logo-type-white

AP® English Language

The ultimate guide to 2015 ap® english language frqs.

  • The Albert Team
  • Last Updated On: March 1, 2022

the_ultimate_guide_to_2015 AP® English language frqs

There were once two neighbors who lived in a small town on the edge of a roaring river. Their houses were set on piles of brick as a precaution for when the river would flood. The bricks had worked for a while, but they began to wear away over the years, and it had left the houses unstable.

One of the neighbors was always conscious of the fact that there could be another flood, so he spent countless hours building and fortifying the foundation of his house. He would get up early on the weekends and go to sleep late on the weeknights to rebuild his home the right way. While he was busy working, his neighbor would sleep in and go out at night, always commenting to his hard-working neighbor that there would be time for work later.

The hours he spent working resulted in a much stronger foundation. He eventually put the house on a set of sturdy stilts that would hold up against the power of a rushing river. He was never able to take his neighbor up on the offer to go out, but that spring when the waters of the river came rushing toward their houses, it all paid off. His house was left standing, while the flood destroyed his neighbor’s house.

brick

Taking your AP® Language test puts you in the unique situation to choose your path. You can be like the man who put off preparing, going into the test blind. Or you can be like the man who spent long hours in preparation, making sure he was ready when the time came.

You want to be like the man who ensured his house had a stable foundation. The foundation you are building, though, will reflect the time you put into learning how to write stellar essays. That foundation includes learning about what the essays require, the best way to write them, and the scoring process. If you prepare yourself for the Free Response Questions, your writing will stand up to the flood, at least metaphorically.

Test Breakdown

The Free Response Questions (FRQs) are the essay portion of the AP® Language exam. The exam itself has two parts, the first is a multiple choice section, and the second is the FRQs. This guide provides an overview, strategies, and examples of the FRQs from the CollegeBoard. There is a guide to the multiple choice here .

The FRQ section has two distinct parts: 15 minutes for reading a set of texts and 120 minutes for writing three essays. The 15 minute “reading period” is designed to give you time to read through the documents for question 1 and develop a thoughtful response. Although you are advised to give each essay 40 minutes, there is no set amount of time for any of the essays. You may divide the 120 minutes however you want.

The three FRQs are each designed to test a different style of writing. The first question is always a synthesis essay – which is why they give you 15 minutes to read all of the sources you must synthesize. The second essay is rhetorical analysis, requiring you to analyze a text through your essay. The third paper is an argumentative essay.

Each essay is worth one-third of the total grade for the FRQ section, and the FRQ section is worth 55% of the total AP® test. Keep that in mind as you prepare for the exam, while the multiple-choice section is hard, the essays are worth more overall – so divide your study time evenly.

The scale for essay scores ranges from 1-9. A score of 1 being illegible or unintelligible, while a score of 9 is going to reflect the best attributes and aspects of early college level writing. You should be shooting to improve your scores to the passing range, which is 5 or above. Note that if you are struggling with the multiple choice section, a 9-9-9 on the essays can help make up for it.

The Tale of Three Essays

If you are currently taking an AP® class, you have probably experienced the style and formats of the three assignments. You may have learned about the specifics of the different types of essays in class, and you may have already found out which of the three is easiest for you. However, you must possess skill in all three to master the AP® test.

The First Essay (Synthesis)

The first essay on the test is going to be the synthesis essay . This essay can be the trickiest to master, but once you do get the hang of it, you will be one step closer to learning the others. The synthesis requires you to read six texts, which can be poems, articles, short stories, or even political cartoons.

Once you have read and analyzed the texts, you are asked to craft an argument using at least three of the documents from the set. The sources should be used to build and support your argument, and you must integrate them into a coherent whole.

On the 2015 FRQ section of the AP® exam, the synthesis essay focused on university honor codes. The complete prompt for the section is below:

Q1

If we break down the task it is asking you to use the six sources to create a “coherent, well-developed argument” from your own position on whether or not schools should create, maintain, change, or eliminate the honors system. As you read this you might have some experience with the idea of honor codes; perhaps you have one from your high school. You can use that experience, but your response needs to focus on the given texts.

To find the actual documents you can go here . Taking a look at the documents will provide some context for the essay samples and their scores.

The question is scored on a scale from 1-9, with nine being the highest. Let’s take a look at some examples of student essays, along with comments from the readers – to break down the dos and don’ts of the FRQ section.

You should always strive to get the highest score possible. Writing a high scoring paper involves learning some practices that will help you write the best possible synthesis essay. Below are three examples taken from student essays.

Craft a Well-Developed Thesis

One of the key elements of scoring high on the synthesis essay section of the FRQ is to craft a well-developed thesis that integrates three sources.

FRQ_2

This thesis is from a high scoring essay based off of question 1 from the 2015 FRQ. Take note of some of the good things that this student is doing:

• The essay mentions three examples that they will reference throughout the rest of their essay: promotes a healthy academic environment, statistically lowers the percentage of academic dishonesty in school, and adaptability to school environments.

Part of a strong thesis is the use of three reasons to support the main claim. Each of the reasons that supports your claim should come from a different source text. By using a three-reason support of your claim, you ensure that you have at least the three required sources integrated. Remember : to get a 6 or higher requires 3 or more sources.

• The intro always introduces a counterclaim as a contrast to the thesis. The student points out that, “Some argue that honor codes should not be implemented for reasons such as ineffectiveness of the code and creation of a “big-brother”-esque environment…”

This counterclaim sets the student up to include a paragraph that argues against the claims posed in some of the articles, allowing them to use more of the given sources to their advantage. Using a counterclaim sets them up to write a well-supported essay.

Use Sources Effectively

Another essential part of scoring well on the synthesis essay is to utilize the sources effectively. The student demonstrates their command of the text through their second and third paragraphs:

FRQ_3

The student seamlessly integrates the different sources in their essay. Notice how in the section above the student can go from one source (Vangell) into information and argument based off of another source (Dirmeger and Cartwright). The ability to use the sources together to form a coherent and cohesive whole is one factor that can set your essay apart from other students’.

Have a Well-Developed Reason for Each Source

Lastly, you will want to ensure that you give a well-developed explanation of the texts when speaking about them. Take this for example:

FRQ_4

The student demonstrates deep understanding, and it shows in their writing. You should read a range of texts to prepare for the test. In the example above the student demonstrates a few key skills:

• The student establishes that they understand that Bacall’s comic is satiric, and isn’t meant to seriously. The analysis shows the reader that the student understood the text, and was able to grasp the nuance of the satire.

• The student also establishes that the use of the spy cam is connected to a philosophical idea like totalitarianism – showing the student understands how the text relates to other parts of the world as a whole.

• The student uses the cartoon as a way to jump into his argument, showing how the fears of critics are unwarranted.

There are some practices that students should avoid on FRQ 1 of the test. Students who do these things can expect to receive low scores on their essays, and if you wish to score above a five, you should avoid them at all costs.

Don’t Change Your Argument Midway Through Your Essay.

Changing your argument creates confusion and will make your essay weaker overall. Let’s look at a few pieces from a student essay to see how they change their arguments midway through:

FRQ_5

Notice that this student talks about the honor system at their school. The student say that it should be maintained in its current form because it is fair, but also punishes students. This statement is taken from the end of the intro paragraph and sets this up as the main crux of the student’s argument – with the idea that they will expand this idea in their paragraphs later.

However, they do not expand the argument with any evidence:

FRQ_6

The student continues to talk about how the system at their school is stable, but at the same time, they offer no proof of the actual policy at their school. They use words like “possible” and “fairly” to describe the system – which seems to suggest they don’t have a good grasp of it.

Up to this point, the student has been somewhat consistent, despite being vague and offering no evidence to support the point about how their school’s honor code is a good example of an honor system that works. In the next paragraph, though, the student’s essay takes a complete turn:

FRQ_7

In their second to the last paragraph, the student turns from the idea that their school’s honor code is “solid” and instead state that they should change it to incorporate a peer-enforced honor system. This line of argument doesn’t go well with the rest of their essay and even acts to contradict their main points.

Most likely the student added this part to their paper after they realized that they had only utilized a single source. The essay ends with confusion and two sources used inadequately. The lesson to learn from this bad essay is that we should stay consistent in our arguments, sticking to the points we discuss at the beginning.

Don’t Fail to Argue the Prompt

One of the easiest ways to fail question one is to write an essay that doesn’t answer the task in the prompt. If we take a look at a sample of a student’s writing, we can see what it looks like when the aim of the essay isn’t focused on the prompt:

FRQ_8

This student is not focusing on whether or not honor codes work. The student is instead giving information and background about honor codes. This explanation goes on for the entire introductory paragraph of the essay, but in the end, the reader has no idea what the student is going to say in the rest of the essay.

The use of information instead of argument is an ineffective strategy for the AP® Language exam, and you should avoid it. Don’t try to make the essay about something other than the assigned prompt. If you stick to the prompt, you will have a better shot at getting a high score like an 8 or 9.

AP® Readers’ Tips:

  • Read every text before you start your essay. One of the pitfalls of many students is that they do not use enough sources and try to fit them in after the fact.
  • Plan ahead. Ensure that you understand what you are going to be saying and how you will incorporate the different sources into your writing. You will need at least three sources to get above a 6, so ensure you have at least that many mapped in your plan.

The Second Essay (Rhetorical Analysis)

The second essay on the FRQ section is always a rhetorical analysis essay. This essay will focus on analyzing a text for an important aspect of the writing. In the case of the 2015 FRQ, the analysis was supposed to concentrate on rhetorical strategies:

Q_1

The prompt asks the reader to carefully read the article written by Cesar Chavez and write an essay analyzing the rhetorical choices he uses in the article. Rhetorical choices are simply another term for rhetorical strategies and include things like the rhetorical appeals, and rhetorical devices.

Let’s examine the do’s and don’ts for the second essay.

Utilize Specific Examples from the Text in Your Analysis

Chavez

In this high scoring essay, the student goes into their analysis right away. The student points out that Chavez uses precise diction, a rhetorical device, to get his point across. This specific example shows the depth and understanding of the student’s analysis and sets the student up to receive a high score.

Whatever you identify in the text for your analysis, you should be able to point out precisely how it supports your main point. The more depth you can give in your analysis, the more accurate you can be with your comments, the better you will do.

Use Outside Knowledge Effectively to Strengthen Your Argument

The ability to pull in outside knowledge from your classes or books you have read will help enhance your analysis. Let’s take a look at how a student did this on the 2015 exam:

favorable

In the example above, the student can provide a more in-depth analysis of Chavez’s words by connecting Chavez’s mention of Gandhi to background knowledge of what Gandhi did in British-controlled India.

The student can provide a comparison of sorts and show how effective Chavez’s comparison is by offering background information about Gandhi’s efforts in India.

Whenever possible, bring in background information that will help with your analysis. It might only seem like extra knowledge about the topic or author, but it could provide some insight into why they chose to write about something or show the full effect of their argument.

Some things to avoid on the literary analysis essay include misreading the passage and providing inadequate analysis of the text.

Don’t Misread the Text

One of the easiest ways to lower your score is to explain something from the text that is incorrect. Let’s look at one of the examples of this from a student essay:

Dont misread the text

The article mentions that the farm workers union was inspired by the work of Martin Luther King Jr. The student’s misreading of the article led them to write, “Chavez’s appeal reached out to an audience of African-American working for justice and equality…” This analysis is a blatant misread of the passage because nowhere does it signify that Chavez was reaching out to African-Americans specifically.

This type of misread may seem minor, but it indicates that the student’s grasp of the article is less than what they need to analyze it in depth. It will also alert the reader to that fact, and they may look more closely for other signs of misunderstanding and shallow reading.

Don’t Over-simplify Your Points

You will want to ensure that your analysis is detailed and gets to the very root of the text. Here is an example of simple analysis from a student:

dont over simplify

The student references lines from the text, but the student does not go into detail about what those lines say, nor does the student elaborate on why “…readers are overcome with a sense of duty and motivation.” This simplistic analysis of the text leaves a lot to be desired, and it received a low score because it didn’t provide the necessary details to analyze the text accurately.

You should elaborate on each piece of evidence that you bring forth from the text, and be specific about what in the text you are analyzing. The more you pay attention to the smaller details, the better your score will be in the long run.

AP® Readers’ Tips

  • Pay attention to both the holistic (overall) and analytic (particular) views of the piece. You will need to understand both the text as a whole and the specific parts of the text to analyze it effectively.
  • Don’t just analyze the rhetoric used, but instead connect the rhetoric to the specific purpose that Chavez hopes to achieve through his speech. This rule applies to any rhetorical analysis essay.

The Third Essay (Argument)

The third and last essay of the FRQ does not respond to a particular text. Instead, the prompt focuses on crafting an argument about a particular issue. Your essay will need to argue a particular position, though most of the questions put forth by the exam will not be simple either/or questions.

Let’s look at the prompt for the third essay from 2015:

Q3

Before we get into the do’s and don’ts of the essay, let’s talk about the particular challenge of this task. You are presented with a scenario, in this case, it deals with small talk, and you are asked to create an argument about that issue.

For 2015, the scenario asks you to argue what value or function you see in small or “polite” talk. You are asked to reference a culture or community you are familiar with, and use evidence from some sources.

A few of the most important things you can do to ensure you score well on the essay include clearly articulating your thesis and use every example to support your main claim.

Clearly Articulate Your Thesis

Like with the synthesis essay a clear thesis is important for the argumentative essay. The thesis should be clear in articulating the essay’s claim, and it should demonstrate that the student understood the requirements of the prompt. Let’s examine a well-written thesis statement:

Clearly Articulate

The student, in this case, chose to argue that polite speech serves the purpose of making others more receptive to your purpose. The student then points out three specific situations where polite speech matters: when speaking to superiors, juries, and the general public.

At the end of the thesis statement, the student makes plain the exact nature of the exchange of polite speech for the desired goal. The clarity of their speech and the depth of their understanding is made clear by their command of language.

Use Examples to Support Main Claim

The best essays are going to use all of their examples to support their main claim. In the case of the third essay, the student sets the essay up so that every example will support the idea that polite speech works as an exchange between those with power and those seeking their purpose.

Let’s take a look at one example of how this support works:

Example 1

The student ensures that they are supporting their main claim. The student is very explicit in tying the example back to the claim with phrases like, “polite speech is an expectation in an environment like school”.

The student points out that there is an expectation of polite speech, and then shows what would happen if polite speech wasn’t used, “…without its implementation, students’ words, and by extension, requests or queries would be disregarded.” This evidence shows that not only is this type of speech required in a school setting but that it is what allows people to get what they want.

This passage demonstrates the level of depth and connection you must make from your evidence to your claim if you want to score well on the third essay of the FRQ. Keep the relationship in mind, and ensure that all your examples explicitly support your claim.

If we take a look at the essay samples from 2015, there are few examples that stand out as don’ts. In particular, you should avoid circular reasoning, and a failure to use variety in your sentences and writing.

Don’t be Unclear in Your Writing

When you are making an argument, and it is based solely on your experiences and reasoning, it can be easy to get bogged down in the details and fail to write a clear well-reasoned essay. You need to take your time and ensure you use clear, well-reasoned logic in your essay.

Let’s take a look at a sample from an essay that has circular reasoning:

Dont be unclear

The essay doesn’t have a clear, logical path. The thesis statement that polite speech is polite doesn’t add anything to the discussion of the value of polite speech. Their essay is set up by this reasoning to fail. You should avoid circular arguments and logical fallacies at all costs in your argumentative essay.

There is a three-part process to creating an argument and avoiding the mistakes from the sample above. Arguments have three main parts:

• Claim: What you are arguing is true.

• Warrant: Your explanation and reasoning for why it is true.

• Evidence: The proof that your warrant uses to prove your claim.

Without any of these three parts any argument is incomplete, and like the sample above – an argument that is incomplete will fail to earn you a high score.

Don’t use a Repetitive Sentence Structure

It seems simple, but many students use simple and repeated sentence structure. You don’t want your writing to become repetitive, so instead try to create variety. Let’s take a look at a student that used repetition too often:

Dont use a repetative

Now the mistake this writer makes may have been done by accident, but it perfectly represents the problem of repetitive structure. It would be advised that you reread your essay before time is up to ensure that you don’t have any repetitions this obvious (In short…In short. above).

This problem can be solved by using a variety of sentence structures, lengths, and formations. You should work diligently as your practice to vary all the elements of your sentences and work to elevate the diction (word choice) you use to make it more formal and academic.

  • Keep track of all parts of the prompt. One of the easiest ways to drop points is to forget to answer an important aspect of the prompt. In the case of the 2015 prompt, the essay needs to discuss both a community that is familiar with the student and the value of “polite” speech in that community.
  • Try to reference literary examples in your writing. There wasn’t much opportunity to reference readings in the 2015 prompt, but if you can reference the different literature you have read as evidence, it can help boost your scores.

General AP® Readers’ Tips

• Make a plan. One of the best things you can do for any essay you are writing under a time crunch is make a thought-out plan. Sometimes, in the heat of writing, it is easy to forget where we are in our arguments. Having a simple outline can save you from that misfortune.

• Answer the question in your introduction, and be direct. Directly answering the prompt is one of the easiest ways to ensure you get a higher score.

• Clearly, indent your paragraphs, and ensure that you always have an easy to navigate structure. Topic sentences are a must, so make sure those figure into your structure.

• Use evidence especially quotes from the texts, and explain what they mean. You need to make an explicit connection between the evidence you use, and how it supports your points.

• Part of all great writing is variety. Vary your sentence structures, don’t make all of your sentences short or choppy, but instead try to inject some creativity into your writing. Utilize transitions, complex sentences, and elevated diction in your writing.

• Use active voice, and make every word add to the paper as a whole. Avoid fluff; you don’t want your paper to look bad because you are trying to pad your word count.

Wrapping up the Ultimate Guide to 2015 AP® English Language FRQs

Now that you better understand the expectations of the AP® Language and Composition FRQ section, you are one step closer to getting your five on the exam. Take what you have learned in this guide, and work on applying it to your writing. So, now it is time to go practice to perfection.

If you have any more tips or awesome ideas for how to study for the AP® Lang FRQ add them in the comments below.

Looking for AP® English Language practice?

Kickstart your AP® English Language prep with Albert. Start your AP® exam prep today .

Interested in a school license?​

Popular posts.

AP® Physics I score calculator

AP® Score Calculators

Simulate how different MCQ and FRQ scores translate into AP® scores

essay questions 2015

AP® Review Guides

The ultimate review guides for AP® subjects to help you plan and structure your prep.

essay questions 2015

Core Subject Review Guides

Review the most important topics in Physics and Algebra 1 .

essay questions 2015

SAT® Score Calculator

See how scores on each section impacts your overall SAT® score

essay questions 2015

ACT® Score Calculator

See how scores on each section impacts your overall ACT® score

essay questions 2015

Grammar Review Hub

Comprehensive review of grammar skills

essay questions 2015

AP® Posters

Download updated posters summarizing the main topics and structure for each AP® exam.

IELTS Preparation with Liz: Free IELTS Tips and Lessons, 2024

' src=

  • Test Information FAQ
  • Band Scores
  • IELTS Candidate Success Tips
  • Computer IELTS: Pros & Cons
  • How to Prepare
  • Useful Links & Resources
  • Recommended Books
  • Writing Task 1
  • Writing Task 2
  • Speaking Part 1 Topics
  • Speaking Part 2 Topics
  • Speaking Part 3 Topics
  • 100 Essay Questions
  • On The Day Tips
  • Top Results
  • Advanced IELTS

100 IELTS Essay Questions

Below are practice IELTS essay questions and topics for writing task 2. The 100 essay questions have been used many times over the years. The questions are organised under common topics and essay types. IELTS often use the similar topics for their essays but change the wording of the essay question.

In order to prepare well for writing task 2, you should prepare ideas for common topics and then practise applying them to the tasks given (to the essay questions). Also see model essays and tips  for writing task 2.

Below you will find:

  • Essay Questions By Topic
  • Essay Questions by Essay Type

Please also note that my new Grammar E-book is now available in my store along with my Ideas for Essay Topics E-book and Advanced Writing Lessons. To visit store, click here: Liz’s Store

1) Common IELTS Essay Questions

IELTS practice essay questions divided by topic. These topics have been reported by IELTS students in their tests. Essay questions have been recreated as accurately as possible.

  • Art   (5 essay questions)
  • Business & Money   (17 essay questions)
  • Communication & Personality   (20 essay questions)
  • Crime & Punishment   (12 essay questions)
  • Education   (17 essay questions)
  • Environment   (12 essay questions)
  • Family & Children   (8 essay questions)
  • Food & Diet (13 essay questions)
  • Government (6 essay questions)
  • Health   (9 essay questions)
  • Housing, Buildings & Urban Planning (8 essay questions)
  • Language (6 essay questions)
  • Leisure (1 essay question)
  • Media & Advertising   (12 essay questions)
  • Reading  (5 essay questions)
  • Society   (10 essay questions)
  • Space Exploration (3 questions)
  • Sport & Exercise   (6 essay questions)
  • Technology  (6 essay questions)
  • Tourism and Travel   (11 essay questions)
  • Transport  (7 essay questions)
  • Work (17 essay questions)

2) IELTS Essay Questions by Essay Type 

There are 5 main types of essay questions in IELTS writing task 2 (opinion essays, discussion essay, advantage/disadvantage essays, solution essay and direct question essays). Click on the links below to see some sample essay questions for each type.

  • Opinion Essay Questions
  • Discussion Essay Questions
  • Solution Essay Questions
  • Direct Questions Essay Titles 
  • Advantage / Disadvantage Essay Questions

………………………………

FREE SUBSCRIBE : Get New Lessons & Posts by Email

Type your email…

Advanced IELTS Lessons & E-books

essay questions 2015

Recent Lessons

Ielts speaking part 2 topic water sports: vocab & model answer, ielts liz personal update 2024, ielts model essay -two questions essay type, ielts bar chart of age groups 2024, ielts topic: urban planning.

' src=

Click Below to Learn:

  • IELTS Test Information

Copyright Notice

Copyright © Elizabeth Ferguson, 2014 – 2024

All rights reserved.

Privacy Policy & Disclaimer

  • Click here:  Privacy Policy 
  • Click here: Disclaimer

Return to top of page

Copyright © 2024 · Prose on Genesis Framework · WordPress · Log in

PrepScholar

Choose Your Test

Sat / act prep online guides and tips, 53 stellar college essay topics to inspire you.

author image

College Essays

feature_orange_notebook_pencil_college_essay_topics

Most colleges and universities in the United States require applicants to submit at least one essay as part of their application. But trying to figure out what college essay topics you should choose is a tricky process. There are so many potential things you could write about!

In this guide, we go over the essential qualities that make for a great college essay topic and give you 50+ college essay topics you can use for your own statement . In addition, we provide you with helpful tips for turning your college essay topic into a stellar college essay.

What Qualities Make for a Good College Essay Topic?

Regardless of what you write about in your personal statement for college , there are key features that will always make for a stand-out college essay topic.

#1: It’s Specific

First off, good college essay topics are extremely specific : you should know all the pertinent facts that have to do with the topic and be able to see how the entire essay comes together.

Specificity is essential because it’ll not only make your essay stand out from other statements, but it'll also recreate the experience for admissions officers through its realism, detail, and raw power. You want to tell a story after all, and specificity is the way to do so. Nobody wants to read a vague, bland, or boring story — not even admissions officers!

For example, an OK topic would be your experience volunteering at a cat shelter over the summer. But a better, more specific college essay topic would be how you deeply connected with an elderly cat there named Marty, and how your bond with him made you realize that you want to work with animals in the future.

Remember that specificity in your topic is what will make your essay unique and memorable . It truly is the key to making a strong statement (pun intended)!

#2: It Shows Who You Are

In addition to being specific, good college essay topics reveal to admissions officers who you are: your passions and interests, what is important to you, your best (or possibly even worst) qualities, what drives you, and so on.

The personal statement is critical because it gives schools more insight into who you are as a person and not just who you are as a student in terms of grades and classes.

By coming up with a real, honest topic, you’ll leave an unforgettable mark on admissions officers.

#3: It’s Meaningful to You

The very best college essay topics are those that hold deep meaning to their writers and have truly influenced them in some significant way.

For instance, maybe you plan to write about the first time you played Skyrim to explain how this video game revealed to you the potentially limitless worlds you could create, thereby furthering your interest in game design.

Even if the topic seems trivial, it’s OK to use it — just as long as you can effectively go into detail about why this experience or idea had such an impact on you .

Don’t give in to the temptation to choose a topic that sounds impressive but doesn’t actually hold any deep meaning for you. Admissions officers will see right through this!

Similarly, don’t try to exaggerate some event or experience from your life if it’s not all that important to you or didn’t have a substantial influence on your sense of self.

#4: It’s Unique

College essay topics that are unique are also typically the most memorable, and if there’s anything you want to be during the college application process, it’s that! Admissions officers have to sift through thousands of applications, and the essay is one of the only parts that allows them to really get a sense of who you are and what you value in life.

If your essay is trite or boring, it won’t leave much of an impression , and your application will likely get immediately tossed to the side with little chance of seeing admission.

But if your essay topic is very original and different, you’re more likely to earn that coveted second glance at your application.

What does being unique mean exactly, though? Many students assume that they must choose an extremely rare or crazy experience to talk about in their essays —but that's not necessarily what I mean by "unique." Good college essay topics can be unusual and different, yes, but they can also be unique takes on more mundane or common activities and experiences .

For instance, say you want to write an essay about the first time you went snowboarding. Instead of just describing the details of the experience and how you felt during it, you could juxtapose your emotions with a creative and humorous perspective from the snowboard itself. Or you could compare your first attempt at snowboarding with your most recent experience in a snowboarding competition. The possibilities are endless!

#5: It Clearly Answers the Question

Finally, good college essay topics will clearly and fully answer the question(s) in the prompt.

You might fail to directly answer a prompt by misinterpreting what it’s asking you to do, or by answering only part of it (e.g., answering just one out of three questions).

Therefore, make sure you take the time to come up with an essay topic that is in direct response to every question in the prompt .

Take this Coalition Application prompt as an example:

What is the hardest part of being a teenager now? What's the best part? What advice would you give a younger sibling or friend (assuming they would listen to you)?

For this prompt, you’d need to answer all three questions (though it’s totally fine to focus more on one or two of them) to write a compelling and appropriate essay.

This is why we recommend reading and rereading the essay prompt ; you should know exactly what it’s asking you to do, well before you start brainstorming possible college application essay topics.

body_girl_thinking_bubble_idea

53 College Essay Topics to Get Your Brain Moving

In this section, we give you a list of 53 examples of college essay topics. Use these as jumping-off points to help you get started on your college essay and to ensure that you’re on track to coming up with a relevant and effective topic.

All college application essay topics below are categorized by essay prompt type. We’ve identified six general types of college essay prompts:

Why This College?

Change and personal growth, passions, interests, and goals, overcoming a challenge, diversity and community, solving a problem.

Note that these prompt types could overlap with one another, so you’re not necessarily limited to just one college essay topic in a single personal statement.

  • How a particular major or program will help you achieve your academic or professional goals
  • A memorable and positive interaction you had with a professor or student at the school
  • Something good that happened to you while visiting the campus or while on a campus tour
  • A certain class you want to take or a certain professor you’re excited to work with
  • Some piece of on-campus equipment or facility that you’re looking forward to using
  • Your plans to start a club at the school, possibly to raise awareness of a major issue
  • A study abroad or other unique program that you can’t wait to participate in
  • How and where you plan to volunteer in the community around the school
  • An incredible teacher you studied under and the positive impact they had on you
  • How you went from really liking something, such as a particular movie star or TV show, to not liking it at all (or vice versa)
  • How yours or someone else’s (change in) socioeconomic status made you more aware of poverty
  • A time someone said something to you that made you realize you were wrong
  • How your opinion on a controversial topic, such as gay marriage or DACA, has shifted over time
  • A documentary that made you aware of a particular social, economic, or political issue going on in the country or world
  • Advice you would give to your younger self about friendship, motivation, school, etc.
  • The steps you took in order to kick a bad or self-sabotaging habit
  • A juxtaposition of the first and most recent time you did something, such as dance onstage
  • A book you read that you credit with sparking your love of literature and/or writing
  • A school assignment or project that introduced you to your chosen major
  • A glimpse of your everyday routine and how your biggest hobby or interest fits into it
  • The career and (positive) impact you envision yourself having as a college graduate
  • A teacher or mentor who encouraged you to pursue a specific interest you had
  • How moving around a lot helped you develop a love of international exchange or learning languages
  • A special skill or talent you’ve had since you were young and that relates to your chosen major in some way, such as designing buildings with LEGO bricks
  • Where you see yourself in 10 or 20 years
  • Your biggest accomplishment so far relating to your passion (e.g., winning a gold medal for your invention at a national science competition)
  • A time you lost a game or competition that was really important to you
  • How you dealt with the loss or death of someone close to you
  • A time you did poorly in a class that you expected to do well in
  • How moving to a new school impacted your self-esteem and social life
  • A chronic illness you battled or are still battling
  • Your healing process after having your heart broken for the first time
  • A time you caved under peer pressure and the steps you took so that it won't happen again
  • How you almost gave up on learning a foreign language but stuck with it
  • Why you decided to become a vegetarian or vegan, and how you navigate living with a meat-eating family
  • What you did to overcome a particular anxiety or phobia you had (e.g., stage fright)
  • A history of a failed experiment you did over and over, and how you finally found a way to make it work successfully
  • Someone within your community whom you aspire to emulate
  • A family tradition you used to be embarrassed about but are now proud of
  • Your experience with learning English upon moving to the United States
  • A close friend in the LGBTQ+ community who supported you when you came out
  • A time you were discriminated against, how you reacted, and what you would do differently if faced with the same situation again
  • How you navigate your identity as a multiracial, multiethnic, and/or multilingual person
  • A project or volunteer effort you led to help or improve your community
  • A particular celebrity or role model who inspired you to come out as LGBTQ+
  • Your biggest challenge (and how you plan to tackle it) as a female in a male-dominated field
  • How you used to discriminate against your own community, and what made you change your mind and eventually take pride in who you are and/or where you come from
  • A program you implemented at your school in response to a known problem, such as a lack of recycling cans in the cafeteria
  • A time you stepped in to mediate an argument or fight between two people
  • An app or other tool you developed to make people’s lives easier in some way
  • A time you proposed a solution that worked to an ongoing problem at school, an internship, or a part-time job
  • The steps you took to identify and fix an error in coding for a website or program
  • An important social or political issue that you would fix if you had the means

body_boy_writing_notebook_ideas

How to Build a College Essay in 6 Easy Steps

Once you’ve decided on a college essay topic you want to use, it’s time to buckle down and start fleshing out your essay. These six steps will help you transform a simple college essay topic into a full-fledged personal statement.

Step 1: Write Down All the Details

Once you’ve chosen a general topic to write about, get out a piece of paper and get to work on creating a list of all the key details you could include in your essay . These could be things such as the following:

  • Emotions you felt at the time
  • Names, places, and/or numbers
  • Dialogue, or what you or someone else said
  • A specific anecdote, example, or experience
  • Descriptions of how things looked, felt, or seemed

If you can only come up with a few details, then it’s probably best to revisit the list of college essay topics above and choose a different one that you can write more extensively on.

Good college essay topics are typically those that:

  • You remember well (so nothing that happened when you were really young)
  • You're excited to write about
  • You're not embarrassed or uncomfortable to share with others
  • You believe will make you positively stand out from other applicants

Step 2: Figure Out Your Focus and Approach

Once you have all your major details laid out, start to figure out how you could arrange them in a way that makes sense and will be most effective.

It’s important here to really narrow your focus: you don’t need to (and shouldn’t!) discuss every single aspect of your trip to visit family in Indonesia when you were 16. Rather, zero in on a particular anecdote or experience and explain why and how it impacted you.

Alternatively, you could write about multiple experiences while weaving them together with a clear, meaningful theme or concept , such as how your math teacher helped you overcome your struggle with geometry over the course of an entire school year. In this case, you could mention a few specific times she tutored you and most strongly supported you in your studies.

There’s no one right way to approach your college essay, so play around to see what approaches might work well for the topic you’ve chosen.

If you’re really unsure about how to approach your essay, think about what part of your topic was or is most meaningful and memorable to you, and go from there.

Step 3: Structure Your Narrative

  • Beginning: Don’t just spout off a ton of background information here—you want to hook your reader, so try to start in the middle of the action , such as with a meaningful conversation you had or a strong emotion you felt. It could also be a single anecdote if you plan to center your essay around a specific theme or idea.
  • Middle: Here’s where you start to flesh out what you’ve established in the opening. Provide more details about the experience (if a single anecdote) or delve into the various times your theme or idea became most important to you. Use imagery and sensory details to put the reader in your shoes.
  • End: It’s time to bring it all together. Finish describing the anecdote or theme your essay centers around and explain how it relates to you now , what you’ve learned or gained from it, and how it has influenced your goals.

body_pen_crinkled_up_paper

Step 4: Write a Rough Draft

By now you should have all your major details and an outline for your essay written down; these two things will make it easy for you to convert your notes into a rough draft.

At this stage of the writing process, don’t worry too much about vocabulary or grammar and just focus on getting out all your ideas so that they form the general shape of an essay . It’s OK if you’re a little over the essay's word limit — as you edit, you’ll most likely make some cuts to irrelevant and ineffective parts anyway.

If at any point you get stuck and have no idea what to write, revisit steps 1-3 to see whether there are any important details or ideas you might be omitting or not elaborating on enough to get your overall point across to admissions officers.

Step 5: Edit, Revise, and Proofread

  • Sections that are too wordy and don’t say anything important
  • Irrelevant details that don’t enhance your essay or the point you're trying to make
  • Parts that seem to drag or that feel incredibly boring or redundant
  • Areas that are vague and unclear and would benefit from more detail
  • Phrases or sections that are awkwardly placed and should be moved around
  • Areas that feel unconvincing, inauthentic, or exaggerated

Start paying closer attention to your word choice/vocabulary and grammar at this time, too. It’s perfectly normal to edit and revise your college essay several times before asking for feedback, so keep working with it until you feel it’s pretty close to its final iteration.

This step will likely take the longest amount of time — at least several weeks, if not months — so really put effort into fixing up your essay. Once you’re satisfied, do a final proofread to ensure that it’s technically correct.

Step 6: Get Feedback and Tweak as Needed

After you’ve overhauled your rough draft and made it into a near-final draft, give your essay to somebody you trust , such as a teacher or parent, and have them look it over for technical errors and offer you feedback on its content and overall structure.

Use this feedback to make any last-minute changes or edits. If necessary, repeat steps 5 and 6. You want to be extra sure that your essay is perfect before you submit it to colleges!

Recap: From College Essay Topics to Great College Essays

Many different kinds of college application essay topics can get you into a great college. But this doesn’t make it any easier to choose the best topic for you .

In general, the best college essay topics have the following qualities :

  • They’re specific
  • They show who you are
  • They’re meaningful to you
  • They’re unique
  • They clearly answer the question

If you ever need help coming up with an idea of what to write for your essay, just refer to the list of 53 examples of college essay topics above to get your brain juices flowing.

Once you’ve got an essay topic picked out, follow these six steps for turning your topic into an unforgettable personal statement :

  • Write down all the details
  • Figure out your focus and approach
  • Structure your narrative
  • Write a rough draft
  • Edit, revise, and proofread
  • Get feedback and tweak as needed

And with that, I wish you the best of luck on your college essays!

What’s Next?

Writing a college essay is no simple task. Get expert college essay tips with our guides on how to come up with great college essay ideas and how to write a college essay, step by step .

You can also check out this huge list of college essay prompts  to get a feel for what types of questions you'll be expected to answer on your applications.

Want to see examples of college essays that absolutely rocked? You're in luck because we've got a collection of 100+ real college essay examples right here on our blog!

Want to write the perfect college application essay?   We can help.   Your dedicated PrepScholar Admissions counselor will help you craft your perfect college essay, from the ground up. We learn your background and interests, brainstorm essay topics, and walk you through the essay drafting process, step-by-step. At the end, you'll have a unique essay to proudly submit to colleges.   Don't leave your college application to chance. Find out more about PrepScholar Admissions now:

Hannah received her MA in Japanese Studies from the University of Michigan and holds a bachelor's degree from the University of Southern California. From 2013 to 2015, she taught English in Japan via the JET Program. She is passionate about education, writing, and travel.

Ask a Question Below

Have any questions about this article or other topics? Ask below and we'll reply!

Improve With Our Famous Guides

  • For All Students

The 5 Strategies You Must Be Using to Improve 160+ SAT Points

How to Get a Perfect 1600, by a Perfect Scorer

Series: How to Get 800 on Each SAT Section:

Score 800 on SAT Math

Score 800 on SAT Reading

Score 800 on SAT Writing

Series: How to Get to 600 on Each SAT Section:

Score 600 on SAT Math

Score 600 on SAT Reading

Score 600 on SAT Writing

Free Complete Official SAT Practice Tests

What SAT Target Score Should You Be Aiming For?

15 Strategies to Improve Your SAT Essay

The 5 Strategies You Must Be Using to Improve 4+ ACT Points

How to Get a Perfect 36 ACT, by a Perfect Scorer

Series: How to Get 36 on Each ACT Section:

36 on ACT English

36 on ACT Math

36 on ACT Reading

36 on ACT Science

Series: How to Get to 24 on Each ACT Section:

24 on ACT English

24 on ACT Math

24 on ACT Reading

24 on ACT Science

What ACT target score should you be aiming for?

ACT Vocabulary You Must Know

ACT Writing: 15 Tips to Raise Your Essay Score

How to Get Into Harvard and the Ivy League

How to Get a Perfect 4.0 GPA

How to Write an Amazing College Essay

What Exactly Are Colleges Looking For?

Is the ACT easier than the SAT? A Comprehensive Guide

Should you retake your SAT or ACT?

When should you take the SAT or ACT?

Stay Informed

Follow us on Facebook (icon)

Get the latest articles and test prep tips!

Looking for Graduate School Test Prep?

Check out our top-rated graduate blogs here:

GRE Online Prep Blog

GMAT Online Prep Blog

TOEFL Online Prep Blog

Holly R. "I am absolutely overjoyed and cannot thank you enough for helping me!”
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

UPSC Coaching, Study Materials, and Mock Exams

Enroll in ClearIAS UPSC Coaching Join Now Log In

Call us: +91-9605741000

UPSC Mains 2015 Essay Question Paper

Last updated on September 11, 2023 by Alex Andrews George

UPSC Mains 2015 Essay Question Paper

Table of Contents

Section ‘A’

  • Lending hands to someone is better than giving a dole.
  • Quick but steady wins the race.
  • Character of an institution is reflected in its leader.
  • Education without values, as useful as it is, seems rather to make a man more clever devil.
  • Technology cannot replace manpower.
  • Crisis faced in India – moral or economic.
  •  Dreams which should not let India sleep.
  • Can capitalism bring inclusive growth?

Print Friendly, PDF & Email

Aim IAS, IPS, or IFS?

ClearIAS UPSC Coaching

About Alex Andrews George

Alex Andrews George is a mentor, author, and social entrepreneur. Alex is the founder of ClearIAS and one of the expert Civil Service Exam Trainers in India.

He is the author of many best-seller books like 'Important Judgments that transformed India' and 'Important Acts that transformed India'.

A trusted mentor and pioneer in online training , Alex's guidance, strategies, study-materials, and mock-exams have helped many aspirants to become IAS, IPS, and IFS officers.

Reader Interactions

essay questions 2015

December 28, 2015 at 1:12 pm

how to stdy csat paper

essay questions 2015

January 19, 2016 at 7:56 pm

If u want to study cast paper then first thourouhly study all the context present in syllabus then solve previous year papers

essay questions 2015

May 16, 2016 at 8:44 am

sir kya apke notes hindi me avilable hai

June 22, 2016 at 7:54 am

SIR i want to geogrgphy mains sub ject study material give me and essasey writing knowledge u give me plz

essay questions 2015

July 13, 2016 at 11:23 pm

Basically im a civil engineering student… Now im very confused to select optional pls suggest me can i take civil engineering as an optional or not

October 19, 2017 at 12:16 am

Go through the syllubus and question papers and if u are comfertable with it u can take,else go with any other common subjects

essay questions 2015

July 26, 2016 at 4:26 pm

sir i am in bsc first year i want to help how to study in upsc what i do

October 15, 2016 at 3:54 pm

I. Am preparation for ias but me no know syllabus and study material about no knowledge and essay how to writing please tell me my email name Anuj Pandey

November 3, 2016 at 4:18 pm

sir am BCOM graduate. can I select any opinional subject.

and. what are the tricks for essay papers

essay questions 2015

January 21, 2017 at 8:18 pm

Hindi medium Essay Paper Explain Plz a sir.and optional subject math .

essay questions 2015

February 2, 2017 at 8:29 pm

It’s good sir please update easy answer paper

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Don’t lose out without playing the right game!

Follow the ClearIAS Prelims cum Mains (PCM) Integrated Approach.

Join ClearIAS PCM Course Now

UPSC Online Preparation

  • Union Public Service Commission (UPSC)
  • Indian Administrative Service (IAS)
  • Indian Police Service (IPS)
  • IAS Exam Eligibility
  • UPSC Free Study Materials
  • UPSC Exam Guidance
  • UPSC Prelims Test Series
  • UPSC Syllabus
  • UPSC Online
  • UPSC Prelims
  • UPSC Interview
  • UPSC Toppers
  • UPSC Previous Year Qns
  • UPSC Age Calculator
  • UPSC Calendar 2024
  • About ClearIAS
  • ClearIAS Programs
  • ClearIAS Fee Structure
  • IAS Coaching
  • UPSC Coaching
  • UPSC Online Coaching
  • ClearIAS Blog
  • Important Updates
  • Announcements
  • Book Review
  • ClearIAS App
  • Work with us
  • Advertise with us
  • Privacy Policy
  • Terms and Conditions
  • Talk to Your Mentor

Featured on

ClearIAS Featured in The Hindu

and many more...

essay questions 2015

Take ClearIAS Mock Exams: Analyse Your Progress

ClearIAS Course Image

Analyse Your Performance and Track Your All-India Ranking

Ias/ips/ifs online coaching: target cse 2025.

ClearIAS Course Image

Are you struggling to finish the UPSC CSE syllabus without proper guidance?

First-year applicants: Essays, activities & academics

Rather than asking you to write one long essay, the MIT application consists of several short response questions and essays designed to help us get to know you. Remember that this is not a writing test. Be honest, be open, be authentic—this is your opportunity to connect with us.

You should certainly be thoughtful about your essays, but if you’re thinking too much—spending a lot of time stressing or strategizing about what makes you “look best,” as opposed to the answers that are honest and easy—you’re doing it wrong.

Our questions

For the 2023–2024 application, we’re asking these short answer essay questions:

  • What field of study appeals to you the most right now? (Note: Applicants select from a drop-down list.) Tell us more about why this field of study at MIT appeals to you.
  • We know you lead a busy life, full of activities, many of which are required of you. Tell us about something you do simply for the pleasure of it.
  • How has the world you come from—including your opportunities, experiences, and challenges—shaped your dreams and aspirations?
  • MIT brings people with diverse backgrounds together to collaborate, from tackling the world’s biggest challenges to lending a helping hand. Describe one way you have collaborated with others to learn from them, with them, or contribute to your community together.
  • How did you manage a situation or challenge that you didn’t expect? What did you learn from it?

Depending on the question, we’re looking for responses of approximately 100–200 words each. There is also one final, open-ended, additional-information text box where you can tell us anything else you think we really ought to know.

Please use our form, not a resume, to list your activities. There is only enough space to list four things—please choose the four that mean the most to you and tell us a bit about them.

Self-reported Coursework Form

How you fill out this form will not make or break your application, so don’t stress about it. Use your best judgment—we’re simply trying to get a clear picture of your academic preparation by subject area. We see thousands of different transcripts, so it really helps us to view your coursework and grades in a consistent format.

Here are a few quick tips to help you complete this section:

  • The self-reported coursework should be completed by students in U.S. school systems only. If you attend an international school, we’ll just use your transcript.
  • The information you provide does not replace your official high school transcript, which must be sent to us from your school to verify your self-reported information (in order to avoid accidental misrepresentation, it might help to have a copy of your high school transcript in front of you while completing this form).
  • Avoid abbreviations, if at all possible, and enter the names of your school courses by subject area. Please include all classes you have taken and are currently taking. If your courses were taken outside of your high school (at a local junior college or university, for example), tell us where they were taken in the “Class Name” field.
  • In the “Grade Received” field, list term and/or final grades for each class, as found on your school transcript (semester, trimester, quarter, final, etc.). Use one entry only per class. For example, it’s not necessary to use a separate entry for each semester of the same class. Place all grades for a class in the same field, separating grades with commas.

The State Bar of California

California Bar Examination

Examination Questions

Essay Questions and Selected Answers

Performance Tests and Selected Answers

  • Study Aids Form

First-Year Law Students' Examination

Home — Essay Samples — Entertainment — The Big Short — The Big Short: An Analysis of Financial Missteps and Moral Quandaries

test_template

The Big Short: an Analysis of Financial Missteps and Moral Quandaries

  • Categories: Ethics The Big Short

About this sample

close

Words: 585 |

Published: Jun 6, 2024

Words: 585 | Page: 1 | 3 min read

Table of contents

Introduction, body paragraphs, the financial crisis depicted, innovative narrative techniques, character development and ethical quandaries, critical reception and cultural impact.

Image of Dr. Charlotte Jacobson

Cite this Essay

Let us write you an essay from scratch

  • 450+ experts on 30 subjects ready to help
  • Custom essay delivered in as few as 3 hours

Get high-quality help

author

Prof. Kifaru

Verified writer

  • Expert in: Philosophy Entertainment

writer

+ 120 experts online

By clicking “Check Writers’ Offers”, you agree to our terms of service and privacy policy . We’ll occasionally send you promo and account related email

No need to pay just yet!

Related Essays

7.5 pages / 3441 words

3 pages / 1458 words

3 pages / 1291 words

5 pages / 2217 words

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

121 writers online

Still can’t find what you need?

Browse our vast selection of original essay samples, each expertly formatted and styled

Related Essays on The Big Short

"The Big Short" movie begins with a narrative by Ryan Gosling's character, Jared Vennet, who provides a somewhat sardonic introduction to the ostensibly dull world of banking. However, this setup is the prelude to a "The Big [...]

The film “The Big Short” is a film showing three separate yet parallel stories that are loosely tied to one another. Each story is about a group of people and their actions leading up to the house market crash in 2007 and 2008. [...]

The Big Short is a film based on a non-fictional movie covering the financial crisis of 2007-2008 before it even happened. This movie not only focuses on the events that led up to the financial crisis, but men that saw the [...]

The Big Short is a movie based on a non-fiction book “The Big Short: Inside the Doomsday Machine” by Michael Lewis in 2010. The film tells the story that led up to the 2007-2008 financial crisis and the housing bubbles. The [...]

George Orwell’s ‘Animal Farm’ is an allegorical fairy tale which is profound in its condemnations of totalitarian regimes. The novel explores the concepts of propaganda, totalitarianism and tyranny impacting on the oppressed [...]

Anthony Burgess’s A Clockwork Orange is a novel pervaded by a multifaceted and intrinsic musical presence. Protagonist Alex’s fondness for classical music imbues his character with interesting dimensions, and resonates well [...]

Related Topics

By clicking “Send”, you agree to our Terms of service and Privacy statement . We will occasionally send you account related emails.

Where do you want us to send this sample?

By clicking “Continue”, you agree to our terms of service and privacy policy.

Be careful. This essay is not unique

This essay was donated by a student and is likely to have been used and submitted before

Download this Sample

Free samples may contain mistakes and not unique parts

Sorry, we could not paraphrase this essay. Our professional writers can rewrite it and get you a unique paper.

Please check your inbox.

We can write you a custom essay that will follow your exact instructions and meet the deadlines. Let's fix your grades together!

Get Your Personalized Essay in 3 Hours or Less!

We use cookies to personalyze your web-site experience. By continuing we’ll assume you board with our cookie policy .

  • Instructions Followed To The Letter
  • Deadlines Met At Every Stage
  • Unique And Plagiarism Free

essay questions 2015

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 03 June 2024

Applying large language models for automated essay scoring for non-native Japanese

  • Wenchao Li 1 &
  • Haitao Liu 2  

Humanities and Social Sciences Communications volume  11 , Article number:  723 ( 2024 ) Cite this article

185 Accesses

2 Altmetric

Metrics details

  • Language and linguistics

Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated listening tests, and automated oral proficiency assessments. The application of LLMs for AES in the context of non-native Japanese, however, remains limited. This study explores the potential of LLM-based AES by comparing the efficiency of different models, i.e. two conventional machine training technology-based methods (Jess and JWriter), two LLMs (GPT and BERT), and one Japanese local LLM (Open-Calm large model). To conduct the evaluation, a dataset consisting of 1400 story-writing scripts authored by learners with 12 different first languages was used. Statistical analysis revealed that GPT-4 outperforms Jess and JWriter, BERT, and the Japanese language-specific trained Open-Calm large model in terms of annotation accuracy and predicting learning levels. Furthermore, by comparing 18 different models that utilize various prompts, the study emphasized the significance of prompts in achieving accurate and reliable evaluations using LLMs.

Similar content being viewed by others

essay questions 2015

Accurate structure prediction of biomolecular interactions with AlphaFold 3

essay questions 2015

Testing theory of mind in large language models and humans

essay questions 2015

Highly accurate protein structure prediction with AlphaFold

Conventional machine learning technology in aes.

AES has experienced significant growth with the advancement of machine learning technologies in recent decades. In the earlier stages of AES development, conventional machine learning-based approaches were commonly used. These approaches involved the following procedures: a) feeding the machine with a dataset. In this step, a dataset of essays is provided to the machine learning system. The dataset serves as the basis for training the model and establishing patterns and correlations between linguistic features and human ratings. b) the machine learning model is trained using linguistic features that best represent human ratings and can effectively discriminate learners’ writing proficiency. These features include lexical richness (Lu, 2012 ; Kyle and Crossley, 2015 ; Kyle et al. 2021 ), syntactic complexity (Lu, 2010 ; Liu, 2008 ), text cohesion (Crossley and McNamara, 2016 ), and among others. Conventional machine learning approaches in AES require human intervention, such as manual correction and annotation of essays. This human involvement was necessary to create a labeled dataset for training the model. Several AES systems have been developed using conventional machine learning technologies. These include the Intelligent Essay Assessor (Landauer et al. 2003 ), the e-rater engine by Educational Testing Service (Attali and Burstein, 2006 ; Burstein, 2003 ), MyAccess with the InterlliMetric scoring engine by Vantage Learning (Elliot, 2003 ), and the Bayesian Essay Test Scoring system (Rudner and Liang, 2002 ). These systems have played a significant role in automating the essay scoring process and providing quick and consistent feedback to learners. However, as touched upon earlier, conventional machine learning approaches rely on predetermined linguistic features and often require manual intervention, making them less flexible and potentially limiting their generalizability to different contexts.

In the context of the Japanese language, conventional machine learning-incorporated AES tools include Jess (Ishioka and Kameda, 2006 ) and JWriter (Lee and Hasebe, 2017 ). Jess assesses essays by deducting points from the perfect score, utilizing the Mainichi Daily News newspaper as a database. The evaluation criteria employed by Jess encompass various aspects, such as rhetorical elements (e.g., reading comprehension, vocabulary diversity, percentage of complex words, and percentage of passive sentences), organizational structures (e.g., forward and reverse connection structures), and content analysis (e.g., latent semantic indexing). JWriter employs linear regression analysis to assign weights to various measurement indices, such as average sentence length and total number of characters. These weights are then combined to derive the overall score. A pilot study involving the Jess model was conducted on 1320 essays at different proficiency levels, including primary, intermediate, and advanced. However, the results indicated that the Jess model failed to significantly distinguish between these essay levels. Out of the 16 measures used, four measures, namely median sentence length, median clause length, median number of phrases, and maximum number of phrases, did not show statistically significant differences between the levels. Additionally, two measures exhibited between-level differences but lacked linear progression: the number of attributives declined words and the Kanji/kana ratio. On the other hand, the remaining measures, including maximum sentence length, maximum clause length, number of attributive conjugated words, maximum number of consecutive infinitive forms, maximum number of conjunctive-particle clauses, k characteristic value, percentage of big words, and percentage of passive sentences, demonstrated statistically significant between-level differences and displayed linear progression.

Both Jess and JWriter exhibit notable limitations, including the manual selection of feature parameters and weights, which can introduce biases into the scoring process. The reliance on human annotators to label non-native language essays also introduces potential noise and variability in the scoring. Furthermore, an important concern is the possibility of system manipulation and cheating by learners who are aware of the regression equation utilized by the models (Hirao et al. 2020 ). These limitations emphasize the need for further advancements in AES systems to address these challenges.

Deep learning technology in AES

Deep learning has emerged as one of the approaches for improving the accuracy and effectiveness of AES. Deep learning-based AES methods utilize artificial neural networks that mimic the human brain’s functioning through layered algorithms and computational units. Unlike conventional machine learning, deep learning autonomously learns from the environment and past errors without human intervention. This enables deep learning models to establish nonlinear correlations, resulting in higher accuracy. Recent advancements in deep learning have led to the development of transformers, which are particularly effective in learning text representations. Noteworthy examples include bidirectional encoder representations from transformers (BERT) (Devlin et al. 2019 ) and the generative pretrained transformer (GPT) (OpenAI).

BERT is a linguistic representation model that utilizes a transformer architecture and is trained on two tasks: masked linguistic modeling and next-sentence prediction (Hirao et al. 2020 ; Vaswani et al. 2017 ). In the context of AES, BERT follows specific procedures, as illustrated in Fig. 1 : (a) the tokenized prompts and essays are taken as input; (b) special tokens, such as [CLS] and [SEP], are added to mark the beginning and separation of prompts and essays; (c) the transformer encoder processes the prompt and essay sequences, resulting in hidden layer sequences; (d) the hidden layers corresponding to the [CLS] tokens (T[CLS]) represent distributed representations of the prompts and essays; and (e) a multilayer perceptron uses these distributed representations as input to obtain the final score (Hirao et al. 2020 ).

figure 1

AES system with BERT (Hirao et al. 2020 ).

The training of BERT using a substantial amount of sentence data through the Masked Language Model (MLM) allows it to capture contextual information within the hidden layers. Consequently, BERT is expected to be capable of identifying artificial essays as invalid and assigning them lower scores (Mizumoto and Eguchi, 2023 ). In the context of AES for nonnative Japanese learners, Hirao et al. ( 2020 ) combined the long short-term memory (LSTM) model proposed by Hochreiter and Schmidhuber ( 1997 ) with BERT to develop a tailored automated Essay Scoring System. The findings of their study revealed that the BERT model outperformed both the conventional machine learning approach utilizing character-type features such as “kanji” and “hiragana”, as well as the standalone LSTM model. Takeuchi et al. ( 2021 ) presented an approach to Japanese AES that eliminates the requirement for pre-scored essays by relying solely on reference texts or a model answer for the essay task. They investigated multiple similarity evaluation methods, including frequency of morphemes, idf values calculated on Wikipedia, LSI, LDA, word-embedding vectors, and document vectors produced by BERT. The experimental findings revealed that the method utilizing the frequency of morphemes with idf values exhibited the strongest correlation with human-annotated scores across different essay tasks. The utilization of BERT in AES encounters several limitations. Firstly, essays often exceed the model’s maximum length limit. Second, only score labels are available for training, which restricts access to additional information.

Mizumoto and Eguchi ( 2023 ) were pioneers in employing the GPT model for AES in non-native English writing. Their study focused on evaluating the accuracy and reliability of AES using the GPT-3 text-davinci-003 model, analyzing a dataset of 12,100 essays from the corpus of nonnative written English (TOEFL11). The findings indicated that AES utilizing the GPT-3 model exhibited a certain degree of accuracy and reliability. They suggest that GPT-3-based AES systems hold the potential to provide support for human ratings. However, applying GPT model to AES presents a unique natural language processing (NLP) task that involves considerations such as nonnative language proficiency, the influence of the learner’s first language on the output in the target language, and identifying linguistic features that best indicate writing quality in a specific language. These linguistic features may differ morphologically or syntactically from those present in the learners’ first language, as observed in (1)–(3).

我-送了-他-一本-书

Wǒ-sòngle-tā-yī běn-shū

1 sg .-give. past- him-one .cl- book

“I gave him a book.”

Agglutinative

彼-に-本-を-あげ-まし-た

Kare-ni-hon-o-age-mashi-ta

3 sg .- dat -hon- acc- give.honorification. past

Inflectional

give, give-s, gave, given, giving

Additionally, the morphological agglutination and subject-object-verb (SOV) order in Japanese, along with its idiomatic expressions, pose additional challenges for applying language models in AES tasks (4).

足-が 棒-に なり-ました

Ashi-ga bo-ni nar-mashita

leg- nom stick- dat become- past

“My leg became like a stick (I am extremely tired).”

The example sentence provided demonstrates the morpho-syntactic structure of Japanese and the presence of an idiomatic expression. In this sentence, the verb “なる” (naru), meaning “to become”, appears at the end of the sentence. The verb stem “なり” (nari) is attached with morphemes indicating honorification (“ます” - mashu) and tense (“た” - ta), showcasing agglutination. While the sentence can be literally translated as “my leg became like a stick”, it carries an idiomatic interpretation that implies “I am extremely tired”.

To overcome this issue, CyberAgent Inc. ( 2023 ) has developed the Open-Calm series of language models specifically designed for Japanese. Open-Calm consists of pre-trained models available in various sizes, such as Small, Medium, Large, and 7b. Figure 2 depicts the fundamental structure of the Open-Calm model. A key feature of this architecture is the incorporation of the Lora Adapter and GPT-NeoX frameworks, which can enhance its language processing capabilities.

figure 2

GPT-NeoX Model Architecture (Okgetheng and Takeuchi 2024 ).

In a recent study conducted by Okgetheng and Takeuchi ( 2024 ), they assessed the efficacy of Open-Calm language models in grading Japanese essays. The research utilized a dataset of approximately 300 essays, which were annotated by native Japanese educators. The findings of the study demonstrate the considerable potential of Open-Calm language models in automated Japanese essay scoring. Specifically, among the Open-Calm family, the Open-Calm Large model (referred to as OCLL) exhibited the highest performance. However, it is important to note that, as of the current date, the Open-Calm Large model does not offer public access to its server. Consequently, users are required to independently deploy and operate the environment for OCLL. In order to utilize OCLL, users must have a PC equipped with an NVIDIA GeForce RTX 3060 (8 or 12 GB VRAM).

In summary, while the potential of LLMs in automated scoring of nonnative Japanese essays has been demonstrated in two studies—BERT-driven AES (Hirao et al. 2020 ) and OCLL-based AES (Okgetheng and Takeuchi, 2024 )—the number of research efforts in this area remains limited.

Another significant challenge in applying LLMs to AES lies in prompt engineering and ensuring its reliability and effectiveness (Brown et al. 2020 ; Rae et al. 2021 ; Zhang et al. 2021 ). Various prompting strategies have been proposed, such as the zero-shot chain of thought (CoT) approach (Kojima et al. 2022 ), which involves manually crafting diverse and effective examples. However, manual efforts can lead to mistakes. To address this, Zhang et al. ( 2021 ) introduced an automatic CoT prompting method called Auto-CoT, which demonstrates matching or superior performance compared to the CoT paradigm. Another prompt framework is trees of thoughts, enabling a model to self-evaluate its progress at intermediate stages of problem-solving through deliberate reasoning (Yao et al. 2023 ).

Beyond linguistic studies, there has been a noticeable increase in the number of foreign workers in Japan and Japanese learners worldwide (Ministry of Health, Labor, and Welfare of Japan, 2022 ; Japan Foundation, 2021 ). However, existing assessment methods, such as the Japanese Language Proficiency Test (JLPT), J-CAT, and TTBJ Footnote 1 , primarily focus on reading, listening, vocabulary, and grammar skills, neglecting the evaluation of writing proficiency. As the number of workers and language learners continues to grow, there is a rising demand for an efficient AES system that can reduce costs and time for raters and be utilized for employment, examinations, and self-study purposes.

This study aims to explore the potential of LLM-based AES by comparing the effectiveness of five models: two LLMs (GPT Footnote 2 and BERT), one Japanese local LLM (OCLL), and two conventional machine learning-based methods (linguistic feature-based scoring tools - Jess and JWriter).

The research questions addressed in this study are as follows:

To what extent do the LLM-driven AES and linguistic feature-based AES, when used as automated tools to support human rating, accurately reflect test takers’ actual performance?

What influence does the prompt have on the accuracy and performance of LLM-based AES methods?

The subsequent sections of the manuscript cover the methodology, including the assessment measures for nonnative Japanese writing proficiency, criteria for prompts, and the dataset. The evaluation section focuses on the analysis of annotations and rating scores generated by LLM-driven and linguistic feature-based AES methods.

Methodology

The dataset utilized in this study was obtained from the International Corpus of Japanese as a Second Language (I-JAS) Footnote 3 . This corpus consisted of 1000 participants who represented 12 different first languages. For the study, the participants were given a story-writing task on a personal computer. They were required to write two stories based on the 4-panel illustrations titled “Picnic” and “The key” (see Appendix A). Background information for the participants was provided by the corpus, including their Japanese language proficiency levels assessed through two online tests: J-CAT and SPOT. These tests evaluated their reading, listening, vocabulary, and grammar abilities. The learners’ proficiency levels were categorized into six levels aligned with the Common European Framework of Reference for Languages (CEFR) and the Reference Framework for Japanese Language Education (RFJLE): A1, A2, B1, B2, C1, and C2. According to Lee et al. ( 2015 ), there is a high level of agreement (r = 0.86) between the J-CAT and SPOT assessments, indicating that the proficiency certifications provided by J-CAT are consistent with those of SPOT. However, it is important to note that the scores of J-CAT and SPOT do not have a one-to-one correspondence. In this study, the J-CAT scores were used as a benchmark to differentiate learners of different proficiency levels. A total of 1400 essays were utilized, representing the beginner (aligned with A1), A2, B1, B2, C1, and C2 levels based on the J-CAT scores. Table 1 provides information about the learners’ proficiency levels and their corresponding J-CAT and SPOT scores.

A dataset comprising a total of 1400 essays from the story writing tasks was collected. Among these, 714 essays were utilized to evaluate the reliability of the LLM-based AES method, while the remaining 686 essays were designated as development data to assess the LLM-based AES’s capability to distinguish participants with varying proficiency levels. The GPT 4 API was used in this study. A detailed explanation of the prompt-assessment criteria is provided in Section Prompt . All essays were sent to the model for measurement and scoring.

Measures of writing proficiency for nonnative Japanese

Japanese exhibits a morphologically agglutinative structure where morphemes are attached to the word stem to convey grammatical functions such as tense, aspect, voice, and honorifics, e.g. (5).

食べ-させ-られ-まし-た-か

tabe-sase-rare-mashi-ta-ka

[eat (stem)-causative-passive voice-honorification-tense. past-question marker]

Japanese employs nine case particles to indicate grammatical functions: the nominative case particle が (ga), the accusative case particle を (o), the genitive case particle の (no), the dative case particle に (ni), the locative/instrumental case particle で (de), the ablative case particle から (kara), the directional case particle へ (e), and the comitative case particle と (to). The agglutinative nature of the language, combined with the case particle system, provides an efficient means of distinguishing between active and passive voice, either through morphemes or case particles, e.g. 食べる taberu “eat concusive . ” (active voice); 食べられる taberareru “eat concusive . ” (passive voice). In the active voice, “パン を 食べる” (pan o taberu) translates to “to eat bread”. On the other hand, in the passive voice, it becomes “パン が 食べられた” (pan ga taberareta), which means “(the) bread was eaten”. Additionally, it is important to note that different conjugations of the same lemma are considered as one type in order to ensure a comprehensive assessment of the language features. For example, e.g., 食べる taberu “eat concusive . ”; 食べている tabeteiru “eat progress .”; 食べた tabeta “eat past . ” as one type.

To incorporate these features, previous research (Suzuki, 1999 ; Watanabe et al. 1988 ; Ishioka, 2001 ; Ishioka and Kameda, 2006 ; Hirao et al. 2020 ) has identified complexity, fluency, and accuracy as crucial factors for evaluating writing quality. These criteria are assessed through various aspects, including lexical richness (lexical density, diversity, and sophistication), syntactic complexity, and cohesion (Kyle et al. 2021 ; Mizumoto and Eguchi, 2023 ; Ure, 1971 ; Halliday, 1985 ; Barkaoui and Hadidi, 2020 ; Zenker and Kyle, 2021 ; Kim et al. 2018 ; Lu, 2017 ; Ortega, 2015 ). Therefore, this study proposes five scoring categories: lexical richness, syntactic complexity, cohesion, content elaboration, and grammatical accuracy. A total of 16 measures were employed to capture these categories. The calculation process and specific details of these measures can be found in Table 2 .

T-unit, first introduced by Hunt ( 1966 ), is a measure used for evaluating speech and composition. It serves as an indicator of syntactic development and represents the shortest units into which a piece of discourse can be divided without leaving any sentence fragments. In the context of Japanese language assessment, Sakoda and Hosoi ( 2020 ) utilized T-unit as the basic unit to assess the accuracy and complexity of Japanese learners’ speaking and storytelling. The calculation of T-units in Japanese follows the following principles:

A single main clause constitutes 1 T-unit, regardless of the presence or absence of dependent clauses, e.g. (6).

ケンとマリはピクニックに行きました (main clause): 1 T-unit.

If a sentence contains a main clause along with subclauses, each subclause is considered part of the same T-unit, e.g. (7).

天気が良かった の で (subclause)、ケンとマリはピクニックに行きました (main clause): 1 T-unit.

In the case of coordinate clauses, where multiple clauses are connected, each coordinated clause is counted separately. Thus, a sentence with coordinate clauses may have 2 T-units or more, e.g. (8).

ケンは地図で場所を探して (coordinate clause)、マリはサンドイッチを作りました (coordinate clause): 2 T-units.

Lexical diversity refers to the range of words used within a text (Engber, 1995 ; Kyle et al. 2021 ) and is considered a useful measure of the breadth of vocabulary in L n production (Jarvis, 2013a , 2013b ).

The type/token ratio (TTR) is widely recognized as a straightforward measure for calculating lexical diversity and has been employed in numerous studies. These studies have demonstrated a strong correlation between TTR and other methods of measuring lexical diversity (e.g., Bentz et al. 2016 ; Čech and Miroslav, 2018 ; Çöltekin and Taraka, 2018 ). TTR is computed by considering both the number of unique words (types) and the total number of words (tokens) in a given text. Given that the length of learners’ writing texts can vary, this study employs the moving average type-token ratio (MATTR) to mitigate the influence of text length. MATTR is calculated using a 50-word moving window. Initially, a TTR is determined for words 1–50 in an essay, followed by words 2–51, 3–52, and so on until the end of the essay is reached (Díez-Ortega and Kyle, 2023 ). The final MATTR scores were obtained by averaging the TTR scores for all 50-word windows. The following formula was employed to derive MATTR:

\({\rm{MATTR}}({\rm{W}})=\frac{{\sum }_{{\rm{i}}=1}^{{\rm{N}}-{\rm{W}}+1}{{\rm{F}}}_{{\rm{i}}}}{{\rm{W}}({\rm{N}}-{\rm{W}}+1)}\)

Here, N refers to the number of tokens in the corpus. W is the randomly selected token size (W < N). \({F}_{i}\) is the number of types in each window. The \({\rm{MATTR}}({\rm{W}})\) is the mean of a series of type-token ratios (TTRs) based on the word form for all windows. It is expected that individuals with higher language proficiency will produce texts with greater lexical diversity, as indicated by higher MATTR scores.

Lexical density was captured by the ratio of the number of lexical words to the total number of words (Lu, 2012 ). Lexical sophistication refers to the utilization of advanced vocabulary, often evaluated through word frequency indices (Crossley et al. 2013 ; Haberman, 2008 ; Kyle and Crossley, 2015 ; Laufer and Nation, 1995 ; Lu, 2012 ; Read, 2000 ). In line of writing, lexical sophistication can be interpreted as vocabulary breadth, which entails the appropriate usage of vocabulary items across various lexicon-grammatical contexts and registers (Garner et al. 2019 ; Kim et al. 2018 ; Kyle et al. 2018 ). In Japanese specifically, words are considered lexically sophisticated if they are not included in the “Japanese Education Vocabulary List Ver 1.0”. Footnote 4 Consequently, lexical sophistication was calculated by determining the number of sophisticated word types relative to the total number of words per essay. Furthermore, it has been suggested that, in Japanese writing, sentences should ideally have a length of no more than 40 to 50 characters, as this promotes readability. Therefore, the median and maximum sentence length can be considered as useful indices for assessment (Ishioka and Kameda, 2006 ).

Syntactic complexity was assessed based on several measures, including the mean length of clauses, verb phrases per T-unit, clauses per T-unit, dependent clauses per T-unit, complex nominals per clause, adverbial clauses per clause, coordinate phrases per clause, and mean dependency distance (MDD). The MDD reflects the distance between the governor and dependent positions in a sentence. A larger dependency distance indicates a higher cognitive load and greater complexity in syntactic processing (Liu, 2008 ; Liu et al. 2017 ). The MDD has been established as an efficient metric for measuring syntactic complexity (Jiang, Quyang, and Liu, 2019 ; Li and Yan, 2021 ). To calculate the MDD, the position numbers of the governor and dependent are subtracted, assuming that words in a sentence are assigned in a linear order, such as W1 … Wi … Wn. In any dependency relationship between words Wa and Wb, Wa is the governor and Wb is the dependent. The MDD of the entire sentence was obtained by taking the absolute value of governor – dependent:

MDD = \(\frac{1}{n}{\sum }_{i=1}^{n}|{\rm{D}}{{\rm{D}}}_{i}|\)

In this formula, \(n\) represents the number of words in the sentence, and \({DD}i\) is the dependency distance of the \({i}^{{th}}\) dependency relationship of a sentence. Building on this, the annotation of sentence ‘Mary-ga-John-ni-keshigomu-o-watashita was [Mary- top -John- dat -eraser- acc -give- past] ’. The sentence’s MDD would be 2. Table 3 provides the CSV file as a prompt for GPT 4.

Cohesion (semantic similarity) and content elaboration aim to capture the ideas presented in test taker’s essays. Cohesion was assessed using three measures: Synonym overlap/paragraph (topic), Synonym overlap/paragraph (keywords), and word2vec cosine similarity. Content elaboration and development were measured as the number of metadiscourse markers (type)/number of words. To capture content closely, this study proposed a novel-distance based representation, by encoding the cosine distance between the essay (by learner) and essay task’s (topic and keyword) i -vectors. The learner’s essay is decoded into a word sequence, and aligned to the essay task’ topic and keyword for log-likelihood measurement. The cosine distance reveals the content elaboration score in the leaners’ essay. The mathematical equation of cosine similarity between target-reference vectors is shown in (11), assuming there are i essays and ( L i , …. L n ) and ( N i , …. N n ) are the vectors representing the learner and task’s topic and keyword respectively. The content elaboration distance between L i and N i was calculated as follows:

\(\cos \left(\theta \right)=\frac{{\rm{L}}\,\cdot\, {\rm{N}}}{\left|{\rm{L}}\right|{\rm{|N|}}}=\frac{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}{N}_{i}}{\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{L}_{i}^{2}}\sqrt{\mathop{\sum }\nolimits_{i=1}^{n}{N}_{i}^{2}}}\)

A high similarity value indicates a low difference between the two recognition outcomes, which in turn suggests a high level of proficiency in content elaboration.

To evaluate the effectiveness of the proposed measures in distinguishing different proficiency levels among nonnative Japanese speakers’ writing, we conducted a multi-faceted Rasch measurement analysis (Linacre, 1994 ). This approach applies measurement models to thoroughly analyze various factors that can influence test outcomes, including test takers’ proficiency, item difficulty, and rater severity, among others. The underlying principles and functionality of multi-faceted Rasch measurement are illustrated in (12).

\(\log \left(\frac{{P}_{{nijk}}}{{P}_{{nij}(k-1)}}\right)={B}_{n}-{D}_{i}-{C}_{j}-{F}_{k}\)

(12) defines the logarithmic transformation of the probability ratio ( P nijk /P nij(k-1) )) as a function of multiple parameters. Here, n represents the test taker, i denotes a writing proficiency measure, j corresponds to the human rater, and k represents the proficiency score. The parameter B n signifies the proficiency level of test taker n (where n ranges from 1 to N). D j represents the difficulty parameter of test item i (where i ranges from 1 to L), while C j represents the severity of rater j (where j ranges from 1 to J). Additionally, F k represents the step difficulty for a test taker to move from score ‘k-1’ to k . P nijk refers to the probability of rater j assigning score k to test taker n for test item i . P nij(k-1) represents the likelihood of test taker n being assigned score ‘k-1’ by rater j for test item i . Each facet within the test is treated as an independent parameter and estimated within the same reference framework. To evaluate the consistency of scores obtained through both human and computer analysis, we utilized the Infit mean-square statistic. This statistic is a chi-square measure divided by the degrees of freedom and is weighted with information. It demonstrates higher sensitivity to unexpected patterns in responses to items near a person’s proficiency level (Linacre, 2002 ). Fit statistics are assessed based on predefined thresholds for acceptable fit. For the Infit MNSQ, which has a mean of 1.00, different thresholds have been suggested. Some propose stricter thresholds ranging from 0.7 to 1.3 (Bond et al. 2021 ), while others suggest more lenient thresholds ranging from 0.5 to 1.5 (Eckes, 2009 ). In this study, we adopted the criterion of 0.70–1.30 for the Infit MNSQ.

Moving forward, we can now proceed to assess the effectiveness of the 16 proposed measures based on five criteria for accurately distinguishing various levels of writing proficiency among non-native Japanese speakers. To conduct this evaluation, we utilized the development dataset from the I-JAS corpus, as described in Section Dataset . Table 4 provides a measurement report that presents the performance details of the 14 metrics under consideration. The measure separation was found to be 4.02, indicating a clear differentiation among the measures. The reliability index for the measure separation was 0.891, suggesting consistency in the measurement. Similarly, the person separation reliability index was 0.802, indicating the accuracy of the assessment in distinguishing between individuals. All 16 measures demonstrated Infit mean squares within a reasonable range, ranging from 0.76 to 1.28. The Synonym overlap/paragraph (topic) measure exhibited a relatively high outfit mean square of 1.46, although the Infit mean square falls within an acceptable range. The standard error for the measures ranged from 0.13 to 0.28, indicating the precision of the estimates.

Table 5 further illustrated the weights assigned to different linguistic measures for score prediction, with higher weights indicating stronger correlations between those measures and higher scores. Specifically, the following measures exhibited higher weights compared to others: moving average type token ratio per essay has a weight of 0.0391. Mean dependency distance had a weight of 0.0388. Mean length of clause, calculated by dividing the number of words by the number of clauses, had a weight of 0.0374. Complex nominals per T-unit, calculated by dividing the number of complex nominals by the number of T-units, had a weight of 0.0379. Coordinate phrases rate, calculated by dividing the number of coordinate phrases by the number of clauses, had a weight of 0.0325. Grammatical error rate, representing the number of errors per essay, had a weight of 0.0322.

Criteria (output indicator)

The criteria used to evaluate the writing ability in this study were based on CEFR, which follows a six-point scale ranging from A1 to C2. To assess the quality of Japanese writing, the scoring criteria from Table 6 were utilized. These criteria were derived from the IELTS writing standards and served as assessment guidelines and prompts for the written output.

A prompt is a question or detailed instruction that is provided to the model to obtain a proper response. After several pilot experiments, we decided to provide the measures (Section Measures of writing proficiency for nonnative Japanese ) as the input prompt and use the criteria (Section Criteria (output indicator) ) as the output indicator. Regarding the prompt language, considering that the LLM was tasked with rating Japanese essays, would prompt in Japanese works better Footnote 5 ? We conducted experiments comparing the performance of GPT-4 using both English and Japanese prompts. Additionally, we utilized the Japanese local model OCLL with Japanese prompts. Multiple trials were conducted using the same sample. Regardless of the prompt language used, we consistently obtained the same grading results with GPT-4, which assigned a grade of B1 to the writing sample. This suggested that GPT-4 is reliable and capable of producing consistent ratings regardless of the prompt language. On the other hand, when we used Japanese prompts with the Japanese local model “OCLL”, we encountered inconsistent grading results. Out of 10 attempts with OCLL, only 6 yielded consistent grading results (B1), while the remaining 4 showed different outcomes, including A1 and B2 grades. These findings indicated that the language of the prompt was not the determining factor for reliable AES. Instead, the size of the training data and the model parameters played crucial roles in achieving consistent and reliable AES results for the language model.

The following is the utilized prompt, which details all measures and requires the LLM to score the essays using holistic and trait scores.

Please evaluate Japanese essays written by Japanese learners and assign a score to each essay on a six-point scale, ranging from A1, A2, B1, B2, C1 to C2. Additionally, please provide trait scores and display the calculation process for each trait score. The scoring should be based on the following criteria:

Moving average type-token ratio.

Number of lexical words (token) divided by the total number of words per essay.

Number of sophisticated word types divided by the total number of words per essay.

Mean length of clause.

Verb phrases per T-unit.

Clauses per T-unit.

Dependent clauses per T-unit.

Complex nominals per clause.

Adverbial clauses per clause.

Coordinate phrases per clause.

Mean dependency distance.

Synonym overlap paragraph (topic and keywords).

Word2vec cosine similarity.

Connectives per essay.

Conjunctions per essay.

Number of metadiscourse markers (types) divided by the total number of words.

Number of errors per essay.

Japanese essay text

出かける前に二人が地図を見ている間に、サンドイッチを入れたバスケットに犬が入ってしまいました。それに気づかずに二人は楽しそうに出かけて行きました。やがて突然犬がバスケットから飛び出し、二人は驚きました。バスケット の 中を見ると、食べ物はすべて犬に食べられていて、二人は困ってしまいました。(ID_JJJ01_SW1)

The score of the example above was B1. Figure 3 provides an example of holistic and trait scores provided by GPT-4 (with a prompt indicating all measures) via Bing Footnote 6 .

figure 3

Example of GPT-4 AES and feedback (with a prompt indicating all measures).

Statistical analysis

The aim of this study is to investigate the potential use of LLM for nonnative Japanese AES. It seeks to compare the scoring outcomes obtained from feature-based AES tools, which rely on conventional machine learning technology (i.e. Jess, JWriter), with those generated by AI-driven AES tools utilizing deep learning technology (BERT, GPT, OCLL). To assess the reliability of a computer-assisted annotation tool, the study initially established human-human agreement as the benchmark measure. Subsequently, the performance of the LLM-based method was evaluated by comparing it to human-human agreement.

To assess annotation agreement, the study employed standard measures such as precision, recall, and F-score (Brants 2000 ; Lu 2010 ), along with the quadratically weighted kappa (QWK) to evaluate the consistency and agreement in the annotation process. Assume A and B represent human annotators. When comparing the annotations of the two annotators, the following results are obtained. The evaluation of precision, recall, and F-score metrics was illustrated in equations (13) to (15).

\({\rm{Recall}}(A,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,A}\)

\({\rm{Precision}}(A,\,B)=\frac{{\rm{Number}}\,{\rm{of}}\,{\rm{identical}}\,{\rm{nodes}}\,{\rm{in}}\,A\,{\rm{and}}\,B}{{\rm{Number}}\,{\rm{of}}\,{\rm{nodes}}\,{\rm{in}}\,B}\)

The F-score is the harmonic mean of recall and precision:

\({\rm{F}}-{\rm{score}}=\frac{2* ({\rm{Precision}}* {\rm{Recall}})}{{\rm{Precision}}+{\rm{Recall}}}\)

The highest possible value of an F-score is 1.0, indicating perfect precision and recall, and the lowest possible value is 0, if either precision or recall are zero.

In accordance with Taghipour and Ng ( 2016 ), the calculation of QWK involves two steps:

Step 1: Construct a weight matrix W as follows:

\({W}_{{ij}}=\frac{{(i-j)}^{2}}{{(N-1)}^{2}}\)

i represents the annotation made by the tool, while j represents the annotation made by a human rater. N denotes the total number of possible annotations. Matrix O is subsequently computed, where O_( i, j ) represents the count of data annotated by the tool ( i ) and the human annotator ( j ). On the other hand, E refers to the expected count matrix, which undergoes normalization to ensure that the sum of elements in E matches the sum of elements in O.

Step 2: With matrices O and E, the QWK is obtained as follows:

K = 1- \(\frac{\sum i,j{W}_{i,j}\,{O}_{i,j}}{\sum i,j{W}_{i,j}\,{E}_{i,j}}\)

The value of the quadratic weighted kappa increases as the level of agreement improves. Further, to assess the accuracy of LLM scoring, the proportional reductive mean square error (PRMSE) was employed. The PRMSE approach takes into account the variability observed in human ratings to estimate the rater error, which is then subtracted from the variance of the human labels. This calculation provides an overall measure of agreement between the automated scores and true scores (Haberman et al. 2015 ; Loukina et al. 2020 ; Taghipour and Ng, 2016 ). The computation of PRMSE involves the following steps:

Step 1: Calculate the mean squared errors (MSEs) for the scoring outcomes of the computer-assisted tool (MSE tool) and the human scoring outcomes (MSE human).

Step 2: Determine the PRMSE by comparing the MSE of the computer-assisted tool (MSE tool) with the MSE from human raters (MSE human), using the following formula:

\({\rm{PRMSE}}=1-\frac{({\rm{MSE}}\,{\rm{tool}})\,}{({\rm{MSE}}\,{\rm{human}})\,}=1-\,\frac{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-{\hat{{\rm{y}}}}_{{\rm{i}}})}^{2}}{{\sum }_{i}^{n}=1{({{\rm{y}}}_{i}-\hat{{\rm{y}}})}^{2}}\)

In the numerator, ŷi represents the scoring outcome predicted by a specific LLM-driven AES system for a given sample. The term y i − ŷ i represents the difference between this predicted outcome and the mean value of all LLM-driven AES systems’ scoring outcomes. It quantifies the deviation of the specific LLM-driven AES system’s prediction from the average prediction of all LLM-driven AES systems. In the denominator, y i − ŷ represents the difference between the scoring outcome provided by a specific human rater for a given sample and the mean value of all human raters’ scoring outcomes. It measures the discrepancy between the specific human rater’s score and the average score given by all human raters. The PRMSE is then calculated by subtracting the ratio of the MSE tool to the MSE human from 1. PRMSE falls within the range of 0 to 1, with larger values indicating reduced errors in LLM’s scoring compared to those of human raters. In other words, a higher PRMSE implies that LLM’s scoring demonstrates greater accuracy in predicting the true scores (Loukina et al. 2020 ). The interpretation of kappa values, ranging from 0 to 1, is based on the work of Landis and Koch ( 1977 ). Specifically, the following categories are assigned to different ranges of kappa values: −1 indicates complete inconsistency, 0 indicates random agreement, 0.0 ~ 0.20 indicates extremely low level of agreement (slight), 0.21 ~ 0.40 indicates moderate level of agreement (fair), 0.41 ~ 0.60 indicates medium level of agreement (moderate), 0.61 ~ 0.80 indicates high level of agreement (substantial), 0.81 ~ 1 indicates almost perfect level of agreement. All statistical analyses were executed using Python script.

Results and discussion

Annotation reliability of the llm.

This section focuses on assessing the reliability of the LLM’s annotation and scoring capabilities. To evaluate the reliability, several tests were conducted simultaneously, aiming to achieve the following objectives:

Assess the LLM’s ability to differentiate between test takers with varying levels of oral proficiency.

Determine the level of agreement between the annotations and scoring performed by the LLM and those done by human raters.

The evaluation of the results encompassed several metrics, including: precision, recall, F-Score, quadratically-weighted kappa, proportional reduction of mean squared error, Pearson correlation, and multi-faceted Rasch measurement.

Inter-annotator agreement (human–human annotator agreement)

We started with an agreement test of the two human annotators. Two trained annotators were recruited to determine the writing task data measures. A total of 714 scripts, as the test data, was utilized. Each analysis lasted 300–360 min. Inter-annotator agreement was evaluated using the standard measures of precision, recall, and F-score and QWK. Table 7 presents the inter-annotator agreement for the various indicators. As shown, the inter-annotator agreement was fairly high, with F-scores ranging from 1.0 for sentence and word number to 0.666 for grammatical errors.

The findings from the QWK analysis provided further confirmation of the inter-annotator agreement. The QWK values covered a range from 0.950 ( p  = 0.000) for sentence and word number to 0.695 for synonym overlap number (keyword) and grammatical errors ( p  = 0.001).

Agreement of annotation outcomes between human and LLM

To evaluate the consistency between human annotators and LLM annotators (BERT, GPT, OCLL) across the indices, the same test was conducted. The results of the inter-annotator agreement (F-score) between LLM and human annotation are provided in Appendix B-D. The F-scores ranged from 0.706 for Grammatical error # for OCLL-human to a perfect 1.000 for GPT-human, for sentences, clauses, T-units, and words. These findings were further supported by the QWK analysis, which showed agreement levels ranging from 0.807 ( p  = 0.001) for metadiscourse markers for OCLL-human to 0.962 for words ( p  = 0.000) for GPT-human. The findings demonstrated that the LLM annotation achieved a significant level of accuracy in identifying measurement units and counts.

Reliability of LLM-driven AES’s scoring and discriminating proficiency levels

This section examines the reliability of the LLM-driven AES scoring through a comparison of the scoring outcomes produced by human raters and the LLM ( Reliability of LLM-driven AES scoring ). It also assesses the effectiveness of the LLM-based AES system in differentiating participants with varying proficiency levels ( Reliability of LLM-driven AES discriminating proficiency levels ).

Reliability of LLM-driven AES scoring

Table 8 summarizes the QWK coefficient analysis between the scores computed by the human raters and the GPT-4 for the individual essays from I-JAS Footnote 7 . As shown, the QWK of all measures ranged from k  = 0.819 for lexical density (number of lexical words (tokens)/number of words per essay) to k  = 0.644 for word2vec cosine similarity. Table 9 further presents the Pearson correlations between the 16 writing proficiency measures scored by human raters and GPT 4 for the individual essays. The correlations ranged from 0.672 for syntactic complexity to 0.734 for grammatical accuracy. The correlations between the writing proficiency scores assigned by human raters and the BERT-based AES system were found to range from 0.661 for syntactic complexity to 0.713 for grammatical accuracy. The correlations between the writing proficiency scores given by human raters and the OCLL-based AES system ranged from 0.654 for cohesion to 0.721 for grammatical accuracy. These findings indicated an alignment between the assessments made by human raters and both the BERT-based and OCLL-based AES systems in terms of various aspects of writing proficiency.

Reliability of LLM-driven AES discriminating proficiency levels

After validating the reliability of the LLM’s annotation and scoring, the subsequent objective was to evaluate its ability to distinguish between various proficiency levels. For this analysis, a dataset of 686 individual essays was utilized. Table 10 presents a sample of the results, summarizing the means, standard deviations, and the outcomes of the one-way ANOVAs based on the measures assessed by the GPT-4 model. A post hoc multiple comparison test, specifically the Bonferroni test, was conducted to identify any potential differences between pairs of levels.

As the results reveal, seven measures presented linear upward or downward progress across the three proficiency levels. These were marked in bold in Table 10 and comprise one measure of lexical richness, i.e. MATTR (lexical diversity); four measures of syntactic complexity, i.e. MDD (mean dependency distance), MLC (mean length of clause), CNT (complex nominals per T-unit), CPC (coordinate phrases rate); one cohesion measure, i.e. word2vec cosine similarity and GER (grammatical error rate). Regarding the ability of the sixteen measures to distinguish adjacent proficiency levels, the Bonferroni tests indicated that statistically significant differences exist between the primary level and the intermediate level for MLC and GER. One measure of lexical richness, namely LD, along with three measures of syntactic complexity (VPT, CT, DCT, ACC), two measures of cohesion (SOPT, SOPK), and one measure of content elaboration (IMM), exhibited statistically significant differences between proficiency levels. However, these differences did not demonstrate a linear progression between adjacent proficiency levels. No significant difference was observed in lexical sophistication between proficiency levels.

To summarize, our study aimed to evaluate the reliability and differentiation capabilities of the LLM-driven AES method. For the first objective, we assessed the LLM’s ability to differentiate between test takers with varying levels of oral proficiency using precision, recall, F-Score, and quadratically-weighted kappa. Regarding the second objective, we compared the scoring outcomes generated by human raters and the LLM to determine the level of agreement. We employed quadratically-weighted kappa and Pearson correlations to compare the 16 writing proficiency measures for the individual essays. The results confirmed the feasibility of using the LLM for annotation and scoring in AES for nonnative Japanese. As a result, Research Question 1 has been addressed.

Comparison of BERT-, GPT-, OCLL-based AES, and linguistic-feature-based computation methods

This section aims to compare the effectiveness of five AES methods for nonnative Japanese writing, i.e. LLM-driven approaches utilizing BERT, GPT, and OCLL, linguistic feature-based approaches using Jess and JWriter. The comparison was conducted by comparing the ratings obtained from each approach with human ratings. All ratings were derived from the dataset introduced in Dataset . To facilitate the comparison, the agreement between the automated methods and human ratings was assessed using QWK and PRMSE. The performance of each approach was summarized in Table 11 .

The QWK coefficient values indicate that LLMs (GPT, BERT, OCLL) and human rating outcomes demonstrated higher agreement compared to feature-based AES methods (Jess and JWriter) in assessing writing proficiency criteria, including lexical richness, syntactic complexity, content, and grammatical accuracy. Among the LLMs, the GPT-4 driven AES and human rating outcomes showed the highest agreement in all criteria, except for syntactic complexity. The PRMSE values suggest that the GPT-based method outperformed linguistic feature-based methods and other LLM-based approaches. Moreover, an interesting finding emerged during the study: the agreement coefficient between GPT-4 and human scoring was even higher than the agreement between different human raters themselves. This discovery highlights the advantage of GPT-based AES over human rating. Ratings involve a series of processes, including reading the learners’ writing, evaluating the content and language, and assigning scores. Within this chain of processes, various biases can be introduced, stemming from factors such as rater biases, test design, and rating scales. These biases can impact the consistency and objectivity of human ratings. GPT-based AES may benefit from its ability to apply consistent and objective evaluation criteria. By prompting the GPT model with detailed writing scoring rubrics and linguistic features, potential biases in human ratings can be mitigated. The model follows a predefined set of guidelines and does not possess the same subjective biases that human raters may exhibit. This standardization in the evaluation process contributes to the higher agreement observed between GPT-4 and human scoring. Section Prompt strategy of the study delves further into the role of prompts in the application of LLMs to AES. It explores how the choice and implementation of prompts can impact the performance and reliability of LLM-based AES methods. Furthermore, it is important to acknowledge the strengths of the local model, i.e. the Japanese local model OCLL, which excels in processing certain idiomatic expressions. Nevertheless, our analysis indicated that GPT-4 surpasses local models in AES. This superior performance can be attributed to the larger parameter size of GPT-4, estimated to be between 500 billion and 1 trillion, which exceeds the sizes of both BERT and the local model OCLL.

Prompt strategy

In the context of prompt strategy, Mizumoto and Eguchi ( 2023 ) conducted a study where they applied the GPT-3 model to automatically score English essays in the TOEFL test. They found that the accuracy of the GPT model alone was moderate to fair. However, when they incorporated linguistic measures such as cohesion, syntactic complexity, and lexical features alongside the GPT model, the accuracy significantly improved. This highlights the importance of prompt engineering and providing the model with specific instructions to enhance its performance. In this study, a similar approach was taken to optimize the performance of LLMs. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. Model 1 was used as the baseline, representing GPT-4 without any additional prompting. Model 2, on the other hand, involved GPT-4 prompted with 16 measures that included scoring criteria, efficient linguistic features for writing assessment, and detailed measurement units and calculation formulas. The remaining models (Models 3 to 18) utilized GPT-4 prompted with individual measures. The performance of these 18 different models was assessed using the output indicators described in Section Criteria (output indicator) . By comparing the performances of these models, the study aimed to understand the impact of prompt engineering on the accuracy and effectiveness of GPT-4 in AES tasks.

Based on the PRMSE scores presented in Fig. 4 , it was observed that Model 1, representing GPT-4 without any additional prompting, achieved a fair level of performance. However, Model 2, which utilized GPT-4 prompted with all measures, outperformed all other models in terms of PRMSE score, achieving a score of 0.681. These results indicate that the inclusion of specific measures and prompts significantly enhanced the performance of GPT-4 in AES. Among the measures, syntactic complexity was found to play a particularly significant role in improving the accuracy of GPT-4 in assessing writing quality. Following that, lexical diversity emerged as another important factor contributing to the model’s effectiveness. The study suggests that a well-prompted GPT-4 can serve as a valuable tool to support human assessors in evaluating writing quality. By utilizing GPT-4 as an automated scoring tool, the evaluation biases associated with human raters can be minimized. This has the potential to empower teachers by allowing them to focus on designing writing tasks and guiding writing strategies, while leveraging the capabilities of GPT-4 for efficient and reliable scoring.

figure 4

PRMSE scores of the 18 AES models.

This study aimed to investigate two main research questions: the feasibility of utilizing LLMs for AES and the impact of prompt engineering on the application of LLMs in AES.

To address the first objective, the study compared the effectiveness of five different models: GPT, BERT, the Japanese local LLM (OCLL), and two conventional machine learning-based AES tools (Jess and JWriter). The PRMSE values indicated that the GPT-4-based method outperformed other LLMs (BERT, OCLL) and linguistic feature-based computational methods (Jess and JWriter) across various writing proficiency criteria. Furthermore, the agreement coefficient between GPT-4 and human scoring surpassed the agreement among human raters themselves, highlighting the potential of using the GPT-4 tool to enhance AES by reducing biases and subjectivity, saving time, labor, and cost, and providing valuable feedback for self-study. Regarding the second goal, the role of prompt design was investigated by comparing 18 models, including a baseline model, a model prompted with all measures, and 16 models prompted with one measure at a time. GPT-4, which outperformed BERT and OCLL, was selected as the candidate model. The PRMSE scores of the models showed that GPT-4 prompted with all measures achieved the best performance, surpassing the baseline and other models.

In conclusion, this study has demonstrated the potential of LLMs in supporting human rating in assessments. By incorporating automation, we can save time and resources while reducing biases and subjectivity inherent in human rating processes. Automated language assessments offer the advantage of accessibility, providing equal opportunities and economic feasibility for individuals who lack access to traditional assessment centers or necessary resources. LLM-based language assessments provide valuable feedback and support to learners, aiding in the enhancement of their language proficiency and the achievement of their goals. This personalized feedback can cater to individual learner needs, facilitating a more tailored and effective language-learning experience.

There are three important areas that merit further exploration. First, prompt engineering requires attention to ensure optimal performance of LLM-based AES across different language types. This study revealed that GPT-4, when prompted with all measures, outperformed models prompted with fewer measures. Therefore, investigating and refining prompt strategies can enhance the effectiveness of LLMs in automated language assessments. Second, it is crucial to explore the application of LLMs in second-language assessment and learning for oral proficiency, as well as their potential in under-resourced languages. Recent advancements in self-supervised machine learning techniques have significantly improved automatic speech recognition (ASR) systems, opening up new possibilities for creating reliable ASR systems, particularly for under-resourced languages with limited data. However, challenges persist in the field of ASR. First, ASR assumes correct word pronunciation for automatic pronunciation evaluation, which proves challenging for learners in the early stages of language acquisition due to diverse accents influenced by their native languages. Accurately segmenting short words becomes problematic in such cases. Second, developing precise audio-text transcriptions for languages with non-native accented speech poses a formidable task. Last, assessing oral proficiency levels involves capturing various linguistic features, including fluency, pronunciation, accuracy, and complexity, which are not easily captured by current NLP technology.

Data availability

The dataset utilized was obtained from the International Corpus of Japanese as a Second Language (I-JAS). The data URLs: [ https://www2.ninjal.ac.jp/jll/lsaj/ihome2.html ].

J-CAT and TTBJ are two computerized adaptive tests used to assess Japanese language proficiency.

SPOT is a specific component of the TTBJ test.

J-CAT: https://www.j-cat2.org/html/ja/pages/interpret.html

SPOT: https://ttbj.cegloc.tsukuba.ac.jp/p1.html#SPOT .

The study utilized a prompt-based GPT-4 model, developed by OpenAI, which has an impressive architecture with 1.8 trillion parameters across 120 layers. GPT-4 was trained on a vast dataset of 13 trillion tokens, using two stages: initial training on internet text datasets to predict the next token, and subsequent fine-tuning through reinforcement learning from human feedback.

https://www2.ninjal.ac.jp/jll/lsaj/ihome2-en.html .

http://jhlee.sakura.ne.jp/JEV/ by Japanese Learning Dictionary Support Group 2015.

We express our sincere gratitude to the reviewer for bringing this matter to our attention.

On February 7, 2023, Microsoft began rolling out a major overhaul to Bing that included a new chatbot feature based on OpenAI’s GPT-4 (Bing.com).

Appendix E-F present the analysis results of the QWK coefficient between the scores computed by the human raters and the BERT, OCLL models.

Attali Y, Burstein J (2006) Automated essay scoring with e-rater® V.2. J. Technol., Learn. Assess., 4

Barkaoui K, Hadidi A (2020) Assessing Change in English Second Language Writing Performance (1st ed.). Routledge, New York. https://doi.org/10.4324/9781003092346

Bentz C, Tatyana R, Koplenig A, Tanja S (2016) A comparison between morphological complexity. measures: Typological data vs. language corpora. In Proceedings of the workshop on computational linguistics for linguistic complexity (CL4LC), 142–153. Osaka, Japan: The COLING 2016 Organizing Committee

Bond TG, Yan Z, Heene M (2021) Applying the Rasch model: Fundamental measurement in the human sciences (4th ed). Routledge

Brants T (2000) Inter-annotator agreement for a German newspaper corpus. Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00), Athens, Greece, 31 May-2 June, European Language Resources Association

Brown TB, Mann B, Ryder N, et al. (2020) Language models are few-shot learners. Advances in Neural Information Processing Systems, Online, 6–12 December, Curran Associates, Inc., Red Hook, NY

Burstein J (2003) The E-rater scoring engine: Automated essay scoring with natural language processing. In Shermis MD and Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Čech R, Miroslav K (2018) Morphological richness of text. In Masako F, Václav C (ed) Taming the corpus: From inflection and lexis to interpretation, 63–77. Cham, Switzerland: Springer Nature

Çöltekin Ç, Taraka, R (2018) Exploiting Universal Dependencies treebanks for measuring morphosyntactic complexity. In Aleksandrs B, Christian B (ed), Proceedings of first workshop on measuring language complexity, 1–7. Torun, Poland

Crossley SA, Cobb T, McNamara DS (2013) Comparing count-based and band-based indices of word frequency: Implications for active vocabulary research and pedagogical applications. System 41:965–981. https://doi.org/10.1016/j.system.2013.08.002

Article   Google Scholar  

Crossley SA, McNamara DS (2016) Say more and be more coherent: How text elaboration and cohesion can increase writing quality. J. Writ. Res. 7:351–370

CyberAgent Inc (2023) Open-Calm series of Japanese language models. Retrieved from: https://www.cyberagent.co.jp/news/detail/id=28817

Devlin J, Chang MW, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, Minnesota, 2–7 June, pp. 4171–4186. Association for Computational Linguistics

Diez-Ortega M, Kyle K (2023) Measuring the development of lexical richness of L2 Spanish: a longitudinal learner corpus study. Studies in Second Language Acquisition 1-31

Eckes T (2009) On common ground? How raters perceive scoring criteria in oral proficiency testing. In Brown A, Hill K (ed) Language testing and evaluation 13: Tasks and criteria in performance assessment (pp. 43–73). Peter Lang Publishing

Elliot S (2003) IntelliMetric: from here to validity. In: Shermis MD, Burstein JC (ed) Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Google Scholar  

Engber CA (1995) The relationship of lexical proficiency to the quality of ESL compositions. J. Second Lang. Writ. 4:139–155

Garner J, Crossley SA, Kyle K (2019) N-gram measures and L2 writing proficiency. System 80:176–187. https://doi.org/10.1016/j.system.2018.12.001

Haberman SJ (2008) When can subscores have value? J. Educat. Behav. Stat., 33:204–229

Haberman SJ, Yao L, Sinharay S (2015) Prediction of true test scores from observed item scores and ancillary data. Brit. J. Math. Stat. Psychol. 68:363–385

Halliday MAK (1985) Spoken and Written Language. Deakin University Press, Melbourne, Australia

Hirao R, Arai M, Shimanaka H et al. (2020) Automated essay scoring system for nonnative Japanese learners. Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), pp. 1250–1257. European Language Resources Association

Hunt KW (1966) Recent Measures in Syntactic Development. Elementary English, 43(7), 732–739. http://www.jstor.org/stable/41386067

Ishioka T (2001) About e-rater, a computer-based automatic scoring system for essays [Konpyūta ni yoru essei no jidō saiten shisutemu e − rater ni tsuite]. University Entrance Examination. Forum [Daigaku nyūshi fōramu] 24:71–76

Hochreiter S, Schmidhuber J (1997) Long short- term memory. Neural Comput. 9(8):1735–1780

Article   CAS   PubMed   Google Scholar  

Ishioka T, Kameda M (2006) Automated Japanese essay scoring system based on articles written by experts. Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17–18 July 2006, pp. 233-240. Association for Computational Linguistics, USA

Japan Foundation (2021) Retrieved from: https://www.jpf.gp.jp/j/project/japanese/survey/result/dl/survey2021/all.pdf

Jarvis S (2013a) Defining and measuring lexical diversity. In Jarvis S, Daller M (ed) Vocabulary knowledge: Human ratings and automated measures (Vol. 47, pp. 13–44). John Benjamins. https://doi.org/10.1075/sibil.47.03ch1

Jarvis S (2013b) Capturing the diversity in lexical diversity. Lang. Learn. 63:87–106. https://doi.org/10.1111/j.1467-9922.2012.00739.x

Jiang J, Quyang J, Liu H (2019) Interlanguage: A perspective of quantitative linguistic typology. Lang. Sci. 74:85–97

Kim M, Crossley SA, Kyle K (2018) Lexical sophistication as a multidimensional phenomenon: Relations to second language lexical proficiency, development, and writing quality. Mod. Lang. J. 102(1):120–141. https://doi.org/10.1111/modl.12447

Kojima T, Gu S, Reid M et al. (2022) Large language models are zero-shot reasoners. Advances in Neural Information Processing Systems, New Orleans, LA, 29 November-1 December, Curran Associates, Inc., Red Hook, NY

Kyle K, Crossley SA (2015) Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Q 49:757–786

Kyle K, Crossley SA, Berger CM (2018) The tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behav. Res. Methods 50:1030–1046. https://doi.org/10.3758/s13428-017-0924-4

Article   PubMed   Google Scholar  

Kyle K, Crossley SA, Jarvis S (2021) Assessing the validity of lexical diversity using direct judgements. Lang. Assess. Q. 18:154–170. https://doi.org/10.1080/15434303.2020.1844205

Landauer TK, Laham D, Foltz PW (2003) Automated essay scoring and annotation of essays with the Intelligent Essay Assessor. In Shermis MD, Burstein JC (ed), Automated Essay Scoring: A Cross-Disciplinary Perspective. Lawrence Erlbaum Associates, Mahwah, NJ

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 159–174

Laufer B, Nation P (1995) Vocabulary size and use: Lexical richness in L2 written production. Appl. Linguist. 16:307–322. https://doi.org/10.1093/applin/16.3.307

Lee J, Hasebe Y (2017) jWriter Learner Text Evaluator, URL: https://jreadability.net/jwriter/

Lee J, Kobayashi N, Sakai T, Sakota K (2015) A Comparison of SPOT and J-CAT Based on Test Analysis [Tesuto bunseki ni motozuku ‘SPOT’ to ‘J-CAT’ no hikaku]. Research on the Acquisition of Second Language Japanese [Dainigengo to shite no nihongo no shūtoku kenkyū] (18) 53–69

Li W, Yan J (2021) Probability distribution of dependency distance based on a Treebank of. Japanese EFL Learners’ Interlanguage. J. Quant. Linguist. 28(2):172–186. https://doi.org/10.1080/09296174.2020.1754611

Article   MathSciNet   Google Scholar  

Linacre JM (2002) Optimizing rating scale category effectiveness. J. Appl. Meas. 3(1):85–106

PubMed   Google Scholar  

Linacre JM (1994) Constructing measurement with a Many-Facet Rasch Model. In Wilson M (ed) Objective measurement: Theory into practice, Volume 2 (pp. 129–144). Norwood, NJ: Ablex

Liu H (2008) Dependency distance as a metric of language comprehension difficulty. J. Cognitive Sci. 9:159–191

Liu H, Xu C, Liang J (2017) Dependency distance: A new perspective on syntactic patterns in natural languages. Phys. Life Rev. 21. https://doi.org/10.1016/j.plrev.2017.03.002

Loukina A, Madnani N, Cahill A, et al. (2020) Using PRMSE to evaluate automated scoring systems in the presence of label noise. Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, Seattle, WA, USA → Online, 10 July, pp. 18–29. Association for Computational Linguistics

Lu X (2010) Automatic analysis of syntactic complexity in second language writing. Int. J. Corpus Linguist. 15:474–496

Lu X (2012) The relationship of lexical richness to the quality of ESL learners’ oral narratives. Mod. Lang. J. 96:190–208

Lu X (2017) Automated measurement of syntactic complexity in corpus-based L2 writing research and implications for writing assessment. Lang. Test. 34:493–511

Lu X, Hu R (2022) Sense-aware lexical sophistication indices and their relationship to second language writing quality. Behav. Res. Method. 54:1444–1460. https://doi.org/10.3758/s13428-021-01675-6

Ministry of Health, Labor, and Welfare of Japan (2022) Retrieved from: https://www.mhlw.go.jp/stf/newpage_30367.html

Mizumoto A, Eguchi M (2023) Exploring the potential of using an AI language model for automated essay scoring. Res. Methods Appl. Linguist. 3:100050

Okgetheng B, Takeuchi K (2024) Estimating Japanese Essay Grading Scores with Large Language Models. Proceedings of 30th Annual Conference of the Language Processing Society in Japan, March 2024

Ortega L (2015) Second language learning explained? SLA across 10 contemporary theories. In VanPatten B, Williams J (ed) Theories in Second Language Acquisition: An Introduction

Rae JW, Borgeaud S, Cai T, et al. (2021) Scaling Language Models: Methods, Analysis & Insights from Training Gopher. ArXiv, abs/2112.11446

Read J (2000) Assessing vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Rudner LM, Liang T (2002) Automated Essay Scoring Using Bayes’ Theorem. J. Technol., Learning and Assessment, 1 (2)

Sakoda K, Hosoi Y (2020) Accuracy and complexity of Japanese Language usage by SLA learners in different learning environments based on the analysis of I-JAS, a learners’ corpus of Japanese as L2. Math. Linguist. 32(7):403–418. https://doi.org/10.24701/mathling.32.7_403

Suzuki N (1999) Summary of survey results regarding comprehensive essay questions. Final report of “Joint Research on Comprehensive Examinations for the Aim of Evaluating Applicability to Each Specialized Field of Universities” for 1996-2000 [shōronbun sōgō mondai ni kansuru chōsa kekka no gaiyō. Heisei 8 - Heisei 12-nendo daigaku no kaku senmon bun’ya e no tekisei no hyōka o mokuteki to suru sōgō shiken no arikata ni kansuru kyōdō kenkyū’ saishū hōkoku-sho]. University Entrance Examination Section Center Research and Development Department [Daigaku nyūshi sentā kenkyū kaihatsubu], 21–32

Taghipour K, Ng HT (2016) A neural approach to automated essay scoring. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, 1–5 November, pp. 1882–1891. Association for Computational Linguistics

Takeuchi K, Ohno M, Motojin K, Taguchi M, Inada Y, Iizuka M, Abo T, Ueda H (2021) Development of essay scoring methods based on reference texts with construction of research-available Japanese essay data. In IPSJ J 62(9):1586–1604

Ure J (1971) Lexical density: A computational technique and some findings. In Coultard M (ed) Talking about Text. English Language Research, University of Birmingham, Birmingham, England

Vaswani A, Shazeer N, Parmar N, et al. (2017) Attention is all you need. In Advances in Neural Information Processing Systems, Long Beach, CA, 4–7 December, pp. 5998–6008, Curran Associates, Inc., Red Hook, NY

Watanabe H, Taira Y, Inoue Y (1988) Analysis of essay evaluation data [Shōronbun hyōka dēta no kaiseki]. Bulletin of the Faculty of Education, University of Tokyo [Tōkyōdaigaku kyōiku gakubu kiyō], Vol. 28, 143–164

Yao S, Yu D, Zhao J, et al. (2023) Tree of thoughts: Deliberate problem solving with large language models. Advances in Neural Information Processing Systems, 36

Zenker F, Kyle K (2021) Investigating minimum text lengths for lexical diversity indices. Assess. Writ. 47:100505. https://doi.org/10.1016/j.asw.2020.100505

Zhang Y, Warstadt A, Li X, et al. (2021) When do you need billions of words of pretraining data? Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Online, pp. 1112-1125. Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.acl-long.90

Download references

This research was funded by National Foundation of Social Sciences (22BYY186) to Wenchao Li.

Author information

Authors and affiliations.

Department of Japanese Studies, Zhejiang University, Hangzhou, China

Department of Linguistics and Applied Linguistics, Zhejiang University, Hangzhou, China

You can also search for this author in PubMed   Google Scholar

Contributions

Wenchao Li is in charge of conceptualization, validation, formal analysis, investigation, data curation, visualization and writing the draft. Haitao Liu is in charge of supervision.

Corresponding author

Correspondence to Wenchao Li .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Ethical approval

Ethical approval was not required as the study did not involve human participants.

Informed consent

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental material file #1, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Li, W., Liu, H. Applying large language models for automated essay scoring for non-native Japanese. Humanit Soc Sci Commun 11 , 723 (2024). https://doi.org/10.1057/s41599-024-03209-9

Download citation

Received : 02 February 2024

Accepted : 16 May 2024

Published : 03 June 2024

DOI : https://doi.org/10.1057/s41599-024-03209-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

essay questions 2015

What is cloud computing?

Group of white spheres on light blue background

With cloud computing, organizations essentially buy a range of services offered by cloud service providers (CSPs). The CSP’s servers host all the client’s applications. Organizations can enhance their computing power more quickly and cheaply via the cloud than by purchasing, installing, and maintaining their own servers.

The cloud-computing model is helping organizations to scale new digital solutions with greater speed and agility—and to create value more quickly. Developers use cloud services to build and run custom applications and to maintain infrastructure and networks for companies of virtually all sizes—especially large global ones. CSPs offer services, such as analytics, to handle and manipulate vast amounts of data. Time to market accelerates, speeding innovation to deliver better products and services across the world.

What are examples of cloud computing’s uses?

Get to know and directly engage with senior mckinsey experts on cloud computing.

Brant Carson is a senior partner in McKinsey’s Vancouver office; Chandra Gnanasambandam and Anand Swaminathan are senior partners in the Bay Area office; William Forrest is a senior partner in the Chicago office; Leandro Santos is a senior partner in the Atlanta office; Kate Smaje is a senior partner in the London office.

Cloud computing came on the scene well before the global pandemic hit, in 2020, but the ensuing digital dash  helped demonstrate its power and utility. Here are some examples of how businesses and other organizations employ the cloud:

  • A fast-casual restaurant chain’s online orders multiplied exponentially during the 2020 pandemic lockdowns, climbing to 400,000 a day, from 50,000. One pleasant surprise? The company’s online-ordering system could handle the volume—because it had already migrated to the cloud . Thanks to this success, the organization’s leadership decided to accelerate its five-year migration plan to less than one year.
  • A biotech company harnessed cloud computing to deliver the first clinical batch of a COVID-19 vaccine candidate for Phase I trials in just 42 days—thanks in part to breakthrough innovations using scalable cloud data storage and computing  to facilitate processes ensuring the drug’s safety and efficacy.
  • Banks use the cloud for several aspects of customer-service management. They automate transaction calls using voice recognition algorithms and cognitive agents (AI-based online self-service assistants directing customers to helpful information or to a human representative when necessary). In fraud and debt analytics, cloud solutions enhance the predictive power of traditional early-warning systems. To reduce churn, they encourage customer loyalty through holistic retention programs managed entirely in the cloud.
  • Automakers are also along for the cloud ride . One company uses a common cloud platform that serves 124 plants, 500 warehouses, and 1,500 suppliers to consolidate real-time data from machines and systems and to track logistics and offer insights on shop floor processes. Use of the cloud could shave 30 percent off factory costs by 2025—and spark innovation at the same time.

That’s not to mention experiences we all take for granted: using apps on a smartphone, streaming shows and movies, participating in videoconferences. All of these things can happen in the cloud.

Learn more about our Cloud by McKinsey , Digital McKinsey , and Technology, Media, & Telecommunications  practices.

How has cloud computing evolved?

Going back a few years, legacy infrastructure dominated IT-hosting budgets. Enterprises planned to move a mere 45 percent of their IT-hosting expenditures to the cloud by 2021. Enter COVID-19, and 65 percent of the decision makers surveyed by McKinsey increased their cloud budgets . An additional 55 percent ended up moving more workloads than initially planned. Having witnessed the cloud’s benefits firsthand, 40 percent of companies expect to pick up the pace of implementation.

The cloud revolution has actually been going on for years—more than 20, if you think the takeoff point was the founding of Salesforce, widely seen as the first software as a service (SaaS) company. Today, the next generation of cloud, including capabilities such as serverless computing, makes it easier for software developers to tweak software functions independently, accelerating the pace of release, and to do so more efficiently. Businesses can therefore serve customers and launch products in a more agile fashion. And the cloud continues to evolve.

Circular, white maze filled with white semicircles.

Introducing McKinsey Explainers : Direct answers to complex questions

Cost savings are commonly seen as the primary reason for moving to the cloud but managing those costs requires a different and more dynamic approach focused on OpEx rather than CapEx. Financial-operations (or FinOps) capabilities  can indeed enable the continuous management and optimization of cloud costs . But CSPs have developed their offerings so that the cloud’s greatest value opportunity is primarily through business innovation and optimization. In 2020, the top-three CSPs reached $100 billion  in combined revenues—a minor share of the global $2.4 trillion market for enterprise IT services—leaving huge value to be captured. To go beyond merely realizing cost savings, companies must activate three symbiotic rings of cloud value creation : strategy and management, business domain adoption, and foundational capabilities.

What’s the main reason to move to the cloud?

The pandemic demonstrated that the digital transformation can no longer be delayed—and can happen much more quickly than previously imagined. Nothing is more critical to a corporate digital transformation than becoming a cloud-first business. The benefits are faster time to market, simplified innovation and scalability, and reduced risk when effectively managed. The cloud lets companies provide customers with novel digital experiences—in days, not months—and delivers analytics absent on legacy platforms. But to transition to a cloud-first operating model, organizations must make a collective effort that starts at the top. Here are three actions CEOs can take to increase the value their companies get from cloud computing :

  • Establish a sustainable funding model.
  • Develop a new business technology operating model.
  • Set up policies to attract and retain the right engineering talent.

How much value will the cloud create?

Fortune 500 companies adopting the cloud could realize more than $1 trillion in value  by 2030, and not from IT cost reductions alone, according to McKinsey’s analysis of 700 use cases.

For example, the cloud speeds up design, build, and ramp-up, shortening time to market when companies have strong DevOps (the combination of development and operations) processes in place; groups of software developers customize and deploy software for operations that support the business. The cloud’s global infrastructure lets companies scale products almost instantly to reach new customers, geographies, and channels. Finally, digital-first companies use the cloud to adopt emerging technologies and innovate aggressively, using digital capabilities as a competitive differentiator to launch and build businesses .

If companies pursue the cloud’s vast potential in the right ways, they will realize huge value. Companies across diverse industries have implemented the public cloud and seen promising results. The successful ones defined a value-oriented strategy across IT and the business, acquired hands-on experience operating in the cloud, adopted a technology-first approach, and developed a cloud-literate workforce.

Learn more about our Cloud by McKinsey and Digital McKinsey practices.

What is the cloud cost/procurement model?

Some cloud services, such as server space, are leased. Leasing requires much less capital up front than buying, offers greater flexibility to switch and expand the use of services, cuts the basic cost of buying hardware and software upfront, and reduces the difficulties of upkeep and ownership. Organizations pay only for the infrastructure and computing services that meet their evolving needs. But an outsourcing model  is more apt than other analogies: the computing business issues of cloud customers are addressed by third-party providers that deliver innovative computing services on demand to a wide variety of customers, adapt those services to fit specific needs, and work to constantly improve the offering.

What are cloud risks?

The cloud offers huge cost savings and potential for innovation. However, when companies migrate to the cloud, the simple lift-and-shift approach doesn’t reduce costs, so companies must remediate their existing applications to take advantage of cloud services.

For instance, a major financial-services organization  wanted to move more than 50 percent of its applications to the public cloud within five years. Its goals were to improve resiliency, time to market, and productivity. But not all its business units needed to transition at the same pace. The IT leadership therefore defined varying adoption archetypes to meet each unit’s technical, risk, and operating-model needs.

Legacy cybersecurity architectures and operating models can also pose problems when companies shift to the cloud. The resulting problems, however, involve misconfigurations rather than inherent cloud security vulnerabilities. One powerful solution? Securing cloud workloads for speed and agility : automated security architectures and processes enable workloads to be processed at a much faster tempo.

What kind of cloud talent is needed?

The talent demands of the cloud differ from those of legacy IT. While cloud computing can improve the productivity of your technology, it requires specialized and sometimes hard-to-find talent—including full-stack developers, data engineers, cloud-security engineers, identity- and access-management specialists, and cloud engineers. The cloud talent model  should thus be revisited as you move forward.

Six practical actions can help your organization build the cloud talent you need :

  • Find engineering talent with broad experience and skills.
  • Balance talent maturity levels and the composition of teams.
  • Build an extensive and mandatory upskilling program focused on need.
  • Build an engineering culture that optimizes the developer experience.
  • Consider using partners to accelerate development and assign your best cloud leaders as owners.
  • Retain top talent by focusing on what motivates them.

How do different industries use the cloud?

Different industries are expected to see dramatically different benefits from the cloud. High-tech, retail, and healthcare organizations occupy the top end of the value capture continuum. Electronics and semiconductors, consumer-packaged-goods, and media companies make up the middle. Materials, chemicals, and infrastructure organizations cluster at the lower end.

Nevertheless, myriad use cases provide opportunities to unlock value across industries , as the following examples show:

  • a retailer enhancing omnichannel  fulfillment, using AI to optimize inventory across channels and to provide a seamless customer experience
  • a healthcare organization implementing remote heath monitoring to conduct virtual trials and improve adherence
  • a high-tech company using chatbots to provide premier-level support combining phone, email, and chat
  • an oil and gas company employing automated forecasting to automate supply-and-demand modeling and reduce the need for manual analysis
  • a financial-services organization implementing customer call optimization using real-time voice recognition algorithms to direct customers in distress to experienced representatives for retention offers
  • a financial-services provider moving applications in customer-facing business domains to the public cloud to penetrate promising markets more quickly and at minimal cost
  • a health insurance carrier accelerating the capture of billions of dollars in new revenues by moving systems to the cloud to interact with providers through easier onboarding

The cloud is evolving  to meet the industry-specific needs of companies. From 2021 to 2024, public-cloud spending on vertical applications (such as warehouse management in retailing and enterprise risk management in banking) is expected to grow by more than 40 percent annually. Spending on horizontal workloads (such as customer relationship management) is expected to grow by 25 percent. Healthcare and manufacturing organizations, for instance, plan to spend around twice as much on vertical applications as on horizontal ones.

Learn more about our Cloud by McKinsey , Digital McKinsey , Financial Services , Healthcare Systems & Services , Retail , and Technology, Media, & Telecommunications  practices.

What are the biggest cloud myths?

Views on cloud computing can be clouded by misconceptions. Here are seven common myths about the cloud —all of which can be debunked:

  • The cloud’s value lies primarily in reducing costs.
  • Cloud computing costs more than in-house computing.
  • On-premises data centers are more secure than the cloud.
  • Applications run more slowly in the cloud.
  • The cloud eliminates the need for infrastructure.
  • The best way to move to the cloud is to focus on applications or data centers.
  • You must lift and shift applications as-is or totally refactor them.

How large must my organization be to benefit from the cloud?

Here’s one more huge misconception: the cloud is just for big multinational companies. In fact, cloud can help make small local companies become multinational. A company’s benefits from implementing the cloud are not constrained by its size. In fact, the cloud shifts barrier to entry skill rather than scale, making it possible for a company of any size to compete if it has people with the right skills. With cloud, highly skilled small companies can take on established competitors. To realize the cloud’s immense potential value fully, organizations must take a thoughtful approach, with IT and the businesses working together.

For more in-depth exploration of these topics, see McKinsey’s Cloud Insights collection. Learn more about Cloud by McKinsey —and check out cloud-related job opportunities if you’re interested in working at McKinsey.

Articles referenced include:

  • “ Six practical actions for building the cloud talent you need ,” January 19, 2022, Brant Carson , Dorian Gärtner , Keerthi Iyengar, Anand Swaminathan , and Wayne Vest
  • “ Cloud-migration opportunity: Business value grows, but missteps abound ,” October 12, 2021, Tara Balakrishnan, Chandra Gnanasambandam , Leandro Santos , and Bhargs Srivathsan
  • “ Cloud’s trillion-dollar prize is up for grabs ,” February 26, 2021, Will Forrest , Mark Gu, James Kaplan , Michael Liebow, Raghav Sharma, Kate Smaje , and Steve Van Kuiken
  • “ Unlocking value: Four lessons in cloud sourcing and consumption ,” November 2, 2020, Abhi Bhatnagar , Will Forrest , Naufal Khan , and Abdallah Salami
  • “ Three actions CEOs can take to get value from cloud computing ,” July 21, 2020, Chhavi Arora , Tanguy Catlin , Will Forrest , James Kaplan , and Lars Vinter

Group of white spheres on light blue background

Want to know more about cloud computing?

Related articles.

Cloud’s trillion-dollar prize is up for grabs

Cloud’s trillion-dollar prize is up for grabs

The cloud transformation engine

The cloud transformation engine

Cloud calculator

Cloud cost-optimization simulator

IMAGES

  1. Sample Essay Questions

    essay questions 2015

  2. With these 33 new argumentative essay topics for middle school students

    essay questions 2015

  3. How to Write an Essay

    essay questions 2015

  4. General Guidelines for Answering Essay Questions

    essay questions 2015

  5. Open Ended Questions in Research Free Essay Example

    essay questions 2015

  6. SOLUTION: Essay writing sample answer 1

    essay questions 2015

VIDEO

  1. 5 Rules for Answering ESSAY Questions on Exams

  2. How to Answer A level Essay Questions (AQA)

  3. How to write a good essay: Paraphrasing the question

  4. How to Write an Eye-Catching Essay Introduction

  5. How to Write an Essay: Formulas for 5-Paragraph Essay

  6. English Essay: How to Write about ANY Essay Topic

COMMENTS

  1. PDF AP English Language and Composition 2015 Free-Response Questions

    Question 1. (Suggested time—40 minutes. This question counts for one-third of the total essay section score.) Many high schools, colleges, and universities have honor codes or honor systems: sets of rules or principles that are intended to cultivate integrity.

  2. PDF AP English Literature and Composition 2015 Free-Response Questions

    This question counts as one-third of the total essay section score.) In the following poem by Caribbean writer Derek Walcott, the speaker recalls a childhood experience of visiting an elderly woman storyteller. Read the poem carefully. Then, in a well-developed essay, discuss the speaker's recollection and analyze how Walcott uses poetic ...

  3. The Ultimate Guide to 2015 AP® English Language FRQs

    The Free Response Questions (FRQs) are the essay portion of the AP® Language exam. The exam itself has two parts, the first is a multiple choice section, and the second is the FRQs. ... Let's look at the prompt for the third essay from 2015: Before we get into the do's and don'ts of the essay, let's talk about the particular challenge ...

  4. PDF AP® ENGLISH LANGUAGE AND COMPOSITION

    Essays earning a score of 9 meet the criteria for the score of 8 and, in addition, are especially sophisticated in their argument, thorough in development, or impressive in their control of language. ... 2015 SCORING COMMENTARY . Question 1 (continued) source is not employed to support the argument for maintaining an honor code. The essay ...

  5. AP English Literature and Composition Exam Questions

    Download free-response questions from this year's exam and past exams along with scoring guidelines, sample responses from exam takers, and scoring distributions. If you are using assistive technology and need help accessing these PDFs in another format, contact Services for Students with Disabilities at 212-713-8333 or by email at ssd@info ...

  6. PDF 2015 AP ENGLISH LANGUAGE AND COMPOSITION FREE-RESPONSE QUESTIONS

    need. Read the following excerpt from the article carefully. Then, in a well-written essay, analyze the rhetorical choices Chavez makes to develop his argument about nonviolent resistance. Dr. King's entire life was an example of power that nonviolence brings to bear in the real world. It is an example that inspired much of the philosophy and

  7. AP United States History Exam Questions

    Score Distributions. Introduction and Preface. Short Answer Question 1. Short Answer Question 2. Short Answer Question 3. Document-Based Question 1. Long Essay Question 2. Long Essay Question 3. Download free-response questions from past AP United States History exams, along with scoring guidelines, sample responses, and scoring distributions.

  8. Florida Board of Bar Examiners

    Part I of this publication contains the essay questions from the February 2015 and July 2015 Florida Bar Examinations and one selected answer for each question. The answers selected for this publication received high scores and were written by applicants who passed the examination. The answers are typed as submitted, except that grammatical ...

  9. PDF Strategies for Essay Writing

    about the question, and they do not want you to bring in other sources. • Consider your audience. It can be difficult to know how much background information or context to provide when you are writing a paper. Here are some useful guidelines: o If you're writing a research paper, do not assume that your reader has read

  10. PDF February 2015 MEE Questions and Analyses

    Preface. The Multistate Essay Examination (MEE) is developed by the National Conference of Bar Examiners (NCBE). This publication includes the questions and analyses from the February 2015 MEE. (In the actual test, the questions are simply numbered rather than being identified by area of law.) The instructions for the test appear on page iii.

  11. Essay Previous Year Papers

    Reach Us 12, Main AB Road, Bhawar Kuan, Indore, Madhya Pradesh, 452007 641, 1 st Floor, Mukherjee Nagar, Delhi-110009 ; 21, Pusa Rd, WEA, Karol Bagh, Delhi-110005

  12. SAT Essay Prompts: The Complete List

    No extra time allowed! #5: Grade the essay, using the official essay rubric to give yourself a score out of 8 in the reading, analysis, and writing sections. #6: Repeat steps 4 and 5. Choose the prompts you think will be the hardest for you so that you can so that you're prepared for the worst when the test day comes.

  13. 100 IELTS Essay Questions

    The questions are organised under common topics and essay types. IELTS often use the similar topics for their essays but change the wording of the essay question. In order to prepare well for writing task 2, you should prepare ideas for common topics and then practise applying them to the tasks given (to the essay questions).

  14. 53 Stellar College Essay Topics to Inspire You

    Once you've chosen a general topic to write about, get out a piece of paper and get to work on creating a list of all the key details you could include in your essay. These could be things such as the following: Emotions you felt at the time. Names, places, and/or numbers. Dialogue, or what you or someone else said.

  15. PDF AP® ENGLISH LANGUAGE AND COMPOSITION

    2015 SCORING GUIDELINES ... Question 2 The essay's score should reflect the essay's quality as a whole. Remember that students had only 40 minutes to read and write; the essay, therefore, is not a finished product and should not be judged by standards appropriate for an out-of-class assignment. Evaluate the essay as a draft, making certain to

  16. PDF HiSET® Exam Free Practice Test FPT

    You will have 85 minutes to complete the multiple-choice questions and essay question of the Writing test. Questions 1 through 10 refer to the following selection. After a class trip, students each chose a personal highlight from their visit to write about for a newspaper feature.

  17. UPSC Mains 2015 Essay Question Paper

    Don't lose out without playing the right game! Follow the ClearIAS Prelims cum Mains (PCM) Integrated Approach. UPSC Mains 2015 Essay Question Paper - Instructions: Write two essays, choosing one from each of the following Section A & B, in about 1000-1200 words.

  18. PDF National Quali cations 2015

    2015 Total marks — 30 Read the passages carefully and then attempt ALL questions, which are printed on a separate sheet. X724/76/11 English Reading for Understanding, Analysis and Evaluation — Text *X7247611* FRIDAY, 15 MAY 9:00 AM - 10:30 AM PB. Page two

  19. Essays, activities & academics

    Rather than asking you to write one long essay, the MIT application consists of several short response questions and essays designed to help us get to know you. Remember that this is not a writing test. Be honest, be open, be authentic—this is your opportunity to connect with us. You should certainly be thoughtful about your essays, but if ...

  20. Past Exams

    October 2015. June 2014. October 2014. June 2013. October 2013. June 2012. October 2012. Past Exam Questions for the California Bar Exam and the First-Year Law Students' Exam.

  21. The Big Short: An Analysis of Financial Missteps and ...

    Introduction. Released in 2015, Adam McKay's The Big Short is a film adaptation of Michael Lewis's non-fiction book that delves into the intricacies of the 2008 financial crisis. The film, acclaimed for its innovative narrative techniques and compelling performances, scrutinizes the events and systemic failures that led to one of the most devastating economic downturns in modern history.

  22. Erlinger v. United States: Does the Sixth Amendment Require a Jury to

    Footnotes Jump to essay-1 For background on the Sixth Amendment generally and the rights it affords to criminal defendants, see Amdt6.1 Overview of Sixth Amendment, Rights in Criminal Prosecutions. Jump to essay-2 S. Union Co. v. United States, 567 U.S. 343, 350-51 (2012).For discussion of the right to trial by jury, see Amdt6.4.1 Overview of Right to Trial by Jury.

  23. ArtII.1 Overview of Article II, Executive Branch

    Kerry, 576 US. 1, 17 (2015). Cf., e.g. , United States ex rel. Knauff v. Snaughnessy , 338 U.S. 537, 543 (1950) (stating that the right to exclude aliens is inherent in the executive power to control the foreign affairs of the nation, and when Congress legislates in this area, it is implementing an inherent executive power ).

  24. PDF AP United States History 2015 Free-Response Questions

    4 Questions. Directions: Read each question carefully and write your responses in the Section I, Part B: Short Answer booklet on the lined pages provided for that question. Use complete sentences; an outline or bulleted list alone is not acceptable. You may plan your answers in this exam booklet, but no credit will be given for notes written in ...

  25. AP U.S. History Past Exam Questions

    Looking for free-response questions and scoring information from the 2015 exam and later? Visit The AP U.S. History Exam. ... Long Essay Question 2. Long Essay Question 3. Resources. Download. AP U.S. History Document-Based Questions, 1973-1999 PDF; 32.21 MB ...

  26. Smith v. Arizona: The Sixth Amendment's Confrontation Clause and

    Footnotes Jump to essay-1 U.S. Const. amend. VI. Jump to essay-2 See Crawford v. Washington, 541 U.S. 36, 68-69 (2004).The Supreme Court in Crawford recognized the existence of two common law Confrontation Clause exceptions that historically permitted the admission of testimonial statements, but it did not expressly approve or disapprove of either.

  27. Applying large language models for automated essay scoring for non

    Recent advancements in artificial intelligence (AI) have led to an increased use of large language models (LLMs) for language assessment tasks such as automated essay scoring (AES), automated ...

  28. Trump v. Anderson: Did the Colorado Supreme Court Err in Excluding

    Footnotes Jump to essay-1 Trump v. Anderson, No. 23-719 (U.S. Mar. 4, 2024). Jump to essay-2 Anderson v. Griswold, 2023 CO 63, ¶ 5. Jump to essay-3 Id. ¶ 225 (We conclude that the foregoing evidence, the great bulk of which was undisputed at trial, established that President Trump engaged in insurrection. Jump to essay-4 Id. ¶ 257 ([B]ecause President Trump is disqualified from holding the ...

  29. Trump's Trial Violated Due Process

    New York's trial of Mr. Trump violated basic due-process principles. "No principle of procedural due process is more clearly established than that notice of the specific charge," the Supreme ...

  30. What is cloud computing: Its uses and benefits

    Cloud computing is the use of comprehensive digital capabilities delivered via the internet for organizations to operate, innovate, and serve customers. It eliminates the need for organizations to host digital applications on their own servers. Group of white spheres on light blue background.