Analysis: Why Students Prefer ChatGPT Over Their Tutors
In our analysis of how ChatGPT brought about a $1 billion market capitalization loss for the online education firm Chegg in May 2023, we pointed out early signals of the chatbot’s growing popularity on college campuses.
One of these indicators was an early survey by Intelligent Magazine in January 2023, reporting that within the first seven weeks after ChatGPT’s launch, about 32 percent of college students had already used the platform to help write papers.
Now the magazine is back with a new poll. This time the publication discovered that students appear to increasingly prefer ChatGPT over tutors—and by surprisingly large margins.
A Shift in Tutoring Preferences?
The new results may herald a substantial shift in students’ preferences for personal academic support. Eighty-five percent of the college and high school respondents reported that ChatGPT’s tutoring was more effective than learning from their tutors. And among parents of elementary school children, almost all of them—96 percent—reported that working with a tutor was less effective for their children than studying with ChatGPT.
Hefty proportions of the respondents had already fired their tutors. Of the college and high school respondents, almost two in five had replaced their tutors with ChatGPT. Of the parents of elementary school pupils, 30 percent made the same switch. And almost all of the respondents cut some tutoring sessions by substituting work with the chatbot.
Consolidating the results from both groups yielded two more jaw-dropping statistics. Overall, 95 percent of the respondents claimed that after working with ChatGPT, their grades improved. What’s more, nine out of ten prefer learning from ChatGPT instead of working with a tutor.
Developing these unexpected conclusions required statistical analysis of online responses collected across the United States from two samples during April and May 2023. The respondents included approximately 302 college and high school students who studied with both ChatGPT and a tutor, along with 485 parents of elementary school pupils who reported that their kids also studied with both kinds of support. However, both were convenience samples, and we discuss below the pollsters’ choice to employ that controversial approach.
Why Students Appreciate ChatGPT’s Tutoring
Why are the survey respondents so wildly enthusiastic about ChatGPT? Although Intelligent Magazine provides less specific data about the respondents’ benefits, a few factors appear to have emerged.
Speed, efficiency, convenience, and 24-hour availability are four of the chatbot’s advantages over tutors that respondents frequently mentioned. Dr. Diane Gayeski, an Ithaca College strategic communications professor and one of Intelligent’s consultants, explained in an extensive interview with Venture Beat that “students and their parents believe ChatGPT is a highly effective study tool that is more convenient and efficient than traditional tutoring.” She added that “parents find ChatGPT to be an attractive tutor because it’s free, always available, and answers questions quickly.”
Error correction and adaptive feedback on learning progress were two other advantages over tutors cited by those surveyed. “ChatGPT’s ability to correct mistakes makes it easier for children to learn correctly,” said one respondent. “ChatGPT is better able to know what I need, what my weaknesses are, and what I need to strengthen,” said another.
Why Stick With Tutors?
Despite all this enthusiasm over ChatGPT, the fact remains that most respondents didn’t fire their tutors, including 61 percent of the college and high school students and 70 percent of the younger kids’ parents. If the chatbot is so effective at tutoring and popular, why do so many students and families maintain their relationships with tutors?
The first weak link is that students must provide ChatGPT with effective prompts. “ChatGPT cannot do what experienced human tutors do because it’s reliant on the user entering good prompts,” according to Dr. Gayeski. She continues:
A tutor can offer tailored examples suitable for a 15-year-old. However, while ChatGPT can provide such examples, the user must possess the skill to articulate the request effectively.
That’s a tall order for two reasons. First, eliciting optimal responses from artificial intelligence platforms is called prompt engineering, as we pointed out in our recent feature article entitled “The $335,000 ChatGPT Skill Savvy Online Students Need to Know.”
And although it doesn’t require any software coding, it’s still a difficult learned skill that depends upon hands-on experience—which is why some companies are paying annual salaries over $335,000 for prompt engineering professionals: “Prompt engineering isn’t that easy, and a lot of students don’t necessarily have time to learn how to do it,” says Matt Mittelsteadt, a research fellow at George Mason University’s Mercatus Center, in an interview with Politico.
Second, in an academic context, crafting effective prompts requires sufficient background knowledge about the field of study. The problem is that most high school and lower-division undergraduates don’t yet possess this fundamental level of knowledge about academic disciplines—after all, that’s what they’re trying to attain through their course and degree program enrollments.
And that lack of understanding about a field’s basic principles makes formulating good prompts even more difficult: “For effective results, ChatGPT is totally reliant on the user entering good prompts,” says Dr. Gayeski. “The student has to have enough background knowledge to give ChatGPT the right type of prompt.”
But there’s a second factor that explains why so many of the respondents stuck with their tutors. Besides subject matter expertise and personalized teaching skills, much of what a tutor provides is a relationship with a student—a benefit that no AI chatbot can ever provide. Through that relationship, the tutor can help develop critical attributes that the student needs to succeed along with subject-matter knowledge and problem-solving skills, such as motivation and structure. Says Dr. Gayeski:
That’s why many people use personal trainers at a gym. It’s not necessarily that they don’t know how to exercise or use the machines. It’s because having that appointment provides [for] people to get positive feedback and encouragement from the trainer that can foster persistence.
Do Tutoring Cutbacks Make Economic Sense?
One of the more ironic aspects of K-12 students’ suddenly replacing their tutors with a free chatbot is that many of their families aren’t saving any money that way.
As we pointed out in our December 2022 analysis “Online Tutoring: What Works and What Doesn’t,” a series of studies in 2021 and 2022 showed plummeting reading and math scores among K-12 students. Because that pandemic learning loss turned into a hot political issue during the 2022 election, at no other time in history has America spent so much money to provide families with tutoring.
The Biden Administration’s American Rescue Plan allocates $122 billion to school districts—the largest single infusion of federal education dollars in history—and mandates they spend 20 percent of those funds on learning loss. After reviewing the spending plans of 5,400 districts, the education analysis firm Burbio found that 25 percent of them had each budgeted an average of $1.4 million for tutoring. That total works out to $7.56 billion for nationwide tutoring that the districts have to spend before the program expires in September 2024.
Much of those budgets are being spent on the most effective method, known as “high-impact” or “high-dosage” tutoring. Staffed by highly-trained tutors during school hours, this structured method complements class assignments. And as we discuss at length in our article, it’s been proven effective by two recent European studies conducted by researchers from Harvard University and the London School of Economics. But the approach is so expensive that traditionally it’s been implemented predominantly in the most affluent suburban school districts.
So what we have in 2023 is an unusual situation where families across America are being provided almost $8 billion worth of highly-effective tutoring—at no cost—that will no longer be available to them after September 2024. Yet based on the Intelligent Magazine data, at least a third of the students supported by programs like these appear to be cutting back on hours with human tutors in favor of artificial intelligence software. Given those added benefits from working with tutors cited by Dr. Gayeski, one has to wonder if cutting back on free hours spent with top tutors makes sense under these circumstances.
The Risks of Convenience Sampling
There’s one other facet to the Intelligent Magazine survey worth mentioning: the pollsters’ controversial choice of methodology.
These days, when reviewing reports about surveys in the press, most of us typically assume that we’re reading about conclusions from studying a randomly-selected sample, as well as one comprising a population’s subset intended to accurately reflect the larger group’s attributes. For example, if we wanted to better understand a company made up of 100 employees where half are female, we might select a small random sample from that group of five women and five men to study.
This method reduces sampling bias and improves the generalizability of the results. Known as representative sampling, this method is employed in most of the opinion research conducted by national pollsters like Gallup and marketing research firms like the Nielsen Corporation.
But that’s not how Intelligent Magazine’s New York polling firm Pollfish conducted this study. Instead, Pollfish employed an approach known as convenience sampling, also known by names like accidental sampling or grab sampling. This method merely selects participants based on their availability and accessibility. The technique is fast, easy, and cheap, which is why it’s used so frequently in polls of customers at shopping malls, or studies of the most surveyed group on the planet: college students taking introductory courses in psychology.
However, convenience sampling has many disadvantages, and experts have criticized its overuse because this approach limits the results’ generalizability and introduces sampling bias. This means that it’s not really possible to draw accurate conclusions about a larger population based merely on the results from a convenience sample; it’s only possible to draw conclusions about the sample itself.
Bias can also skew results because quickly “grabbing” the most readily available folks to interview may not encompass essential minorities within a convenience sample when a representative random sample would include them.
Furthermore, the results from these samples are easy to analyze but difficult to replicate. And statistical inference using techniques like confidence intervals, hypothesis testing, and regression analysis frequently isn’t appropriate for convenience sampling. That’s because these calculations can yield inaccurate results when fed data from such a non-random sample.
Does this mean that opinion researchers should never use convenience sampling? No, not at all. This kind of sampling is appropriate for exploratory research, like pilot and feasibility studies. Grab sampling may also amount to the only available option when fast results are essential, or when an organization doesn’t have a budget sufficient to construct an accurate representative sample.
Intelligent Magazine seems to be attempting to beat competitors by publishing rapid insights, and a convenience sample is certainly one route toward accomplishing that objective. Moreover, these results on tutoring do offer some interesting insights about the behavior of the sample’s respondents, especially where the results reflected large values or comparisons with substantial value differences.
However, readers need to be careful about making generalizations about the U.S. population based on any convenience sample—including this one. Many observers who don’t “read the fine print” or didn’t take university statistics will assume that it’s reasonable to generalize these results to the entire U.S. population, even though, strictly speaking, that would not be appropriate based on the limitations inherent with this survey design.
It would make better sense to treat this sort of polling as a useful pilot study that’s helpful in the absence of other data since pilot studies typically require subsequent attempts to replicate the results using representative samples that can provide statistical inference and analysis with better reliability and validity.
And in the interest of full disclosure, besides presenting its methodology and referencing Pollfish, Intelligent Magazine has an editorial responsibility to provide a more direct and comprehensive overview discussing the advantages, disadvantages, and appropriate applications of convenience sampling techniques.