The progress that ChatGPT made in an exam in just three months stunned an economics professor.
Bryan Caplan of George Mason University said the chatbot got a D on his economics test in January.
He tried again with the GPT-4 update last week and its score improved to an A.
An economics professor said the progress ChatGPT made — it improved its score from a D to an A on his economics test in just three months — has stunned him.
Bryan Caplan, an economics professor at George Mason University, told Insider that the latest version of ChatGPT could now be responsible for the first big bet he's ever lost.
ChatGPT-3.5 didn't understand basic theory
Writing in a blog post on his Substack "Bet On It" in January, Caplan said he gave ChatGPT questions from his fall midterms.
Caplan said his exam questions test students' understanding of economics rather than have them regurgitate textbooks or complete what are essentially memory exercises.
It's here where the old version of ChatGPT tripped up. The bot scored 31 out of a possible 100 on his test, equivalent to a D and well below his 50% median.
Caplan told Insider that the bot failed to understand basic concepts, such as the principle of comparative and absolute advantage. Its answers were also more political than economic, he said.
"ChatGPT does a fine job of imitating a very weak GMU econ student," Caplan wrote in his January blog post.
He isn't the only academic that ChatGPT has disappointed. While it passed a Wharton Business School exam in January, its professor said it made "surprising mistakes" on simple calculations.
Caplan likes to bet. He's previously placed 23 public bets and won them all. They're usually for modest sums of about $100, and often on technical subjects like predicted unemployment rates and inflation readings.
He also narrowly won a 2008 bet that no member state would leave the European Union before 2020 — the UK left in January of that year.
ChatGPT's responses underwhelmed him so much that Caplan bet an AI model wouldn't score an A on six out of seven of his exams before 2029.
But when ChatGPT-4 was released, its progress stunned Caplan. It scored 73% on the same midterm test, equivalent to an A and among the best scores in his class.
ChatGPT's paywalled upgrade sought to fix some of the early issues with the beta version, GPT-3.5. This purportedly included making ChatGPT 40% more likely to return accurate responses, as well as making it able to handle more nuanced instructions.
For Caplan, the improvements were obvious. The bot gave clear answers to his questions, understanding principles it previously struggled with. It also scored perfect marks explaining and evaluating concepts that economists like Paul Krugman have championed.
"The only thing I can say is it just seems a lot better," Caplan said.
Caplan thought ChatGPT's training data might have picked up his previous blog post where he explained his answers, but colleagues told him this was highly unlikely.
He added that he's already fed the bot new tests it hadn't seen before, where it did even better than its previous 73% grade. "I was very smug in my judgment, and I'm not smug anymore," Caplan said.
Caplan is more confident he'll win his next AI-related wager. He has a bet with Eliezer Yudkowsky, an AI doomer who has sparred with Sam Altman, the creator of ChatGPT, that AI will lead to the end of the world before January 1, 2030.
"I'm probably going to lose this AI bet, but I am totally on board to do a bunch more end-of-the-world AI bets because I think these people are out of their minds," he said.
Tough to test
AI bots have caused headaches for examiners. Professors told Insider that plagiarism can be hard to prove with material from ChatGPT because there is no material evidence of wrongdoing.
Caplan said he's thinking of doing away with graded homework in the wake of ChatGPT's rise. He hopes his habit of regularly changing questions will be enough to stop students from learning and regurgitating ChatGPT's responses in exam settings.
Read the original article on Business Insider