BYU News, ChatGPT Can’t Ace This Test, But Experts Think It Soon Will … What It Means For Teaching:
Last month, OpenAI launched its newest AI chatbot product, GPT-4. According to the folks at OpenAI, the bot, which uses machine learning to generate natural language text, passed the bar exam with a score in the 90th percentile, passed 13 of 15 AP exams and got a nearly perfect score on the GRE Verbal test.
Inquiring minds at BYU and 186 other universities wanted to know how OpenAI’s tech would fare on accounting exams. So, they put the original version, ChatGPT, to the test. The researchers say that while it still has work to do in the realm of accounting, it’s a game changer that will change the way everyone teaches and learns — for the better.
“When this technology first came out, everyone was worried that students could now use it to cheat,” said lead study author David Wood, a BYU professor of accounting. “But opportunities to cheat have always existed. So for us, we’re trying to focus on what we can do with this technology now that we couldn’t do before to improve the teaching process for faculty and the learning process for students. Testing it out was eye-opening.” ...
His co-author recruiting pitch on social media exploded: 327 co-authors from 186 educational institutions in 14 countries participated in the research, contributing 25,181 classroom accounting exam questions. They also recruited undergrad BYU students (including Wood’s daughter, Jessica) to feed another 2,268 textbook test bank questions to ChatGPT. The questions covered accounting information systems (AIS), auditing, financial accounting, managerial accounting and tax, and varied in difficulty and type (true/false, multiple choice, short answer, etc.).
Although ChatGPT’s performance was impressive, the students performed better. Students scored an overall average of 76.7%, compared to ChatGPT’s score of 47.4%. On a 11.3% of questions, ChatGPT scored higher than the student average, doing particularly well on AIS and auditing. But the AI bot did worse on tax, financial, and managerial assessments, possibly because ChatGPT struggled with the mathematical processes required for the latter type.
The ChatGPT Artificial Intelligence Chatbot: How Well Does It Answer Accounting Assessment Questions?:
ChatGPT, a language-learning model chatbot, has garnered considerable attention for its ability to respond to users’ questions. Using data from 14 countries and 186 institutions, we compare ChatGPT and student performance for 28,085 questions from accounting assessments and textbook test banks. As of January 2023, ChatGPT provides correct answers for 56.5 percent of questions and partially correct answers for an additional 9.4 percent of questions. When considering point values for questions, students significantly outperform ChatGPT with a 76.7 percent average on assessments compared to 47.5 percent for ChatGPT if no partial credit is awarded and 56.5 percent if partial credit is awarded. Still, ChatGPT performs better than the student average for 15.8 percent of assessments when we include partial credit. We provide evidence of how ChatGPT performs on different question types, accounting topics, class levels, open/closed assessments, and test bank questions. We also discuss implications for accounting education and research.
Accounting Today, ChatGPT Bombs Accounting Class