Wednesday, August 23, 2023
Can ChatGPT-4 Really Do Tax?
Andrew Blair-Stanek (Maryland; Google Scholar), Nils Holzenberger (Institut Polytechnique de Paris) & Benjamin Van Durme (Johns Hopkins; Google Scholar), OpenAI Cribbed Our Tax Example, But Can GPT-4 Really Do Tax?, 180 Tax Notes Fed. 1101 (Aug. 14, 2023):
In the livestream introducing GPT-4, OpenAI used one of our SARA [acronym for StAtutory Reasoning Assessment] tax cases verbatim, describing it as a real tax example, even though SARA is a simplified academic data set. In the demo, OpenAI also used our heavily edited SARA version of the IRC. OpenAI incorrectly thought GPT-4 had correctly calculated the tax liability because its answer matched the SARA answer, although our IRC edits change the result from the actual IRC. We tested GPT-4 on the entire SARA data set. It gets tax liabilities exactly right around one-third of the time and miscalculates tax liabilities by over 10 percent nearly a quarter of the time. GPT-4 often misreads even our simplified version of the IRC. In the livestream, the presenter warned, “You should always check with your tax adviser.” Wise advice. ...
GPT-4 is a remarkable model, able to take raw tax statutes and facts and correctly calculate the tax liability around one-third of the time — a large advance over what we thought possible just a few years ago. There is a large community of computer scientists working to expand the usefulness and power of these models. We are optimistic that future models will be able to help proactively find tax minimization strategies.
Prior TaxProf Blog coverage:
- Andrew Blair-Stanek, Nils Holzenberger & Benjamin Van Durme, Shelter Check: Proactively Finding Tax Minimization Strategies via AI, 177 Tax Notes Fed. 1515 (Dec. 12, 2022):
https://taxprof.typepad.com/taxprof_blog/2023/08/can-chatgpt-4-really-do-tax.html