Does 2026 AI exhibit intelligence, or can Claude outsmart Pierre or Catherine ?
Abstract
Using a sequence of high-school level mathematics questions that were not available on the Internet, we compare the performance of the popular AI software Claude with that of my friends and fellow human beings Pierre and Catherine. Pierre had solid scientific training as a young man, while Catherine studied literature. All three were subjected to a simulated pre-calculus oral exam with main questions and follow-up questions. Their performances are compared and the ones with the best and worst performances are identified. The outcome is that the current version of Claude, even though it is an extremely useful tool that has probably recorded the solution to nearly all calculus questions that are available on the Internet, exhibits only a very limited understanding of the subject and does not exhibit the ability to make intelligent connections between different features of a pre-calculus mathematics problem that it has never seen before.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.