Using Large Language Models to Analyze Engagement in Computational Thinking via Computational Physics Essays

N. Sanjay Rebello

Using Large Language Models to Analyze Engagement in Computational Thinking via Computational Physics Essays

Abstract

As computational thinking (CT) becomes increasingly important to physics education, the need for authentic, project-based assessments has grown. While open-ended multimodal assignments, such as Computational Physics Essays (CPEs), help capture student reasoning and encourage active learning, they introduce a significant evaluation bottleneck. Manually grading these complex notebooks across a complex taxonomy of computational practices is resource-intensive and limits scalability in large-enrollment courses. In this study, we investigated the viability of using a multimodal Large Language Model (LLM) to automate the evaluation of 100 student-generated CPEs. Using a human-coded baseline, we systematically evaluated the model's capacity to detect student engagement across 20 distinct CT sub-practices and a holistic overall quality score. The results showed that the LLM performs very well on clearly defined tasks, achieving an 84% exact agreement with human raters on the binary sub-practices. However, more subjective constructs proved challenging, with the model reaching only a 71% agreement for the holistic quality analysis. Our findings demonstrated that while LLMs can reliably automate the detection of specific computational practices, subjective evaluation remains a hurdle.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…