An Empirical Study on Challenging Math Problem Solving with LLM-based Conversational Agents
Open Access
Author:
Wu, Yiran
Graduate Program:
Informatics
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
September 15, 2024
Committee Members:
Qingyun Wu, Thesis Advisor/Co-Advisor Lu Lin, Committee Member Dongwon Lee, Professor in Charge/Director of Graduate Studies Fenglong Ma, Committee Member
Keywords:
LLM application LLM Tool-use AI for Science
Abstract:
The application of Large Language Models (LLMs) in solving mathematical problems expressed in natural language is a promising area of research, especially given their potential to serve as foundational models across various domains. This paper explores the efficacy of utilizing conversational agents powered by LLMs for tackling complex mathematical challenges. We evaluate various ways of solving math problems. We also introduce \MathChat, a framework designed for problem-solving through dialogue, consisting of an LLM agent and a user proxy agent responsible for executing tools and providing additional guidance. This collaborative setup enables a dynamic problem-solving process, where the agents iteratively discuss and refine solutions.
Our evaluation focuses on challenging high school-level competition problems from the MATH dataset. Using Python-based tool integration, \MathChat demonstrates a 6\% improvement over previous prompting methods that incorporate tool usage and 16\% over the vanilla prompt. The results highlight the potential of \MathChat to enhance mathematical problem-solving capabilities in LLMs, offering insights into effective prompting techniques and the benefits of a conversational approach.