MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education

We present MathVC, the very first LLM-powered virtual classroom containing multiple LLM-simulated student characters, with whom a human student can practice their Mathematics Modeling skill. The effectiveness of the simulation approach was confirmed and the promise for MathVC to benefit real-life students in the future was shown in the paper.

An example demonstrating the application of MathVC, where students are presented a math problem (optionally with accompanied data) and engage in effective discussions on math modeling. Alice, Bob, and Charlie are LLM-simulated student characters. A human student can choose to respond or skip the current turn.

Abstract

Mathematical modeling (MM) is considered a fundamental skill for students in STEM disciplines. Practicing the MM skill is often the most effective when students can engage in group discussion and collaborative problem-solving. However, due to unevenly distributed teachers and educational resources needed to monitor such group activities, students do not always receive equal opportunities for this practice. Excitingly, large language models (LLMs) have recently demonstrated strong capability in both modeling mathematical problems and simulating characters with different traits and properties. Drawing inspiration from the advancement of LLMs, in this work, we present MATHVC, the very first LLM-powered virtual classroom containing multiple LLM-simulated student characters, with whom a human student can practice their MM skill. To encourage each LLM character’s behaviors to be aligned with their specified math-relevant properties (termed “characteristics alignment”) and the overall conversational procedure to be close to an authentic student MM discussion (termed “conversational procedural alignment”), we proposed three innovations: integrating MM domain knowledge into the simulation, defining symbolic schema as the ground for character simulation, and designing a meta planner at the platform level to drive the conversational procedure. Through experiments and ablation studies, we confirmed the effectiveness of our simulation approach and showed the promise for MATHVC to benefit real-life students in the future.

Updates

The demo and code are coming soon!
2024-04: ArXiv version is online. Please check out our preprint.

Motivations and Challenges

Mathematical modeling (MM) is a critical skill for students pursuing STEM fields. However, orchestrating such discussions and collaborative learning activities can be very challenging for teachers, especially for the marginalized communities. On the other hand, LLMs present strong math reasoning capabilities in understanding and solving mathematics problems and autonomous LLM-based generative agents have showcased the capacity to simulate human activities. Therefore, observing the pressing need to enhance MM education for resource-limited communities and getting inspired by the striking progress made by LLMs, in this work, we propose MathVC.

We identified two unique challenges in the development of MathVC. The first challenge is characteristic alignment, i.e., aligning an LLM’s character simulation to the authentic characteristics of real human students. The second challenge is conversational procedural alignment, i.e., aligning the overall conversational procedure to an authentic MM discussion among middle-school students.

MathVC Structure

Our system consists of two parts, i.e., meta planner for organizing the overall conversation and facilitating a smooth multi-stage student discussion, and character simulation for creating and updating individual student characters.
A key innovation of MathVC lies in the use of symbolic schemas describing how characters form and gradually update their understanding of the mathematics problem. Specifically, we design the Task Schema to describe elements that are necessary for solving a given MM task and the Character Schemato show a student character's understanding and modeling plan. An example is shown below. The LLM edits the correct value 16 into 15 in the task schema to simulate the initial mistake made by the simulated student Charlie.

Experiment

Human Evaluation

We compare MathVC with four baselines.

Vanilla Simulation: We directly present the LLM the math question, the student characteristics, and the dialogue history, and then instruct the LLM to generate responses as a middleschool student.

Domain-Specified Simulation: We augment the vanilla simulation with domain-specific knowledge.

w/ only character schema: We augment domain-specific simulation with the character schema.
w/ only the meta planner: We augment domain-specific simulation with the meta planner.

The annotators were instructed to rate each sampled dialogue on the degrees of characteristic alignment and conversational procedural alignment it reflected (1: worst; 4: best).

	Characteristic Alignment	Conversational Procedural Alignment
Vanilla Simulation	2.85	2.88
Domain-Specified Simulation	3.30	3.15
w/ only character schema	3.50	3.20
w/ only the meta planner	3.00	3.10
MathVC	3.73	3.530

Takeaways:

Incorporating domain knowledge improves the simulation quality.

Including character schema enhances characteristic alignment, but including meta planner enhances alignment only when used with character schema.

MathVC offers the most aligned simulation by harmoniously integrating character schema with meta planner.

Case Study

In the vanilla simulation, the discussion bypasses many discussion stages (e.g., establishing the team organization), and Alice, who is bad at mathematics, does not make any mistakes. The domain-specified simulation made some improvements, such as simulating responses that are shorter and can simulate how students establish shared task understanding before planning on problem-solving, although it is still unable to simulate a fully extended, multi-stage conversation (e.g., team organization is still missing). In MATHVC, the conversation successfully goes over all stages. It also simulates how Alice makes a calculation mistake initially, which is then corrected by Charlie. Like a real student, Alice then adjusts her modeling plan based on Charlie’s explanation.

Discussion and Future Work

Our research is motivated by the pressing need to alleviate the dependence on teachers to orchestrate collaborative student learning for mathematics modeling (MM), a critical skill in STEM fields. To the best of our knowledge, MathVC is the very first LLM-powered multi-character virtual classroom designed for this purpose. We envision that such a platform can eventually be deployed as effective take-home exercises for students practicing their MM skills. This is particularly important for marginalized communities where only limited teachers and educational resources are available, and our system has the potential to become a tool to promote educational equity. In addition, we also identify a few other benefits of how MathVC could assist students in learning. For example, MathVC can reduce the pressure and anxiety of students for taking peer discussions, thus increasing their opportunities to participate. This is especially beneficial for students who may be reluctant to participate due to shyness, low self-esteem, language barriers, or poor academic performance. Our system thus promotes inclusion, allowing a greater variety of students to participate, thereby increasing diversity and breadth. Finally, we note that teachers have the flexibility to configure the student characters according to each human student’s background and performance, which can thus enable personalized learning experiences. In the future, one could also extend our system to automatically optimize the character configuration based on the student’s past performance.

Despite all the promises discussed above, however, we still have some future works to do:

User friendly interface: Developing a student friendly interface.

Pilot study: Conducting a pilot study with real human students to obtain more feedbacks.

Collecting real-life data: Collecting more real-life data from student and finetune LLMs to enhance the alignments.

More characteristics simulation: Conducting extension of our system could be to enable richer characteristics simulation, including but not limited to gender, age, name, culture, language, and academic background.

Acknowledge

We were grateful to receive support from Microsoft’s Accelerating Foundation Models Research program (https://www.microsoft.com/en-us/research/collaboration/accelerating-foundation-models-research/) for conducting this research. We also appreciate computing resources from the Office of Research Computing (https://orc.gmu.edu) and funding support from the Department of Computer Science at GMU. W. Mifdal was sponsored by the Office of Student Creative Activities and Research (OSCAR) at GMU. We thank Dr. Anthony Eamonn Kelly (GMU, Educational Psychology) and Dr. Melissa A. Gallagher (University of Houston, Department of Curriculum & Instruction) for early discussions about the idea, and students in the GMU NLP group (https://nlp.cs.gmu.edu/) for their thoughtful comments.

BibTeX

If you find our paper helpful, please cite it as follows

@misc{yue2024mathvc,
      title={MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education},
      author={Murong Yue and Wijdane Mifdal and Yixuan Zhang and Jennifer Suh and Ziyu Yao},
      year={2024},
      eprint={2404.06711},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}