Back

Welcome to My Project Page

Hello again! I've been working on an AI model designed to tackle tough math competition questions from the AI Mathematical Olympiad (AIMO) Progress Prize 2, and I wanted to share some recent discoveries and ideas I've come across along the way.

Why This Matters to Me

I've always believed that advanced math problems offer a great test of an AI's reasoning capabilities. Participating in a Kaggle challenge that sets a target of at least 47 correct solutions out of 50 pushes me to think beyond just final answers. It's a chance to learn how an AI can break down problems, plan each step, and keep its reasoning on track.

Building on TIR Insights

My main starting point has been the NuminaMath TIR approach, which references methods outlined in the TORA paper. This paper discusses ways to integrate tool usage with a language model's natural reasoning style. Although the original system aims for agent-based problem solving, I've tried similar ideas in my own setup, with mixed results:

Data, Data, Data

  1. NuminaMath CoT & TIR
    • I started with the NuminaMath CoT dataset and TIR. These sets are great for math problems at the IMO level or slightly below.
    • Eventually realized they weren't enough for the hardest problems, so I searched for more challenging datasets.
  2. Omni Math & AoPS
    • Incorporated difficult samples from Omni Math at a certain complexity level, and classic problems from the AoPS dataset.
    • This helped broaden the range of math topics and problem structures.
  3. Synthetic Solutions & Model Diversity
    • I tried solutions from multiple open-source models, such as Macro-o1 and QwQ-32B Preview.
    • By asking each to solve the same question in different ways, I ended up with a more diverse training set.

Discovering Maisa AI

One interesting find has been Maisa AI. While exploring tool-based approaches, I tested Maisa's system on around 600 sample math problems and was impressed with its consistency. I plan to analyze its accuracy rate more carefully and see if any of its reasoning patterns can inform my own model's training strategy. Maisa's unique approach includes:

I'm considering ways to learn from these features—particularly the idea of validating partial solutions automatically.

Refining My Approach

Where Everything Stands

Next Steps

  1. Further Tool Experiments: Investigate how to embed code-runner or symbolic solvers directly into the workflow.
  2. Performance Analysis: Compare results with solutions from Maisa AI on selected test sets to see where my model might improve.

Thanks for checking out my work! If you're curious about any of these approaches or want to exchange ideas about AI-driven math solutions, feel free to get in touch. I'm excited to keep fine-tuning my system and seeing how far I can push its reasoning skills.