This checks the output of LLM to be true or not in mathematics by using Octave to verify it. It did not work as expected using OpenAI's GPT-4o, but with other models, the results may change. (We hope ...
It did not work as expected using OpenAI's GPT-4o, but with other models, the results may change. (We hope it does too.) This is a experiment, and we look forward for people forking it or doing on ...
Title MathCheck: A Math Assistant based on a Combination of Computer Algebra Systems and SAT Solvers ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results