More from Xena
A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →
Undergraduate mathematicians usually have a hard time defining functions from quotients in Lean, because they have been taught a specific model for quotients in their classes, which is not the model that Lean uses. This post is an attempt to … Continue reading →
So the big news this week is that o3, OpenAI's new language model, got 25% on FrontierMath. Let's start by explaining what this means. Continue reading →
So I'm two months into trying to teach a proof of Fermat's Last Theorem to a computer. We already have one interesting story, which I felt was worth sharing. Continue reading →
More in AI
Don’t believe everything you see
One model appears closer to the center than the rest