More from Xena
A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →
My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsense by linking to … Continue reading →
So the big news this week is that o3, OpenAI's new language model, got 25% on FrontierMath. Let's start by explaining what this means. Continue reading →
So I'm two months into trying to teach a proof of Fermat's Last Theorem to a computer. We already have one interesting story, which I felt was worth sharing. Continue reading →
More in AI
Don’t believe everything you see
One model appears closer to the center than the rest