Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
21
So the big news this week is that o3, OpenAI's new language model, got 25% on FrontierMath. Let's start by explaining what this means. Continue reading →
4 months ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Xena

Think of a number: an update

A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →

a month ago 18 votes
What is a quotient?

Undergraduate mathematicians usually have a hard time defining functions from quotients in Lean, because they have been taught a specific model for quotients in their classes, which is not the model that Lean uses. This post is an attempt to … Continue reading →

2 months ago 32 votes
Think of a number.

My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsense by linking to … Continue reading →

3 months ago 25 votes
Fermat’s Last Theorem — how it’s going

So I'm two months into trying to teach a proof of Fermat's Last Theorem to a computer. We already have one interesting story, which I felt was worth sharing. Continue reading →

4 months ago 22 votes

More in AI

AI #113: The o3 Era Begins

Enjoy it while it lasts.

8 hours ago 2 votes
OpenAI’s dirty December o3 demo doesn’t readily replicate

Don’t believe everything you see

yesterday 3 votes
o3 Is a Lying Liar

I love o3.

yesterday 3 votes
New Results of State-of-the-art LLMs on 4 Political Orientation Tests

One model appears closer to the center than the rest

2 days ago 4 votes
You Better Mechanize

Or you had better not.

2 days ago 3 votes