Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
21
A month or two ago I wrote this post which expressed my frustration with various issues around private datasets as a way of measuring the mathematical abilities of language models. More generally I was frustrated about the difficulty of being … Continue reading →
a month ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Xena

What is a quotient?

Undergraduate mathematicians usually have a hard time defining functions from quotients in Lean, because they have been taught a specific model for quotients in their classes, which is not the model that Lean uses. This post is an attempt to … Continue reading →

2 months ago 35 votes
Think of a number.

My feed was recently clogged up with news articles reporting that Sam Altman thinks that AGI is here, or will be here next year, or whatever. I will refrain from giving even more air to this nonsense by linking to … Continue reading →

3 months ago 29 votes
Can AI do maths yet? Thoughts from a mathematician.

So the big news this week is that o3, OpenAI's new language model, got 25% on FrontierMath. Let's start by explaining what this means. Continue reading →

4 months ago 24 votes
Fermat’s Last Theorem — how it’s going

So I'm two months into trying to teach a proof of Fermat's Last Theorem to a computer. We already have one interesting story, which I felt was worth sharing. Continue reading →

4 months ago 25 votes

More in AI

AI #115: The Evil Applications Division

It can be bleak out there, but the candor is very helpful, and you occasionally get a win.

17 hours ago 1 votes
”Everyone is cheating their way through college” with GenAI. Who should bear the costs?

Society is once again left holding the bag

yesterday 1 votes
OpenAI's $3B Bet

Unpacking OpenAI's latest acquisition of Windsurf.

yesterday 1 votes
How projects fail at large tech companies

How do projects fail at large tech companies? As I’ve said many times, failure means executives aren’t happy with how the project turned out. At healthy companies, that typically means that a sensible engineer wouldn’t be happy either, because the project didn’t work or users hated it. But what actually causes the projects to fail? I’ve seen a lot of projects go wrong - both up close and at a distance - in the last ten years. Here are the main reasons why. Doomed from the start Lots of projects fail because there’s no way they could possibly have succeeded. In American law, some cases get dismissed at “summary judgment”: even if the plaintiff succeeds in proving everything they aim to prove, it still wouldn’t add up to demonstrating enough illegal activity to win their case. At tech companies, some projects are like that: even if the plan goes off without a hitch, the project is still doomed to fail. Some doomed projects begin with over-ambitious plans. For instance, an executive…

yesterday 1 votes
Help me improve Society's Backend!

Two simple questions to help make Society's Backend better

2 days ago 1 votes