What's Missing From LLM Chatbots: A Sense of Purpose

112

LLM-based chatbots’ capabilities have been advancing every month. These improvements are mostly measured by benchmarks like MMLU, HumanEval, and MATH (e.g. sonnet 3.5, gpt-4o). However, as these measures get more and more saturated, is user experience increasing in proportion to these scores? If we envision a future

10 months ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from The Gradient

AGI Is Not Multimodal

"In projecting language back as the model for thought, we lose sight of the tacit embodied understanding that undergirds our intelligence." –Terry Winograd The recent successes of generative AI models have convinced some that AGI is imminent. While these models appear to capture the essence of human

a month ago • 23 votes

Shape, Symmetries, and Structure: The Changing Role of Mathematics in Machine Learning Research

What is the Role of Mathematics in Modern Machine Learning? The past decade has witnessed a shift in how progress is made in machine learning. Research involving carefully designed and mathematically principled architectures result in only marginal improvements while compute-intensive and engineering-first efforts that scale to ever larger training sets

8 months ago • 85 votes

We Need Positive Visions for AI Grounded in Wellbeing

Introduction Imagine yourself a decade ago, jumping directly into the present shock of conversing naturally with an encyclopedic AI that crafts images, writes code, and debates philosophy. Won’t this technology almost certainly transform society — and hasn’t AI’s impact on us so far been

12 months ago • 115 votes

Financial Market Applications of LLMs

The AI revolution drove frenzied investment in both private and public companies and captured the public’s imagination in 2023. Transformational consumer products like ChatGPT are powered by Large Language Models (LLMs) that excel at modeling sequences of tokens that represent words or parts of words [2]. Amazingly, structural

a year ago • 111 votes

More in AI

AI Roundup 129: Personal Superintelligence

August 1, 2025.

yesterday • 5 votes

You're invited to the ML for SWEs Discord server!

I’ve set up a Discord server for paid machine learning for software engineer subscribers, so you can have access to all of my curated feeds that I use to keep up with AI resources, news and jobs. This is also a private space for paid subscribers to chat, learn, network, and get feedback from one another.

yesterday • 3 votes

Ads are inevitable in AI, and that's okay

Convergent evolution in LLMs will get us there

5 days ago • 11 votes

The Bitter Lesson versus The Garbage Can

Does process matter? We are about to find out.

5 days ago • 20 votes

The US Should Run Faster on AI Instead of Trying to Trip Up China

Export Controls and Trump's New AI Action Plan

a week ago • 12 votes

New here?