Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
19
Our mission is to ensure that AGI (Artificial General Intelligence) benefits all of humanity.  Systems that start to point to AGI* are coming into view, and so we think it’s important to understand the moment we are in. AGI is a weakly defined term, but generally speaking we mean it to be a system that can tackle increasingly complex problems, at human level, in many fields. People are tool-builders with an inherent drive to understand and create, which leads to the world getting better for all of us. Each new generation builds upon the discoveries of the generations before to create even more capable tools—electricity, the transistor, the computer, the internet, and soon AGI. Over time, in fits and starts, the steady march of human innovation has brought previously unimaginable levels of prosperity and improvements to almost every aspect of people’s lives. In some sense, AGI is just another tool in this ever-taller scaffolding of human progress we are building together. In another...
a month ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Sam Altman

Reflections

The second birthday of ChatGPT was only a little over a month ago, and now we have transitioned into the next paradigm of models that can do complex reasoning. New years get people in a reflective mood, and I wanted to share some personal thoughts about how it has gone so far, and some of the things I’ve learned along the way. As we get closer to AGI, it feels like an important time to look at the progress of our company. There is still so much to understand, still so much we don’t know, and it’s still so early. But we know a lot more than we did when we started. We started OpenAI almost nine years ago because we believed that AGI was possible, and that it could be the most impactful technology in human history. We wanted to figure out how to build it and make it broadly beneficial; we were excited to try to make our mark on history. Our ambitions were extraordinarily high and so was our belief that the work might benefit society in an equally extraordinary way. At the time, very few people cared, and if they did, it was mostly because they thought we had no chance of success. In 2022, OpenAI was a quiet research lab working on something temporarily called “Chat With GPT-3.5”. (We are much better at research than we are at naming things.) We had been watching people use the playground feature of our API and knew that developers were really enjoying talking to the model. We thought building a demo around that experience would show people something important about the future and help us make our models better and safer. We ended up mercifully calling it ChatGPT instead, and launched it on November 30th of 2022. We always knew, abstractly, that at some point we would hit a tipping point and the AI revolution would get kicked off. But we didn’t know what the moment would be. To our surprise, it turned out to be this. The launch of ChatGPT kicked off a growth curve like nothing we have ever seen—in our company, our industry, and the world broadly. We are finally seeing some of the massive upside we have always hoped for from AI, and we can see how much more will come soon. It hasn’t been easy. The road hasn’t been smooth and the right choices haven’t been obvious. In the last two years, we had to build an entire company, almost from scratch, around this new technology. There is no way to train people for this except by doing it, and when the technology category is completely new, there is no one at all who can tell you exactly how it should be done. Building up a company at such high velocity with so little training is a messy process. It’s often two steps forward, one step back (and sometimes, one step forward and two steps back). Mistakes get corrected as you go along, but there aren’t really any handbooks or guideposts when you’re doing original work. Moving at speed in uncharted waters is an incredible experience, but it is also immensely stressful for all the players. Conflicts and misunderstanding abound. These years have been the most rewarding, fun, best, interesting, exhausting, stressful, and—particularly for the last two—unpleasant years of my life so far. The overwhelming feeling is gratitude; I know that someday I’ll be retired at our ranch watching the plants grow, a little bored, and will think back at how cool it was that I got to do the work I dreamed of since I was a little kid. I try to remember that on any given Friday, when seven things go badly wrong by 1 pm. A little over a year ago, on one particular Friday, the main thing that had gone wrong that day was that I got fired by surprise on a video call, and then right after we hung up the board published a blog post about it. I was in a hotel room in Las Vegas. It felt, to a degree that is almost impossible to explain, like a dream gone wrong. Getting fired in public with no warning kicked off a really crazy few hours, and a pretty crazy few days. The “fog of war” was the strangest part. None of us were able to get satisfactory answers about what had happened, or why.  The whole event was, in my opinion, a big failure of governance by well-meaning people, myself included. Looking back, I certainly wish I had done things differently, and I’d like to believe I’m a better, more thoughtful leader today than I was a year ago. I also learned the importance of a board with diverse viewpoints and broad experience in managing a complex set of challenges. Good governance requires a lot of trust and credibility. I appreciate the way so many people worked together to build a stronger system of governance for OpenAI that enables us to pursue our mission of ensuring that AGI benefits all of humanity. My biggest takeaway is how much I have to be thankful for and how many people I owe gratitude towards: to everyone who works at OpenAI and has chosen to spend their time and effort going after this dream, to friends who helped us get through the crisis moments, to our partners and customers who supported us and entrusted us to enable their success, and to the people in my life who showed me how much they cared. [1] We all got back to the work in a more cohesive and positive way and I’m very proud of our focus since then. We have done what is easily some of our best research ever. We grew from about 100 million weekly active users to more than 300 million. Most of all, we have continued to put technology out into the world that people genuinely seem to love and that solves real problems. Nine years ago, we really had no idea what we were eventually going to become; even now, we only sort of know. AI development has taken many twists and turns and we expect more in the future. Some of the twists have been joyful; some have been hard. It’s been fun watching a steady stream of research miracles occur, and a lot of naysayers have become true believers. We’ve also seen some colleagues split off and become competitors. Teams tend to turn over as they scale, and OpenAI scales really fast. I think some of this is unavoidable—startups usually see a lot of turnover at each new major level of scale, and at OpenAI numbers go up by orders of magnitude every few months. The last two years have been like a decade at a normal company. When any company grows and evolves so fast, interests naturally diverge. And when any company in an important industry is in the lead, lots of people attack it for all sorts of reasons, especially when they are trying to compete with it. Our vision won’t change; our tactics will continue to evolve. For example, when we started we had no idea we would have to build a product company; we thought we were just going to do great research. We also had no idea we would need such a crazy amount of capital. There are new things we have to go build now that we didn’t understand a few years ago, and there will be new things in the future we can barely imagine now.  We are proud of our track-record on research and deployment so far, and are committed to continuing to advance our thinking on safety and benefits sharing. We continue to believe that the best way to make an AI system safe is by iteratively and gradually releasing it into the world, giving society time to adapt and co-evolve with the technology, learning from experience, and continuing to make the technology safer. We believe in the importance of being world leaders on safety and alignment research, and in guiding that research with feedback from real world applications. We are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents “join the workforce” and materially change the output of companies. We continue to believe that iteratively putting great tools in the hands of people leads to great, broadly-distributed outcomes. We are beginning to turn our aim beyond that, to superintelligence in the true sense of the word. We love our current products, but we are here for the glorious future. With superintelligence, we can do anything else. Superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own, and in turn massively increase abundance and prosperity. This sounds like science fiction right now, and somewhat crazy to even talk about it. That’s alright—we’ve been there before and we’re OK with being there again. We’re pretty confident that in the next few years, everyone will see what we see, and that the need to act with great care, while still maximizing broad benefit and empowerment, is so important. Given the possibilities of our work, OpenAI cannot be a normal company. How lucky and humbling it is to be able to play a role in this work. (Thanks to Josh Tyrangiel for sort of prompting this. I wish we had had a lot more time.) [1] There were a lot of people who did incredible and gigantic amounts of work to help OpenAI, and me personally, during those few days, but two people stood out from all others. Ron Conway and Brian Chesky went so far above and beyond the call of duty that I’m not even sure how to describe it. I’ve of course heard stories about Ron’s ability and tenaciousness for years and I’ve spent a lot of time with Brian over the past couple of years getting a huge amount of help and advice. But there’s nothing quite like being in the foxhole with people to see what they can really do. I am reasonably confident OpenAI would have fallen apart without their help; they worked around the clock for days until things were done. Although they worked unbelievably hard, they stayed calm and had clear strategic thought and great advice throughout. They stopped me from making several mistakes and made none themselves. They used their vast networks for everything needed and were able to navigate many complex situations. And I’m sure they did a lot of things I don’t know about. What I will remember most, though, is their care, compassion, and support. I thought I knew what it looked like to support a founder and a company, and in some small sense I did. But I have never before seen, or even heard of, anything like what these guys did, and now I get more fully why they have the legendary status they do. They are different and both fully deserve their genuinely unique reputations, but they are similar in their remarkable ability to move mountains and help, and in their unwavering commitment in times of need. The tech industry is far better off for having both of them in it. There are others like them; it is an amazingly special thing about our industry and does much more to make it all work than people realize. I look forward to paying it forward. On a more personal note, thanks especially to Ollie for his support that weekend and always; he is incredible in every way and no one could ask for a better partner.

2 months ago 65 votes
GPT-4o

There are two things from our announcement today I wanted to highlight. First, a key part of our mission is to put very capable AI tools in the hands of people for free (or at a great price). I am very proud that we’ve made the best model in the world available for free in ChatGPT, without ads or anything like that.  Our initial conception when we started OpenAI was that we’d create AI and use it to create all sorts of benefits for the world. Instead, it now looks like we’ll create AI and then other people will use it to create all sorts of amazing things that we all benefit from.  We are a business and will find plenty of things to charge for, and that will help us provide free, outstanding AI service to (hopefully) billions of people.  Second, the new voice (and video) mode is the best computer interface I’ve ever used. It feels like AI from the movies; and it’s still a bit surprising to me that it’s real. Getting to human-level response times and expressiveness turns out to be a big change. The original ChatGPT showed a hint of what was possible with language interfaces; this new thing feels viscerally different. It is fast, smart, fun, natural, and helpful. Talking to a computer has never felt really natural for me; now it does. As we add (optional) personalization, access to your information, the ability to take actions on your behalf, and more, I can really see an exciting future where we are able to use computers to do much more than ever before. Finally, huge thanks to the team that poured so much work into making this happen!

10 months ago 147 votes
What I Wish Someone Had Told Me

Optimism, obsession, self-belief, raw horsepower and personal connections are how things get started. Cohesive teams, the right combination of calmness and urgency, and unreasonable commitment are how things get finished. Long-term orientation is in short supply; try not to worry about what people think in the short term, which will get easier over time. It is easier for a team to do a hard thing that really matters than to do an easy thing that doesn’t really matter; audacious ideas motivate people. Incentives are superpowers; set them carefully. Concentrate your resources on a small number of high-conviction bets; this is easy to say but evidently hard to do. You can delete more stuff than you think. Communicate clearly and concisely. Fight bullshit and bureaucracy every time you see it and get other people to fight it too. Do not let the org chart get in the way of people working productively together. Outcomes are what count; don’t let good process excuse bad results. Spend more time recruiting. Take risks on high-potential people with a fast rate of improvement. Look for evidence of getting stuff done in addition to intelligence. Superstars are even more valuable than they seem, but you have to evaluate people on their net impact on the performance of the organization. Fast iteration can make up for a lot; it’s usually ok to be wrong if you iterate quickly. Plans should be measured in decades, execution should be measured in weeks. Don’t fight the business equivalent of the laws of physics. Inspiration is perishable and life goes by fast. Inaction is a particularly insidious type of risk. Scale often has surprising emergent properties. Compounding exponentials are magic. In particular, you really want to build a business that gets a compounding advantage with scale. Get back up and keep going. Working with great people is one of the best parts of life.

a year ago 100 votes
Helion Needs You

Helion has been progressing even faster than I expected and is on pace in 2024 to 1) demonstrate Q > 1 fusion and 2) resolve all questions needed to design a mass-producible fusion generator. The goals of the company are quite ambitious—clean, continuous energy for 1 cent per kilowatt-hour, and the ability to manufacture enough power plants to satisfy the current electrical demand of earth in a ten year period. If both things happen, it will transform the world. Abundant, clean, and radically inexpensive energy will elevate the quality of life for all of us—think about how much the cost of energy factors into what we do and use. Also, electricity at this price will allow us to do things like efficiently capture carbon (so although we’ll still rely on gasoline for awhile, it’ll be ok). Although Helion’s scientific progress of the past 8 years is phenomenal and necessary, it is not sufficient to rapidly get to this new energy economy. Helion now needs to figure out how to engineer machines that don’t break, how to build a factory and supply chain capable of manufacturing a machine every day, how to work with power grids and governments around the world, and more. The biggest input to the degree and speed of success at the company is now the talent of the people who join the team. Here are a few of the most critical jobs, but please don’t let the lack of a perfect fit deter you from applying. Electrical Engineer, Low Voltage: https://boards.greenhouse.io/helionenergy/jobs/4044506005 Electrical Engineer, Pulsed Power: https://boards.greenhouse.io/helionenergy/jobs/4044510005 Mechanical Engineer, Generator Systems: https://boards.greenhouse.io/helionenergy/jobs/4044522005 Manager of Mechanical Engineering: https://boards.greenhouse.io/helionenergy/jobs/4044521005 (All current jobs: https://www.helionenergy.com/careers/)

over a year ago 31 votes

More in AI

On (Not) Feeling the AGI

Ben Thompson interviewed Sam Altman recently about building a consumer tech company, and about the history of OpenAI.

21 hours ago 2 votes
Musk, Grok, and “rigorous adherence to truth“

Elon Musk, yesterday: “Rigorous adherence to truth is the only way to build safe Al.”

12 hours ago 1 votes
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle "stop-and-go" waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient flow-smoothing controllers, we built fast, data-driven simulations that RL agents interact with, learning to maximize energy efficiency while maintaining throughput and operating safely around human drivers. Overall, a small proportion of well-controlled autonomous vehicles (AVs) is enough to significantly improve traffic flow and fuel efficiency for all drivers on the road. Moreover, the trained controllers are designed to be deployable on most modern vehicles, operating in a decentralized manner and relying on standard radar sensors. In our latest paper, we explore the challenges of deploying RL controllers on a large-scale, from simulation to the field, during this 100-car experiment. The challenges of phantom jams A stop-and-go wave moving backwards through highway traffic. If you drive, you’ve surely experienced the frustration of stop-and-go waves, those seemingly inexplicable traffic slowdowns that appear out of nowhere and then suddenly clear up. These waves are often caused by small fluctuations in our driving behavior that get amplified through the flow of traffic. We naturally adjust our speed based on the vehicle in front of us. If the gap opens, we speed up to keep up. If they brake, we also slow down. But due to our nonzero reaction time, we might brake just a bit harder than the vehicle in front. The next driver behind us does the same, and this keeps amplifying. Over time, what started as an insignificant slowdown turns into a full stop further back in traffic. These waves move backward through the traffic stream, leading to significant drops in energy efficiency due to frequent accelerations, accompanied by increased CO2 emissions and accident risk. And this isn’t an isolated phenomenon! These waves are ubiquitous on busy roads when the traffic density exceeds a critical threshold. So how can we address this problem? Traditional approaches like ramp metering and variable speed limits attempt to manage traffic flow, but they often require costly infrastructure and centralized coordination. A more scalable approach is to use AVs, which can dynamically adjust their driving behavior in real-time. However, simply inserting AVs among human drivers isn’t enough: they must also drive in a smarter way that makes traffic better for everyone, which is where RL comes in. Fundamental diagram of traffic flow. The number of cars on the road (density) affects how much traffic is moving forward (flow). At low density, adding more cars increases flow because more vehicles can pass through. But beyond a critical threshold, cars start blocking each other, leading to congestion, where adding more cars actually slows down overall movement. Reinforcement learning for wave-smoothing AVs RL is a powerful control approach where an agent learns to maximize a reward signal through interactions with an environment. The agent collects experience through trial and error, learns from its mistakes, and improves over time. In our case, the environment is a mixed-autonomy traffic scenario, where AVs learn driving strategies to dampen stop-and-go waves and reduce fuel consumption for both themselves and nearby human-driven vehicles. Training these RL agents requires fast simulations with realistic traffic dynamics that can replicate highway stop-and-go behavior. To achieve this, we leveraged experimental data collected on Interstate 24 (I-24) near Nashville, Tennessee, and used it to build simulations where vehicles replay highway trajectories, creating unstable traffic that AVs driving behind them learn to smooth out. Simulation replaying a highway trajectory that exhibits several stop-and-go waves. We designed the AVs with deployment in mind, ensuring that they can operate using only basic sensor information about themselves and the vehicle in front. The observations consist of the AV’s speed, the speed of the leading vehicle, and the space gap between them. Given these inputs, the RL agent then prescribes either an instantaneous acceleration or a desired speed for the AV. The key advantage of using only these local measurements is that the RL controllers can be deployed on most modern vehicles in a decentralized way, without requiring additional infrastructure. Reward design The most challenging part is designing a reward function that, when maximized, aligns with the different objectives that we desire the AVs to achieve: Wave smoothing: Reduce stop-and-go oscillations. Energy efficiency: Lower fuel consumption for all vehicles, not just AVs. Safety: Ensure reasonable following distances and avoid abrupt braking. Driving comfort: Avoid aggressive accelerations and decelerations. Adherence to human driving norms: Ensure a “normal” driving behavior that doesn’t make surrounding drivers uncomfortable. Balancing these objectives together is difficult, as suitable coefficients for each term must be found. For instance, if minimizing fuel consumption dominates the reward, RL AVs learn to come to a stop in the middle of the highway because that is energy optimal. To prevent this, we introduced dynamic minimum and maximum gap thresholds to ensure safe and reasonable behavior while optimizing fuel efficiency. We also penalized the fuel consumption of human-driven vehicles behind the AV to discourage it from learning a selfish behavior that optimizes energy savings for the AV at the expense of surrounding traffic. Overall, we aim to strike a balance between energy savings and having a reasonable and safe driving behavior. Simulation results Illustration of the dynamic minimum and maximum gap thresholds, within which the AV can operate freely to smooth traffic as efficiently as possible. The typical behavior learned by the AVs is to maintain slightly larger gaps than human drivers, allowing them to absorb upcoming, possibly abrupt, traffic slowdowns more effectively. In simulation, this approach resulted in significant fuel savings of up to 20% across all road users in the most congested scenarios, with fewer than 5% of AVs on the road. And these AVs don’t have to be special vehicles! They can simply be standard consumer cars equipped with a smart adaptive cruise control (ACC), which is what we tested at scale. Smoothing behavior of RL AVs. Red: a human trajectory from the dataset. Blue: successive AVs in the platoon, where AV 1 is the closest behind the human trajectory. There is typically between 20 and 25 human vehicles between AVs. Each AV doesn’t slow down as much or accelerate as fast as its leader, leading to decreasing wave amplitude over time and thus energy savings. 100 AV field test: deploying RL at scale Our 100 cars parked at our operational center during the experiment week. Given the promising simulation results, the natural next step was to bridge the gap from simulation to the highway. We took the trained RL controllers and deployed them on 100 vehicles on the I-24 during peak traffic hours over several days. This large-scale experiment, which we called the MegaVanderTest, is the largest mixed-autonomy traffic-smoothing experiment ever conducted. Before deploying RL controllers in the field, we trained and evaluated them extensively in simulation and validated them on the hardware. Overall, the steps towards deployment involved: Training in data-driven simulations: We used highway traffic data from I-24 to create a training environment with realistic wave dynamics, then validate the trained agent’s performance and robustness in a variety of new traffic scenarios. Deployment on hardware: After being validated in robotics software, the trained controller is uploaded onto the car and is able to control the set speed of the vehicle. We operate through the vehicle’s on-board cruise control, which acts as a lower-level safety controller. Modular control framework: One key challenge during the test was not having access to the leading vehicle information sensors. To overcome this, the RL controller was integrated into a hierarchical system, the MegaController, which combines a speed planner guide that accounts for downstream traffic conditions, with the RL controller as the final decision maker. Validation on hardware: The RL agents were designed to operate in an environment where most vehicles were human-driven, requiring robust policies that adapt to unpredictable behavior. We verify this by driving the RL-controlled vehicles on the road under careful human supervision, making changes to the control based on feedback. Each of the 100 cars is connected to a Raspberry Pi, on which the RL controller (a small neural network) is deployed. The RL controller directly controls the onboard adaptive cruise control (ACC) system, setting its speed and desired following distance. Once validated, the RL controllers were deployed on 100 cars and driven on I-24 during morning rush hour. Surrounding traffic was unaware of the experiment, ensuring unbiased driver behavior. Data was collected during the experiment from dozens of overhead cameras placed along the highway, which led to the extraction of millions of individual vehicle trajectories through a computer vision pipeline. Metrics computed on these trajectories indicate a trend of reduced fuel consumption around AVs, as expected from simulation results and previous smaller validation deployments. For instance, we can observe that the closer people are driving behind our AVs, the less fuel they appear to consume on average (which is calculated using a calibrated energy model): Average fuel consumption as a function of distance behind the nearest engaged RL-controlled AV in the downstream traffic. As human drivers get further away behind AVs, their average fuel consumption increases. Another way to measure the impact is to measure the variance of the speeds and accelerations: the lower the variance, the less amplitude the waves should have, which is what we observe from the field test data. Overall, although getting precise measurements from a large amount of camera video data is complicated, we observe a trend of 15 to 20% of energy savings around our controlled cars. Data points from all vehicles on the highway over a single day of the experiment, plotted in speed-acceleration space. The cluster to the left of the red line represents congestion, while the one on the right corresponds to free flow. We observe that the congestion cluster is smaller when AVs are present, as measured by computing the area of a soft convex envelope or by fitting a Gaussian kernel. Final thoughts The 100-car field operational test was decentralized, with no explicit cooperation or communication between AVs, reflective of current autonomy deployment, and bringing us one step closer to smoother, more energy-efficient highways. Yet, there is still vast potential for improvement. Scaling up simulations to be faster and more accurate with better human-driving models is crucial for bridging the simulation-to-reality gap. Equipping AVs with additional traffic data, whether through advanced sensors or centralized planning, could further improve the performance of the controllers. For instance, while multi-agent RL is promising for improving cooperative control strategies, it remains an open question how enabling explicit communication between AVs over 5G networks could further improve stability and further mitigate stop-and-go waves. Crucially, our controllers integrate seamlessly with existing adaptive cruise control (ACC) systems, making field deployment feasible at scale. The more vehicles equipped with smart traffic-smoothing control, the fewer waves we’ll see on our roads, meaning less pollution and fuel savings for everyone! Many contributors took part in making the MegaVanderTest happen! The full list is available on the CIRCLES project page, along with more details about the project. Read more: [paper]

yesterday 4 votes
More on Various AI Action Plans

Last week I covered Anthropic’s relatively strong submission, and OpenAI’s toxic submission. This week I cover several other submissions, and do some follow-up on OpenAI’s entry.

2 days ago 2 votes
Is There a Difference Between Calculation and Computation?

Recently I’ve been producing (for my own amusement) example Curta calculations. One motivation was arguing if a proposed solution method for Dudeney’s digits problem was something that could in fact have been easily executed in 1924. This got me thinking, is there an actual difference between calculation and computation? In […]

3 days ago 5 votes