Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
20
In theory, one of the main applications for robots should be operating in environments that (for whatever reason) are too dangerous for humans. I say “in theory” because in practice it’s difficult to get robots to do useful stuff in semi-structured or unstructured environments without direct human supervision. This is why there’s been some emphasis recently on teleoperation: Human software teaming up with robot hardware can be a very effective combination. For this combination to work, you need two things. First, an intuitive control system that lets the user embody themselves in the robot to pilot it effectively. And second, a robot that can deliver on the kind of embodiment that the human pilot needs. The second bit is the more challenging, because humans have very high standards for mobility, strength, and dexterity. But researchers at the Italian Institute of Technology (IIT) have a system that manages to check both boxes, thanks to its enormously powerful quadruped, which now...
a month ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from IEEE Spectrum

Video Friday: Meet Mech, a Superhumanoid Robot

Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. European Robotics Forum: 25–27 March 2025, STUTTGART, GERMANY RoboSoft 2025: 23–26 April 2025, LAUSANNE, SWITZERLAND ICUAS 2025: 14–17 May 2025, CHARLOTTE, NC ICRA 2025: 19–23 May 2025, ATLANTA, GA London Humanoids Summit: 29–30 May 2025, LONDON IEEE RCAR 2025: 1–6 June 2025, TOYAMA, JAPAN 2025 Energy Drone & Robotics Summit: 16–18 June 2025, HOUSTON, TX RSS 2025: 21–25 June 2025, LOS ANGELES ETH Robotics Summer School: 21–27 June 2025, GENEVA IAS 2025: 30 June–4 July 2025, GENOA, ITALY ICRES 2025: 3–4 July 2025, PORTO, PORTUGAL IEEE World Haptics: 8–11 July 2025, SUWON, KOREA IFAC Symposium on Robotics: 15–18 July 2025, PARIS RoboCup 2025: 15–21 July 2025, BAHIA, BRAZIL Enjoy today’s videos! Every time you see a humanoid demo in a warehouse or factory, ask yourself: Would a “superhumanoid” like this actually be a better answer? [ Dexterity ] The only reason that this is the second video in Video Friday this week, and not the first, is because you’ve almost certainly already seen it. This is a collaboration between the Robotics and AI Institute and Boston Dynamics, and RAI has its own video, which is slightly different: - YouTube [ Boston Dynamics ] via [ RAI ] Well this just looks a little bit like magic. [ University of Pennsylvania Sung Robotics Lab ] After hours of dance battles with professional choreographers (yes, real human dancers!), PM01 now nails every iconic move from Kung Fu Hustle. [ EngineAI ] Sanctuary AI has demonstrated industry-leading sim-to-real transfer of learned dexterous manipulation policies for our unique, high degree-of-freedom, high strength, and high speed hydraulic hands. [ Sanctuary AI ] This video is “introducing BotQ, Figure’s new high-volume manufacturing facility for humanoid robots,” but I just see some injection molding and finishing of a few plastic parts. [ Figure ] DEEP Robotics recently showcased its “One-Touch Navigation” feature, enhancing the intelligent control experience of its robotic dog. This feature offers two modes: map-based point selection and navigation and video-based point navigation, designed for open terrains and confined spaces respectively. By simply typing on a tablet screen or selecting a point in the video feed, the robotic dog can autonomously navigate to the target point, automatically planning its path and intelligently avoiding obstacles, significantly improving traversal efficiency. What’s in the bags, though? [ Deep Robotics ] This hurts my knees to watch, in a few different ways. [ Unitree ] Why the recent obsession with two legs when instead robots could have six? So much cuter! [ Jizai ] via [ RobotStart ] The world must know: who killed Mini-Duck? [ Pollen ] Seven hours of Digit robots at work at ProMat. And there are two more days of these livestreams if you need more! [ Agility ]

5 days ago 5 votes
AlexNet Source Code Is Now Open Source

In partnership with Google, the Computer History Museum has released the source code to AlexNet, the neural network that in 2012 kickstarted today’s prevailing approach to AI. The source code is available as open source on CHM’s GitHub page. What Is AlexNet? AlexNet is an artificial neural network created to recognize the contents of photographic images. It was developed in 2012 by then University of Toronto graduate students Alex Krizhevsky and Ilya Sutskever and their faculty advisor, Geoffrey Hinton. The Origins of Deep Learning Hinton is regarded as one of the fathers of deep learning, the type of artificial intelligence that uses neural networks and is the foundation of today’s mainstream AI. Simple three-layer neural networks with only one layer of adaptive weights were first built in the late 1950s—most notably by Cornell researcher Frank Rosenblatt—but they were found to have limitations. [This explainer gives more details on how neural networks work.] In particular, researchers needed networks with more than one layer of adaptive weights, but there wasn’t a good way to train them. By the early 1970s, neural networks had been largely rejected by AI researchers. Frank Rosenblatt [left, shown with Charles W. Wightman] developed the first artificial neural network, the perceptron, in 1957.Division of Rare and Manuscript Collections/Cornell University Library In the 1980s, neural network research was revived outside the AI community by cognitive scientists at the University of California San Diego, under the new name of “connectionism.” After finishing his Ph.D. at the University of Edinburgh in 1978, Hinton had become a postdoctoral fellow at UCSD, where he collaborated with David Rumelhart and Ronald Williams. The three rediscovered the backpropagation algorithm for training neural networks, and in 1986 they published two papers showing that it enabled neural networks to learn multiple layers of features for language and vision tasks. Backpropagation, which is foundational to deep learning today, uses the difference between the current output and the desired output of the network to adjust the weights in each layer, from the output layer backward to the input layer. University of Toronto. Away from the centers of traditional AI, Hinton’s work and those of his graduate students made Toronto a center of deep learning research over the coming decades. One postdoctoral student of Hinton’s was Yann LeCun, now chief scientist at Meta. While working in Toronto, LeCun showed that when backpropagation was used in “convolutional” neural networks, they became very good at recognizing handwritten numbers. ImageNet and GPUs Despite these advances, neural networks could not consistently outperform other types of machine learning algorithms. They needed two developments from outside of AI to pave the way. The first was the emergence of vastly larger amounts of data for training, made available through the Web. The second was enough computational power to perform this training, in the form of 3D graphics chips, known as GPUs. By 2012, the time was ripe for AlexNet. Fei-Fei Li’s ImageNet image dataset, completed in 2009, was pivotal in training AlexNet. Here, Li [right] talks with Tom Kalil at the Computer History Museum.Douglas Fairbairn/Computer History Museum The data needed to train AlexNet was found in ImageNet, a project started and led by Stanford professor Fei-Fei Li. Beginning in 2006, and against conventional wisdom, Li envisioned a dataset of images covering every noun in the English language. She and her graduate students began collecting images found on the Internet and classifying them using a taxonomy provided by WordNet, a database of words and their relationships to each other. Given the enormity of their task, Li and her collaborators ultimately crowdsourced the task of labeling images to gig workers, using Amazon’s Mechanical Turk platform. competition in 2010 to encourage research teams to improve their image recognition algorithms. But over the next two years, the best systems only made marginal improvements. NVIDIA, cofounded by CEO Jensen Huang, had led the way in the 2000s in making GPUs more generalizable and programmable for applications beyond 3D graphics, especially with the CUDA programming system released in 2007. Both ImageNet and CUDA were, like neural networks themselves, fairly niche developments that were waiting for the right circumstances to shine. In 2012, AlexNet brought together these elements—deep neural networks, big datasets, and GPUs— for the first time, with pathbreaking results. Each of these needed the other. How AlexNet Was Created By the late 2000s, Hinton’s grad students at the University of Toronto were beginning to use GPUs to train neural networks for both image and speech recognition. Their first successes came in speech recognition, but success in image recognition would point to deep learning as a possible general-purpose solution to AI. One student, Ilya Sutskever, believed that the performance of neural networks would scale with the amount of data available, and the arrival of ImageNet provided the opportunity. In 2011, Sutskever convinced fellow grad student Alex Krizhevsky, who had a keen ability to wring maximum performance out of GPUs, to train a convolutional neural network for ImageNet, with Hinton serving as principal investigator. AlexNet used NVIDIA GPUs running CUDA code trained on the ImageNet dataset. NVIDIA CEO Jensen Huang was named a 2024 CHM Fellow for his contributions to computer graphics chips and AI.Douglas Fairbairn/Computer History Museum Krizhevsky had already written CUDA code for a convolutional neural network using NVIDIA GPUs, called cuda-convnet, trained on the much smaller CIFAR-10 image dataset. He extended cuda-convnet with support for multiple GPUs and other features and retrained it on ImageNet. The training was done on a computer with two NVIDIA cards in Krizhevsky’s bedroom at his parents’ house. Over the course of the next year, he constantly tweaked the network’s parameters and retrained it until it achieved performance superior to its competitors. The network would ultimately be named AlexNet, after Krizhevsky. Geoff Hinton summed up the AlexNet project this way: “Ilya thought we should do it, Alex made it work, and I got the Nobel prize.” Krizhevsky, Sutskever, and Hinton wrote a paper on AlexNet that was published in the fall of 2012 and presented by Krizhevsky at a computer vision conference in Florence, Italy, in October. Veteran computer vision researchers weren’t convinced, but LeCun, who was at the meeting, pronounced it a turning point for AI. He was right. Before AlexNet, almost none of the leading computer vision papers used neural nets. After it, almost all of them would. synthesize believable human voices, beat champion Go players, and generate artwork, culminating with the release of ChatGPT in November 2022 by OpenAI, a company cofounded by Sutskever. Releasing the AlexNet Source Code In 2020, I reached out to Krizhevsky to ask about the possibility of allowing CHM to release the AlexNet source code, due to its historical significance. He connected me to Hinton, who was working at Google at the time. Google owned AlexNet, having acquired DNNresearch, the company owned by Hinton, Sutskever, and Krizhevsky. Hinton got the ball rolling by connecting CHM to the right team at Google. CHM worked with the Google team for five years to negotiate the release. The team also helped us identify the specific version of the AlexNet source code to release—there have been many versions of AlexNet over the years. There are other repositories of code called AlexNet on GitHub, but many of these are re-creations based on the famous paper, not the original code. CHM’s GitHub page. This post originally appeared on the blog of the Computer History Museum. Acknowledgments Special thanks to Geoffrey Hinton for providing his quote and reviewing the text, to Cade Metz and Alex Krizhevsky for additional clarifications, and to David Bieber and the rest of the team at Google for their work in securing the source code release. References Fei-Fei Li, The Worlds I See: Curiosity, Exploration, and Discovery at the Dawn of AI. First edition, Flatiron Books, New York, 2023. Cade Metz, Genius Makers: The Mavericks Who Brought AI to Google, Facebook, and the World. First edition, Penguin Random House, New York, 2022.

5 days ago 4 votes
IEEE Recognizes Itaipu Dam’s Engineering Achievements

Technology should benefit humanity. One of the most remarkable examples of technology’s potential to provide enduring benefits is the Itaipu Hydroelectric Dam, a massive binational energy project between Brazil and Paraguay. Built on the Paraná River, which forms part of the border between the two nations, Itaipu transformed a once-contested hydroelectric resource into a shared engine of economic progress. The power plant has held many records. For decades, it was the world’s largest hydroelectric facility; the dam spans the river’s 7.9-kilometer width and reaches a height of 196 meters. Itaipu was also the first hydropower plant to generate more than 100 terawatt hours of electricity in a year. To acknowledge Itaipu’s monumental engineering achievement, on 4 March the dam was recognized as an IEEE Milestone during a ceremony in Hernandarias, Paraguay. The ceremony commemorated the project’s impact on engineering and energy production. Itaipu’s massive scale By the late 1960s, Brazil and Paraguay recognized the Paraná River’s untapped hydroelectric potential, according to the Global Infrastructure Hub. Brazil, which was undergoing rapid industrialization, sought a stable, renewable energy source to reduce its dependence on fossil fuels. Meanwhile, Paraguay, lacking the financial resources to construct a gigawatt-scale hydroelectric facility independently, entered into a treaty with Brazil in 1973. The agreement granted both countries equal ownership of the dam and its power generation. Construction began in 1975 and was completed in 1984, costing US $19.6 billion. The scale of the project was staggering. Engineers excavated 50 million cubic meters of earth and rock, poured 12.3 million cubic meters of concrete, and used enough iron and steel to construct 380 Eiffel Towers. Itaipu was designed for continuous expansion. It initially launched with two 700-megawatt turbine units, providing 1.4 gigawatts of capacity. By 1991, the power plant reached its planned 12.6 GW capacity. In 2006 and 2007, it was expanded to 14 GW with the addition of two more units, for a total of 20. Although China’s 22.5-GW Three Gorges Dam, on the Yangtze River near the city of Yichang, surpassed Itaipu’s capacity in 2012, the South American dam remains one of the world’s most productive hydroelectric facilities. On average, Itaipu generates around 90 terawatt-hours of electricity annually. It set a record by generating 103.1 TWh in 2016 (surpassed in 2020 by Three Gorges’ 111.8-TWh output). To put 100 TWh into perspective, a power plant would need to burn approximately 50 million tonnes of coal to produce the same amount of energy, according to the U.S. Energy Information Administration. By harnessing 62,200 cubic meters of river water per second, Itaipu prevents the release of nearly 100 million tonnes of carbon dioxide each year. During its 40-year lifetime, the dam has generated more than 3,000 TWh of electricity, meeting nearly 90 percent of Paraguay’s energy needs and contributing roughly 10 percent of Brazil’s electricity supply. Itaipu’s legacy endures as a testament to the benefits of international cooperation and sustainable energy and to the power of engineering to shape the future. IEEE recognition for Itaipu The IEEE Milestone commemorative plaque, now displayed in the dam’s visitor center, highlights Itaipu’s role as a world leader in hydroelectric power generation. It reads: “Itaipu power plant construction began in 1975 as a joint Brazil-Paraguay venture. When power generation started in 1984, Itaipu set a world record for the single largest installed hydroelectric capacity (14 GW). For at least three decades, Itaipu produced more electricity annually than any other hydroelectric project. Linking power plants, substations, and transmission lines in both Brazil and Paraguay, Itaipu’s system provided reliable, affordable energy to consumers and industry.” Administered by the IEEE History Center and supported by donors, the Milestone program recognizes outstanding technical developments worldwide. The IEEE Paraguay Section sponsored the nomination.

6 days ago 5 votes
Squirrels Inspire Leaping Strategy for Salto Robot

When you see a squirrel jump to a branch, you might think (and I myself thought, up until just now) that they’re doing what birds and primates would do to stick the landing: just grabbing the branch and hanging on. But it turns out that squirrels, being squirrels, don’t actually have prehensile hands or feet, meaning that they can’t grasp things with any significant amount of strength. Instead, they manage to land on branches using a “palmar” grasp, which isn’t really a grasp at all, in the sense that there’s not much grabbing going on. It’s more accurate to say that the squirrel is mostly landing on its palms and then balancing, which is very impressive. This kind of dynamic stability is a trait that squirrels share with one of our favorite robots: Salto. Salto is a jumper too, and it’s about as non-prehensile as it’s possible to get, having just one limb with basically no grip strength at all. The robot is great at bouncing around on the ground, but if it could move vertically, that’s an entire new mobility dimension that could lead to some potentially interesting applications, including environmental scouting, search and rescue, and disaster relief. In a paper published today in Science Robotics, roboticists have now taught Salto to leap from one branch to another like squirrels do, using a low torque gripper and relying on its balancing skills instead. Squirrel Landing Techniques in Robotics While we’re going to be mostly talking about robots here (because that’s what we do), there’s an entire paper by many of the same robotics researchers that was published in late February in the Journal of Experimental Biology about how squirrels land on branches this way. While you’d think that the researchers might have found some domesticated squirrels for this, they actually spent about a month bribing wild squirrels on the UC Berkeley campus to bounce around some instrumented perches while high speed cameras were rolling. Squirrels aim for perfectly balanced landings, which allow them to immediately jump again. They don’t always get it quite right, of course, and they’re excellent at recovering from branch landings where they go a little bit over or under where they want to be. The research showed how squirrels use their musculoskeletal system to adjust their body position, dynamically absorbing the impact of landing with their forelimbs and altering their mass distribution to turn near misses into successful perches. It’s these kinds of skills that Salto really needs to be able to usefully make jumps in the real world. When everything goes exactly the way it’s supposed to, jumping and perching is easy, but that almost never happens and the squirrel research shows how important it is to be able to adapt when things go wonky. It’s not like the little robot has a lot of degrees of freedom to work with—it’s got just one leg, just one foot, a couple of thrusters, and that spinning component which, believe it or not, functions as a tail. And yet, Salto manages to (sometimes!) make it work. Those balanced upright landings are super impressive, although we should mention that Salto only achieved that level of success with two out of 30 trials. It only actually fell off the perch five times, and the rest of the time, it did manage a landing but then didn’t quite balance and either overshot or undershot the branch. There are some mechanical reasons why this is particularly difficult for Salto—for example, having just one leg to use for both jumping and landing means that the robot’s leg has to be rotated mid-jump. This takes time, and causes Salto to jump more vertically than squirrels do, since squirrels jump with their back legs and land with their front legs. Based on these tests, the researchers identified four key features for balanced landings that apply to robots (and squirrels): Power and accuracy are important! It’s easier to land a shallower jump with a more horizontal trajectory. Being able to squish down close to the branch helps with balancing. Responsive actuation is also important! Of these, Salto is great at the first one, very much not great at the second one, and also not great at the third and fourth ones. So in some sense, it’s amazing that the roboticists have been able to get it to do this branch-to-branch jumping as well as they have. There’s plenty more to do, though. Squirrels aren’t the only arboreal jumpers out there, and there’s likely more to learn from other animals—Salto was originally inspired by the galago (also known as bush babies), although those are more difficult to find on the UC Berkeley campus. And while the researchers don’t mention it, the obvious extension to this work is to chain together multiple jumps, and eventually to combine branch jumping with the ground jumping and wall jumping that Salto can do already to really give those squirrels a jump for their nuts.

a week ago 6 votes
Video Friday: Exploring Phobos

Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. European Robotics Forum: 25–27 March 2025, STUTTGART, GERMANY RoboSoft 2025: 23–26 April 2025, LAUSANNE, SWITZERLAND ICUAS 2025: 14–17 May 2025, CHARLOTTE, NC ICRA 2025: 19–23 May 2025, ATLANTA, GA London Humanoids Summit: 29–30 May 2025, LONDON IEEE RCAR 2025: 1–6 June 2025, TOYAMA, JAPAN 2025 Energy Drone & Robotics Summit: 16–18 June 2025, HOUSTON, TX RSS 2025: 21–25 June 2025, LOS ANGELES ETH Robotics Summer School: 21–27 June 2025, GENEVA IAS 2025: 30 June–4 July 2025, GENOA, ITALY ICRES 2025: 3–4 July 2025, PORTO, PORTUGAL IEEE World Haptics: 8–11 July 2025, SUWON, KOREA IFAC Symposium on Robotics: 15–18 July 2025, PARIS RoboCup 2025: 15–21 July 2025, BAHIA, BRAZIL Enjoy today’s videos! In 2026, a JAXA spacecraft is heading to the Martian moon Phobos to chuck a little rover at it. [ DLR ] Happy International Women’s Day! UBTECH humanoid robots Walker S1 deliver flowers to incredible women and wish all women a day filled with love, joy and empowerment. [ UBTECH ] TRON 1 demonstrates Multi-Terrain Mobility as a versatile biped mobility platform, empowering innovators to push the boundaries of robotic locomotion, unlocking limitless possibilities in algorithm validation and advanced application development. [ LimX Dynamics ] This is indeed a very fluid running gait, and the flip is also impressive, but I’m wondering what sort of actual value these skills add, you know? Or even what kind of potential value they’re leading up to. [ EngineAI ] Designing trajectories for manipulation through contact is challenging as it requires reasoning of object & robot trajectories as well as complex contact sequences simultaneously. In this paper, we present a novel framework for simultaneously designing trajectories of robots, objects, and contacts efficiently for contact-rich manipulation. [ Paper ] via [ Mitsubishi Electric Research Laboratories ] Thanks, Yuki! Running robot, you say? I’m thinking it might actually be a power walking robot. [ MagicLab ] Wake up, Reachy! [ Pollen ] Robot vacuum docks have gotten large enough that we’re now all supposed to pretend that we’re happy they’ve become pieces of furniture. [ Roborock ] The SeaPerch underwater robot, a “do-it-yourself” maker project, is a popular educational tool for middle and high school students. Developed by MIT Sea Grant, the remotely operated vehicle (ROV) teaches hand fabrication processes, electronics techniques, and STEM concepts, while encouraging exploration of structures, electronics, and underwater dynamics. [ MIT Sea Grant ] I was at this RoboGames match! In 2010! And now I feel old! [ Hardcore Robotics ] Daniel Simu with a detailed breakdown of his circus acrobat partner robot. If you don’t want to watch the whole thing, make sure and check out 3:30. [ Daniel Simu ]

a week ago 8 votes

More in AI

On (Not) Feeling the AGI

Ben Thompson interviewed Sam Altman recently about building a consumer tech company, and about the history of OpenAI.

20 hours ago 2 votes
Musk, Grok, and “rigorous adherence to truth“

Elon Musk, yesterday: “Rigorous adherence to truth is the only way to build safe Al.”

11 hours ago 1 votes
Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle "stop-and-go" waves, those frustrating slowdowns and speedups that usually have no clear cause but lead to congestion and significant energy waste. To train efficient flow-smoothing controllers, we built fast, data-driven simulations that RL agents interact with, learning to maximize energy efficiency while maintaining throughput and operating safely around human drivers. Overall, a small proportion of well-controlled autonomous vehicles (AVs) is enough to significantly improve traffic flow and fuel efficiency for all drivers on the road. Moreover, the trained controllers are designed to be deployable on most modern vehicles, operating in a decentralized manner and relying on standard radar sensors. In our latest paper, we explore the challenges of deploying RL controllers on a large-scale, from simulation to the field, during this 100-car experiment. The challenges of phantom jams A stop-and-go wave moving backwards through highway traffic. If you drive, you’ve surely experienced the frustration of stop-and-go waves, those seemingly inexplicable traffic slowdowns that appear out of nowhere and then suddenly clear up. These waves are often caused by small fluctuations in our driving behavior that get amplified through the flow of traffic. We naturally adjust our speed based on the vehicle in front of us. If the gap opens, we speed up to keep up. If they brake, we also slow down. But due to our nonzero reaction time, we might brake just a bit harder than the vehicle in front. The next driver behind us does the same, and this keeps amplifying. Over time, what started as an insignificant slowdown turns into a full stop further back in traffic. These waves move backward through the traffic stream, leading to significant drops in energy efficiency due to frequent accelerations, accompanied by increased CO2 emissions and accident risk. And this isn’t an isolated phenomenon! These waves are ubiquitous on busy roads when the traffic density exceeds a critical threshold. So how can we address this problem? Traditional approaches like ramp metering and variable speed limits attempt to manage traffic flow, but they often require costly infrastructure and centralized coordination. A more scalable approach is to use AVs, which can dynamically adjust their driving behavior in real-time. However, simply inserting AVs among human drivers isn’t enough: they must also drive in a smarter way that makes traffic better for everyone, which is where RL comes in. Fundamental diagram of traffic flow. The number of cars on the road (density) affects how much traffic is moving forward (flow). At low density, adding more cars increases flow because more vehicles can pass through. But beyond a critical threshold, cars start blocking each other, leading to congestion, where adding more cars actually slows down overall movement. Reinforcement learning for wave-smoothing AVs RL is a powerful control approach where an agent learns to maximize a reward signal through interactions with an environment. The agent collects experience through trial and error, learns from its mistakes, and improves over time. In our case, the environment is a mixed-autonomy traffic scenario, where AVs learn driving strategies to dampen stop-and-go waves and reduce fuel consumption for both themselves and nearby human-driven vehicles. Training these RL agents requires fast simulations with realistic traffic dynamics that can replicate highway stop-and-go behavior. To achieve this, we leveraged experimental data collected on Interstate 24 (I-24) near Nashville, Tennessee, and used it to build simulations where vehicles replay highway trajectories, creating unstable traffic that AVs driving behind them learn to smooth out. Simulation replaying a highway trajectory that exhibits several stop-and-go waves. We designed the AVs with deployment in mind, ensuring that they can operate using only basic sensor information about themselves and the vehicle in front. The observations consist of the AV’s speed, the speed of the leading vehicle, and the space gap between them. Given these inputs, the RL agent then prescribes either an instantaneous acceleration or a desired speed for the AV. The key advantage of using only these local measurements is that the RL controllers can be deployed on most modern vehicles in a decentralized way, without requiring additional infrastructure. Reward design The most challenging part is designing a reward function that, when maximized, aligns with the different objectives that we desire the AVs to achieve: Wave smoothing: Reduce stop-and-go oscillations. Energy efficiency: Lower fuel consumption for all vehicles, not just AVs. Safety: Ensure reasonable following distances and avoid abrupt braking. Driving comfort: Avoid aggressive accelerations and decelerations. Adherence to human driving norms: Ensure a “normal” driving behavior that doesn’t make surrounding drivers uncomfortable. Balancing these objectives together is difficult, as suitable coefficients for each term must be found. For instance, if minimizing fuel consumption dominates the reward, RL AVs learn to come to a stop in the middle of the highway because that is energy optimal. To prevent this, we introduced dynamic minimum and maximum gap thresholds to ensure safe and reasonable behavior while optimizing fuel efficiency. We also penalized the fuel consumption of human-driven vehicles behind the AV to discourage it from learning a selfish behavior that optimizes energy savings for the AV at the expense of surrounding traffic. Overall, we aim to strike a balance between energy savings and having a reasonable and safe driving behavior. Simulation results Illustration of the dynamic minimum and maximum gap thresholds, within which the AV can operate freely to smooth traffic as efficiently as possible. The typical behavior learned by the AVs is to maintain slightly larger gaps than human drivers, allowing them to absorb upcoming, possibly abrupt, traffic slowdowns more effectively. In simulation, this approach resulted in significant fuel savings of up to 20% across all road users in the most congested scenarios, with fewer than 5% of AVs on the road. And these AVs don’t have to be special vehicles! They can simply be standard consumer cars equipped with a smart adaptive cruise control (ACC), which is what we tested at scale. Smoothing behavior of RL AVs. Red: a human trajectory from the dataset. Blue: successive AVs in the platoon, where AV 1 is the closest behind the human trajectory. There is typically between 20 and 25 human vehicles between AVs. Each AV doesn’t slow down as much or accelerate as fast as its leader, leading to decreasing wave amplitude over time and thus energy savings. 100 AV field test: deploying RL at scale Our 100 cars parked at our operational center during the experiment week. Given the promising simulation results, the natural next step was to bridge the gap from simulation to the highway. We took the trained RL controllers and deployed them on 100 vehicles on the I-24 during peak traffic hours over several days. This large-scale experiment, which we called the MegaVanderTest, is the largest mixed-autonomy traffic-smoothing experiment ever conducted. Before deploying RL controllers in the field, we trained and evaluated them extensively in simulation and validated them on the hardware. Overall, the steps towards deployment involved: Training in data-driven simulations: We used highway traffic data from I-24 to create a training environment with realistic wave dynamics, then validate the trained agent’s performance and robustness in a variety of new traffic scenarios. Deployment on hardware: After being validated in robotics software, the trained controller is uploaded onto the car and is able to control the set speed of the vehicle. We operate through the vehicle’s on-board cruise control, which acts as a lower-level safety controller. Modular control framework: One key challenge during the test was not having access to the leading vehicle information sensors. To overcome this, the RL controller was integrated into a hierarchical system, the MegaController, which combines a speed planner guide that accounts for downstream traffic conditions, with the RL controller as the final decision maker. Validation on hardware: The RL agents were designed to operate in an environment where most vehicles were human-driven, requiring robust policies that adapt to unpredictable behavior. We verify this by driving the RL-controlled vehicles on the road under careful human supervision, making changes to the control based on feedback. Each of the 100 cars is connected to a Raspberry Pi, on which the RL controller (a small neural network) is deployed. The RL controller directly controls the onboard adaptive cruise control (ACC) system, setting its speed and desired following distance. Once validated, the RL controllers were deployed on 100 cars and driven on I-24 during morning rush hour. Surrounding traffic was unaware of the experiment, ensuring unbiased driver behavior. Data was collected during the experiment from dozens of overhead cameras placed along the highway, which led to the extraction of millions of individual vehicle trajectories through a computer vision pipeline. Metrics computed on these trajectories indicate a trend of reduced fuel consumption around AVs, as expected from simulation results and previous smaller validation deployments. For instance, we can observe that the closer people are driving behind our AVs, the less fuel they appear to consume on average (which is calculated using a calibrated energy model): Average fuel consumption as a function of distance behind the nearest engaged RL-controlled AV in the downstream traffic. As human drivers get further away behind AVs, their average fuel consumption increases. Another way to measure the impact is to measure the variance of the speeds and accelerations: the lower the variance, the less amplitude the waves should have, which is what we observe from the field test data. Overall, although getting precise measurements from a large amount of camera video data is complicated, we observe a trend of 15 to 20% of energy savings around our controlled cars. Data points from all vehicles on the highway over a single day of the experiment, plotted in speed-acceleration space. The cluster to the left of the red line represents congestion, while the one on the right corresponds to free flow. We observe that the congestion cluster is smaller when AVs are present, as measured by computing the area of a soft convex envelope or by fitting a Gaussian kernel. Final thoughts The 100-car field operational test was decentralized, with no explicit cooperation or communication between AVs, reflective of current autonomy deployment, and bringing us one step closer to smoother, more energy-efficient highways. Yet, there is still vast potential for improvement. Scaling up simulations to be faster and more accurate with better human-driving models is crucial for bridging the simulation-to-reality gap. Equipping AVs with additional traffic data, whether through advanced sensors or centralized planning, could further improve the performance of the controllers. For instance, while multi-agent RL is promising for improving cooperative control strategies, it remains an open question how enabling explicit communication between AVs over 5G networks could further improve stability and further mitigate stop-and-go waves. Crucially, our controllers integrate seamlessly with existing adaptive cruise control (ACC) systems, making field deployment feasible at scale. The more vehicles equipped with smart traffic-smoothing control, the fewer waves we’ll see on our roads, meaning less pollution and fuel savings for everyone! Many contributors took part in making the MegaVanderTest happen! The full list is available on the CIRCLES project page, along with more details about the project. Read more: [paper]

yesterday 4 votes
More on Various AI Action Plans

Last week I covered Anthropic’s relatively strong submission, and OpenAI’s toxic submission. This week I cover several other submissions, and do some follow-up on OpenAI’s entry.

2 days ago 2 votes
Is There a Difference Between Calculation and Computation?

Recently I’ve been producing (for my own amusement) example Curta calculations. One motivation was arguing if a proposed solution method for Dudeney’s digits problem was something that could in fact have been easily executed in 1924. This got me thinking, is there an actual difference between calculation and computation? In […]

3 days ago 5 votes