More from IEEE Spectrum
It was 8 May 1945, Victory in Europe Day. With the German military’s unconditional surrender, the European part of World War II came to an end. Alan Turing and his assistant Donald Bayley celebrated victory in their quiet English way, by taking a long walk together. They had been working side by side for more than a year in a secret electronics laboratory, deep in the English countryside. Bayley, a young electrical engineer, knew little about his boss’s other life as a code breaker, only that Turing would set off on his bicycle every now and then to another secret establishment about 10 miles away along rural lanes, Bletchley Park. As Bayley and the rest of the world would later learn, Bletchley Park was the headquarters of a vast, unprecedented code-breaking operation. Donald Bayley (1921-2020) graduated with a degree in electrical engineering, and was commissioned into the Royal Electrical and Mechanical Engineers. There, he was selected to work with Alan Turing on the Delilah project. In later life he designed the teletypewriter-based “Piccolo” systemfor secret diplomatic radio communications, adopted by the British Foreign and Commonwealth Office and used worldwide for decades.Bonhams “That was the end of that conversation,” Bayley recalled 67 years later. Turing’s incredible code-breaking work is now no longer secret. What’s more, he is renowned both as a founding father of computer science and as a pioneering figure in artificial intelligence. He is not so well-known, however, for his work in electrical engineering. This may be about to change. In November 2023, a large cache of his wartime papers—nicknamed the “Bayley papers”—was auctioned in London for almost half a million U.S. dollars. The previously unknown cache contains many sheets in Turing’s own handwriting, telling of his top-secret “Delilah” engineering project from 1943 to 1945. Delilah was Turing’s portable voice-encryption system, named after the biblical deceiver of men. There is also material written by Bayley, often in the form of notes he took while Turing was speaking. It is thanks to Bayley that the papers survived: He kept them until he died in 2020, 66 years after Turing passed away. When the British Government learned about the sale of these papers at auction, it acted swiftly to put a ban on their export, declaring them to be “an important part of our national story,” and saying “It is right that a UK buyer has the opportunity to purchase these papers.” I was lucky enough to get access to the collection prior to the November sale, when the auction house asked for my assistance in identifying some of the technical material. The Bayley papers shine new light on Turing the engineer. Alan Turing’s Delilah Project During the war, Turing realized that cryptology’s new frontier was going to be the encryption of speech. The existing wartime cipher machines—such as the Japanese “ Purple” machine, the British Typex, and the Germans’ famous Enigma and teletypewriter-based SZ42—were all for encrypting typewritten text. Text, though, is scarcely the most convenient way for commanders to communicate, and secure voice communication was on the military wish list. SIGSALY speech-encryption system was constructed in New York City, under a U.S. Army contract, during 1942 and 1943. It was gigantic, weighing over 50 thousand kilograms and filling a room. Turing was familiar with SIGSALY and wanted to miniaturize speech encryption. The result, Delilah, consisted of three small units, each roughly the size of a shoebox. Weighing just 39 kg, including its power pack, Delilah would be at home in a truck, a trench, or a large backpack. Bell Labs’ top secret installation of the SIGSALY voice-encryption system was a room-size machine that weighed over 50,000 kilograms. NSA In 1943, Turing set up bench space in a Nissen hut and worked on Delilah in secret. The hut was at Hanslope Park, a military-run establishment in the middle of nowhere, England. Today, Hanslope Park is still an ultrasecret intelligence site known as His Majesty’s Government Communications Centre. In the Turing tradition, HMGCC engineers supply today’s British intelligence agents with specialized hardware and software. Turing seems to have enjoyed the two years he spent at Hanslope Park working on Delilah. He made an old cottage his home and took meals in the Army mess. The commanding officer recalled that he “soon settled down and became one of us.” In 1944, Turing acquired his young assistant, Bayley, who had recently graduated from the University of Birmingham with a bachelor’s degree in electrical engineering. The two became good friends, working together on Delilah until the autumn of 1945. Bayley called Turing simply “Prof,” as everyone did in the Bletchley-Hanslope orbit. “I admired the originality of his mind,” Bayley told me when I interviewed him in the 1990s. “He taught me a great deal, for which I have always been grateful.” In return, Bayley taught Turing bench skills. When he first arrived at Hanslope Park, Bayley found Turing wiring together circuits that resembled a “spider’s nest,” he said. He took Turing firmly by the hand and dragged him through breadboarding bootcamp. Alan Turing and his assistant Donald Bayley created this working prototype of their voice-encryption system, called Delilah.The National Archives, London A year later, as the European war ground to a close, Turing and Bayley got a prototype system up and running. This “did all that could be expected of it,” Bayley said. He described the Delilah system as “one of the first to be based on rigorous cryptographic principles.” How Turing’s Voice-Encryption System Worked Turing drew inspiration for the voice-encryption system from existing cipher machines for text. Teletypewriter-based cipher machines such as the Germans’ sophisticated SZ42—broken by Turing and his colleagues at Bletchley Park—worked differently from the better known Enigma machine. Enigma was usually used for messages transmitted over radio in Morse code. It encrypted the letters A through Z by lighting up corresponding letters on a panel, called the lampboard, whose electrical connections with the keyboard were continually changing. The SZ42, by contrast, was attached to a regular teletypewriter that used a 5-bit telegraph code and could handle not just letters, but also numbers and a range of punctuation. Morse code was not involved. (This 5-bit telegraph code was a forerunner of ASCII and Unicode and is still used by some ham radio operators.) The Delilah voice-encryption machine contained a key unit that generated the pseudorandom numbers used to obscure messages. This blueprint of the key unit features 8 multivibrators (labeled “v1,” “v2,” and so forth).The National Archives, London Inside the SZ42, the key was produced by a key generator, consisting of a system of 12 wheels. As the wheels turned, they churned out a continual stream of seemingly random characters. The wheels in the receiver’s machine were synchronized with the sender’s, and so produced the same characters—Y/RABV8WOUJL/H9VF3JX/D5Z in our example. The receiving machine subtracted the key from the incoming ciphertext PNTDOOLLHANC9OAND9NK9CK5, revealing the plaintext ANGREIFEN9UM9NUL9NUL9UHR (a space was always typed as “9”). now declassified report, Turing and Bayley commented that the problem of synchronizing the two key generators had presented them with “formidable difficulties.” But they overcame these and other problems, and eventually demonstrated Delilah using a recording of a speech given by Winston Churchill, successfully encrypting, transmitting, and decrypting it. This loose-leaf sheet shows a circuit used by Turing in an experiment to measure the cut-off voltage at a triode tube, most likely in connection with the avalanche-effect basic to a multivibrator. Multivibrators were an essential component of Delilah’s key-generation module. Bonhams The encryption-decryption process began with discretizing the audio signal, which today we’d call analog-to-digital conversion. This produced a sequence of individual numbers, each corresponding to the signal’s voltage at a particular point in time. Then numbers from Delilah’s key were added to these numbers. During the addition, any digits that needed to be carried over to the next column were left out of the calculation—called “noncarrying” addition, this helped scramble the message. The resulting sequence of numbers was the encrypted form of the speech signal. This was transmitted automatically to a second Delilah at the receiving end. The receiving Delilah subtracted the key from the incoming transmission, and then converted the resulting numbers to voltages to reproduce the original speech. But the war was winding down, and the military was not attracted to the system. Work on the Delilah project stopped not long after the war ended, when Turing was hired by the British National Physical Laboratory to design and develop an electronic computer. Delilah “had little potential for further development,” Bayley said and “was soon forgotten.” Yet it offered a very high level of security, and was the first successful demonstration of a compact portable device for voice encryption. Turing’s Lab Notebook The two years Turing spent on Delilah produced the Bayley papers. The papers comprise a laboratory notebook, a considerable quantity of loose sheets (some organized into bundles), and—the jewel of the collection—a looseleaf ring binder bulging with pages. multivibrator, which is a circuit that can be triggered to produce a single voltage pulse or a chain of pulses. In the experiment, the pulse was fed into an oscilloscope and its shape examined. Multivibrators were crucial components of Turing’s all-important key generator, and the next page of the notebook, labeled “Measurement of ‘Heaviside function,’ ” shows the voltages measured in part of the same multivibrator circuit. A key item in the Bayley papers is this lab notebook, whose first 24 pages are in Turing’s handwriting. These detail Turing’s work on the Delilah project prior to Bayley’s arrival in March 1944.Bonhams Today, there is intense interest in the use of multivibrators in cryptography. Turing’s key generator, the most original part of Delilah, contained eight multivibrator circuits, along with the five-wheel assembly mentioned previously. In effect the multivibrators were eight more very complicated “wheels,” and there was additional circuitry for enhancing the random appearance of the numbers the multivibrators produced. The Bandwidth Theorem Two loose pages, in Turing’s handwriting, explain the so-called bandwidth theorem, now known as the Nyquist-Shannon sampling theorem. This was likely written out for Bayley’s benefit. Bonhams sampling theorem. Turing’s proof of the theorem is scrawled over two sheets. Most probably he wrote the proof out for Bayley’s benefit. The theorem—which expresses what the sampling rate needs to be if sound waves are to be reproduced accurately—governed Delilah’s conversion of sound waves into numbers, done by sampling vocal frequencies several thousand times a second. Bell Labs, Claude Shannon had written a paper sketching previous work on the theorem and then proving his own formulation of it. Shannon wrote the paper in 1940, although it was not published until 1949. Turing worked at Bell Labs for a time in 1943, in connection with SIGSALY, before returning to England and embarking on Delilah. It seems likely that he and Shannon would have discussed sampling rates. Turing’s “Red Form” Notes During the war, Hanslope Park housed a large radio-monitoring section. Shifts of operators continuously searched the airwaves for enemy messages. Enigma transmissions, in Morse code, were identified by their stereotypical military format, while the distinctive warble of the SZ42’s radioteletype signals was instantly recognizable. After latching onto a transmission, an operator filled out an Army-issue form (preprinted in bright red ink). The frequency, the time of interception, and the letters of ciphertext were noted down. This “red form” was then rushed to the code breakers at Bletchley Park. Writing paper was in short supply in wartime Britain, and Turing used the blank reverse sides of these “red form” sheets, designed for radio operators to note down information about intercepted signals.Bonhams Writing paper was in short supply in wartime Britain. Turing evidently helped himself to large handfuls of red forms, scrawling out screeds of notes about Delilah on the blank reverse sides. In one bundle of red forms, numbered by Turing at the corners, he considered a resistance-capacitance network into which a “pulse of area A at time 0” is input. He calculated the charge as the pulse passes through the network, and then calculated the “output volts with pulse of that area.” The following sheets are covered with integral equations involving time, resistance, and charge. Then a scribbled diagram appears, in which a wavelike pulse is analyzed into discrete “steps”—a prelude to several pages of Fourier-type analysis. Turing appended a proof of what he termed the “Fourier theorem,” evidence that these pages may have been a tutorial for Bayley. Turing’s Lectures for Electrical Engineers The cover of the looseleaf ring binder is embossed in gilt letters “Queen Mary’s School, Walsall,” where Bayley had once been a pupil. It is crammed with handwritten notes taken by Bayley during a series of evening lectures that Turing gave at Hanslope Park. The size of Turing’s audience is unknown, but there were numerous young engineers like Bayley at Hanslope. Turing’s Lectures on Advanced Mathematics for Electrical Engineers. Running to 180 pages, they are the most extensive noncryptographic work by Turing currently known, vying in length with his 1940 write-up about Enigma and the Bombe, affectionately known at Bletchley Park as “Prof’s Book.” Scientific American ran an article by the legendary computer scientist and AI pioneer John McCarthy, in which he stated that Turing’s work did not play “any direct role in the labors of the men who made the computer a reality.” It was a common view at the time. A binder filled with Bayley’s notes of Turing’s lectures is the jewel of the recently sold document collection.Bonhams As we now know, though, after the war Turing himself designed an electronic computer, called the Automatic Computing Engine, or ACE. What’s more, he designed the programming system for the Manchester University “Baby” computer, as well as the hardware for its punched-tape input/output. Baby came to life in mid-1948. Although small, it was the first truly stored-program electronic computer. Two years later, the prototype of Turing’s ACE ran its first program. The prototype was later commercialized as the English Electric DEUCE (Digital Electronic Universal Computing Engine). Dozens of DEUCEs were purchased—big sales in those days—and so Turing’s computer became a major workhorse during the first decades of the Digital Age. Turing’s lecture notes are in effect a textbook, terse and selective, on advanced math for circuit engineers, although now very out-of-date, of course. Turing’s knowledge of practical electronics was probably inferior to his assistant’s, initially anyway, since Bayley had studied the subject at university and then was involved with radar before his transfer to Hanslope Park. When it came to the mathematical side of things, however, the situation was very different. The Bayley papers demonstrate the maturity of Turing’s knowledge of the mathematics of electrical circuit design—knowledge that was essential to the success of the Delilah project. report for the Bonhams auction house.
IEEE Spectrum reported at the time, it was “the motleyest assortment of vehicles assembled in one place since the filming of Mad Max 2: The Road Warrior.” Not a single entrant made it across the finish line. Some didn’t make it out of the parking lot. So it’s all the more remarkable that in the second DARPA Grand Challenge, just a year and a half later, five vehicles crossed the finish line. Stanley, developed by the Stanford Racing Team, eked out a first-place win to claim the $2 million purse. This modified Volkswagen Touareg [shown at top] completed the 212-kilometer course in 6 hours, 54 minutes. Carnegie Mellon’s Sandstorm and H1ghlander took second and third place, respectively, with times of 7:05 and 7:14. So how did the Grand Challenge go from a total bust to having five robust finishers in such a short period of time? It’s definitely a testament to what can be accomplished when engineers rise to a challenge. But the outcome of this one race was preceded by a much longer path of research, and that plus a little bit of luck are what ultimately led to victory. Before Stanley, there was Minerva Let’s back up to 1998, when computer scientist Sebastian Thrun was working at Carnegie Mellon and experimenting with a very different robot: a museum tour guide. For two weeks in the summer, Minerva, which looked a bit like a Dalek from “Doctor Who,” navigated an exhibit at the Smithsonian National Museum of American History. Its main task was to roll around and dispense nuggets of information about the displays. Minerva was a museum tour-guide robot developed by Sebastian Thrun. In an interview at the time, Thrun acknowledged that Minerva was there to entertain. But Minerva wasn’t just a people pleaser ; it was also a machine learning experiment. It had to learn where it could safely maneuver without taking out a visitor or a priceless artifact. Visitor, nonvisitor; display case, not-display case; open floor, not-open floor. It had to react to humans crossing in front of it in unpredictable ways. It had to learn to “see.” Fast-forward five years: Thrun transferred to Stanford in July 2003. Inspired by the first Grand Challenge, he organized the Stanford Racing Team with the aim of fielding a robotic car in the second competition. team’s paper.) A remote-control kill switch, which DARPA required on all vehicles, would deactivate the car before it could become a danger. About 100,000 lines of code did that and much more. Many of the other 2004 competitors regrouped to try again, and new ones entered the fray. In all, 195 teams applied to compete in the 2005 event. Teams included students, academics, industry experts, and hobbyists. In the early hours of 8 October, the finalists gathered for the big race. Each team had a staggered start time to help avoid congestion along the route. About two hours before a team’s start, DARPA gave them a CD containing approximately 3,000 GPS coordinates representing the course. Once the team hit go, it was hands off: The car had to drive itself without any human intervention. PBS’s NOVA produced an excellent episode on the 2004 and 2005 Grand Challenges that I highly recommend if you want to get a feel for the excitement, anticipation, disappointment, and triumph. In the 2005 Grand Challenge, Carnegie Mellon University’s H1ghlander was one of five autonomous cars to finish the race.Damian Dovarganes/AP H1ghlander held the pole position, having placed first in the qualifying rounds, followed by Stanley and Sandstorm. H1ghlander pulled ahead early and soon had a substantial lead. That’s where luck, or rather the lack of it, came in. What went wrong with H1ghlander remained a mystery, even after extensive postrace analysis. It wasn’t until 12 years after the race—and once again with a bit of luck—that CMU discovered the problem: Pressing on a small electronic filter between the engine control module and the fuel injector caused the engine to lose power and even turn off. Team members speculated that an accident a few weeks before the competition had damaged the filter. (To learn more about how CMU finally figured this out, see Spectrum Senior Editor Evan Ackerman’s 2017 story.) The Legacy of the DARPA Grand Challenge Regardless of who won the Grand Challenge, many success stories came out of the contest. A year and a half after the race, Thrun had already made great progress on adaptive cruise control and lane-keeping assistance, which is now readily available on many commercial vehicles. He then worked on Google’s Street View and its initial self-driving cars. CMU’s Red Team worked with NASA to develop rovers for potentially exploring the moon or distant planets. Closer to home, they helped develop self-propelled harvesters for the agricultural sector. Stanford team leader Sebastian Thrun holds a $2 million check, the prize for winning the 2005 Grand Challenge.Damian Dovarganes/AP Of course, there was also a lot of hype, which tended to overshadow the race’s militaristic origins—remember, the “D” in DARPA stands for “defense.” Back in 2000, a defense authorization bill had stipulated that one-third of the U.S. ground combat vehicles be “unmanned” by 2015, and DARPA conceived of the Grand Challenge to spur development of these autonomous vehicles. The U.S. military was still fighting in the Middle East, and DARPA promoters believed self-driving vehicles would help minimize casualties, particularly those caused by improvised explosive devices. 2007 Urban Challenge, in which vehicles navigated a simulated city and suburban environment; the 2012 Robotics Challenge for disaster-response robots; and the 2022 Subterranean Challenge for—you guessed it—robots that could get around underground. Despite the competitions, continued military conflicts, and hefty government contracts, actual advances in autonomous military vehicles and robots did not take off to the extent desired. As of 2023, robotic ground vehicles made up only 3 percent of the global armored-vehicle market. Much of the contemporary reporting on the Grand Challenge predicted that self-driving cars would take us closer to a “Jetsons” future, with a self-driving vehicle to ferry you around. But two decades after Stanley, the rollout of civilian autonomous cars has been confined to specific applications, such as Waymo robotaxis transporting people around San Francisco or the GrubHub Starships struggling to deliver food across my campus at the University of South Carolina. A Tale of Two Stanleys Not long after the 2005 race, Stanley was ready to retire. Recalling his experience testing Minerva at the National Museum of American History, Thrun thought the museum would make a nice home. He loaned it to the museum in 2006, and since 2008 it has resided permanently in the museum’s collections, alongside other remarkable specimens in robotics and automobiles. In fact, it isn’t even the first Stanley in the collection. Stanley now resides in the collections of the Smithsonian Institution’s National Museum of American History, which also houses another Stanley—this 1910 Stanley Runabout. Behring Center/National Museum of American History/Smithsonian Institution That distinction belongs to a 1910 Stanley Runabout, an early steam-powered car introduced at a time when it wasn’t yet clear that the internal-combustion engine was the way to go. Despite clear drawbacks—steam engines had a nasty tendency to explode—“Stanley steamers” were known for their fine craftsmanship. Fred Marriott set the land speed record while driving a Stanley in 1906. It clocked in at 205.5 kilometers per hour, which was significantly faster than the 21st-century Stanley’s average speed of 30.7 km/hr. To be fair, Marriott’s Stanley was racing over a flat, straight course rather than the off-road terrain navigated by Thrun’s Stanley. Part of a continuing series looking at historical artifacts that embrace the boundless potential of technology. An abridged version of this article appears in the February 2025 print issue as “Slow and Steady Wins the Race.” References Sebastian Thrun and his colleagues at the Stanford Artificial Intelligence Laboratory, along with members of the other groups that sponsored Stanley, published “Stanley: The Robot That Won the DARPA Grand Challenge.” This paper, from the Journal of Field Robotics, explains the vehicle’s development. The NOVA PBS episode “The Great Robot Race” provides interviews and video footage from both the failed first Grand Challenge and the successful second one. I personally liked the side story of GhostRider, an autonomous motorcycle that competed in both competitions but didn’t quite cut it. (GhostRider also now resides in the Smithsonian’s collection.) Smithsonian curator Carlene Stephens kindly talked with me about how she collected Stanley for the National Museum of American History and where she sees artifacts like this fitting into the stream of history.
Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please send us your events for inclusion. RoboCup German Open: 12–16 March 2025, NUREMBERG, GERMANY German Robotics Conference: 13–15 March 2025, NUREMBERG, GERMANY European Robotics Forum: 25–27 March 2025, STUTTGART, GERMANY RoboSoft 2025: 23–26 April 2025, LAUSANNE, SWITZERLAND ICUAS 2025: 14–17 May 2025, CHARLOTTE, NC ICRA 2025: 19–23 May 2025, ATLANTA, GA London Humanoids Summit: 29–30 May 2025, LONDON IEEE RCAR 2025: 1–6 June 2025, TOYAMA, JAPAN 2025 Energy Drone & Robotics Summit: 16–18 June 2025, HOUSTON, TX RSS 2025: 21–25 June 2025, LOS ANGELES Enjoy today’s videos! This video about ‘foster’ Aibos helping kids at a children’s hospital is well worth turning on auto-translated subtitles for. [ Aibo Foster Program ] Hello everyone, let me introduce myself again. I am Unitree H1 “Fuxi”. I am now a comedian at the Spring Festival Gala, hoping to bring joy to everyone. Let’s push boundaries every day and shape the future together. [ Unitree ] Happy Chinese New Year from PNDbotics! [ PNDbotics ] In celebration of the upcoming Year of the Snake, TRON 1 swishes into three little lions, eager to spread hope, courage, and strength to everyone in 2025. Wishing you a Happy Chinese New Year and all the best, TRON TRON TRON! [ LimX Dynamics ] Designing planners and controllers for contact-rich manipulation is extremely challenging as contact violates the smoothness conditions that many gradient-based controller synthesis tools assume. We introduce natural baselines for leveraging contact smoothing to compute (a) open-loop plans robust to uncertain conditions and/or dynamics, and (b) feedback gains to stabilize around open-loop plans. Mr. Bucket is my favorite. [ Mitsubishi Electric Research Laboratories ] Thanks, Yuki! What do you get when you put three aliens in a robotaxi? The first-ever Zoox commercial! We hope you have as much fun watching it as we had creating it and can’t wait for you to experience your first ride in the not-too-distant future. [ Zoox ] The Humanoids Summit at the Computer History Museum in December was successful enough (either because of or in spite of my active participation) that it’s not only happening again in 2025, there’s also going to be a spring version of the conference in London in May! [ Humanoids Summit ] I’m not sure it’ll ever be practical at scale, but I do really like JSK’s musculoskeletal humanoid work. [ Paper ] In November 2024, as part of the CRS-31 mission, flight controllers remotely maneuvered Canadarm2 and Dextre to extract a payload from the SpaceX Dragon cargo ship’s trunk (CRS-31) and install it on the International Space Station. This animation was developed in preparation for the operation and shows just how complex robotic tasks can be. [ Canadian Space Agency ] Staci Americas, a third-party logistics provider, addressed its inventory challenges by implementing the Corvus One™ Autonomous Inventory Management System in its Georgia and New Jersey facilities. The system uses autonomous drones for nightly, lights-out inventory scans, identifying discrepancies and improving workflow efficiency. [ Corvus Robotics ] Thanks, Joan! I would have said that this controller was too small to be manipulated with a pinch grasp. I would be wrong. [ Pollen ] How does NASA plan to use resources on the surface of the Moon? One method is the ISRU Pilot Excavator, or IPEx! Designed by Kennedy Space Center’s Swamp Works team, the primary goal of IPEx is to dig up lunar soil, known as regolith, and transport it across the Moon’s surface. [ NASA ] The TBS Mojito is an advanced forward-swept FPV flying wing platform that delivers unmatched efficiency and flight endurance. By focusing relentlessly on minimizing drag, the wing reaches speeds upwards of 200 km/h (125 mph), while cruising at 90-120 km/h (60-75 mph) with minimal power consumption. [ Team BlackSheep ] At Zoox, safety is more than a priority—it’s foundational to our mission and one of the core reasons we exist. Our System Design & Mission Assurance (SDMA) team is responsible for building the framework for safe autonomous driving. Our Co-Founder and CTO, Jesse Levinson, and Senior Director of System Design and Mission Assurance (SDMA), Qi Hommes, hosted a LinkedIn Live to provide an insider’s overview of the teams responsible for developing the metrics that ensure our technology is safe for deployment on public roads. [ Zoox ]
AI-generated voices that can mimic every vocal nuance and tic of human speech, down to specific regional accents. And with just a few seconds of audio, AI can now clone someone’s specific voice. AI agents will make calls on our behalf, conversing with others in natural language. All of that is happening, and will be commonplace soon. You can’t just label AI-generated speech. It will come in many different forms. So we need a way to recognize AI that works no matter the modality. It needs to work for long or short snippets of audio, even just a second long. It needs to work for any language, and in any cultural context. At the same time, we shouldn’t constrain the underlying system’s sophistication or language complexity. We have a simple proposal: all talking AIs and robots should use a ring modulator. In the mid-twentieth century, before it was easy to create actual robotic-sounding speech synthetically, ring modulators were used to make actors’ voices sound robotic. Over the last few decades, we have become accustomed to robotic voices, simply because text-to-speech systems were good enough to produce intelligible speech that was not human-like in its sound. Now we can use that same technology to make robotic speech that is indistinguishable from human sound robotic again. Responsible AI companies that provide voice synthesis or AI voice assistants in any form should add a ring modulator of some standard frequency (say, between 30-80 Hz) and of a minimum amplitude (say, 20 percent). That’s it. People will catch on quickly. Here are a couple of examples you can listen to for examples of what we’re suggesting. The first clip is an AI-generated “podcast” of this article made by Google’s NotebookLM featuring two AI “hosts.” Google’s NotebookLM created the podcast script and audio given only the text of this article. The next two clips feature that same podcast with the AIs’ voices modulated more and less subtly by a ring modulator: Raw audio sample generated by Google’s NotebookLM Audio sample with added ring modulator (30 Hz-25%) Audio sample with added ring modulator (30 Hz-40%) We were able to generate the audio effect with a 50-line Python script generated by Anthropic’s Claude. One of the most well-known robot voices were those of the Daleks from Doctor Who in the 1960s. Back then robot voices were difficult to synthesize, so the audio was actually an actor’s voice run through a ring modulator. It was set to around 30 Hz, as we did in our example, with different modulation depth (amplitude) depending on how strong the robotic effect is meant to be. Our expectation is that the AI industry will test and converge on a good balance of such parameters and settings, and will use better tools than a 50-line Python script, but this highlights how simple it is to achieve. We don’t expect scammers to follow our proposal: They’ll find a way no matter what. But that’s always true of security standards, and a rising tide lifts all boats. We think the bulk of the uses will be with popular voice APIs from major companies--and everyone should know that they’re talking with a robot.
More in AI
An interview with Devansh from Artificial Intelligence Made Simple
Must-reads for 2-6-25