More from Ken Shirriff's blog
In the early 1960s, Douglas Engelbart started investigating how computers could augment human intelligence: "If, in your office, you as an intellectual worker were supplied with a computer display backed up by a computer that was alive for you all day and was instantly responsive to every action you had, how much value could you derive from that?" Engelbart developed many features of modern computing that we now take for granted: the mouse,1 hypertext, shared documents, windows, and a graphical user interface. At the 1968 Joint Computer Conference, Engelbart demonstrated these innovations in a groundbreaking presentation, now known as "The Mother of All Demos." The Mother of All Demos.](keyset-video2.jpg "w500") --> The keyset with my prototype USB interface. Engelbart's demo also featured an input device known as the keyset, but unlike his other innovations, the keyset failed to catch on. The 5-finger keyset lets you type without moving your hand, entering characters by pressing multiple keys simultaneously as a chord. Christina Englebart, his daughter, loaned one of Engelbart's keysets to me. I constructed an interface to connect the keyset to USB, so that it can be used with a modern computer. The video below shows me typing with the keyset, using the mouse buttons to select upper case and special characters.2 I wrote this blog post to describe my USB keyset interface. Along the way, however, I got sidetracked by the history of The Mother of All Demos and how it obtained that name. It turns out that Engelbart's demo isn't the first demo to be called "The Mother of All Demos". Engelbart and The Mother of All Demos Engelbart's work has its roots in Vannevar Bush's 1945 visionary essay, "As We May Think." Bush envisioned thinking machines, along with the "memex", a compact machine holding a library of collective knowledge with hypertext-style links: "The Encyclopedia Britannica could be reduced to the volume of a matchbox." The memex could search out information based on associative search, building up a hypertext-like trail of connections. In the early 1960s, Engelbart was inspired by Bush's essay and set out to develop means to augment human intellect: "increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems."3 Engelbart founded the Augmentation Research Center at the Stanford Research Institute (now SRI), where he and his team created a system called NLS (oN-Line System). Engelbart editing a hierarchical shopping list. In 1968, Engelbart demonstrated NLS to a crowd of two thousand people at the Fall Joint Computer Conference. Engelbart gave the demo from the stage, wearing a crisp shirt and tie and a headset microphone. Engelbart created hierarchical documents, such as the shopping list above, and moved around them with hyperlinks. He demonstrated how text could be created, moved, and edited with the keyset and mouse. Other documents included graphics, crude line drawing by today's standards but cutting-edge for the time. The computer's output was projected onto a giant screen, along with video of Engelbart. Engelbart using the keyset to edit text. Note that the display doesn't support lowercase text; instead, uppercase is indicated by a line above the character. Adapted from The Mother of All Demos. Engelbart sat at a specially-designed Herman Miller desk6 that held the keyset, keyboard, and mouse, shown above. While Engelbart was on stage in San Francisco, the SDS 9404 computer that ran the NLS software was 30 miles to the south in Menlo Park.5 To the modern eye, the demo resembles a PowerPoint presentation over Zoom, as Engelbart collaborated with Jeff Rulifson and Bill Paxton, miles away in Menlo Park. (Just like a modern Zoom call, the remote connection started with "We're not hearing you. How about now?") Jeff Rulifson browsed the NLS code, jumping between code files with hyperlinks and expanding subroutines by clicking on them. NLS was written in custom high-level languages, which they developed with a "compiler compiler" called TREE-META. The NLS system held interactive documentation as well as tracking bugs and changes. Bill Paxton interactively drew a diagram and then demonstrated how NLS could be used as a database, retrieving information by searching on keywords. (Although Engelbart was stressed by the live demo, Paxton told me that he was "too young and inexperienced to be concerned.") Bill Paxton, in Menlo Park, communicating with the conference in San Francisco. Bill English, an electrical engineer, not only built the first mouse for Engelbart but was also the hardware mastermind behind the demo. In San Francisco, the screen images were projected on a 20-foot screen by a Volkswagen-sized Eiodophor projector, bouncing light off a modulated oil film. Numerous cameras, video switchers and mixers created the video image. Two leased microwave links and half a dozen antennas connected SRI in Menlo Park to the demo in San Francisco. High-speed modems send the mouse, keyset, and keyboard signals from the demo back to SRI. Bill English spent months assembling the hardware and network for the demo and then managed the demo behind the scenes, assisted by a team of about 17 people. Another participant was the famed counterculturist Stewart Brand, known for the Whole Earth Catalog and the WELL, one of the oldest online virtual communities. Brand advised Engelbart on the presentation, as well as running a camera. He'd often point the camera at a monitor to generate swirling psychedelic feedback patterns, reminiscent of the LSD that he and Engelbart had experimented with. The demo received press attention such as a San Francisco Chronicle article titled "Fantastic World of Tomorrow's Computer". It stated, "The most fantastic glimpse into the computer future was taking place in a windowless room on the third floor of the Civic Auditorium" where Engelbart "made a computer in Menlo Park do secretarial work for him that ten efficient secretaries couldn't do in twice the time." His goal: "We hope to help man do better what he does—perhaps by as much as 50 per cent." However, the demo received little attention in the following decades.7 Engelbart continued his work at SRI for almost a decade, but as Engelbart commented with frustration, “There was a slightly less than universal perception of our value at SRI”.8 In 1977, SRI sold the Augmentation Research Center to Tymshare, a time-sharing computing company. (Timesharing was the cloud computing of the 1970s and 1980s, where companies would use time on a centralized computer.) At Tymshare, Engelbart's system was renamed AUGMENT and marketed as an office automation service, but Engelbart himself was sidelined from development, a situation that he described as sitting in a corner and becoming invisible. Meanwhile, Bill English and some other SRI researchers9 migrated four miles south to Xerox PARC and worked on the Xerox Alto computer. The Xerox Alto incorporated many ideas from the Augmentation Research Center including the graphical user interface, the mouse, and the keyset. The Alto's keyset was almost identical to the Engelbart keyset, as can be seen in the photo below. The Alto's keyset was most popular for the networked 3D shooter game "Maze War", with the clicking of keysets echoing through the hallways of Xerox PARC. A Xerox Alto with a keyset on the left. Xerox famously failed to commercialize the ideas from the Xerox Alto, but Steve Jobs recognized the importance of interactivity, the graphical user interface, and the mouse when he visited Xerox PARC in 1979. Steve Jobs provided the Apple Lisa and Macintosh ended up with a graphical user interface and the mouse (streamlined to one button instead of three), but he left the keyset behind.10 When McDonnell Douglas acquired Tymshare in 1984, Engelbart and his software—now called Augment—had a new home.11 In 1987, McDonnell Douglas released a text editor and outline processor for the IBM PC called MiniBASE, one of the few PC applications that supported a keyset. The functionality of MiniBASE was almost identical to Engelbart's 1968 demo, but in 1987, MiniBASE was competing against GUI-based word processors such as MacWrite and Microsoft Word, so MiniBASE had little impact. Engelbart left McDonnell Douglas in 1988, forming a research foundation called the Bootstrap Institute to continue his research independently. The name: "The Mother of All Demos" The name "The Mother of All Demos" has its roots in the Gulf War. In August 1990, Iraq invaded Kuwait, leading to war between Iraq and a coalition of the United States and 41 other countries. During the months of buildup prior to active conflict, Iraq's leader, Saddam Hussein, exhorted the Iraqi people to prepare for "the mother of all battles",12 a phrase that caught the attention of the media. The battle didn't proceed as Hussein hoped: during exactly 100 hours of ground combat, the US-led coalition liberated Kuwait, pushed into Iraq, crushed the Iraqi forces, and declared a ceasefire.13 Hussein's mother of all battles became the mother of all surrenders. The phrase "mother of all ..." became the 1990s equivalent of a meme, used as a slightly-ironic superlative. It was applied to everything from The Mother of All Traffic Jams to The Mother of All Windows Books, from The Mother of All Butter Cookies to Apple calling mobile devices The Mother of All Markets.14 In 1991, this superlative was applied to a computer demo, but it wasn't Engelbart's demo. Andy Grove, Intel's president, gave a keynote speech at Comdex 1991 entitled The Second Decade: Computer-Supported Collaboration, a live demonstration of his vision for PC-based video conferencing and wireless communication in the PC's second decade. This complex hour-long demo required almost six months to prepare, with 15 companies collaborating. Intel called this demo "The Mother of All Demos", a name repeated in the New York Times, San Francisco Chronicle, Fortune, and PC Week.15 Andy Grove's demo was a hit, with over 20,000 people requesting a video tape, but the demo was soon forgotten. On the eve of Comdex, the New York Times wrote about Intel's "Mother of All Demos". Oct 21, 1991, D1-D2. In 1994, Wired writer Steven Levy wrote Insanely Great: The Life and Times of Macintosh, the Computer that Changed Everything.8 In the second chapter of this comprehensive book, Levy explained how Vannevar Bush and Doug Engelbart "sparked a chain reaction" that led to the Macintosh. The chapter described Engelbart's 1968 demo in detail including a throwaway line saying, "It was the mother of all demos."16 Based on my research, I think this is the source of the name "The Mother of All Demos" for Engelbart's demo. By the end of the century, multiple publications echoed Levy's catchy phrase. In February 1999, the San Jose Mercury News had a special article on Engelbart, saying that the demonstration was "still called 'the mother of all demos'", a description echoed by the industry publication Computerworld.17 The book Nerds: A Brief History of the Internet stated that the demo "has entered legend as 'the mother of all demos'". By this point, Engelbart's fame for the "mother of all demos" was cemented and the phrase became near-obligatory when writing about him. The classic Silicon Valley history Fire in the Valley (1984), for example, didn't even mention Engelbart but in the second edition (2000), "The Mother of All Demos" had its own chapter. Interfacing the keyset to USB Getting back to the keyset interface, the keyset consists of five microswitches, triggered by the five levers. The switches are wired to a standard DB-25 connector. I used a Teensy 3.6 microcontroller board for the interface, since this board can act both as a USB device and as a USB host. As a USB device, the Teensy can emulate a standard USB keyboard. As a USB host, the Teensy can receive input from a standard USB mouse. Connecting the keyset to the Teensy is (almost) straightforward, wiring the switches to five data inputs on the Teensy and the common line connected to ground. The Teensy's input lines can be configured with pullup resistors inside the microcontroller. The result is that a data line shows 1 by default and 0 when the corresponding key is pressed. One complication is that the keyset apparently has a 1.5 KΩ between the leftmost button and ground, maybe to indicate that the device is plugged in. This resistor caused that line to always appear low to the Teensy. To counteract this and allow the Teensy to read the pin, I connected a 1 KΩ pullup resistor to that one line. The interface code Reading the keyset and sending characters over USB is mostly straightforward, but there are a few complications. First, it's unlikely that the user will press multiple keyset buttons at exactly the same time. Moreover, the button contacts may bounce. To deal with this, I wait until the buttons have a stable value for 100 ms (a semi-arbitrary delay) before sending a key over USB. The second complication is that with five keys, the keyset only supports 32 characters. To obtain upper case, numbers, special characters, and control characters, the keyset is designed to be used in conjunction with mouse buttons. Thus, the interface needs to act as a USB host, so I can plug in a USB mouse to the interface. If I want the mouse to be usable as a mouse, not just buttons in conjunction with the keyset, the interface mus forward mouse events over USB. But it's not that easy, since mouse clicks in conjunction with the keyset shouldn't be forwarded. Otherwise, unwanted clicks will happen while using the keyset. To emulate a keyboard, the code uses the Keyboard library. This library provides an API to send characters to the destination computer. Inconveniently, the simplest method, print(), supports only regular characters, not special characters like ENTER or BACKSPACE. For those, I needed to use the lower-level press() and release() methods. To read the mouse buttons, the code uses the USBHost_t36 library, the Teensy version of the USB Host library. Finally, to pass mouse motion through to the destination computer, I use the Mouse library. Conclusions Engelbart claimed that learning a keyset wasn't difficult—a six-year-old kid could learn it in less than a week—but I'm not willing to invest much time into learning it. In my brief use of the keyset, I found it very difficult to use physically. Pressing four keys at once is difficult, with the worst being all fingers except the ring finger. Combining this with a mouse button or two at the same time gave me the feeling that I was sight-reading a difficult piano piece. Maybe it becomes easier with use, but I noticed that Alto programs tended to treat the keyset as function keys, rather than a mechanism for typing with chords.18 David Liddle of Xerox PARC said, "We found that [the keyset] was tending to slow people down, once you got away from really hot [stuff] system programmers. It wasn't quite so good if you were giving it to other engineers, let alone clerical people and so on." If anyone else has a keyset that they want to connect via USB (unlikely as it may be), my code is on github.19 Thanks to Christina Engelbart for loaning me the keyset. Thanks to Bill Paxton for answering my questions. Follow me on Bluesky (@righto.com) or RSS for updates. Footnotes and references Engelbart's use of the mouse wasn't arbitrary, but based on research. In 1966, shortly after inventing the mouse, Engelbart carried out a NASA-sponsored study that evaluated six input devices: two types of joysticks, a Graphacon positioner, the mouse, a light pen, and a control operated by the knees (leaving the hands free). The mouse, knee control, and light pen performed best, with users finding the mouse satisfying to use. Although inexperienced subjects had some trouble with the mouse, experienced subjects considered it the best device. A joystick, Graphacon, mouse, knee control, and light pen were examined as input devices. Photos from the study. ↩ The information sheet below from the Augmentation Research Center shows what keyset chords correspond to each character. I used this encoding for my interface software. Each column corresponds to a different combination of mouse buttons. The information sheet for the keyset specifies how to obtain each character. The special characters above are <CD> (Command Delete, i.e. cancel a partially-entered command), <BC> (Backspace Character), <OK> (confirm command), <BW>(Backspace Word), <RC> (Replace Character), <ESC> (which does filename completion). NLS and the Augment software have the concept of a viewspec, a view specification that controls the view of a file. For instance, viewspecs can expand or collapse an outline to show more or less detail, filter the content, or show authorship of sections. The keyset can select viewspecs, as shown below. Back of the keyset information sheet. Viewsets are explained in more detail in The Mother of All Demos. For my keyset interface, I ignored viewspecs since I don't have software to use these inputs, but it would be easy to modify the code to output the desired viewspec characters. ↩ See Augmenting Human Intellect: A Conceptual Framework, Engelbart's 1962 report. ↩ Engelbart used an SDS 940 computer running the Berkeley Timesharing System. The computer had 64K words of core memory, with 4.5 MB of drum storage for swapping and 96 MB of disk storage for files. For displays, the computer drove twelve 5" high-resolution CRTs, but these weren't viewed directly. Instead, each CRT had a video camera pointed at it and the video was redisplayed on a larger display in a work station in each office. The SDS 940 was a large 24-bit scientific computer, built by Scientific Data Systems. Although SDS built the first integrated-circuit-based commercial computer in 1965 (the SDS 92), the SDS 940 was a transistorized system. It consisted of multiple refrigerator-sized cabinets, as shown below. Since each memory cabinet held 16K words and the computer at SRI had 64K, SRI's computer had two additional cabinets of memory. Front view of an SDS 940 computer. From the Theory of Operation manual. In the late 1960s, Xerox wanted to get into the computer industry, so Xerox bought Scientific Data Systems in 1969 for $900 million (about $8 billion in current dollars). The acquisition was a disaster. After steadily losing money, Xerox decided to exit the mainframe computer business in 1975. Xerox's CEO summed up the purchase: "With hindsight, we would not have done the same thing." ↩ The Mother of All Demos is on YouTube, as well as a five-minute summary for the impatient. ↩ The desk for the keyset and mouse was designed by Herman Miller, the office furniture company. Herman Miller worked with SRI to design the desks and office walls as part of their plans for the office of the future. Herman Miller invented the cubicle office in 1964, creating a modern replacement for the commonly used open office arrangement. ↩ Engelbart's demo is famous now, but for many years it was ignored. For instance, Electronic Design had a long article on Engelbart's work in 1969 (putting the system on the cover), but there was no mention of the demo. Engelbart's system was featured on the cover of Electronic Design. Feb 1, 1969. (slightly retouched) But by the 1980s, the Engelbart demo started getting attention. The 1986 documentary Silicon Valley Boomtown had a long section on Engelbart's work and the demo. By 1988, the New York Times was referring to the demo as legendary. ↩ Levy had written about Engelbart a decade earlier, in the May 1984 issue of the magazine Popular Computing. The article focused on the mouse, recently available to the public through the Apple Lisa and the IBM PC (as an option). The big issue at the time was how many buttons a mouse should have: three like Engelbart's mouse, the one button that Apple used, or two buttons as Bill Gates preferred. But Engelbart's larger vision also came through in Levy's interview along with his frustration that most of his research had been ignored, overshadowed by the mouse. Notably, there was no mention of Engelbart's 1968 demo in the article. ↩↩ The SRI researchers who moved to Xerox include Bill English, Charles Irby, Jeff Rulifson, Bill Duval, and Bill Paxton (details). ↩ In 2023, Xerox donated the entire Xerox PARC research center to SRI. The research center remained in Palo Alto but became part of SRI. In a sense, this closed the circle, since many of the people and ideas from SRI had gone to PARC in the 1970s. However, both PARC and SRI had changed radically since the 1970s, with the cutting edge of computer research moving elsewhere. ↩ For a detailed discussion of the Augment system, see Tymshare's Augment: Heralding a New Era, Oct 1978. Augment provided a "broad range of information handling capability" that was not available elsewhere. Unlike other word processing systems, Augment was targeted at the professional, not clerical workers, people who were "eager to explore the open-ended possibilities" of the interactive process. The main complaints about Augment were its price and that it was not easy to use. Accessing Engelbart's NLS system over ARPANET cost an eye-watering $48,000 a year (over $300,000 a year in current dollars). Tymshare's Augment service was cheaper (about $80 an hour in current dollars), but still much more expensive than a standard word processing service. Overall, the article found that Augment users were delighted with the system: "It is stimulating to belong to the electronic intelligentsia." Users found it to be "a way of life—an absorbing, enriching experience". ↩ William Safire provided background in the New York Times, explaining that "the mother of all battles" originally referred to the battle of Qadisiya in A.D. 636, and Saddam Hussein was referencing that ancient battle. A translator responded, however, that the Arabic expression would be better translated as "the great battle" than "the mother of all battles." ↩ The end of the Gulf War left Saddam Hussein in control of Iraq and left thousands of US troops in Saudi Arabia. These factors would turn out to be catastrophic in the following years. ↩ At the Mobile '92 conference, Apple's CEO, John Sculley, said personal communicators could be "the mother of all markets," while Andy Grove of Intel said that the idea of a wireless personal communicator in every pocket is "a pipe dream driven by greed" (link). In hindsight, Sculley was completely right and Grove was completely wrong. ↩ Some references to Intel's "Mother of all demos" are Computer Industry Gathers Amid Chaos, New York Times, Oct 21, 1991 and "Intel's High-Tech Vision of the Future: Chipmaker proposes using computers to dramatically improve productivity", San Francisco Chronicle, Oct 21, 1991, p24. The title of an article in Microprocessor Report, "Intel Declares Victory in the Mother of All Demos" (Nov. 20, 1991), alluded to the recently-ended war. Fortune wrote about Intel's demo in the Feb 17, 1997 issue. A longer description of Intel's demo is in the book Strategy is Destiny. ↩ Several sources claim that Andy van Dam was the first to call Engelbart's demo "The Mother of All Demos." Although van Dam attended the 1968 demo, I couldn't find any evidence that he coined the phrase. John Markoff, a technology journalist for The New York Times, wrote a book What the Dormouse Said: How the Sixties Counterculture Shaped the Personal Computer Industry. In this book, Markoff wrote about Engelbart's demo, saying "Years later, his talk remained 'the mother of all demos' in the words of Andries van Dam, a Brown University computer scientist." As far as I can tell, van Dam used the phrase but only after it had already been popularized by Levy. ↩ It's curious to write that the demonstration was still called the "mother of all demos" when the phrase was just a few years old. ↩ The photo below shows a keyset from the Xerox Alto. The five keys are labeled with separate functions—Copy, Undelete, Move, Draw, and Fine— for use with ALE, a program for IC design. ALE supported keyset chording in combination with the mouse. ↩ Keyset from a Xerox Alto, courtesy of Digibarn. After I implemented this interface, I came across a project that constructed a 3D-printed chording keyset, also using a Teensy for the USB interface. You can find that project here. ↩
What is the origin of the word "mainframe", referring to a large, complex computer? Most sources agree that the term is related to the frames that held early computers, but the details are vague.1 It turns out that the history is more interesting and complicated than you'd expect. Based on my research, the earliest computer to use the term "main frame" was the IBM 701 computer (1952), which consisted of boxes called "frames." The 701 system consisted of two power frames, a power distribution frame, an electrostatic storage frame, a drum frame, tape frames, and most importantly a main frame. The IBM 701's main frame is shown in the documentation below.2 This diagram shows how the IBM 701 mainframe swings open for access to the circuitry. From "Type 701 EDPM [Electronic Data Processing Machine] Installation Manual", IBM. From Computer History Museum archives. The meaning of "mainframe" has evolved, shifting from being a part of a computer to being a type of computer. For decades, "mainframe" referred to the physical box of the computer; unlike modern usage, this "mainframe" could be a minicomputer or even microcomputer. Simultaneously, "mainframe" was a synonym for "central processing unit." In the 1970s, the modern meaning started to develop—a large, powerful computer for transaction processing or business applications—but it took decades for this meaning to replace the earlier ones. In this article, I'll examine the history of these shifting meanings in detail. Early computers and the origin of "main frame" Early computers used a variety of mounting and packaging techniques including panels, cabinets, racks, and bays.3 This packaging made it very difficult to install or move a computer, often requiring cranes or the removal of walls.4 To avoid these problems, the designers of the IBM 701 computer came up with an innovative packaging technique. This computer was constructed as individual units that would pass through a standard doorway, would fit on a standard elevator, and could be transported with normal trucking or aircraft facilities.7 These units were built from a metal frame with covers attached, so each unit was called a frame. The frames were named according to their function, such as the power frames and the tape frame. Naturally, the main part of the computer was called the main frame. An IBM 701 system at General Motors. On the left: tape drives in front of power frames. Back: drum unit/frame, control panel and electronic analytical control unit (main frame), electrostatic storage unit/frame (with circular storage CRTs). Right: printer, card punch. Photo from BRL Report, thanks to Ed Thelen. The IBM 701's internal documentation used "main frame" frequently to indicate the main box of the computer, alongside "power frame", "core frame", and so forth. For instance, each component in the schematics was labeled with its location in the computer, "MF" for the main frame.6 Externally, however, IBM documentation described the parts of the 701 computer as units rather than frames.5 The term "main frame" was used by a few other computers in the 1950s.8 For instance, the JOHNNIAC Progress Report (August 8, 1952) mentions that "the main frame for the JOHNNIAC is ready to receive registers" and they could test the arithmetic unit "in the JOHNNIAC main frame in October."10 An article on the RAND Computer in 1953 stated that "The main frame is completed and partially wired" The main body of a computer called ERMA is labeled "main frame" in the 1955 Proceedings of the Eastern Computer Conference.9 Operator at console of IBM 701. The main frame is on the left with the cover removed. The console is in the center. The power frame (with gauges) is on the right. Photo from NOAA. The progression of the word "main frame" can be seen in reports from the Ballistics Research Lab (BRL) that list almost all the computers in the United States. In the 1955 BRL report, most computers were built from cabinets or racks; the phrase "main frame" was only used with the IBM 650, 701, and 704. By 1961, the BRL report shows "main frame" appearing in descriptions of the IBM 702, 705, 709, and 650 RAMAC, as well as the Univac FILE 0, FILE I, RCA 501, READIX, and Teleregister Telefile. This shows that the use of "main frame" was increasing, but still mostly an IBM term. The physical box of a minicomputer or microcomputer In modern usage, mainframes are distinct from minicomputers or microcomputers. But until the 1980s, the word "mainframe" could also mean the main physical part of a minicomputer or microcomputer. For instance, a "minicomputer mainframe" was not a powerful minicomputer, but simply the main part of a minicomputer.13 For example, the PDP-11 is an iconic minicomputer, but DEC discussed its "mainframe."14. Similarly, the desktop-sized HP 2115A and Varian Data 620i computers also had mainframes.15 As late as 1981, the book Mini and Microcomputers mentioned "a minicomputer mainframe." "Mainframes for Hobbyists" on the front cover of Radio-Electronics, Feb 1978. Even microcomputers had a mainframe: the cover of Radio Electronics (1978, above) stated, "Own your own Personal Computer: Mainframes for Hobbyists", using the definition below. An article "Introduction to Personal Computers" in Radio Electronics (Mar 1979) uses a similar meaning: "The first choice you will have to make is the mainframe or actual enclosure that the computer will sit in." The popular hobbyist magazine BYTE also used "mainframe" to describe a microprocessor's box in the 1970s and early 1980s16. BYTE sometimes used the word "mainframe" both to describe a large IBM computer and to describe a home computer box in the same issue, illustrating that the two distinct meanings coexisted. Definition from Radio-Electronics: main-frame n: COMPUTER; esp: a cabinet housing the computer itself as distinguished from peripheral devices connected with it: a cabinet containing a motherboard and power supply intended to house the CPU, memory, I/O ports, etc., that comprise the computer itself. Main frame synonymous with CPU Words often change meaning through metonymy, where a word takes on the meaning of something closely associated with the original meaning. Through this process, "main frame" shifted from the physical frame (as a box) to the functional contents of the frame, specifically the central processing unit.17 The earliest instance that I could find of the "main frame" being equated with the central processing unit was in 1955. Survey of Data Processors stated: "The central processing unit is known by other names; the arithmetic and ligical [sic] unit, the main frame, the computer, etc. but we shall refer to it, usually, as the central processing unit." A similar definition appeared in Radio Electronics (June 1957, p37): "These arithmetic operations are performed in what is called the arithmetic unit of the machine, also sometimes referred to as the 'main frame.'" The US Department of Agriculture's Glossary of ADP Terminology (1960) uses the definition: "MAIN FRAME - The central processor of the computer system. It contains the main memory, arithmetic unit and special register groups." I'll mention that "special register groups" is nonsense that was repeated for years.18 This definition was reused and extended in the government's Automatic Data Processing Glossary, published in 1962 "for use as an authoritative reference by all officials and employees of the executive branch of the Government" (below). This definition was reused in many other places, notably the Oxford English Dictionary.19 Definition from Bureau of the Budget: frame, main, (1) the central processor of the computer system. It contains the main storage, arithmetic unit and special register groups. Synonymous with (CPU) and (central processing unit). (2) All that portion of a computer exclusive of the input, output, peripheral and in some instances, storage units. By the early 1980s, defining a mainframe as the CPU had become obsolete. IBM stated that "mainframe" was a deprecated term for "processing unit" in the Vocabulary for Data Processing, Telecommunications, and Office Systems (1981); the American National Dictionary for Information Processing Systems (1982) was similar. Computers and Business Information Processing (1983) bluntly stated: "According to the official definition, 'mainframe' and 'CPU' are synonyms. Nobody uses the word mainframe that way." Guide for auditing automatic data processing systems (1961).](mainframe-diagram.jpg "w400") 1967: I/O devices transferring data "independent of the main frame" Datamation, Volume 13. Discusses other sorts of off-line I/O. 1967 Office Equipment & Methods: "By putting your data on magnetic tape and feeding it to your computer in this pre-formatted fashion, you increase your data input rate so dramatically that you may effect main frame time savings as high as 50%." Same in Data Processing Magazine, 1966 Equating the mainframe and the CPU led to a semantic conflict in the 1970s, when the CPU became a microprocessor chip rather than a large box. For the most part, this was resolved by breaking apart the definitions of "mainframe" and "CPU", with the mainframe being the computer or class of computers, while the CPU became the processor chip. However, some non-American usages resolved the conflict by using "CPU" to refer to the box/case/tower of a PC. (See discussion [here](https://news.ycombinator.com/item?id=21336515) and [here](https://superuser.com/questions/1198006/is-it-correct-to-say-that-main-memory-ram-is-a-part-of-cpu).) --> Mainframe vs. peripherals Rather than defining the mainframe as the CPU, some dictionaries defined the mainframe in opposition to the "peripherals", the computer's I/O devices. The two definitions are essentially the same, but have a different focus.20 One example is the IFIP-ICC Vocabulary of Information Processing (1966) which defined "central processor" and "main frame" as "that part of an automatic data processing system which is not considered as peripheral equipment." Computer Dictionary (1982) had the definition "main frame—The fundamental portion of a computer, i.e. the portion that contains the CPU and control elements of a computer system, as contrasted with peripheral or remote devices usually of an input-output or memory nature." One reason for this definition was that computer usage was billed for mainframe time, while other tasks such as printing results could save money by taking place directly on the peripherals without using the mainframe itself.21 A second reason was that the mainframe vs. peripheral split mirrored the composition of the computer industry, especially in the late 1960s and 1970s. Computer systems were built by a handful of companies, led by IBM. Compatible I/O devices and memory were built by many other companies that could sell them at a lower cost than IBM.22 Publications about the computer industry needed convenient terms to describe these two industry sectors, and they often used "mainframe manufacturers" and "peripheral manufacturers." Main Frame or Mainframe? An interesting linguistic shift is from "main frame" as two independent words to a compound word: either hyphenated "main-frame" or the single word "mainframe." This indicates the change from "main frame" being a type of frame to "mainframe" being a new concept. The earliest instance of hyphenated "main-frame" that I found was from 1959 in IBM Information Retrieval Systems Conference. "Mainframe" as a single, non-hyphenated word appears the same year in Datamation, mentioning the mainframe of the NEAC2201 computer. In 1962, the IBM 7090 Installation Instructions refer to a "Mainframe Diag[nostic] and Reliability Program." (Curiously, the document also uses "main frame" as two words in several places.) The 1962 book Information Retrieval Management discusses how much computer time document queries can take: "A run of 100 or more machine questions may require two to five minutes of mainframe time." This shows that by 1962, "main frame" had semantically shifted to a new word, "mainframe." The rise of the minicomputer and how the "mainframe" become a class of computers So far, I've shown how "mainframe" started as a physical frame in the computer, and then was generalized to describe the CPU. But how did "mainframe" change from being part of a computer to being a class of computers? This was a gradual process, largely happening in the mid-1970s as the rise of the minicomputer and microcomputer created a need for a word to describe large computers. Although microcomputers, minicomputers, and mainframes are now viewed as distinct categories, this was not the case at first. For instance, a 1966 computer buyer's guide lumps together computers ranging from desk-sized to 70,000 square feet.23 Around 1968, however, the term "minicomputer" was created to describe small computers. The story is that the head of DEC in England created the term, inspired by the miniskirt and the Mini Minor car.24 While minicomputers had a specific name, larger computers did not.25 Gradually in the 1970s "mainframe" came to be a separate category, distinct from "minicomputer."2627 An early example is Datamation (1970), describing systems of various sizes: "mainframe, minicomputer, data logger, converters, readers and sorters, terminals." The influential business report EDP first split mainframes from minicomputers in 1972.28 The line between minicomputers and mainframes was controversial, with articles such as Distinction Helpful for Minis, Mainframes and Micro, Mini, or Mainframe? Confusion persists (1981) attempting to clarify the issue.29 With the development of the microprocessor, computers became categorized as mainframes, minicomputers or microcomputers. For instance, a 1975 Computerworld article discussed how the minicomputer competes against the microcomputer and mainframes. Adam Osborne's An Introduction to Microcomputers (1977) described computers as divided into mainframes, minicomputers, and microcomputers by price, power, and size. He pointed out the large overlap between categories and avoided specific definitions, stating that "A minicomputer is a minicomputer, and a mainframe is a mainframe, because that is what the manufacturer calls it."32 In the late 1980s, computer industry dictionaries started defining a mainframe as a large computer, often explicitly contrasted with a minicomputer or microcomputer. By 1990, they mentioned the networked aspects of mainframes.33 IBM embraces the mainframe label Even though IBM is almost synonymous with "mainframe" now, IBM avoided marketing use of the word for many years, preferring terms such as "general-purpose computer."35 IBM's book Planning a Computer System (1962) repeatedly referred to "general-purpose computers" and "large-scale computers", but never used the word "mainframe."34 The announcement of the revolutionary System/360 (1964) didn't use the word "mainframe"; it was called a general-purpose computer system. The announcement of the System/370 (1970) discussed "medium- and large-scale systems." The System/32 introduction (1977) said, "System/32 is a general purpose computer..." The 1982 announcement of the 3804, IBM's most powerful computer at the time, called it a "large scale processor" not a mainframe. IBM started using "mainframe" as a marketing term in the mid-1980s. For example, the 3270 PC Guide (1986) refers to "IBM mainframe computers." An IBM 9370 Information System brochure (c. 1986) says the system was "designed to provide mainframe power." IBM's brochure for the 3090 processor (1987) called them "advanced general-purpose computers" but also mentioned "mainframe computers." A System 390 brochure (c. 1990) discussed "entry into the mainframe class." The 1990 announcement of the ES/9000 called them "the most powerful mainframe systems the company has ever offered." The IBM System/390: "The excellent balance between price and performance makes entry into the mainframe class an attractive proposition." IBM System/390 Brochure By 2000, IBM had enthusiastically adopted the mainframe label: the z900 announcement used the word "mainframe" six times, calling it the "reinvented mainframe." In 2003, IBM announced "The Mainframe Charter", describing IBM's "mainframe values" and "mainframe strategy." Now, IBM has retroactively applied the name "mainframe" to their large computers going back to 1959 (link), (link). Mainframes and the general public While "mainframe" was a relatively obscure computer term for many years, it became widespread in the 1980s. The Google Ngram graph below shows the popularity of "microcomputer", "minicomputer", and "mainframe" in books.36 The terms became popular during the late 1970s and 1980s. The popularity of "minicomputer" and "microcomputer" roughly mirrored the development of these classes of computers. Unexpectedly, even though mainframes were the earliest computers, the term "mainframe" peaked later than the other types of computers. N-gram graph from Google Books Ngram Viewer. Dictionary definitions I studied many old dictionaries to see when the word "mainframe" showed up and how they defined it. To summarize, "mainframe" started to appear in dictionaries in the late 1970s, first defining the mainframe in opposition to peripherals or as the CPU. In the 1980s, the definition gradually changed to the modern definition, with a mainframe distinguished as being large, fast, and often centralized system. These definitions were roughly a decade behind industry usage, which switched to the modern meaning in the 1970s. The word didn't appear in older dictionaries, such as the Random House College Dictionary (1968) and Merriam-Webster (1974). The earliest definition I could find was in the supplement to Webster's International Dictionary (1976): "a computer and esp. the computer itself and its cabinet as distinguished from peripheral devices connected with it." Similar definitions appeared in Webster's New Collegiate Dictionary (1976, 1980). A CPU-based definition appeared in Random House College Dictionary (1980): "the device within a computer which contains the central control and arithmetic units, responsible for the essential control and computational functions. Also called central processing unit." The Random House Dictionary (1978, 1988 printing) was similar. The American Heritage Dictionary (1982, 1985) combined the CPU and peripheral approaches: "mainframe. The central processing unit of a computer exclusive of peripheral and remote devices." The modern definition as a large computer appeared alongside the old definition in Webster's Ninth New Collegiate Dictionary (1983): "mainframe (1964): a computer with its cabinet and internal circuits; also: a large fast computer that can handle multiple tasks concurrently." Only the modern definition appears in The New Merriram-Webster Dictionary (1989): "large fast computer", while Webster's Unabridged Dictionary of the English Language (1989): "mainframe. a large high-speed computer with greater storage capacity than a minicomputer, often serving as the central unit in a system of smaller computers. [MAIN + FRAME]." Random House Webster's College Dictionary (1991) and Random House College Dictionary (2001) had similar definitions. The Oxford English Dictionary is the principal historical dictionary, so it is interesting to see its view. The 1989 OED gave historical definitions as well as defining mainframe as "any large or general-purpose computer, exp. one supporting numerous peripherals or subordinate computers." It has seven historical examples from 1964 to 1984; the earliest is the 1964 Honeywell Glossary. It quotes a 1970 Dictionary of Computers as saying that the word "Originally implied the main framework of a central processing unit on which the arithmetic unit and associated logic circuits were mounted, but now used colloquially to refer to the central processor itself." The OED also cited a Hewlett-Packard ad from 1974 that used the word "mainframe", but I consider this a mistake as the usage is completely different.15 Encyclopedias A look at encyclopedias shows that the word "mainframe" started appearing in discussions of computers in the early 1980s, later than in dictionaries. At the beginning of the 1980s, many encyclopedias focused on large computers, without using the word "mainframe", for instance, The Concise Encyclopedia of the Sciences (1980) and World Book (1980). The word "mainframe" started to appear in supplements such as Britannica Book of the Year (1980) and World Book Year Book (1981), at the same time as they started discussing microcomputers. Soon encyclopedias were using the word "mainframe", for example, Funk & Wagnalls Encyclopedia (1983), Encyclopedia Americana (1983), and World Book (1984). By 1986, even the Doubleday Children's Almanac showed a "mainframe computer." Newspapers I examined old newspapers to track the usage of the word "mainframe." The graph below shows the usage of "mainframe" in newspapers. The curve shows a rise in popularity through the 1980s and a steep drop in the late 1990s. The newspaper graph roughly matches the book graph above, although newspapers show a much steeper drop in the late 1990s. Perhaps mainframes aren't in the news anymore, but people still write books about them. Newspaper usage of "mainframe." Graph from newspapers.com from 1975 to 2010 shows usage started growing in 1978, picked up in 1984, and peaked in 1989 and 1997, with a large drop in 2001 and after (y2k?). The first newspaper appearances were in classified ads seeking employees, for instance, a 1960 ad in the San Francisco Examiner for people "to monitor and control main-frame operations of electronic computers...and to operate peripheral equipment..." and a (sexist) 1966 ad in the Philadelphia Inquirer for "men with Digital Computer Bkgrnd [sic] (Peripheral or Mainframe)."37 By 1970, "mainframe" started to appear in news articles, for example, "The computer can't work without the mainframe unit." By 1971, the usage increased with phrases such as "mainframe central processor" and "'main-frame' computer manufacturers". 1972 had usages such as "the mainframe or central processing unit is the heart of any computer, and does all the calculations". A 1975 article explained "'Mainframe' is the industry's word for the computer itself, as opposed to associated items such as printers, which are referred to as 'peripherals.'" By 1980, minicomputers and microcomputers were appearing: "All hardware categories-mainframes, minicomputers, microcomputers, and terminals" and "The mainframe and the minis are interconnected." By 1985, the mainframe was a type of computer, not just the CPU: "These days it's tough to even define 'mainframe'. One definition is that it has for its electronic brain a central processor unit (CPU) that can handle at least 32 bits of information at once. ... A better distinction is that mainframes have numerous processors so they can work on several jobs at once." Articles also discussed "the micro's challenge to the mainframe" and asked, "buy a mainframe, rather than a mini?" By 1990, descriptions of mainframes became florid: "huge machines laboring away in glass-walled rooms", "the big burner which carries the whole computing load for an organization", "behemoth data crunchers", "the room-size machines that dominated computing until the 1980s", "the giant workhorses that form the nucleus of many data-processing centers", "But it is not raw central-processing-power that makes a mainframe a mainframe. Mainframe computers command their much higher prices because they have much more sophisticated input/output systems." Conclusion After extensive searches through archival documents, I found usages of the term "main frame" dating back to 1952, much earlier than previously reported. In particular, the introduction of frames to package the IBM 701 computer led to the use of the word "main frame" for that computer and later ones. The term went through various shades of meaning and remained fairly obscure for many years. In the mid-1970s, the term started describing a large computer, essentially its modern meaning. In the 1980s, the term escaped the computer industry and appeared in dictionaries, encyclopedias, and newspapers. After peaking in the 1990s, the term declined in usage (tracking the decline in mainframe computers), but the term and the mainframe computer both survive. Two factors drove the popularity of the word "mainframe" in the 1980s with its current meaning of a large computer. First, the terms "microcomputer" and "minicomputer" led to linguistic pressure for a parallel term for large computers. For instance, the business press needed a word to describe IBM and other large computer manufacturers. While "server" is the modern term, "mainframe" easily filled the role back then and was nicely alliterative with "microcomputer" and "minicomputer."38 Second, up until the 1980s, the prototype meaning for "computer" was a large mainframe, typically IBM.39 But as millions of home computers were sold in the early 1980s, the prototypical "computer" shifted to smaller machines. This left a need for a term for large computers, and "mainframe" filled that need. In other words, if you were talking about a large computer in the 1970s, you could say "computer" and people would assume you meant a mainframe. But if you said "computer" in the 1980s, you needed to clarify if it was a large computer. The word "mainframe" is almost 75 years old and both the computer and the word have gone through extensive changes in this time. The "death of the mainframe" has been proclaimed for well over 30 years but mainframes are still hanging on. Who knows what meaning "mainframe" will have in another 75 years? Follow me on Bluesky (@righto.com) or RSS. (I'm no longer on Twitter.) Thanks to the Computer History Museum and archivist Sara Lott for access to many documents. Notes and References The Computer History Museum states: "Why are they called “Mainframes”? Nobody knows for sure. There was no mainframe “inventor” who coined the term. Probably “main frame” originally referred to the frames (designed for telephone switches) holding processor circuits and main memory, separate from racks or cabinets holding other components. Over time, main frame became mainframe and came to mean 'big computer.'" (Based on my research, I don't think telephone switches have any connection to computer mainframes.) Several sources explain that the mainframe is named after the frame used to construct the computer. The Jargon File has a long discussion, stating that the term "originally referring to the cabinet containing the central processor unit or ‘main frame’." Ken Uston's Illustrated Guide to the IBM PC (1984) has the definition "MAIN FRAME A large, high-capacity computer, so named because the CPU of this kind of computer used to be mounted on a frame." IBM states that mainframe "Originally referred to the central processing unit of a large computer, which occupied the largest or central frame (rack)." The Microsoft Computer Dictionary (2002) states that the name mainframe "is derived from 'main frame', the cabinet originally used to house the processing unit of such computers." Some discussions of the origin of the word "mainframe" are here, here, here, here, and here. The phrase "main frame" in non-computer contexts has a very old but irrelevant history, describing many things that have a frame. For example, it appears in thousands of patents from the 1800s, including drills, saws, a meat cutter, a cider mill, printing presses, and corn planters. This shows that it was natural to use the phrase "main frame" when describing something constructed from frames. Telephony uses a Main distribution frame or "main frame" for wiring, going back to 1902. Some people claim that the computer use of "mainframe" is related to the telephony use, but I don't think they are related. In particular, a telephone main distribution frame looks nothing like a computer mainframe. Moreover, the computer use and the telephony use developed separately; if the computer use started in, say, Bell Labs, a connection would be more plausible. IBM patents with "main frame" include a scale (1922), a card sorter (1927), a card duplicator (1929), and a card-based accounting machine (1930). IBM's incidental uses of "main frame" are probably unrelated to modern usage, but they are a reminder that punch card data processing started decades before the modern computer. ↩ It is unclear why the IBM 701 installation manual is dated August 27, 1952 but the drawing is dated 1953. I assume the drawing was updated after the manual was originally produced. ↩ This footnote will survey the construction techniques of some early computers; the key point is that building a computer on frames was not an obvious technique. ENIAC (1945), the famous early vacuum tube computer, was constructed from 40 panels forming three walls filling a room (ref, ref). EDVAC (1949) was built from large cabinets or panels (ref) while ORDVAC and CLADIC (1949) were built on racks (ref). One of the first commercial computers, UNIVAC 1 (1951), had a "Central Computer" organized as bays, divided into three sections, with tube "chassis" plugged in (ref ). The Raytheon computer (1951) and Moore School Automatic Computer (1952) (ref) were built from racks. The MONROBOT VI (1955) was described as constructed from the "conventional rack-panel-cabinet form" (ref). ↩ The size and construction of early computers often made it difficult to install or move them. The early computer ENIAC required 9 months to move from Philadelphia to the Aberdeen Proving Ground. For this move, the wall of the Moore School in Philadelphia had to be partially demolished so ENIAC's main panels could be removed. In 1959, moving the SWAC computer required disassembly of the computer and removing one wall of the building (ref). When moving the early computer JOHNNIAC to a different site, the builders discovered the computer was too big for the elevator. They had to raise the computer up the elevator shaft without the elevator (ref). This illustrates the benefits of building a computer from moveable frames. ↩ The IBM 701's main frame was called the Electronic Analytical Control Unit in external documentation. ↩ The 701 installation manual (1952) has a frame arrangement diagram showing the dimensions of the various frames, along with a drawing of the main frame, and power usage of the various frames. Service documentation (1953) refers to "main frame adjustments" (page 74). The 700 Series Data Processing Systems Component Circuits document (1955-1959) lists various types of frames in its abbreviation list (below) Abbreviations used in IBM drawings include MF for main frame. Also note CF for core frame, and DF for drum frame, From 700 Series Data Processing Systems Component Circuits (1955-1959). When repairing an IBM 701, it was important to know which frame held which components, so "main frame" appeared throughout the engineering documents. For instance, in the schematics, each module was labeled with its location; "MF" stands for "main frame." Detail of a 701 schematic diagram. "MF" stands for "main frame." This diagram shows part of a pluggable tube module (type 2891) in mainframe panel 3 (MF3) section J, column 29. The blocks shown are an AND gate, OR gate, and Cathode Follower (buffer). From System Drawings 1.04.1. The "main frame" terminology was used in discussions with customers. For example, notes from a meeting with IBM (April 8, 1952) mention "E. S. [Electrostatic] Memory 15 feet from main frame" and list "main frame" as one of the seven items obtained for the $15,000/month rental cost. ↩ For more information on how the IBM 701 was designed to fit on elevators and through doorways, see Building IBM: Shaping an Industry and Technology page 170, The Interface: IBM and the Transformation of Corporate Design page 69. This is also mentioned in "Engineering Description of the IBM Type 701 Computer", Proceedings of the IRE Oct 1953, page 1285. ↩ Many early systems used "central computer" to describe the main part of the computer, perhaps more commonly than "main frame." An early example is the "central computer" of the Elecom 125 (1954). The Digital Computer Newsletter (Apr 1955) used "central computer" several times to describe the processor of SEAC. The 1961 BRL report shows "central computer" being used by Univac II, Univac 1107, Univac File 0, DYSEAC and RCA Series 300. The MIT TX-2 Technical Manual (1961) uses "central computer" very frequently. The NAREC glossary (1962) defined "central computer. That part of a computer housed in the main frame." ↩ This footnote lists some other early computers that used the term "main frame." The October 1956 Digital Computer Newsletter mentions the "main frame" of the IBM NORC. Digital Computer Newsletter (Jan 1959) discusses using a RAMAC disk drive to reduce "main frame processing time." This document also mentions the IBM 709 "main frame." The IBM 704 documentation (1958) says "Each DC voltage is distributed to the main frame..." (IBM 736 reference manual) and "Check the air filters in each main frame unit and replace when dirty." (704 Central Processing Unit). The July 1962 Digital Computer Newsletter discusses the LEO III computer: "It has been built on the modular principle with the main frame, individual blocks of storage, and input and output channels all physically separate." The article also mentions that the new computer is more compact with "a reduction of two cabinets for housing the main frame." The IBM 7040 (1964) and IBM 7090 (1962) were constructed from multiple frames, including the processing unit called the "main frame."11 Machines in IBM's System/360 line (1964) were built from frames; some models had a main frame, power frame, wall frame, and so forth, while other models simply numbered the frames sequentially.12 ↩ The 1952 JOHNNIAC progress report is quoted in The History of the JOHNNIAC. This memorandum was dated August 8, 1952, so it is the earliest citation that I found. The June 1953 memorandum also used the term, stating, "The main frame is complete." ↩ A detailed description of IBM's frame-based computer packaging is in Standard Module System Component Circuits pages 6-9. This describes the SMS-based packaging used in the IBM 709x computers, the IBM 1401, and related systems as of 1960. ↩ IBM System/360 computers could have many frames, so they were usually given sequential numbers. The Model 85, for instance, had 12 frames for the processor and four megabytes of memory in 18 frames (at over 1000 pounds each). Some of the frames had descriptive names, though. The Model 40 had a main frame (CPU main frame, CPU frame), a main storage logic frame, a power supply frame, and a wall frame. The Model 50 had a CPU frame, power frame, and main storage frame. The Model 75 had a main frame (consisting of multiple physical frames), storage frames, channel frames, central processing frames, and a maintenance console frame. The compact Model 30 consisted of a single frame, so the documentation refers to the "frame", not the "main frame." For more information on frames in the System/360, see 360 Physical Planning. The Architecture of the IBM System/360 paper refers to the "main-frame hardware." ↩ A few more examples that discuss the minicomputer's mainframe, its physical box: A 1970 article discusses the mainframe of a minicomputer (as opposed to the peripherals) and contrasts minicomputers with large scale computers. A 1971 article on minicomputers discusses "minicomputer mainframes." Computerworld (Jan 28, 1970, p59) discusses minicomputer purchases: "The actual mainframe is not the major cost of the system to the user." Modern Data (1973) mentions minicomputer mainframes several times. ↩ DEC documents refer to the PDP-11 minicomputer as a mainframe. The PDP-11 Conventions manual (1970) defined: "Processor: A unit of a computing system that includes the circuits controlling the interpretation and execution of instructions. The processor does not include the Unibus, core memory, interface, or peripheral devices. The term 'main frame' is sometimes used but this term refers to all components (processor, memory, power supply) in the basic mounting box." In 1976, DEC published the PDP-11 Mainframe Troubleshooting Guide. The PDP-11 mainframe is also mentioned in Computerworld (1977). ↩ Test equipment manufacturers started using the term "main frame" (and later "mainframe") around 1962, to describe an oscilloscope or other test equipment that would accept plug-in modules. I suspect this is related to the use of "mainframe" to describe a computer's box, but it could be independent. Hewlett-Packard even used the term to describe a solderless breadboard, the 5035 Logic Lab. The Oxford English Dictionary (1989) used HP's 1974 ad for the Logic Lab as its earliest citation of mainframe as a single word. It appears that the OED confused this use of "mainframe" with the computer use. 1974 Sci. Amer. Apr. 79. The laboratory station mainframe has the essentials built-in (power supply, logic state indicators and programmers, and pulse sources to provide active stimulus for the student's circuits)." --> Is this a mainframe? The HP 5035A Logic Lab was a power supply and support circuitry for a solderless breadboard. HP's ads referred to this as a "laboratory station mainframe." ↩↩ In the 1980s, the use of "mainframe" to describe the box holding a microcomputer started to conflict with "mainframe" as a large computer. For example, Radio Electronics (October 1982), started using the short-lived term "micro-mainframe" instead of "mainframe" for a microcomputer's enclosure. By 1985, Byte magazine had largely switched to the modern usage of "mainframe." But even as late as 1987, a review of the Apple IIGC described one of the system's components as the '"mainframe" (i.e. the actual system box)'. ↩ Definitions of "central processing unit" disagreed as to whether storage was part of the CPU, part of the main frame, or something separate. This was largely a consequence of the physical construction of early computers. Smaller computers had memory in the same frame as the processor, while larger computers often had separate storage frames for memory. Other computers had some memory with the processor and some external. Thus, the "main frame" might or might not contain memory, and this ambiguity carried over to definitions of CPU. (In modern usage, the CPU consists of the arithmetic/logic unit (ALU) and control circuitry, but excludes memory.) ↩ Many definitions of mainframe or CPU mention "special register groups", an obscure feature specific to the Honeywell 800 computer (1959). (Processors have registers, special registers are common, and some processors have register groups, but only the Honeywell 800 had "special register groups.") However, computer dictionaries kept using this phrase for decades, even though it doesn't make sense for other computers. I wrote a blog post about special register groups here. ↩ This footnote provides more examples of "mainframe" being defined as the CPU. The Data Processing Equipment Encyclopedia (1961) had a similar definition: "Main Frame: The main part of the computer, i.e. the arithmetic or logic unit; the central processing unit." The 1967 IBM 360 operator's guide defined: "The main frame - the central processing unit and main storage." The Department of the Navy's ADP Glossary (1970): "Central processing unit: A unit of a computer that includes the circuits controlling the interpretation and execution of instructions. Synonymous with main frame." This was a popular definition, originally from the ISO, used by IBM (1979) among others. Funk & Wagnalls Dictionary of Data Processing Terms (1970) defined: "main frame: The basic or essential portion of an assembly of hardware, in particular, the central processing unit of a computer." The American National Standard Vocabulary for Information Processing (1970) defined: "central processing unit: A unit of a computer that includes the circuits controlling the interpretation and execution of instructions. Synonymous with main frame." ↩ Both the mainframe vs. peripheral definition and the mainframe as CPU definition made it unclear exactly what components of the computer were included in the mainframe. It's clear that the arithmetic-logic unit and the processor control circuitry were included, while I/O devices were excluded, but some components such as memory were in a gray area. It's also unclear if the power supply and I/O interfaces (channels) are part of the mainframe. These distinctions were ignored in almost all of the uses of "mainframe" that I saw. An unusual definition in a Goddard Space Center document (1965, below) partitioned equipment into the "main frame" (the electronic equipment), "peripheral equipment" (electromechanical components such as the printer and tape), and "middle ground equipment" (the I/O interfaces). The "middle ground" terminology here appears to be unique. Also note that computers are partitioned into "super speed", "large-scale", "medium-scale", and "small-scale." Definitions from Automatic Data Processing Equipment, Goddard Space Center, 1965. "Main frame" was defined as "The central processing unit of a system including the hi-speed core storage memory bank. (This is the electronic element.) ↩ This footnote gives some examples of using peripherals to save the cost of mainframe time. IBM 650 documentation (1956) describes how "Data written on tape by the 650 can be processed by the main frame of the 700 series systems." Univac II Marketing Material (1957) discusses various ways of reducing "main frame time" by, for instance, printing from tape off-line. The USAF Guide for auditing automatic data processing systems (1961) discusses how these "off line" operations make the most efficient use of "the more expensive main frame time." ↩ Peripheral manufacturers were companies that built tape drives, printers, and other devices that could be connected to a mainframe built by IBM or another company. The basis for the peripheral industry was antitrust action against IBM that led to the 1956 Consent Decree. Among other things, the consent decree forced IBM to provide reasonable patent licensing, which allowed other firms to build "plug-compatible" peripherals. The introduction of the System/360 in 1964 produced a large market for peripherals and IBM's large profit margins left plenty of room for other companies. ↩ Computers and Automation, March 1965, categorized computers into five classes, from "Teeny systems" (such as the IBM 360/20) renting for $2000/month, through Small, Medium, and Large systems, up to "Family or Economy Size Systems" (such as the IBM 360/92) renting for $75,000 per month. ↩ The term "minicomputer" was supposedly invented by John Leng, head of DEC's England operations. In the 1960s, he sent back a sales report: "Here is the latest minicomputer activity in the land of miniskirts as I drive around in my Mini Minor", which led to the term becoming popular at DEC. This story is described in The Ultimate Entrepreneur: The Story of Ken Olsen and Digital Equipment Corporation (1988). I'd trust the story more if I could find a reference that wasn't 20 years after the fact. ↩ For instance, Computers and Automation (1971) discussed the role of the minicomputer as compared to "larger computers." A 1975 minicomputer report compared minicomputers to their "general-purpose cousins." ↩ This footnote provides more on the split between minicomputers and mainframes. In 1971, Modern Data Products, Systems, Services contained .".. will offer mainframe, minicomputer, and peripheral manufacturers a design, manufacturing, and production facility...." Standard & Poor's Industry Surveys (1972) mentions "mainframes, minicomputers, and IBM-compatible peripherals." Computerworld (1975) refers to "mainframe and minicomputer systems manufacturers." The 1974 textbook "Information Systems: Technology, Economics, Applications" couldn't decide if mainframes were a part of the computer or a type of computer separate from minicomputers, saying: "Computer mainframes include the CPU and main memory, and in some usages of the term, the controllers, channels, and secondary storage and I/O devices such as tape drives, disks, terminals, card readers, printers, and so forth. However, the equipment for storage and I/O are usually called peripheral devices. Computer mainframes are usually thought of as medium to large scale, rather than mini-computers." Studying U.S. Industrial Outlook reports provides another perspective over time. U.S. Industrial Outlook 1969 divides computers into small, medium-size, and large-scale. Mainframe manufacturers are in opposition to peripheral manufacturers. The same mainframe vs. peripherals opposition appears in U.S. Industrial Outlook 1970 and U.S. Industrial Outlook 1971. The 1971 report also discusses minicomputer manufacturers entering the "maxicomputer market."30 1973 mentions "large computers, minicomputers, and peripherals." U.S. Industrial Outlook 1976 states, "The distinction between mainframe computers, minis, micros, and also accounting machines and calculators should merge into a spectrum." By 1977, the market was separated into "general purpose mainframe computers", "minicomputers and small business computers" and "microprocessors." Family Computing Magazine (1984) had a "Dictionary of Computer Terms Made Simple." It explained that "A Digital computer is either a "mainframe", a "mini", or a "micro." Forty years ago, large mainframes were the only size that a computer could be. They are still the largest size, and can handle more than 100,000,000 instructions per second. PER SECOND! [...] Mainframes are also called general-purpose computers." ↩ In 1974, Congress held antitrust hearings into IBM. The thousand-page report provides a detailed snapshot of the meanings of "mainframe" at the time. For instance, a market analysis report from IDC illustrates the difficulty of defining mainframes and minicomputers in this era (p4952). The "Mainframe Manufacturers" section splits the market into "general-purpose computers" and "dedicated application computers" including "all the so-called minicomputers." Although this section discusses minicomputers, the emphasis is on the manufacturers of traditional mainframes. A second "Plug-Compatible Manufacturers" section discusses companies that manufactured only peripherals. But there's also a separate "Minicomputers" section that focuses on minicomputers (along with microcomputers "which are simply microprocessor-based minicomputers"). My interpretation of this report is the terminology is in the process of moving from "mainframe vs. peripheral" to "mainframe vs. minicomputer." The statement from Research Shareholders Management (p5416) on the other hand discusses IBM and the five other mainframe companies; they classify minicomputer manufacturers separately. (p5425) p5426 mentions "mainframes, small business computers, industrial minicomputers, terminals, communications equipment, and minicomputers." Economist Ralph Miller mentions the central processing unit "(the so-called 'mainframe')" (p5621) and then contrasts independent peripheral manufacturers with mainframe manufacturers (p5622). The Computer Industry Alliance refers to mainframes and peripherals in multiple places, and "shifting the location of a controller from peripheral to mainframe", as well as "the central processing unit (mainframe)" p5099. On page 5290, "IBM on trial: Monopoly tends to corrupt", from Harper's (May 1974), mentions peripherals compatible with "IBM mainframe units—or, as they are called, central processing computers." ↩ The influential business newsletter EDP provides an interesting view on the struggle to separate the minicomputer market from larger computers. Through 1968, they included minicomputers in the "general-purpose computer" category. But in 1969, they split "general-purpose computers" into "Group A, General Purpose Digital Computers" and "Group B, Dedicated Application Digital Computers." These categories roughly corresponded to larger computers and minicomputers, on the (dubious) assumption that minicomputers were used for a "dedicated application." The important thing to note is that in 1969 they did not use the term "mainframe" for the first category, even though with the modern definition it's the obvious term to use. At the time, EDP used "mainframe manufacturer" or "mainframer"31 to refer to companies that manufactured computers (including minicomputers), as opposed to manufacturers of peripherals. In 1972, EDP first mentioned mainframes and minicomputers as distinct types. In 1973, "microcomputer" was added to the categories. As the 1970s progressed, the separation between minicomputers and mainframes became common. However, the transition was not completely smooth; 1973 included a reference to "mainframe shipments (including minicomputers)." To specific, the EDP Industry Report (Nov. 28, 1969) gave the following definitions of the two groups of computers: Group A—General Purpose Digital Computers: These comprise the bulk of the computers that have been listed in the Census previously. They are character or byte oriented except in the case of the large-scale scientific machines, which have 36, 48, or 60-bit words. The predominant portion (60% to 80%) of these computers is rented, usually for $2,000 a month or more. Higher level languages such as Fortran, Cobol, or PL/1 are the primary means by which users program these computers. Group B—Dedicated Application Digital Computers: This group of computers includes the "mini's" (purchase price below $25,000), the "midi's" ($25,000 to $50,000), and certain larger systems usually designed or used for one dedicated application such as process control, data acquisition, etc. The characteristics of this group are that the computers are usually word oriented (8, 12, 16, or 24-bits per word), the predominant number (70% to 100%) are purchased, and assembly language (at times Fortran) is the predominant means of programming. This type of computer is often sold to an original equipment manufacturer (OEM) for further system integration and resale to the final user. These definitions strike me as rather arbitrary. ↩ In 1981 Computerworld had articles trying to clarify the distinctions between microcomputers, minicomputers, superminicomputers, and mainframes, as the systems started to overlay. One article, Distinction Helpful for Minis, Mainframes said that minicomputers were generally interactive, while mainframes made good batch machines and network hosts. Microcomputers had up to 512 KB of memory, minis were 16-bit machines with 512 KB to 4 MB of memory, costing up to $100,000. Superminis were 16- to 32-bit machines with 4 MB to 8 MB of memory, costing up to $200,000 but with less memory bandwidth than mainframes. Finally, mainframes were 32-bit machines with more than 8 MB of memory, costing over $200,000. Another article Micro, Mini, or Mainframe? Confusion persists described a microcomputer as using an 8-bit architecture and having fewer peripherals, while a minicomputer has a 16-bit architecture and 48 KB to 1 MB of memory. ↩ The miniskirt in the mid-1960s was shortly followed by the midiskirt and maxiskirt. These terms led to the parallel construction of the terms minicomputer, midicomputer, and maxicomputer. The New York Times had a long article Maxi Computers Face Mini Conflict (April 5, 1970) explicitly making the parallel: "Mini vs. Maxi, the reigning issue in the glamorous world of fashion, is strangely enough also a major point of contention in the definitely unsexy realm of computers." Although midicomputer and maxicomputer terminology didn't catch on the way minicomputer did, they still had significant use (example, midicomputer examples, maxicomputer examples). The miniskirt/minicomputer parallel was done with varying degrees of sexism. One example is Electronic Design News (1969): "A minicomputer. Like the miniskirt, the small general-purpose computer presents the same basic commodity in a more appealing way." ↩ Linguistically, one indication that a new word has become integrated in the language is when it can be extended to form additional new words. One example is the formation of "mainframers", referring to companies that build mainframes. This word was moderately popular in the 1970s to 1990s. It was even used by the Department of Justice in their 1975 action against IBM where they described the companies in the systems market as the "mainframe companies" or "mainframers." The word is still used today, but usually refers to people with mainframe skills. Other linguistic extensions of "mainframe" include mainframing, unmainframe, mainframed, nonmainframe, and postmainframe. ↩ More examples of the split between microcomputers and mainframes: Softwide Magazine (1978) describes "BASIC versions for micro, mini and mainframe computers." MSC, a disk system manufacturer, had drives "used with many microcomputer, minicomputer, and mainframe processor types" (1980). ↩ Some examples of computer dictionaries referring to mainframes as a size category: Illustrated Dictionary of Microcomputer Terminology (1978) defines "mainframe" as "(1) The heart of a computer system, which includes the CPU and ALU. (2) A large computer, as opposed to a mini or micro." A Dictionary of Minicomputing and Microcomputing (1982) includes the definition of "mainframe" as "A high-speed computer that is larger, faster, and more expensive than the high-end minicomputers. The boundary between a small mainframe and a large mini is fuzzy indeed." The National Bureau of Standards Future Information Technology (1984) defined: "Mainframe is a term used to designate a medium and large scale CPU." The New American Computer Dictionary (1985) defined "mainframe" as "(1) Specifically, the rack(s) holding the central processing unit and the memory of a large computer. (2) More generally, any large computer. 'We have two mainframes and several minis.'" The 1990 ANSI Dictionary for Information Systems (ANSI X3.172-1990) defined: mainframe. A large computer, usually one to which other computers are connected in order to share its resources and computing power. Microsoft Press Computer Dictionary (1991) defined "mainframe computer" as "A high-level computer designed for the most intensive computational tasks. Mainframe computers are often shared by multiple users connected to the computer via terminals." ISO 2382 (1993) defines a mainframe as "a computer, usually in a computer center, with extensive capabilities and resources to which other computers may be connected so that they can share facilities." The Microsoft Computer Dictionary (2002) had an amusingly critical definition of mainframe: "A type of large computer system (in the past often water-cooled), the primary data processing resource for many large businesses and organizations. Some mainframe operating systems and solutions are over 40 years old and have the capacity to store year values only as two digits." ↩ IBM's 1962 book Planning a Computer System (1962) describes how the Stretch computer's circuitry was assembled into frames, with the CPU consisting of 18 frames. The picture below shows how a "frame" was, in fact, constructed from a metal frame. In the Stretch computer, the circuitry (left) could be rolled out of the frame (right) ↩ The term "general-purpose computer" is probably worthy of investigation since it was used in a variety of ways. It is one of those phrases that seems obvious until you think about it more closely. On the one hand, a computer such as the Apollo Guidance Computer can be considered general purpose because it runs a variety of programs, even though the computer was designed for one specific mission. On the other hand, minicomputers were often contrasted with "general-purpose computers" because customers would buy a minicomputer for a specific application, unlike a mainframe which would be used for a variety of applications. ↩ The n-gram graph is from the Google Books Ngram Viewer. The curves on the graph should be taken with a grain of salt. First, the usage of words in published books is likely to lag behind "real world" usage. Second, the number of usages in the data set is small, especially at the beginning. Nonetheless, the n-gram graph generally agrees with what I've seen looking at documents directly. ↩ More examples of "mainframe" in want ads: A 1966 ad from Western Union in The Arizona Republic looking for experience "in a systems engineering capacity dealing with both mainframe and peripherals." A 1968 ad in The Minneapolis Star for an engineer with knowledge of "mainframe and peripheral hardware." A 1968 ad from SDS in The Los Angeles Times for an engineer to design "circuits for computer mainframes and peripheral equipment." A 1968 ad in Fort Lauderdale News for "Computer mainframe and peripheral logic design." A 1972 ad in The Los Angeles Times saying "Mainframe or peripheral [experience] highly desired." In most of these ads, the mainframe was in contrast to the peripherals. ↩ A related factor is the development of remote connections from a microcomputer to a mainframe in the 1980s. This led to the need for a word to describe the remote computer, rather than saying "I connected my home computer to the other computer." See the many books and articles on connecting "micro to mainframe." ↩ To see how the prototypical meaning of "computer" changed in the 1980s, I examined the "Computer" article in encyclopedias from that time. The 1980 Concise Encyclopedia of the Sciences discusses a large system with punched-card input. In 1980, the World Book article focused on mainframe systems, starting with a photo of an IBM System/360 Model 40 mainframe. But in the 1981 supplement and the 1984 encyclopedia, the World Book article opened with a handheld computer game, a desktop computer, and a "large-scale computer." The article described microcomputers, minicomputers, and mainframes. Funk & Wagnalls Encyclopedia (1983) was in the middle of the transition; the article focused on large computers and had photos of IBM machines, but mentioned that future growth is expected in microcomputers. By 1994, the World Book article's main focus was the personal computer, although the mainframe still had a few paragraphs and a photo. This is evidence that the prototypical meaning of "computer" underwent a dramatic shift in the early 1980s from a mainframe to a balance between small and large computers, and then to the personal computer. ↩
Intel released the powerful Pentium processor in 1993, establishing a long-running brand of processors. Earlier, I wrote about the ROM in the Pentium's floating point unit that holds constants such as π. In this post, I'll look at some interesting circuits associated with this ROM. In particular, the circuitry is implemented in BiCMOS, a process that combines bipolar transistors with standard CMOS logic. The photo below shows the Pentium's thumbnail-sized silicon die under a microscope. I've labeled the main functional blocks; the floating point unit is in the lower right with the constant ROM highlighted at the bottom. The various parts of the floating point unit form horizontal stripes. Data buses run vertically through the floating point unit, moving values around the unit. Die photo of the Intel Pentium processor with the floating point constant ROM highlighted in red. Click this image (or any other) for a larger version. The diagram below shows how the circuitry in this post forms part of the Pentium. Zooming in to the bottom of the chip shows the constant ROM, holding 86-bit words: at the left, the exponent section provides 18 bits. At the right, the wider significand section provides 68 bits. Below that, the diagram zooms in on the subject of this article: one of the 86 identical multiplexer/driver circuits that provides the output from the ROM. As you can see, this circuit is a microscopic speck in the chip. Zooming in on the constant ROM's driver circuits at the top of the ROM. The layers In this section, I'll show how the Pentium is constructed from layers. The bottom layer of the chip consists of transistors fabricated on the silicon die. Regions of silicon are doped with impurities to change the electrical properties; these regions appear pinkish in the photo below, compared to the grayish undoped silicon. Thin polysilicon wiring is formed on top of the silicon. Where a polysilicon line crosses doped silicon, a transistor is formed; the polysilicon creates the transistor's gate. Most of these transistors are NMOS and PMOS transistors, but there is a bipolar transistor near the upper right, the large box-like structure. The dark circles are contacts, regions where the metal layer above is connected to the polysilicon or silicon to wire the circuits together. The polysilicon and silicon layers form the Pentium's transistors. This photo shows part of the complete circuit. The Pentium has three layers of metal wiring. The photo below shows the bottom layer, called M1. For the most part, this layer of metal connects the transistors into various circuits, providing wiring over a short distance. The photos in this section show the same region of the chip, so you can match up features between the photos. For instance, the contacts below (black circles) match the black circles above, showing how this metal layer connects to the silicon and polysilicon circuits. You can see some of the silicon and polysilicon in this image, but most of it is hidden by the metal. The Pentium's M1 metal layer is the bottom metal layer. The M2 metal layer (below) sits above the M1 wiring. In this part of the chip, the M2 wires are horizontal. The thicker lines are power and ground. (Because they are thicker, they have lower resistance and can provide the necessary current to the underlying circuitry.) The thinner lines are control signals. The floating point unit is structured so functional blocks are horizontal, while data is transmitted vertically. Thus, a horizontal wire can supply a control signal to all the bits in a functional block. The Pentium's M2 layer. The M3 layer is the top metal layer in the Pentium. It is thicker, so it is better suited for the chip's main power and ground lines as well as long-distance bus wiring. In the photo below, the wide line on the left provides power, while the wide line on the right provides ground. The power and ground are distributed through wiring in the M2 and M1 layers until they are connected to the underlying transistors. At the top of the photo, vertical bus lines are visible; these extend for long distances through the floating point unit. Notice the slightly longer line, fourth from the right. This line provides one bit of data from the ROM, provided by the circuitry described below. The dot near the bottom is a via, connecting this line to a short wire in M2, connected to a wire in M1, connected to the silicon of the output transistors. The Pentium's M3 metal layer. Lower layers are visible, but blurry due to the insulating oxide layers. The circuits for the ROM's output The simplified schematic below shows the circuit that I reverse-engineered. This circuit is repeated 86 times, once for each bit in the ROM's word. You might expect the ROM to provide a single 86-bit word. However, to make the layout work better, the ROM provides eight words in parallel. Thus, the circuitry must select one of the eight words with a multiplexer. In particular, each of the 86 circuits has an 8-to-1 multiplexer to select one bit out of the eight. This bit is then stored in a latch. Finally, a high-current driver amplifies the signal so it can be sent through a bus, traveling to a destination halfway across the floating point unit. A high-level schematic of the circuit. I'll provide a quick review of MOS transistors before I explain the circuitry in detail. CMOS circuitry uses two types of transistors—PMOS and NMOS—which are similar but also opposites. A PMOS transistor is turned on by a low signal on the gate, while an NMOS transistor is turned on by a high signal on the gate; the PMOS symbol has an inversion bubble on the gate. A PMOS transistor works best when pulling its output high, while an NMOS transistor works best when pulling its output low. CMOS circuitry normally uses the two types of MOS transistors in a Complementary fashion to implement logic gates, working together. What makes the circuits below interesting is that they often use NMOS and PMOS transistors independently. The symbol for a PMOS transistor and an NMOS transistor. The detailed schematic below shows the circuitry at the transistor and inverter level. I'll go through each of the components in the remainder of this post. A detailed schematic of the circuit. Click for a larger version. The ROM is constructed as a grid: at each grid point, the ROM can have a transistor for a 0 bit, or no transistor for a 1 bit. Thus, the data is represented by the transistor pattern. The ROM holds 304 constants so there are 304 potential transistors associated with each bit of the output word. These transistors are organized in a 38×8 grid. To select a word from the ROM, a select line activates one group of eight potential transistors. Each transistor is connected to ground, so the transistor (if present) will pull the associated line low, for a 0 bit. Note that the ROM itself consists of only NMOS transistors, making it half the size of a truly CMOS implementation. For more information on the structure and contents of the ROM, see my earlier article. The ROM grid and multiplexer. A ROM transistor can pull a line low for a 0 bit, but how does the line get pulled high for a 1 bit? This is accomplished by a precharge transistor on each line. Before a read from the ROM, the precharge transistors are all activated, pulling the lines high. If a ROM transistor is present on the line, the line will next be pulled low, but otherwise it will remain high due to the capacitance on the line. Next, the multiplexer above selects one of the 8 lines, depending on which word is being accessed. The multiplexer consists of eight transistors. One transistor is activated by a select line, allowing the ROM's signal to pass through. The other seven transistors are in the off state, blocking those ROM signals. Thus, the multiplexer selects one of the 8 bits from the ROM. The circuit below is the "keeper." As explained above, each ROM line is charged high before reading the ROM. However, this charge can fade away. The job of the keeper is to keep the multiplexer's output high until it is pulled low. This is implemented by an inverter connected to a PMOS transistor. If the signal on the line is high, the PMOS transistor will turn on, pulling the line high. (Note that a PMOS transistor is turned on by a low signal, thus the inverter.) If the ROM pulls the line low, the transistor will turn off and stop pulling the line high. This transistor is very weak, so it is easily overpowered by the signal from the ROM. The transistor on the left ensures that the line is high at the start of the cycle. The keeper circuit. The diagram below shows the transistors for the keeper. The two transistors on the left implement a standard CMOS inverter. On the right, note the weak transistor that holds the line high. You might notice that the weak transistor looks larger and wonder why that makes the transistor weak rather than strong. The explanation is that the transistor is large in the "wrong" dimension. The current capacity of an MOS transistor is proportional to the width/length ratio of its gate. (Width is usually the long dimension and length is usually the skinny dimension.) The weak transistor's length is much larger than the other transistors, so the W/L ratio is smaller and the transistor is weaker. (You can think of the transistor's gate as a bridge between its two sides. A wide bridge with many lanes lets lots of traffic through. However, a long, single-lane bridge will slow down the traffic.) The silicon implementation of the keeper. Next, we come to the latch, which remembers the value read from the ROM. This latch will read its input when the load signal is high. When the load signal goes low, the latch will hold its value. Conceptually, the latch is implemented with the circuit below. A multiplexer selects the lower input when the load signal is active, passing the latch input through to the (inverted) output. But when the load signal goes low, the multiplexer will select the top input, which is feedback of the value in the latch. This signal will cycle through the inverters and the multiplexer, holding the value until a new value is loaded. The inverters are required because the multiplexer itself doesn't provide any amplification; the signal would rapidly die out if not amplified by the inverters. The implementation of the latch. The multiplexer is implemented with two CMOS switches, one to select each multiplexer input. Each switch is a pair of PMOS and NMOS transistors that turn on together, allowing a signal to pass through. (See the bottom two transistors below.)1 The upper circuit is trickier. Conceptually, it is an inverter feeding into the multiplexer's CMOS switch. However, the order is switched so the switch feeds into the inverter. The result is not-exactly-a-switch and not-exactly-an-inverter, but the result is the same. You can also view it as an inverter with power and ground that gets cut off when not selected. I suspect this implementation uses slightly less power than the straightforward implementation. The detailed schematic of the latch. The most unusual circuit is the BiCMOS driver. By adding a few extra processing steps to the regular CMOS manufacturing process, bipolar (NPN and PNP) transistors can be created. The Pentium extensively used BiCMOS circuits since they reduced signal delays by up to 35%. Intel also used BiCMOS for the Pentium Pro, Pentium II, Pentium III, and Xeon processors. However, as chip voltages dropped, the benefit from bipolar transistors dropped too and BiCMOS was eventually abandoned. The BiCMOS driver circuit. In the Pentium, BiCMOS drivers are used when signals must travel a long distance across the chip. (In this case, the ROM output travels about halfway up the floating point unit.) These long wires have a lot of capacitance so a high-current driver circuit is needed and the NPN transistor provides extra "oomph." The diagram below shows how the driver is implemented. The NPN transistor is the large boxy structure in the upper right. When the base (B) is pulled high, current flows from the collector (C), pulling the emitter (E) high and thus rapidly pulling the output high. The remainder of the circuit consists of three inverters, each composed of PMOS and NMOS transistors. When a polysilicon line crosses doped silicon, it creates a transistor gate, so each crossing corresponds to a transistor. The inverters use multiple transistors in parallel to provide more current; the transistor sources and/or drains overlap to make the circuitry more compact. This diagram shows the silicon and polysilicon for the driver circuit. One interesting thing about this circuit is that each inverter is carefully designed to provide the desired current, with a different current for a high output versus a low output. The first transistor (purple boxes) has two PMOS transistors and two NMOS transistors, so it is a regular inverter, balanced for high and low outputs. (This inverter is conceptually part of the latch.) The second inverter (yellow boxes) has three large PMOS transistors and one smaller NMOS transistor, so it has more ability to pull the output high than low. This transistor turns on the NPN transistor by providing a high signal to the base, so it needs more current in the high state. The third inverter (green boxes) has one weak PMOS transistor and seven NMOS transistors, so it can pull its output low strongly, but can barely pull its output high. This transistor pulls the ROM output line low, so it needs enough current to drive the entire bus line. But this transistor doesn't need to pull the output high—that's the job of the NPN transistor—so the PMOS transistor can be weak. The construction of the weak transistor is similar to the keeper's weak transistor; its gate length is much larger than the other transistors, so it provides less current. Conclusions The diagram below shows how the functional blocks are arranged in the complete circuit, from the ROM at the bottom to the output at the top. The floating point unit is constructed with a constant width for each bit—38.5 µm—so the circuitry is designed to fit into this width. The layout of this circuitry was hand-optimized to fit as tightly as possible, In comparison, much of the Pentium's circuitry was arranged by software using a standard-cell approach, which is much easier to design but not as dense. Since each bit in the floating point unit is repeated many times, hand-optimization paid off here. The silicon and polysilicon of the circuit, showing the functional blocks. This circuit contains 47 transistors. Since it is duplicated once for each bit, it has 4042 transistors in total, a tiny fraction of the Pentium's 3.1 million transistors. In comparison, the MOS 6502 processor has about 3500-4500 transistors, depending on how you count. In other words, the circuit to select a word from the Pentium's ROM is about as complex as the entire 6502 processor. This illustrates the dramatic growth in processor complexity described by Moore's law. I plan to write more about the Pentium so follow me on Bluesky (@righto.com) or RSS for updates. (I'm no longer on Twitter.) You might enjoy reading about the Pentium Navajo rug. Notes The 8-to-1 multiplexer and the latch's multiplexer use different switch implementations: the first is built from NMOS transistors while the second is built from paired PMOS and NMOS transistors. The reason is that NMOS transistors are better at pulling signals slow, while PMOS transistors are better at pulling signals high. Combining the transistors creates a switch that passes low and high signals efficiently, which is useful in the latch. The 8-to-1 multiplexer, however, only needs to pull signals low (due to the precharging), so the NMOS-only multiplexer works in this role. (Note that early NMOS processors like the 6502 and 8086 built multiplexers and pass-transistor logic out of solely NMOS. This illustrates that you can use NMOS-only switches with both logic levels, but performance is better if you add PMOS transistors.) ↩
Addition is harder than you'd expect, at least for a computer. Computers use multiple types of adder circuits with different tradeoffs of size versus speed. In this article, I reverse-engineer an 8-bit adder in the Pentium's floating point unit. This adder turns out to be a carry-lookahead adder, in particular, a type known as "Kogge-Stone."1 In this article, I'll explain how a carry-lookahead adder works and I'll show how the Pentium implemented it. Warning: lots of Boolean logic ahead. The Pentium die, showing the adder. Click this image (or any other) for a larger version. The die photo above shows the main functional units of the Pentium. The adder, in the lower right, is a small component of the floating point unit. It is not a general-purpose adder, but is used only for determining quotient digits during division. It played a role in the famous Pentium FDIV division bug, which I wrote about here. The hardware implementation The photo below shows the carry-lookahead adder used by the divider. The adder itself consists of the circuitry highlighted in red. At the top, logic gates compute signals in parallel for each of the 8 pairs of inputs: partial sum, carry generate, and carry propagate. Next, the complex carry-lookahead logic determines in parallel if there will be a carry at each position. Finally, XOR gates apply the carry to each bit. Note that the sum/generate/propagate circuitry consists of 8 repeated blocks, and the same with the carry XOR circuitry. The carry lookahead circuitry, however, doesn't have any visible structure since it is different for each bit.2 The carry-lookahead adder that feeds the lookup table. This block of circuitry is just above the PLA on the die. I removed the metal layers, so this photo shows the doped silicon (dark) and the polysilicon (faint gray). The large amount of circuitry in the middle is used for testing; see the footnote.3 At the bottom, the drivers amplify control signals for various parts of the circuit. The carry-lookahead adder concept The problem with addition is that carries make addition slow. Consider calculating 99999+1 by hand. You'll start with 9+1=10, then carry the one, generating another carry, which generates another carry, and so forth, until you go through all the digits. Computer addition has the same problem: If you're adding two numbers, the low-order bits can generate a carry that then propagates through all the bits. An adder that works this way—known as a ripple carry adder—will be slow because the carry has to ripple through all the bits. As a result, CPUs use special circuits to make addition faster. One solution is the carry-lookahead adder. In this adder, all the carry bits are computed in parallel, before computing the sums. Then, the sum bits can be computed in parallel, using the carry bits. As a result, the addition can be completed quickly, without waiting for the carries to ripple through the entire sum. It may seem impossible to compute the carries without computing the sum first, but there's a way to do it. For each bit position, you determine signals called "carry generate" and "carry propagate". These signals can then be used to determine all the carries in parallel. The generate signal indicates that the position generates a carry. For instance, if you add binary 1xx and 1xx (where x is an arbitrary bit), a carry will be generated from the top bit, regardless of the unspecified bits. On the other hand, adding 0xx and 0xx will never produce a carry. Thus, the generate signal is produced for the first case but not the second. But what about 1xx plus 0xx? We might get a carry, for instance, 111+001, but we might not get a carry, for instance, 101+001. In this "maybe" case, we set the carry propagate signal, indicating that a carry into the position will get propagated out of the position. For example, if there is a carry out of the middle position, 1xx+0xx will have a carry from the top bit. But if there is no carry out of the middle position, then there will not be a carry from the top bit. In other words, the propagate signal indicates that a carry into the top bit will be propagated out of the top bit. To summarize, adding 1+1 will generate a carry. Adding 0+1 or 1+0 will propagate a carry. Thus, the generate signal is formed at each position by Gn = An·Bn, where A and B are the inputs. The propagate signal is Pn = An+Bn, the logical-OR of the inputs.4 Now that the propagate and generate signals are defined, they can be used to compute the carry Cn at each bit position: C1 = G0: a carry into bit 1 occurs if a carry is generated from bit 0. C2 = G1 + G0P1: A carry into bit 2 occur if bit 1 generates a carry or bit 1 propagates a carry from bit 0. C3 = G2 + G1P2 + G0P1P2: A carry into bit 3 occurs if bit 2 generates a carry, or bit 2 propagates a carry generated from bit 1, or bits 2 and 1 propagate a carry generated from bit 0. C4 = G3 + G2P3 + G1P2P3 + G0P1P2P3: A carry into bit 4 occurs if a carry is generated from bit 3, 2, 1, or 0 along with the necessary propagate signals. ... and so forth, getting more complicated with each bit ... The important thing about these equations is that they can be computed in parallel, without waiting for a carry to ripple through each position. Once each carry is computed, the sum bits can be computed in parallel: Sn = An ⊕ Bn ⊕ Cn. In other words, the two input bits and the computed carry are combined with exclusive-or. Implementing carry lookahead with a parallel prefix adder The straightforward way to implement carry lookahead is to directly implement the equations above. However, this approach requires a lot of circuitry due to the complicated equations. Moreover, it needs gates with many inputs, which are slow for electrical reasons.5 The Pentium's adder implements the carry lookahead in a different way, called the "parallel prefix adder."7 The idea is to produce the propagate and generate signals across ranges of bits, not just single bits as before. For instance, the propagate signal P32 indicates that a carry in to bit 2 would be propagated out of bit 3. And G30 indicates that bits 3 to 0 generate a carry out of bit 3. Using some mathematical tricks,6 you can take the P and G values for two smaller ranges and merge them into the P and G values for the combined range. For instance, you can start with the P and G values for bits 0 and 1, and produce P10 and G10. These could be merged with P32 and G32 to produce P30 and G30, indicating if a carry is propagated across bits 3-0 or generated by bits 3-0. Note that Gn0 is the carry-lookahead value we need for bit n, so producing these G values gives the results that we need from the carry-lookahead implementation. This merging process is more efficient than the "brute force" implementation of the carry-lookahead logic since logic subexpressions can be reused. This merging process can be implemented in many ways, including Kogge-Stone, Brent-Kung, and Ladner-Fischer. The different algorithms have different tradeoffs of performance versus circuit area. In the next section, I'll show how the Pentium implements the Kogge-Stone algorithm. The Pentium's implementation of the carry-lookahead adder The Pentium's adder is implemented with four layers of circuitry. The first layer produces the propagate and generate signals (P and G) for each bit, along with a partial sum (the sum without any carries). The second layer merges pairs of neighboring P and G values, producing, for instance G65 and P21. The third layer generates the carry-lookahead bits by merging previous P and G values. This layer is complicated because it has different circuitry for each bit. Finally, the fourth layer applies the carry bits to the partial sum, producing the final arithmetic sum. Here is the schematic of the adder, from my reverse engineering. The circuit in the upper left is repeated 8 times to produce the propagate, generate, and partial sum for each bit. This corresponds to the first layer of logic. At the left are the circuits to merge the generate and propagate signals across pairs of bits. These circuits are the second layer of logic. Schematic of the Pentium's 8-bit carry-lookahead adder. Click for a larger version. The circuitry at the right is the interesting part—it computes the carries in parallel and then computes the final sum bits using XOR. This corresponds to the third and fourth layers of circuitry respectively. The circuitry gets more complicated going from bottom to top as the bit position increases. The diagram below is the standard diagram that illustrates how a Kogge-Stone adder works. It's rather abstract, but I'll try to explain it. The diagram shows how the P and G signals are merged to produce each output at the bottom. Each line coresponds to both the P and the G signal. Each square box generates the P and G signals for that bit. (Confusingly, the vertical and diagonal lines have the same meaning, indicating inputs going into a diamond and outputs coming out of a diamond.) Each diamond combines two ranges of P and G signals to generate new P and G signals for the combined range. Thus, the signals cover wider ranges as they progress downward, ending with the Gn0 signals that are the outputs. A diagram of an 8-bit Kogge-Stone adder highlighting the carry out of bit 6 (green) and out of bit 2 (purple). Modification of the diagram by Robey Pointer, Wikimedia Commons. It may be easier to understand the diagram by starting with the outputs. I've highlighted two circuits: The purple circuit computes the carry into bit 3 (out of bit 2), while the green circuit computes the carry into bit 7 (out of bit 6). Following the purple output upward, note that it forms a tree reaching bits 2, 1, and 0, so it generates the carry based on these bits, as desired. In more detail, the upper purple diamond combines the P and G signals for bits 2 and 1, generating P21 and G21. The lower purple diamond merges in P0 and G0 to create P20 and G20. Signal G20 indicates of bits 2 through 0 generate a carry; this is the desired carry value into bit 3. Now, look at the green output and see how it forms a tree going upward, combining bits 6 through 0. Notice how it takes advantage of the purple carry output, reducing the circuitry required. It also uses P65, P43, and the corresponding G signals. Comparing with the earlier schematic shows how the diagram corresponds to the schematic, but abstracts out the details of the gates. Comparing the diagram to the schematic, each square box corresponds to to the circuit in the upper left of the schematic that generates P and G, the first layer of circuitry. The first row of diamonds corresponds to the pairwise combination circuitry on the left of the schematic, the second layer of circuitry. The remaining diamonds correspond to the circuitry on the right of the schematic, with each column corresponding to a bit, the third layer of circuitry. (The diagram ignores the final XOR step, the fourth layer of circuitry.) Next, I'll show how the diagram above, the logic equations, and the schematic are related. The diagram below shows the logic equation for C7 and how it is implemented with gates; this corresponds to the green diamonds above. The gates on the left below computes G63; this corresponds to the middle green diamond on the left. The next gate below computes P63 from P65 and P43; this corresponds to the same green diamond. The last gates mix in C3 (the purple line above); this corresponds to the bottom green diamond. As you can see, the diamonds abstract away the complexity of the gates. Finally, the colored boxes below show how the gate inputs map onto the logic equation. Each input corresponds to multiple terms in the equation (6 inputs replace 28 terms), showing how this approach reduces the circuitry required. This diagram shows how the carry into bit 7 is computed, comparing the equations to the logic circuit. There are alternatives to the Kogge-Stone adder. For example, a Brent-Kung adder (below) uses a different arrangement with fewer diamonds but more layers. Thus, a Brent-Kung adder uses less circuitry but is slower. (You can follow each output upward to verify that the tree reaches the correct inputs.) A diagram of an 8-bit Brent-Kung adder. Diagram by Robey Pointer, Wikimedia Commons. Conclusions The photo below shows the adder circuitry. I've removed the top two layers of metal, leaving the bottom layer of metal. Underneath the metal, polysilicon wiring and doped silicon regions are barely visible; they form the transistors. At the top are eight blocks of gates to generate the partial sum, generate, and propagate signals for each bit. (This corresponds to the first layer of circuitry as described earlier.) In the middle is the carry lookahead circuitry. It is irregular since each bit has different circuitry. (This corresponds to the second and third layers of circuitry, jumbled together.) At the bottom, eight XOR gates combine the carry lookahead output with the partial sum to produce the adder's output. (This corresponds to the fourth layer of circuitry.) The Pentium's adder circuitry with the top two layers of metal removed. The Pentium uses many adders for different purposes: in the integer unit, in the floating point unit, and for address calculation, among others. Floating-point division is known to use a carry-save adder to hold the partial remainder at each step; see my post on the Pentium FDIV division bug for details. I don't know what types of adders are used in other parts of the chip, but maybe I'll reverse-engineer some of them. Follow me on Bluesky (@righto.com) or RSS for updates. (I'm no longer on Twitter.) Footnotes and references Strangely, the original paper by Kogge and Stone had nothing to do with addition and carries. Their 1973 paper was titled, "A Parallel Algorithm for the Efficient Solution of a General Class of Recurrence Equations." It described how to solve recurrence problems on parallel computers, in particular the massively parallel ILLIAC IV. As far as I can tell, it wasn't until 1987 that their algorithm was applied to carry lookahead, in Fast Area-Efficient VLSI Adders. ↩ I'm a bit puzzled why the circuit uses an 8-bit carry-lookahead adder since only 7 bits are used. Moreover, the carry-out is unused. However, the adder's bottom output bit is not connected to anything. Perhaps the 8-bit adder was a standard logic block at Intel and was used as-is. ↩ I probably won't make a separate blog post on the testing circuitry, so I'll put details in this footnote. Half of the circuitry in the adder block is used to test the lookup table. The reason is that a chip such as the Pentium is very difficult to test: if one out of 3.1 million transistors goes bad, how do you detect it? For a simple processor like the 8080, you can run through the instruction set and be fairly confident that any problem would turn up. But with a complex chip, it is almost impossible to come up with an instruction sequence that would test every bit of the microcode ROM, every bit of the cache, and so forth. Starting with the 386, Intel added circuitry to the processor solely to make testing easier; about 2.7% of the transistors in the 386 were for testing. To test a ROM inside the processor, Intel added circuitry to scan the entire ROM and checksum its contents. Specifically, a pseudo-random number generator runs through each address, while another circuit computes a checksum of the ROM output, forming a "signature" word. At the end, if the signature word has the right value, the ROM is almost certainly correct. But if there is even a single bit error, the checksum will be wrong and the chip will be rejected. The pseudo-random numbers and the checksum are both implemented with linear feedback shift registers (LFSR), a shift register along with a few XOR gates to feed the output back to the input. For more information on testing circuitry in the 386, see Design and Test of the 80386, written by Pat Gelsinger, who became Intel's CEO years later. Even with the test circuitry, 48% of the transistor sites in the 386 were untested. The instruction-level test suite to test the remaining circuitry took almost 800,000 clock cycles to run. The overhead of the test circuitry was about 10% more transistors in the blocks that were tested. In the Pentium, the circuitry to test the lookup table PLA is just below the 7-bit adder. An 11-bit LFSR creates the 11-bit input value to the lookup table. A 13-bit LFSR hashes the two-bit quotient result from the PLA, forming a 13-bit checksum. The checksum is fed serially to test circuitry elsewhere in the chip, where it is merged with other test data and written to a register. If the register is 0 at the end, all the tests pass. In particular, if the checksum is correct, you can be 99.99% sure that the lookup table is operating as expected. The ironic thing is that this test circuit was useless for the FDIV bug: it ensured that the lookup table held the intended values, but the intended values were wrong. Why did Intel generate test addresses with a pseudo-random sequence instead of a sequential counter? It turns out that a linear feedback shift register (LFSR) is slightly more compact than a counter. This LFSR trick was also used in a touch-tone chip and the program counter of the Texas Instruments TMS 1000 microcontroller (1974). In the TMS 1000, the program counter steps through the program pseudo-randomly rather than sequentially. The program is shuffled appropriately in the ROM to counteract the sequence, so the program executes as expected and a few transistors are saved. ↩ Block diagram of the testing circuitry. The bits 1+1 will set generate, but should propagate be set too? It doesn't make a difference as far as the equations. This adder sets propagate for 1+1 but some other adders do not. The answer depends on if you use an inclusive-or or exclusive-or gate to produce the propagate signal. ↩ One solution is to implement the carry-lookahead circuit in blocks of four. This can be scaled up with a second level of carry-lookahead to provide the carry lookahead across each group of four blocks. A third level can provide carry lookahead for groups of four second-level blocks, and so forth. This approach requires O(log(N)) levels for N-bit addition. This approach is used by the venerable 74181 ALU, a chip used by many minicomputers in the 1970s; I reverse-engineered the 74181 here. The 74182 chip provides carry lookahead for the higher levels. ↩ I won't go into the mathematics of merging P and G signals; see, for example, Adder Circuits or Carry Lookahead Adders for additional details. The important factor is that the carry merge operator is associative (actually a monoid), so the sub-ranges can be merged in any order. This flexibility is what allows different algorithms with different tradeoffs. ↩ The idea behind a prefix adder is that we want to see if there is a carry out of bit 0, bits 0-1, bits 0-2, bits 0-3, 0-4, and so forth. These are all the prefixes of the word. Since the prefixes are computed in parallel, it's called a parallel prefix adder. ↩
More in technology
It's time to treat Trump like the crisis he is
Here's hoping this post is short and sweet… Back in 2019 I moved my blog off of WordPress and over to Ghost. In short, I wasn't happy with WordPress and wanted a blogging engine that felt more like it was made for blogging than a
This week, I sampled one of the world’s great spiritual texts: the Qur’an. I also read poems by Rumi and saw a movie that led me to question my understanding of both. Readings As with many other people in the West, my understanding of Islam mostly comes from the news — often, in negative light. Also, I was raised Catholic and can’t help but parse these teachings through that tradition. Which is to say, I tried to read objectively, but aware that I’m likely biased. That said, I relished the opportunity to experience this book firsthand. The Qur’an has an interesting structure. Like many other books, it’s divided into chapters (surahs.) But instead of a chronological or topical organization, chapters are roughly sequenced from longest to shortest. (Except the first, which is a short prayer.) Gioia recommended fifteen surahs. As I often do after reading classic texts, I asked ChatGPT for a summary. Here it is verbatim: 1. Al-Fatiha (The Opening): A short prayer for guidance, mercy, and the straight path. Central to every unit of Muslim prayer. 2. Al-Baqara (The Cow): Longest surah; covers law, ethics, worship, community life, and stories of earlier prophets. Establishes Islam’s moral and legal foundations. 3. Al-Imran (The Family of Imran): Addresses Christians and Jews, emphasizes the unity of divine messages, and recounts the stories of Mary and Jesus with warnings against disbelief. 4. An-Nisa (The Women): Focuses on justice, women’s rights, inheritance laws, and proper conduct in war and peace; warns against hypocrisy and betrayal. 5. Al-Ma’ida (The Table Spread): Deals with dietary laws, contracts, and relations with Jews and Christians. Emphasizes loyalty to God’s covenant and the finality of Muhammad’s message. 12. Yusuf (Joseph): A narrative surah telling the story of Joseph in a literary and emotional style. A tale of patience, betrayal, forgiveness, and divine providence. 17. Al-Isra (The Night Journey): Takes its name from Muhammad’s mystical journey to Jerusalem and heaven. Emphasizes the Qur’an’s authority and the consequences of moral choices. 18. Al-Kahf (The Cave): Contains allegorical stories (Sleepers of the Cave, Moses and the Wise Man, Dhul-Qarnayn) about faith, knowledge, and the trials of life. 32. As-Sajda (The Prostration): Stresses the importance of divine revelation and resurrection. Calls for humility and recognition of God’s signs in creation. 36. Ya-Sin: Considered the “heart of the Qur’an”; focuses on God’s power, the resurrection, and human accountability, often recited for the dying. 55. Ar-Rahman (The Most Merciful): A poetic celebration of God’s mercy, creation, and judgment—famous for the refrain: “Which of your Lord’s favors will you deny?” 67. Al-Mulk (The Sovereignty): Reflects on God’s dominion over life and death. Encourages awe and contemplation of the natural world as a sign of divine power. 103. Al-Asr (Time): A short, stark reminder that time is running out—success lies in faith, good works, truth, and patience. 112. Al-Ikhlas (Sincerity): A concise declaration of pure monotheism: God is one, eternal, without equal or offspring. As may be evident here, the Qur’an blends spiritual teachings with social instruction, proposing norms, values, and institutions for a society centered on God. If Christianity’s core message is love, the core here is belief — specifically, belief in one supreme God. All else falls in line behind that idea. Reading Rumi deepened my understanding and appreciation. Through his poems, I perceived signals of what it might be like to live in a world saturated with God: ecstasy at the sight of a flower or the moon’s reflection in a still pond. Still, he reminds us these are only words: This is how it always is A great silence overcomes me, Gioia recommended a selection of Rumi’s poems, but I’ve enjoyed them enough that I’ve decided to read a whole collection. Rather than speedruning through it, I’ll read a bit at a time during my morning journaling sessions throughout the rest of the year. Audiovisual Music: Nusrat Fateh Ali Khan plus music from Tunisia. I was familiar with Ali Khan; his music was everywhere in the 1990s. He’s also one of the performers in the LAST TEMPTATION OF CHRIST soundtracks I highlighted earlier this year. Revisiting his music now gave me a new appreciation; like Rumi, he seems to express the ecstasy of living in a world soaked in divinity. Art: Gioia recommended looking at Islamic architecture. I’ve had the privilege of visiting two of the world’s great Islamic buildings, Hagia Sophia in Istanbul and the Alhambra in Granada. So that’s where my mind went. Hagia Sophia is one of the most impressive spaces I’ve ever been in; it must be experienced firsthand. That said, this NOVA program explains the building’s historical, cultural, and structural significance: This short presentation helped me understand the spiritual significance of sequencing and structure in Islamic religious buildings: The Alhambra is impressive for different reasons. For one thing, it’s not a religious structure. It’s arguably most famous for its beautiful mosaics. This video focuses on the palace’s hydraulic network, one of the world’s great engineering feats: This TEDEd presentation helped me understand the significance of the astonishing geometric decorations in these buildings: Both of these buildings also highlight Islam’s complex relationship with Christianity. Hagia Sophia started as a Christian church before being converted into a mosque, and the Alhambra stands as a reminder that part of Europe was under Islamic rule for a time. Cinema: rather than selecting a movie related to this week’s readings, I chose one I’d wanted to watch for a long time: Hal Ashby’s 1979 satire BEING THERE, starring the great Peter Sellers. By chance, a sheltered, single-minded, TV-obsessed, illiterate gardener named Chance is unwittingly thrust into the highest echelons of economic and political power. Due to his genteel demeanor, his naive utterances are continually misinterpreted as profundities. He remains aloof and uninvolved even as he becomes incredibly powerful. Rather, he becomes powerful because he is aloof and uninvolved. He has no awareness of (or concern for) how his words are interpreted, so he remains cool and neutral. This amplifies his image as a sage, feeding his growing myth. He and his interlocutors live in diverging worlds, reading into each others’ utterances meanings that aren’t there. Reflections Or are they? The movie suggests words don’t have inherent meaning; the meaning is in how we interpret them. Context informs understanding. Lacking an official backstory, people parse Chance through their preconceptions. In the first act, he introduces himself as “Chance the gardener.” But the idea that this seemingly refined gentleman is a mere gardener seems so unlikely that his interlocutor misinterprets his name as the more upper class Chancey Gardiner. And so it remains. When Chance(y) says “I like to watch,” he means something very different than what others hear, leading to particular and often unexpected actions. And it’s actions that ultimately matter. The Talmud puts it succinctly: “We do not see things as they are, we see them as we are.” I think about this whenever I engage with ancient scriptures. These texts were produced by and for very different cultures. The words mean different things to me than they likely meant to their original audiences. It’s not just that they were written in a language I don’t understand: they were written in a world I don’t understand. And yet, they endure. Why? We make the meaning. More precisely: meaning emerges as we parse the text through our particular understandings. And meaning adds value to our lives. What I took from my reading is that Islam positions itself as the culmination of monotheism. Characters, stories, and themes from the Jewish and Christian scriptures are present here too, but with a different focus. Somewhere along the line, Judaism and Christianity softened on the idea that there’s only one God. Islam aims to correct course. The idea of single-mindedly centering your life on something other than yourself is appealing. (Post-Freud, we’d say ego is the enemy.) The concept isn’t unique to Islam, of course. (And again, I’m likely reading into it.) What seems different is the Qur’an’s confident assertion of its message’s universal relevance. I can see how this all-encompassing vision can be seen as incompatible with secular post-modern societies. The Abrahamic traditions are also closely related, which can foster mutual understanding but also intensify disagreement. But I also sense the conflict emerges from literal readings, which we shouldn’t do. Again, these words emerge from — and speak to — different contexts. And I see the transcendent beauty in Rumi’s poems, the patterns of the Alhambra, and Nusrat Fateh Ali Khan’s singing, much as I do in Chartres Cathedral, the Mass in B Minor, and the Divine Comedy. All remind me that words can connect people to higher orders of being and to each other, elevating us all. Does God come in through the words or through what we bring to the words? Does it matter? It does. As Nietzsche warned, once we lose sense of the words’ specialness, we approach them differently — much like Chancey’s friends would hear him differently if they knew he was mere Chance. Some might say truth is what ultimately matters: that Chance is just a gardener who is, in the words of a character who knows, “stuffed with rice pudding between th’ ears. Shortchanged by the Lord, and dumb as a jackass.” But another character who also knows opts to not reveal the “truth” to avoid destroying his dying friend’s comforting illusion. An act of mercy? Delusion? Both? And then there’s the film’s last shot, which I won’t spoil. Let’s just say it casts doubts on whether any of these characters — or us voyeurs in this world of Chance — really knows what is in front of our eyes. Notes on Note-taking I used ChatGPT more extensively this week than at any previous point in the course. I started a ChatGPT project with the following system prompt: You are an expert tutor in literature, history, music, cinema, and more – in short, the humanities. I am in a year-long program to read through (and listen to and watch) the most important humanities works from across the world. You’ll provide summaries, pointers, and advice. Respond concisely but helpfully. Whenever possible, contextualize content by pointing out possible parallels with the modern day. Within this context, I asked ChatGPT to bring perspective to what I was reading — especially some of the most difficult passages and parallels between the other two Abrahamic traditions. I found parts of surah 4 challenging — especially those that address warfare and justice. I asked ChatGPT for a charitable interpretation of the scripture, and the LLM helpfully responded with explanations that helped me understand the text in its broader context. ChatGPT didn’t recommend it, but this video from Bishop Robert Barron also reminded me that the Judeo-Christian scriptures also have similarly difficult passages and that we must interpret them in the context for which they were written: After watching the TEDEd video about geometry in Islamic architecture, I asked ChatGPT, “How does the repetition through tessellation evident in Islamic decoration reflect the religion’s spiritual values?” I’ll quote its answer verbatim, since I found it both insightful and exemplary of the value I’m getting from using AI in this way: Beautiful question—and one that gets at the visual theology of Islam. Islamic decoration, especially through geometric tessellation, is not just ornamental—it’s deeply expressive of spiritual truths. Here’s how repetition, pattern, and abstraction in Islamic art reflect the core values of the faith: ⸻ 1. Unity in Multiplicity (Tawhid) The oneness of God (tawhid) is Islam’s central belief. Repeating geometric patterns express how divine unity underlies all diversity—a single pattern unfolds into infinite forms, just as all creation reflects the One. Spiritual message: Beneath complexity lies a single divine source. ⸻ 2. Infinity and the Transcendent Tessellations can theoretically extend forever—they have no beginning or end. This evokes infinity, a symbol of God’s eternal, boundless nature. The viewer is drawn into a contemplative state by the endless repetition. Spiritual message: God is beyond time, form, and limit. ⸻ 3. Non-figurative, Non-idolatrous Representation Islamic tradition discourages figural imagery in sacred contexts to avoid idolatry. Abstract, geometric art avoids representing God and instead points toward Him by evoking beauty, harmony, and transcendence. Spiritual message: God cannot be pictured—but can be reflected in patterns of perfection. ⸻ 4. Contemplation and Remembrance (Dhikr) Like repeated prayer or names of God, visual repetition becomes a form of remembrance. Islamic art invites stillness, reflection, and mental surrender—a visual dhikr. Spiritual message: Still the ego, contemplate the divine, dissolve into the pattern. ⸻ 5. Mathematics as Sacred Language Islamic artists used advanced geometry and symmetry, often guided by spiritual cosmology. The precision of the art mirrors the divine order in the cosmos, hinting that the universe is intelligible, ordered, and meaningful. Spiritual message: Creation is a sign (ayah) pointing to its Creator. The LLM included a helpful table: Artistic Feature Spiritual Reflection Repetition / Tessellation Unity, infinity, divine order No images of God or people Transcendence, anti-idolatry Symmetry and precision Balance, harmony, divine perfection Infinite extension God’s boundlessness and eternal presence Meditative patterning Encourages contemplation and surrender It concluded: Islamic art, in this sense, functions like visual theology: it doesn’t depict the divine, but invites you to perceive it—not through representation, but through pattern, balance, and beauty. Would you like to see some examples or diagrams that illustrate how these patterns are constructed mathematically or spiritually? I was intrigued by this last question, especially given that OpenAI just released improved ChatGPT image rendering capabilities. So I responded, “Yes, please show me an example diagram that illustrates these principles.” At this point, ChatGPT generated this image: Up Next We’re moving into the second quarter of the course. Gioia recommends books 1 and 2 of the Aeneid, book 1 of Ovid’s Metamorphoses, and a few other Roman poems, aphorisms, and proverbs. So we’re on our way back to Rome! Again, there’s a YouTube playlist for the videos I’m sharing here. I’m also sharing these posts via Substack if you’d like to subscribe and comment. See you next week!
Nick Timiraos for the Wall Street Journal: Corporate America’s Euphoria Over Trump’s ‘Golden Age’ Is Giving Way to Distress Investors were caught flat-footed by Trump’s animus toward Canada, which wasn’t a part of last year’s election campaign. Global
US bridges at risk of ship collisions, Airbus testing an unducted fan, an earthquake in Myanmar, China’s undersea cable cutter, and more.