Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Computer Ads from the Past

Adam Osborne Interview by Practical Computing (1983)

Osborne reveals his philosophy

4 days ago 4 votes
Plus Post: VTech Laser MSX2

The exciting new MSX computer with more features for less money

5 days ago 4 votes
MicroTimes Interviews Jim Clark from Mosaic Communications (1994)

They discuss how to commercialize the internet.

6 days ago 6 votes
Anchor Pad

Two experts want your computer. Which one will get to it first?

a week ago 10 votes

More in technology

Notes on the Pentium's microcode circuitry

Most people think of machine instructions as the fundamental steps that a computer performs. However, many processors have another layer of software underneath: microcode. With microcode, instead of building the processor's control circuitry from complex logic gates, the control logic is implemented with code known as microcode, stored in the microcode ROM. To execute a machine instruction, the computer internally executes several simpler micro-instructions, specified by the microcode. In this post, I examine the microcode ROM in the original Pentium, looking at the low-level circuitry. The photo below shows the Pentium's thumbnail-sized silicon die under a microscope. I've labeled the main functional blocks. The microcode ROM is highlighted at the right. If you look closely, you can see that the microcode ROM consists of two rectangular banks, one above the other. This die photo of the Pentium shows the location of the microcode ROM. Click this image (or any other) for a larger version. The image below shows a closeup of the two microcode ROM banks. Each bank provides 45 bits of output; together they implement a micro-instruction that is 90 bits long. Each bank consists of a grid of transistors arranged into 288 rows and 720 columns. The microcode ROM holds 4608 micro-instructions, 414,720 bits in total. At this magnification, the ROM appears featureless, but it is covered with horizontal wires, each just 1.5 µm thick. The 90 output lines from the ROM, with a closeup of six lines exiting the ROM. The ROM's 90 output lines are collected into a bundle of wires between the banks, as shown above. The detail shows how six of the bits exit from the banks and join the bundle. This bundle exits the ROM to the left, travels to various parts of the chip, and controls the chip's circuitry. The output lines are in the chip's top metal layer (M3): the Pentium has three layers of metal wiring with M1 on the bottom, M2 in the middle, and M3 on top. The Pentium has a large number of bits in its micro-instruction, 90 bits compared to 21 bits in the 8086. Presumably, the Pentium has a "horizontal" microcode architecture, where the microcode bits correspond to low-level control signals, as opposed to "vertical" microcode, where the bits are encoded into denser micro-instructions. I don't have any information on the Pentium's encoding of microcode; unlike the 8086, the Pentium's patents don't provide any clues. The 8086's microcode ROM holds 512 micro-instructions, much less than the Pentium's 4608 micro-instructions. This makes sense, given the much greater complexity of the Pentium's instruction set, including the floating-point unit on the chip. The image below shows a closeup of the Pentium's microcode ROM. For this image, I removed the three layers of metal and the polysilicon layer to expose the chip's underlying silicon. The pattern of silicon doping is visible, showing the transistors and thus the data stored in the ROM. If you have enough time, you can extract the bits from the ROM by examining the silicon and seeing where transistors are present. A closeup of the ROM showing how bits are encoded in the layout of transistors. Before explaining the ROM's circuitry, I'll review how an NMOS transistor is constructed. A transistor can be considered a switch between the source and drain, controlled by the gate. The source and drain regions (green) consist of silicon doped with impurities to change its semiconductor properties, forming N+ silicon. (These regions are visible in the photo above.) The gate consists of a layer of polysilicon (red), separated from the silicon by a very thin insulating oxide layer. Whenever polysilicon crosses active silicon, a transistor is formed. Diagram showing the structure of an NMOS transistor. Bits are stored in the ROM through the pattern of transistors in the grid. The presence or absence of a transistor stores a 0 or 1 bit.1 The closeup below shows eight bits of the microcode ROM. There are four transistors present and four gaps where transistors are missing. Thus, this part of the ROM holds four 0 bits and four 1 bits. For the diagram below, I removed the three metal layers and the polysilicon to show the underlying silicon. I colored doped (active) silicon regions green, and drew in the horizontal polysilicon lines in red. As explained above, a transistor is created if polysilicon crosses doped silicon. Thus, the contents of the ROM are defined by the pattern of silicon regions, which creates the transistors. Eight bits of the microcode ROM, with four transistors present. The horizontal silicon lines are used as wiring to provide ground to the transistors, while the horizontal polysilicon lines select one of the rows in the ROM. The transistors in that row will turn on, pulling the associated output lines low. That is, the presence of a transistor in a row causes the output to be pulled low, while the absence of a transistor causes the output line to remain high. A schematic corresponding to the eight bits above. The diagram below shows the silicon, polysilicon, and bottom metal (M1) layers. I removed the metal from the left to reveal the silicon and polysilicon underneath, but the pattern of vertical metal lines continues there. As shown earlier, the silicon pattern forms transistors. Each horizontal metal line has a connection to ground through a metal line (not shown). The horizontal polysilicon lines select a row. When polysilicon lines cross doped silicon, the gate of a transistor is formed. Two transistors may share the drain, as in the transistor pair on the left. Diagram showing the silicon, polysilicon, and M1 layers. The vertical metal wires form the outputs. The circles are contacts between the metal wire and the silicon of a transistor.2 Short metal jumpers connect the polysilicon lines to the metal layer above, which will be described next. The image below shows the upper left corner of the ROM. The yellowish metal lines are the top metal layer (M3), while the reddish metal lines are the middle metal layer (M2). The thick yellowish M3 lines distribute ground to the ROM. Underneath the horizontal M3 line, a horizontal M2 line also distributes ground. The grids of black dots are numerous contacts between the M3 line and the M2 line, providing a low-resistance connection. The M2 line, in turn, connects to vertical M1 ground lines underneath—these wide vertical lines are faintly visible. These M1 lines connect to the silicon, as shown earlier, providing ground to each transistor. This illustrates the complexity of power distribution in the Pentium: the thick top metal (M3) is the primary distribution of +5 volts and ground through the chip, but power must be passed down through M2 and M1 to reach the transistors. The upper left corner of the ROM. The other important feature above is the horizontal metal lines, which help distribute the row-select signals. As shown earlier, horizontal polysilicon lines provide the row-select signals to the transistors. However, polysilicon is not as good a conductor as metal, so long polysilicon lines have too much resistance. The solution is to run metal lines in parallel, periodically connected to the underlying polysilicon lines and reducing the overall resistance. Since the vertical metal output lines are in the M1 layer, the horizontal row-select lines run in the M2 layer so they don't collide. Short "jumpers" in the M1 layer connect the M2 lines to the polysilicon lines. To summarize, each ROM bank contains a grid of transistors and transistor vacancies to define the bits of the ROM. The ROM is carefully designed so the different layers—silicon, polysilicon, M1, and M2—work together to maximize the ROM's performance and density. Microcode Address Register As the Pentium executes an instruction, it provides the address of each micro-instruction to the microcode ROM. The Pentium holds this address—the micro-address—in the Microcode Address Register (MAR). The MAR is a 13-bit register located above the microcode ROM. The diagram below shows the Microcode Address Register above the upper ROM bank. It consists of 13 bits; each bit has multiple latches to hold the value as well as any pushed subroutine micro-addresses. Between bits 7 and 8, some buffer circuitry amplifies the control signals that go to each bit's circuitry. At the right, drivers amplify the outputs from the MAR, sending the signals to the row drivers and column-select circuitry that I will discuss below. To the left of the MAR is a 32-bit register that is apparently unrelated to the microcode ROM, although I haven't determined its function. The Microcode Address Register is located above the upper ROM bank. The outputs from the Microcode Address Register select rows and columns in the microcode ROM, as I'll explain below. Bits 12 through 7 of the MAR select a block of 8 rows, while bits 6 through 4 select a row in this block. Bits 3 through 0 select one column out of each group of 16 columns to select an output bit. Thus, the microcode address controls what word is provided by the ROM. Several different operations can be performed on the Microcode Address Register. When executing a machine instruction, the MAR must be loaded with the address of the corresponding microcode routine. (I haven't determined how this address is generated.) As microcode is executed, the MAR is usually incremented to move to the next micro-instruction. However, the MAR can branch to a new micro-address as required. The MAR also supports microcode subroutine calls; it will push the current micro-address and jump to the new micro-address. At the end of the micro-subroutine, the micro-address is popped so execution returns to the previous location. The MAR supports three levels of subroutine calls, as it contains three registers to hold the stack of pushed micro-addresses. The MAR receives control signals and addresses from standard-cell logic located above the MAR. Strangely, in Intel's published floorplans for the Pentium, this standard-cell logic is labeled as part of the branch prediction logic, which is above it. However, carefully tracing the signals from the standard-cell logic shows that is connected to the Microcode Address Register, not the branch predictor. Row-select drivers As explained above, each ROM bank has 288 rows of transistors, with polysilicon lines to select one of the rows. To the right of the ROM is circuitry that activates one of these row-select lines, based on the micro-address. Each row matches a different 9-bit address. A straightforward implementation would use a 9-input AND gate for each row, matching a particular pattern of 9 address bits or their complements. However, this implementation would require 576 very large AND gates, so it is impractical. Instead, the Pentium uses an optimized implementation with one 6-input AND gate for each group of 8 rows. The remaining three address bits are decoded once at the top of the ROM. As a result, each row only needs one gate, detecting if its group of eight rows is selected and if the particular one of eight is selected. Simplified schematic of the row driver circuitry. The schematic above shows the circuitry for a group of eight rows, slightly simplified.3 At the top, three address bits are decoded, generating eight output lines with one active at a time. The remaining six address bits are inverted, providing the bit and its complement to the decoding circuitry. Thus, the 9 bits are converted into 20 signals that flow through the decoders, a large number of wires, but not unmanageable. Each group of eight rows has a 6-input AND gate that matches a particular 6-bit address, determined by which inputs are complemented and which are not.4 The NAND gate and inverter at the left combine the 3-bit decoding and the 6-bit decoding, activating the appropriate row. Since there are up to 720 transistors in each row, the row-select lines need to be driven with high current. Thus, the row-select drivers use large transistors, roughly 25 times the size of a regular transistor. To fit these transistors into the same vertical spacing as the rest of the decoding circuitry, a tricky packing is used. The drivers for each group of 8 rows are packed into a 3×3 grid, except the first column has two drivers (since there are 8 drivers in the group, not 9). To avoid a gap, the drivers in the first column are larger vertically and squashed horizontally. Output circuitry The schematic below shows the multiplexer circuit that selects one of 16 columns for a microcode output bit. The first stage has four 4-to-1 multiplexers. Next, another 4-to-1 multiplexer selects one of the outputs. Finally, a BiCMOS driver amplifies the output for transmission to the rest of the processor. The 16-to-1 multiplexer/output driver. In more detail, the ROM and the first multiplexer are essentially NMOS circuits, rather than CMOS. Specifically, the ROM's grid of transistors is constructed from NMOS transistors that can pull a column line low, but there are no PMOS transistors in the grid to pull the line high (since that would double the size of the ROM). Instead, the multiplexer includes precharge transistors to pull the lines high, presumably in the clock phase before the ROM is read. The capacitance of the lines will keep the line high unless it is pulled low by a transistor in the grid. One of the four transistors in the multiplexer is activated (by control signal a, b, c, or d) to select the desired line. The output goes to a "keeper" circuit, which keeps the output high unless it is pulled low. The keeper uses an inverter with a weak PMOS transistor that can only provide a small pull-up current. A stronger low input will overpower this transistor, switching the state of the keeper. The output of this multiplexer, along with the outputs of three other multiplexers, goes to the second-stage multiplexer,5 which selects one of its four inputs, based on control signals e, f, g, and h. The output of this multiplexer is held in a latch built from two inverters. The second latch has weak transistors so the latch can be easily forced into the desired state. The output from the first latch goes through a CMOS switch into a second latch, creating a flip-flop. The output from the second latch goes to a BiCMOS driver, which drives one of the 90 microcode output lines. Most processors are built from CMOS circuitry (i.e. NMOS and PMOS transistors), but the Pentium is built from BiCMOS circuitry: bipolar transistors as well as CMOS. At the time, bipolar transistors improved performance for high-current drivers; see my article on the Pentium's BiCMOS circuitry. The diagram below shows three bits of the microcode output. This circuitry is for the upper ROM bank; the circuitry is mirrored for the lower bank. The circuitry matches the schematic above. Each of the three blocks has 16 input lines from the ROM grid. Four 4-to-1 multiplexers reduce this to 4 lines, and the second multiplexer selects a single line. The result is latched and amplified by the output driver. (Note the large square shape of the bipolar transistors.) Next is the shift register that processes the microcode ROM outputs for testing. The shift register uses XOR logic for its feedback; unlike the rest of the circuitry, the XOR logic is irregular since only some bits are fed into XOR gates. Three bits of output from the microcode, I removed the three metal layers to show the polysilicon and silicon. Circuitry for testing Why does the microcode ROM have shift registers and XOR gates? The reason is that a chip such as the Pentium is very difficult to test: if one out of 3.1 million transistors goes bad, how do you detect it? For a simple processor like the 8086, you can run through the instruction set and be fairly confident that any problem would turn up. But with a complex chip, it is almost impossible to design an instruction sequence that would test every bit of the microcode ROM, every bit of the cache, and so forth. Starting with the 386, Intel added circuitry to the processor solely to make testing easier; about 2.7% of the transistors in the 386 were for testing. The Pentium has this testing circuitry for many ROMs and PLAs, including the division PLA that caused the infamous FDIV bug. To test a ROM inside the processor, Intel added circuitry to scan the entire ROM and checksum its contents. Specifically, a pseudo-random number generator runs through each address, while another circuit computes a checksum of the ROM output, forming a "signature" word. At the end, if the signature word has the right value, the ROM is almost certainly correct. But if there is even a single bit error, the checksum will be wrong and the chip will be rejected. The pseudo-random numbers and the checksum are both implemented with linear feedback shift registers (LFSR), a shift register along with a few XOR gates to feed the output back to the input. For more information on testing circuitry in the 386, see Design and Test of the 80386, written by Pat Gelsinger, who became Intel's CEO years later. Conclusions You'd think that implementing a ROM would be straightforward, but the Pentium's microcode ROM is surprisingly complex due to its optimized structure and its circuitry for testing. I haven't been able to determine much about how the microcode works, except that the micro-instruction is 90 bits wide and the ROM holds 4608 micro-instructions in total. But hopefully you've found this look at the circuitry interesting. Disclaimer: this should all be viewed as slightly speculative and there are probably some errors. I didn't want to prefix every statement with "I think that..." but you should pretend it is there. I plan to write more about the implementation of the Pentium, so follow me on Bluesky (@righto.com) or RSS for updates. Peter Bosch has done some reverse engineering of the Pentium II microcode; his information is here. Footnotes and references It is arbitrary if a transistor corresponds to a 0 bit or a 1 bit. A transistor will pull the output line low (i.e. a 0 bit), but the signal could be inverted before it is used. More analysis of the circuitry or ROM contents would clear this up. ↩ When looking at a ROM like this, the contact pattern seems like it should tell you the contents of the ROM. Unfortunately, this doesn't work. Since a contact can be attached to one or two transistors, the contact pattern doesn't give you enough information. You need to see the silicon to determine the transistor pattern and thus the bits. ↩ I simplified the row driver schematic. The most interesting difference is that the NAND gates are optimized to use three transistors each, instead of four transistors. The trick is that one of the NMOS transistors is essentially shared across the group of 8 drivers; an inverter drives the low side of all eight gates. The second simplification is that the 6-input AND gate is implemented with two 3-input NAND gates and a NOR gate for electrical reasons. Also, the decoder that converts 3 bits into 8 select lines is located between the banks, at the right, not at the top of the ROM as I showed in the schematic. Likewise, the inverters for the 6 row-select bits are not at the top. Instead, there are 6 inverters and 6 buffers arranged in a column to the right of the ROM, which works better for the layout. These are BiCMOS drivers so they can provide the high-current outputs necessary for the long wires and numerous transistor gates that they must drive. ↩ The inputs to the 6-input AND gate are arranged in a binary counting pattern, selecting each row in sequence. This binary arrangment is standard for a ROM's decoder circuitry and is a good way to recognize a ROM on a die. The Pentium has 36 row decoders, rather than the 64 that you'd expect from a 6-bit input. The ROM was made to the size necessary, rather than a full power of two. In most ROMs, it's difficult to determine if the ROM is addressed bottom-to-top or top-to-bottom. However, because the microcode ROM's counting pattern is truncated, one can see that the top bank starts with 0 at the top and counts downward, while the bottom bank is reversed, starting with 0 at the bottom and counting upward. ↩ A note to anyone trying to read the ROM contents: it appears that the order of entries in a group of 16 is inconsistent, so a straightforward attempt to visually read the ROM will end up with scrambled data. That is, some of the groups are reversed. I don't see any obvious pattern in which groups are reversed. A closeup of the first stage output mux. This image shows the M1 metal layer. In the diagram above, look at the contacts from the select lines, connecting the select lines to the mux transistors. The contacts on the left are the mirror image of the contacts on the right, so the columns will be accessed in the opposite order. This mirroring pattern isn't consistent, though; sometimes neighboring groups are mirrored and sometimes they aren't. I don't know why the circuitry has this layout. Sometimes mirroring adjacent groups makes the layout more efficient, but the inconsistent mirroring argues against this. Maybe an automated layout system decided this was the best way. Or maybe Intel did this to provide a bit of obfuscation against reverse engineering. ↩

23 hours ago 3 votes
Electricity and the speed of light

If it's all just electromagnetic waves, why is electricity in a conductor moving slower than visible light?

19 hours ago 3 votes
The April Fools joke that might have got me fired

Everyone should pull one great practical joke in their lifetimes. This one was mine, and I think it's past the statute of limitations. The story is true. Only the names are redacted to protect the guilty. My first job out of college was a database programmer, even though my undergraduate degree had nothing to do with computers and my current profession still mostly doesn't. The reason was that the University I worked for couldn't afford competitive wages, but they did offer various fringe benefits, and they were willing to train someone who at least had decent working knowledge. I, as a newly minted graduate of the august University of California system, had decent working knowledge at least of BSD/386 and SunOS, but more importantly also had the glowing recommendation of my predecessor who was being promoted into a new position. I was hired, which was their first mistake. The system I was hired to work on was an HP 9000 K250, one of Hewlett-Packard's big PA-RISC servers. I wish I had a photograph of it, but all I have are a couple bad scans of some bad Polaroids of my office and none of the server room. The server room was downstairs from my office back in the days when server rooms were on-premises, complete with a swipe card lock and a halon system that would give you a few seconds of grace before it flooded everything. The K250 hulked in there where it had recently replaced what I think was an Encore mini of some sort (probably a Multimax, since it was a few years old and the 88K Encores would have been too new for the University), along with the AIX RS/6000s that provided student and faculty shell accounts and E-mail, the bonded T1 lines, some of the terminal servers, the massive Cabletron routers and a lot of the telco stuff. One of the tape reels from the Encore hangs on my wall today as a memento. The K250 and the Encore it replaced (as well as the L-Class that later replaced the K250 when I was a consultant) ran an all-singing, all-dancing student information system called CARS. CARS is still around, renamed Jenzabar, though I suspect that many of its underpinnings remain if you look under the table. In those days CARS was a massive overlay that was loaded atop the operating system and database, which when I started were, respectively, HP/UX 10.20 and Informix. (I'm old.) It used Informix tables, screens and stored procedures plus its own text UI libraries to run code written variously as Perform screens, SQL, C-shell scripts and plain old C or ESQL/C. Everything was tracked in RCS using overgrown Makefiles. I had the admin side (resource management, financials, attendance trackers, etc.) and my office partner had the academic side (mostly grades and faculty tracking). My job was to write and maintain this code and shortly after to help the University create custom applications in CARS' brand-spanking new web module, which chose the new hotness in scripting languages, i.e., Perl. Fortuitously I had learned Perl in, appropriately enough, a computational linguistics course. CARS also managed most of the printers on campus except for the few that the RS/6000s controlled directly. Most of the campus admin printers were HP LaserJet 4 units of some derivation equipped with JetDirect cards for networking. These are great warhorse printers, some of the best laser printers HP ever made. I suspect there were line printers other places, but those printers were largely what existed in the University's offices. It turns out that the READY message these printers show on their VFD panels is changeable. I don't remember where I read this, probably idly paging through the manual over a lunch break, but initially the only fun things I could think of to do was to have the printer say hi to my boss when she sent jobs to it, stuff like that (whereupon she would tell me to get back to work). Then it dawned on me: because I had access to the printer spools on the K250, and the spool directories were conveniently named the same as their hostnames, I knew where each and every networked LaserJet on campus was. I was young, rash and motivated. This was a hack I just couldn't resist. It would be even better than what had been my favourite joke at my alma mater, where campus services, notable for posting various service suspension notices, posted one April Fools' Day that gravity itself would be suspended to various buildings. I felt sure this hack would eclipse that too. The plan on April Fools' Day was to get into work at OMG early o'clock and iterate over every entry in the spool, sending it a sequence that would change the READY message to INSERT 5 CENTS. This would cause every networked LaserJet on campus to appear to ask for a nickel before you printed anything. The script was very simple (this is the actual script, I saved it): The ^[ was a literal ASCII 27 ESCape character, and netto was a simple netcat-like script I had written in these days before netcat was widely used. That's it. Now, let me be clear: the printer was still ready! The effect was merely cosmetic! It would still print if you sent jobs to it! Nevertheless, to complete the effect, this message was sent out on the campus-wide administration mailing list (which I also saved): At the end of the day I would reset everything back to READY, smile smugly, and continue with my menial existence. That was the plan. Having sent this out, I fielded a few anxious calls, who laughed uproariously when they realized, and I reset their printers manually afterwards. The people who knew me, knew I was a practical joker, took note of the date, and sent approving replies. One of the best was sent to me later in the day by intercampus mail, printed on their laser printer, with a nickel taped to it. Unfortunately, not everybody on campus knew me, and those who did not not only did not call me, but instead called university administration directly. By 8:30am it was chaos in the main office and this filtered up to the head of HR, who most definitely did know me, and told me I'd better send a retraction before the CFO got in or I was in big trouble. That went wrong also, because my retraction said that campus administration was not considering charging per-page fees when in fact they actually were, so I had to retract it and send a new retraction that didn't call attention to that fact. I also ran the script to reset everything early. Eventually the hubbub finally settled down around noon. Everybody in the office thought it was very funny. Even my boss, who officially disapproved, thought it was somewhat funny. The other thing that went wrong, as if all that weren't enough, was that the director of IT — which is to say, my boss's boss — was away on vacation when all this took place. (Read E-mail remotely? Who does that?) I compounded this situation with the tactical error of going skiing over the coming weekend and part of the next week, most of which I spent snowplowing down the bunny slopes face first, so that he discovered all the angry E-mail in his box without me around to explain myself. (My office partner remembers him coming in wide-eyed asking, "what did he do??") When I returned, it was icier in the office than it had been on the mountain. The assistant director, who thought it was funny, was in trouble for not putting a lid on it, and I was in really big trouble for doing it in the first place. I was appropriately contrite and made various apologies and was an uncharacteristically model employee for an unnaturally long period of time. The Ice Age eventually thawed and the incident was officially dropped except for a "poor judgment" on my next performance review and the satisfaction of what was then considered the best practical joke ever pulled on campus. Indeed, everyone agreed it was much more technically accomplished than the previous award winner, where someone had supposedly gotten it around the grounds that the security guards at the entrance would be charging a nominal admission fee per head. Years later they still said it was legendary. I like to think they still do.

10 hours ago 3 votes
XSS To RCE By Abusing Custom File Handlers - Kentico Xperience CMS (CVE-2025-2748)

We know what you’re waiting for - this isn’t it. Today, we’re back with more tales of our adventures in Kentico’s Xperience CMS. Due to it’s wide usage, the type of solution, and the types of enterprises using this solution

7 hours ago 2 votes
You have got to be kidding me

Mia Sato writing for The Verge: Elon Musk’s $1 Million Handout Winners Are Connected to Republican Causes On Sunday, a few thousand people in Green Bay, Wisconsin, gathered to hear Elon Musk speak — and give away two giant cardboard checks for $1 million. Attendance at the event

an hour ago 1 votes