Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
11
The most popular SQLite and PostgreSQL database drivers in Go are (roughly) 20-76% slower than alternative Go drivers on insert-heavy benchmarks of mine. So if you are bulk-inserting data with Go (and potentially also bulk-retrieving data with Go), you may want to consider the driver carefully. And you may want to consider avoiding database/sql. Some driver authors have noted and benchmarked issues with database/sql. So it may be the case that database/sql is responsible for some of this overhead. And indeed the variations between drivers in this post will be demonstrated by using database/sql and avoiding it. This post won't specifically prove that the variation is due to the database/sql interface. But that doesn't change the premise. has suggested that other frontends like sqlc, sqlx, and GORM do worse than database/sql. This post is built on the workload, environment, libraries, and methodology in my databases-intuition repo on GitHub. See the repo for details that will help...
a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Notes on software development

Things that go wrong with disk IO

There are a few interesting scenarios to keep in mind when writing applications (not just databases!) that interact with read and writes files, particularly in transactional contexts where you actually care about the integrity of the data and when you are editing data in place (versus copy-on-write for example). If I don't say otherwise I'm talking about behavior on Linux. The research version of this blog post is Parity Lost and Parity Regained and Characteristics, Impact, and Tolerance of Partial Disk Failures. These two papers also go into the frequency of some of the issues discussed here. These behaviors actually happen in real life! Thank you to Alex Miller and George Xanthakis for reviewing a draft of this post. Terminology Some of these terms are reused in different contexts, and sometimes they are reused because they effectively mean the same thing in a certain configuration. But I'll try to be explicit to avoid confusion. Sector The smallest amount of data that can be read and written atomically by hardware. It used to be 512 bytes, but on modern disks it is often 4KiB. There doesn't seem to be any safe assumption you can make about sector size, despite file system defaults (see below). You must check your disks to know. Block (filesystem/kernel view) Typically set to the sector size since only this block size is atomic. The default in ext4 is 4KiB. Page (kernel view) A disk block that is in memory. Any reads/writes less than the size of a block will read the entire block into kernel memory even if less than that amount is sent back to userland. Page (database/application view) The smallest amount of data the system (database, application, etc.) chooses to act on, when it's read or written or held in memory. The page size is some multiple of the filesystem/kernel block size (including the multiple being 1). SQLite's default page size is 4KiB. MySQL's default page size is 16KiB. Postgres's default page size is 8KiB. Things that go wrong The data didn't reach disk By default, file writes succeed when the data is copied into kernel memory (buffered IO). The man page for write(2) says: A successful return from write() does not make any guarantee that data has been committed to disk. On some filesystems, including NFS, it does not even guarantee that space has successfully been reserved for the data. In this case, some errors might be delayed until a future write(), fsync(2), or even close(2). The only way to be sure is to call fsync(2) after you are done writing all your data. If you don't call fsync on Linux the data isn't necessarily durably on disk, and if the system crashes or restarts before the disk writes the data to non-volatile storage, you may lose data. With O_DIRECT, file writes succeed when the data is copied to at least the disk cache. Alternatively you could open the file with O_DIRECT|O_SYNC (or O_DIRECT|O_DSYNC) and forgo fsync calls. fsync on macOS is a no-op. If you're confused, read Userland Disk I/O. Postgres, SQLite, MongoDB, MySQL fsync data before considering a transaction successful by default. RocksDB does not. The data was fsynced but fsync failed fsync isn't guaranteed to succeed. And when it fails you can't tell which write failed. It may not even be a failure of a write to a file that your process opened: Ideally, the kernel would report errors only on file descriptions on which writes were done that subsequently failed to be written back. The generic pagecache infrastructure does not track the file descriptions that have dirtied each individual page however, so determining which file descriptors should get back an error is not possible. Instead, the generic writeback error tracking infrastructure in the kernel settles for reporting errors to fsync on all file descriptions that were open at the time that the error occurred. In a situation with multiple writers, all of them will get back an error on a subsequent fsync, even if all of the writes done through that particular file descriptor succeeded (or even if there were no writes on that file descriptor at all). Don't be 2018-era Postgres. The only way to have known which exact write failed would be to open the file with O_DIRECT|O_SYNC (or O_DIRECT|O_DSYNC), though this is not the only way to handle fsync failures. The data was corrupted If you don't checksum your data on write and check the checksum on read (as well as periodic scrubbing a la ZFS) you will never be aware if and when the data gets corrupted and you will have to restore (who knows how far back in time) from backups if and when you notice. ZFS, MongoDB (WiredTiger), MySQL (InnoDB), and RocksDB checksum data by default. Postgres and SQLite do not (though databases created from Postgres 18+ will). You should probably turn on checksums on any system that supports it, regardless of the default. The data was partially written Only when the page size you write = block size of your filesystem = sector size of your disk is a write guaranteed to be atomic. If you need to write multiple sectors of data atomically there is the risk that some sectors are written and then the system crashes or restarts. This is called torn writes or torn pages. Postgres, SQLite, and MySQL (InnoDB) handle torn writes. Torn writes are by definition not relevant to immutable storage systems like RocksDB (and other LSM Tree or Copy-on-Write systems like MongoDB (WiredTiger)) unless writes (that update metadata) span sectors. If your file system duplicates all writes like MySQL (InnoDB) does (like you can with data=journal in ext4) you may also not have to worry about torn writes. On the other hand, this amplifies writes 2x. The data didn't reach disk, part 2 Sometimes fsync succeeds but the data isn't actually on disk because the disk is lying. These are called lost writes or phantom writes. You can be resilient to phantom writes by always reading back what you wrote (expensive) or versioning what you wrote. Databases and file systems generally do not seem to handle this situation. The data was written to the wrong place, read from the wrong place If you aren't including where data is supposed to be on disk as part of the checksum or page itself, you risk being unaware that you wrote data to the wrong place or that you read from the wrong place. This is called misdirected writes/reads. Databases and file systems generally do not seem to handle this situation. Further reading In increasing levels of paranoia (laudatory) follow ZFS, Andrea and Remzi Arpaci-Dusseau, and TigerBeetle.

5 days ago 4 votes
Phil Eaton on Technical Blogging

This is an external post of mine. Click here if you are not redirected.

a week ago 8 votes
Minimal downtime Postgres major version upgrades with EDB Postgres Distributed

This is an external post of mine. Click here if you are not redirected.

a month ago 13 votes
From web developer to database developer in 10 years

Last month I completed my first year at EnterpriseDB. I'm on the team that built and maintains pglogical and who, over the years, contributed a good chunk of the logical replication functionality that exists in community Postgres. Most of my work, our work, is in C and Rust with tests in Perl and Python. Our focus these days is a descendant of pglogical called Postgres Distributed which supports replicating DDL, tunable consistency across the cluster, etc. This post is about how I got here. Black boxes I was a web developer from 2014-2021†. I wrote JavaScript and HTML and CSS and whatever server-side language: Python or Go or PHP. I was a hands-on engineering manager from 2017-2021. I was pretty clueless about databases and indeed database knowledge was not a serious part of any interview I did. Throughout that time (2014-2021) I wanted to move my career forward as quickly as possible so I spent much of my free time doing educational projects and writing about them on this blog (or previous incarnations of it). I learned how to write primitive HTTP servers, how to write little parsers and interpreters and compilers. It was a virtuous cycle because the internet (Hacker News anyway) liked reading these posts and I wanted to learn how the black boxes worked. But I shied away from data structures and algorithms (DSA) because they seemed complicated and useless to the work that I did. That is, until 2020 when an inbox page I built started loading more and more slowly as the inbox grew. My coworker pointed me at Use The Index, Luke and the DSA scales fell from my eyes. I wanted to understand this new black box so I built a little in-memory SQL database with support for indexes. I'm a college dropout so even while I was interested in compilers and interpreters earlier in my career I never dreamed I could get a job working on them. Only geniuses and PhDs did that work and I was neither. The idea of working on a database felt the same. However, I could work on little database side projects like I had done before on other topics, so I did. Or a series of explorations of Raft implementations, others' and my own. Startups From 2021-2023 I tried to start a company and when that didn't pan out I joined TigerBeetle as a cofounder to work on marketing and community. It was during this time I started the Software Internals Discord and /r/databasedevelopment which have since kind of exploded in popularity among professionals and academics in database and distributed systems. TigerBeetle was my first job at a database company, and while I contributed bits of code I was not a developer there. It was a way into the space. And indeed it was an incredible learning experience both on the cofounder side and on the database side. I wrote articles with King and Joran that helped teach and affirm for myself the basics of databases and consensus-based distributed systems. Holding out When I left TigerBeetle in 2023 I was still not sure if I could get a job as an actual database developer. My network had exploded since 2021 (when I started my own company that didn't pan out) so I had no trouble getting referrals at database companies. But my background kept leading hiring managers to suggest putting me on cloud teams doing orchestration in Go around a database rather than working on the database itself. I was unhappy with this type-casting so I held out while unemployed and continued to write posts and host virtual hackweeks messing with Postgres and MySQL. I started the first incarnation of the Software Internals Book Club during this time, reading Designing Data Intensive Applications with 5-10 other developers in Bryant Park. During this time I also started the NYC Systems Coffee Club. Postgres After about four months of searching I ended up with three good offers, all to do C and Rust development on Postgres (extensions) as an individual contributor. Working on extensions might sound like the definition of not-sexy, but Postgres APIs are so loosely abstracted it's really as if you're working on Postgres itself. You can mess with almost anything in Postgres so you have to be very aware of what you're doing. And when you can't mess with something in Postgres because an API doesn't yet exist, companies have the tendency to just fork Postgres so they can. (This tendency isn't specific to Postgres, almost every open-source database company seems to have a long-running internal fork or two of the database.) EnterpriseDB Two of the three offers were from early-stage startups and after more than 3 years being part of the earliest stages of startups I was happy for a break. But the third offer was from one of the biggest contributors to Postgres, a 20-year old company called EnterpriseDB. (You can probably come up with different rankings of companies using different metrics so I'm only saying EnterpriseDB is one of the biggest contributors.) It seemed like the best place to be to learn a lot and contribute something meaningful. My coworkers are a mix of Postgres veterans (people who contributed the WAL to Postgres, who contributed MVCC to Postgres, who contributed logical decoding and logical replication, who contributed parallel queries; the list goes on and on) but also my developer-coworkers are people who started at EnterpriseDB on technical support, or who were previously Postgres administrators. It's quite a mix. Relatively few geniuses or PhDs, despite what I used to think, but they certainly work hard and have hard-earned experience. Anyway, I've now been working at EnterpriseDB for over a year so I wanted to share this retrospective. I also wanted to cover what it's like coming from engineering management and founding companies to going back to being an individual contributor. (Spoiler: incredibly enjoyable.) But it has been hard enough to make myself write this much so I'm calling it a day. :) I wrote a post about the winding path I took from web developer to database developer over 10 years. pic.twitter.com/tf8bUDRzjV — Phil Eaton (@eatonphil) February 15, 2025 † From 2011-2014 I also did contract web development but this was part-time while I was in school.

a month ago 20 votes
Edit for clarity

I have the fortune to review a few important blog posts every year and the biggest value I add is to call out sentences or sections that make no sense. It is quite simple and you can do it too. Without clarity only those at your company in marketing and sales (whose job it is to work with what they get) will give you the courtesy of a cursory read and a like on LinkedIn. This is all that most corporate writing achieves. It is the norm and it is understandable. But if you want to reach an audience beyond those folks, you have to make sure you're not writing nonsense. And you, as reviewer and editor, have the chance to call out nonsense if you can get yourself to recognize it. Immune to nonsense But especially when editing blog posts at work, it is easy to gloss over things that make no sense because we are so constantly bombarded by things that make no sense. Maybe it's buzzwords or cliches, or simply lack of rapport. We become immune to nonsense. And even worse, without care, as we become more experienced, we become more fearful to say "I have no idea what you are talking about". We're afraid to look incompetent by admitting our confusion. This fear is understandable, but is itself stupid. And I will trust you to deal with this on your own. Read it out loud So as you review a post, read it out loud to yourself. And if you find yourself saying "what on earth are you talking about", add that as a comment as gently as you feel you should. It is not offensive to say this (depending on how you say it). It is surely the case that the author did not know they were making no sense. It is worse to not mention your confusion and allow the author to look like an idiot or a bore. Once you can call out what does not make sense to you, then read the post again and consider what would not make sense to someone without the context you have. Someone outside your company. Of course you need to make assumptions about the audience to a degree. It is likely your customers or prospects you have in mind. Not your friends or family. With the audience you have in mind, would what you're reading make any sense? Has the author given sufficient background or introduced relevant concepts before bringing up something new? Again this is a second step though. The first step is to make sure that the post makes sense to you. In almost every draft I read, at my company or not, there is something that does not make sense to me. Do two paragraphs need to be reordered because the first one accidentally depended on information mentioned in the second? Are you making ambiguous use of pronouns? And so on. In closing Clarity on its own will put you in the 99th percentile of writing. Beyond that it definitely still matters if you are compelling and original and whatnot. But too often it seems we focus on being exciting rather than being clear. But it doesn't matter if you've got something exciting if it makes no sense to your reader. This sounds like mundane guidance, but I have reviewed many posts that were reviewed by other people and no one else called out nonsense. I feel compelled to mention how important it is. Wrote a new post on the most important, and perhaps least done, thing you can do while reviewing a blog post: edit for clarity. pic.twitter.com/ODblOUzB3g — Phil Eaton (@eatonphil) January 29, 2025

2 months ago 23 votes

More in technology

Electricity and the speed of light

If it's all just electromagnetic waves, why is electricity in a conductor moving slower than visible light?

20 hours ago 3 votes
The April Fools joke that might have got me fired

Everyone should pull one great practical joke in their lifetimes. This one was mine, and I think it's past the statute of limitations. The story is true. Only the names are redacted to protect the guilty. My first job out of college was a database programmer, even though my undergraduate degree had nothing to do with computers and my current profession still mostly doesn't. The reason was that the University I worked for couldn't afford competitive wages, but they did offer various fringe benefits, and they were willing to train someone who at least had decent working knowledge. I, as a newly minted graduate of the august University of California system, had decent working knowledge at least of BSD/386 and SunOS, but more importantly also had the glowing recommendation of my predecessor who was being promoted into a new position. I was hired, which was their first mistake. The system I was hired to work on was an HP 9000 K250, one of Hewlett-Packard's big PA-RISC servers. I wish I had a photograph of it, but all I have are a couple bad scans of some bad Polaroids of my office and none of the server room. The server room was downstairs from my office back in the days when server rooms were on-premises, complete with a swipe card lock and a halon system that would give you a few seconds of grace before it flooded everything. The K250 hulked in there where it had recently replaced what I think was an Encore mini of some sort (probably a Multimax, since it was a few years old and the 88K Encores would have been too new for the University), along with the AIX RS/6000s that provided student and faculty shell accounts and E-mail, the bonded T1 lines, some of the terminal servers, the massive Cabletron routers and a lot of the telco stuff. One of the tape reels from the Encore hangs on my wall today as a memento. The K250 and the Encore it replaced (as well as the L-Class that later replaced the K250 when I was a consultant) ran an all-singing, all-dancing student information system called CARS. CARS is still around, renamed Jenzabar, though I suspect that many of its underpinnings remain if you look under the table. In those days CARS was a massive overlay that was loaded atop the operating system and database, which when I started were, respectively, HP/UX 10.20 and Informix. (I'm old.) It used Informix tables, screens and stored procedures plus its own text UI libraries to run code written variously as Perform screens, SQL, C-shell scripts and plain old C or ESQL/C. Everything was tracked in RCS using overgrown Makefiles. I had the admin side (resource management, financials, attendance trackers, etc.) and my office partner had the academic side (mostly grades and faculty tracking). My job was to write and maintain this code and shortly after to help the University create custom applications in CARS' brand-spanking new web module, which chose the new hotness in scripting languages, i.e., Perl. Fortuitously I had learned Perl in, appropriately enough, a computational linguistics course. CARS also managed most of the printers on campus except for the few that the RS/6000s controlled directly. Most of the campus admin printers were HP LaserJet 4 units of some derivation equipped with JetDirect cards for networking. These are great warhorse printers, some of the best laser printers HP ever made. I suspect there were line printers other places, but those printers were largely what existed in the University's offices. It turns out that the READY message these printers show on their VFD panels is changeable. I don't remember where I read this, probably idly paging through the manual over a lunch break, but initially the only fun things I could think of to do was to have the printer say hi to my boss when she sent jobs to it, stuff like that (whereupon she would tell me to get back to work). Then it dawned on me: because I had access to the printer spools on the K250, and the spool directories were conveniently named the same as their hostnames, I knew where each and every networked LaserJet on campus was. I was young, rash and motivated. This was a hack I just couldn't resist. It would be even better than what had been my favourite joke at my alma mater, where campus services, notable for posting various service suspension notices, posted one April Fools' Day that gravity itself would be suspended to various buildings. I felt sure this hack would eclipse that too. The plan on April Fools' Day was to get into work at OMG early o'clock and iterate over every entry in the spool, sending it a sequence that would change the READY message to INSERT 5 CENTS. This would cause every networked LaserJet on campus to appear to ask for a nickel before you printed anything. The script was very simple (this is the actual script, I saved it): The ^[ was a literal ASCII 27 ESCape character, and netto was a simple netcat-like script I had written in these days before netcat was widely used. That's it. Now, let me be clear: the printer was still ready! The effect was merely cosmetic! It would still print if you sent jobs to it! Nevertheless, to complete the effect, this message was sent out on the campus-wide administration mailing list (which I also saved): At the end of the day I would reset everything back to READY, smile smugly, and continue with my menial existence. That was the plan. Having sent this out, I fielded a few anxious calls, who laughed uproariously when they realized, and I reset their printers manually afterwards. The people who knew me, knew I was a practical joker, took note of the date, and sent approving replies. One of the best was sent to me later in the day by intercampus mail, printed on their laser printer, with a nickel taped to it. Unfortunately, not everybody on campus knew me, and those who did not not only did not call me, but instead called university administration directly. By 8:30am it was chaos in the main office and this filtered up to the head of HR, who most definitely did know me, and told me I'd better send a retraction before the CFO got in or I was in big trouble. That went wrong also, because my retraction said that campus administration was not considering charging per-page fees when in fact they actually were, so I had to retract it and send a new retraction that didn't call attention to that fact. I also ran the script to reset everything early. Eventually the hubbub finally settled down around noon. Everybody in the office thought it was very funny. Even my boss, who officially disapproved, thought it was somewhat funny. The other thing that went wrong, as if all that weren't enough, was that the director of IT — which is to say, my boss's boss — was away on vacation when all this took place. (Read E-mail remotely? Who does that?) I compounded this situation with the tactical error of going skiing over the coming weekend and part of the next week, most of which I spent snowplowing down the bunny slopes face first, so that he discovered all the angry E-mail in his box without me around to explain myself. (My office partner remembers him coming in wide-eyed asking, "what did he do??") When I returned, it was icier in the office than it had been on the mountain. The assistant director, who thought it was funny, was in trouble for not putting a lid on it, and I was in really big trouble for doing it in the first place. I was appropriately contrite and made various apologies and was an uncharacteristically model employee for an unnaturally long period of time. The Ice Age eventually thawed and the incident was officially dropped except for a "poor judgment" on my next performance review and the satisfaction of what was then considered the best practical joke ever pulled on campus. Indeed, everyone agreed it was much more technically accomplished than the previous award winner, where someone had supposedly gotten it around the grounds that the security guards at the entrance would be charging a nominal admission fee per head. Years later they still said it was legendary. I like to think they still do.

10 hours ago 3 votes
XSS To RCE By Abusing Custom File Handlers - Kentico Xperience CMS (CVE-2025-2748)

We know what you’re waiting for - this isn’t it. Today, we’re back with more tales of our adventures in Kentico’s Xperience CMS. Due to it’s wide usage, the type of solution, and the types of enterprises using this solution

7 hours ago 2 votes
You have got to be kidding me

Mia Sato writing for The Verge: Elon Musk’s $1 Million Handout Winners Are Connected to Republican Causes On Sunday, a few thousand people in Green Bay, Wisconsin, gathered to hear Elon Musk speak — and give away two giant cardboard checks for $1 million. Attendance at the event

2 hours ago 1 votes
Forgot your safe combination? This Arduino-controlled autodialer can crack it for you

Safes are designed specifically to be impenetrable — that’s kind of the whole point. That’s great when you need to protect something, but it is a real problem when you forget the combination to your safe or when a safe’s combination becomes lost to history. In such situations, Charles McNall’s safe-cracking autodialer device can help. […] The post Forgot your safe combination? This Arduino-controlled autodialer can crack it for you appeared first on Arduino Blog.

5 hours ago 1 votes