Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
32
I’ve posted another command-line tool on GitHub: randline, which gives you a random selection of lines in a file: $ randline < /usr/share/dict/words ultraluxurious $ randline 3 < /usr/share/dict/words unexceptionably baselessness salinity There are lots of tools that solve this problem; I wrote my own as a way to get some more Rust practice and try a new-to-me technique called reservoir sampling. Existing approaches There’s a shuf command in coreutils which is designed to do this exact thing: $ shuf -n 3 /usr/share/dict/words brimstone melody's reimbursed But I don’t have coreutils on my Mac, so I can’t use shuf. You can do this in lots of other ways using tools like awk, sort and perl. If you’re interested, check out these Stack Overflow and Unix & Linux Stack Exchange threads for examples. For my needs, I wrote a tiny Python script called randline which I saved in my PATH years ago, and I haven’t thought much about since: import random import sys if __name__ == "__main__": ...
2 weeks ago

More from alexwlchan

Adding auto-generated cover images to EPUBs downloaded from AO3

I was chatting with a friend recently, and she mentioned an annoyance when reading fanfiction on her iPad. She downloads fic from AO3 as EPUB files, and reads it in the Kindle app – but the files don’t have a cover image, and so the preview thumbnails aren’t very readable: She’s downloaded several hundred stories, and these thumbnails make it difficult to find things in the app’s “collections” view. This felt like a solvable problem. There are tools to add cover images to EPUB files, if you already have the image. The EPUB file embeds some key metadata, like the title and author. What if you had a tool that could extract that metadata, auto-generate an image, and use it as the cover? So I built that. It’s a small site where you upload EPUB files you’ve downloaded from AO3, the site generates a cover image based on the metadata, and it gives you an updated EPUB to download. The new covers show the title and author in large text on a coloured background, so they’re much easier to browse in the Kindle app: If you’d find this helpful, you can use it at alexwlchan.net/my-tools/add-cover-to-ao3-epubs/ Otherwise, I’m going to explain how it works, and what I learnt from building it. There are three steps to this process: Open the existing EPUB to get the title and author Generate an image based on that metadata Modify the EPUB to insert the new cover image Let’s go through them in turn. Open the existing EPUB I’ve not worked with EPUB before, and I don’t know much about it. My first instinct was to look for Python EPUB libraries on PyPI, but there was nothing appealing. The results were either very specific tools (convert EPUB to/from format X) or very unmaintained (the top result was last updated in April 2014). I decied to try writing my own code to manipulate EPUBs, rather than using somebody else’s library. I had a vague memory that EPUB files are zips, so I changed the extension from .epub to .zip and tried unzipping one – and it turns out that yes, it is a zip file, and the internal structure is fairly simple. I found a file called content.opf which contains metadata as XML, including the title and author I’m looking for: <?xml version='1.0' encoding='utf-8'?> <package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="uuid_id"> <metadata xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata"> <dc:title>Operation Cameo</dc:title> <meta name="calibre:timestamp" content="2025-01-25T18:01:43.253715+00:00"/> <dc:language>en</dc:language> <dc:creator opf:file-as="alexwlchan" opf:role="aut">alexwlchan</dc:creator> <dc:identifier id="uuid_id" opf:scheme="uuid">13385d97-35a1-4e72-830b-9757916d38a7</dc:identifier> <meta name="calibre:title_sort" content="operation cameo"/> <dc:description><p>Some unusual orders arrive at Operation Mincemeat HQ.</p></dc:description> <dc:publisher>Archive of Our Own</dc:publisher> <dc:subject>Fanworks</dc:subject> <dc:subject>General Audiences</dc:subject> <dc:subject>Operation Mincemeat: A New Musical - SpitLip</dc:subject> <dc:subject>No Archive Warnings Apply</dc:subject> <dc:date>2023-12-14T00:00:00+00:00</dc:date> </metadata> … That dc: prefix was instantly familiar from my time working at Wellcome Collection – this is Dublin Core, a standard set of metadata fields used to describe books and other objects. I’m unsurprised to see it in an EPUB; this is exactly how I’d expect it to be used. I found an article that explains the structure of an EPUB file, which told me that I can find the content.opf file by looking at the root-path element inside the mandatory META-INF/container.xml file which is every EPUB. I wrote some code to find the content.opf file, then a few XPath expressions to extract the key fields, and I had the metadata I needed. Generate a cover image I sketched a simple cover design which shows the title and author. I wrote the first version of the drawing code in Pillow, because that’s what I’m familiar with. It was fine, but the code was quite flimsy – it didn’t wrap properly for long titles, and I couldn’t get custom fonts to work. Later I rewrote the app in JavaScript, so I had access to the HTML canvas element. This is another tool that I haven’t worked with before, so a fun chance to learn something new. The API felt fairly familiar, similar to other APIs I’ve used to build HTML elements. This time I did implement some line wrapping – there’s a measureText() API for canvas, so you can see how much space text will take up before you draw it. I break the text into words, and keeping adding words to a line until measureText tells me the line is going to overflow the page. I have lots of ideas for how I could improve the line wrapping, but it’s good enough for now. I was also able to get fonts working, so I picked Georgia to match the font used for titles on AO3. Here are some examples: I had several ideas for choosing the background colour. I’m trying to help my friend browse her collection of fic, and colour would be a useful way to distinguish things – so how do I use it? I realised I could get the fandom from the EPUB file, so I decided to use that. I use the fandom name as a seed to a random number generator, then I pick a random colour. This means that all the fics in the same fandom will get the same colour – for example, all the Star Wars stories are a shade of red, while Star Trek are a bluey-green. This was a bit harder than I expected, because it turns out that JavaScript doesn’t have a built-in seeded random number generator – I ended up using some snippets from a Stack Overflow answer, where bryc has written several pseudorandom number generators in plain JavaScript. I didn’t realise until later, but I designed something similar to the placeholder book covers in the Apple Books app. I don’t use Apple Books that often so it wasn’t a deliberate choice to mimic this style, but clearly it was somewhere in my subconscious. One difference is that Apple’s app seems to be picking from a small selection of background colours, whereas my code can pick a much nicer variety of colours. Apple’s choices will have been pre-approved by a designer to look good, but I think mine is more fun. Add the cover image to the EPUB My first attempt to add a cover image used pandoc: pandoc input.epub --output output.epub --epub-cover-image cover.jpeg This approach was no good: although it added the cover image, it destroyed the formatting in the rest of the EPUB. This made it easier to find the fic, but harder to read once you’d found it. An EPUB file I downloaded from AO3, before/after it was processed by pandoc. So I tried to do it myself, and it turned out to be quite easy! I unzipped another EPUB which already had a cover image. I found the cover image in OPS/images/cover.jpg, and then I looked for references to it in content.opf. I found two elements that referred to cover images: <?xml version="1.0" encoding="UTF-8"?> <package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="PrimaryID"> <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> <meta name="cover" content="cover-image"/> … </metadata> <manifest> <item id="cover-image" href="images/cover.jpg" media-type="image/jpeg" properties="cover-image"/> … </manifest> </package> This gave me the steps for adding a cover image to an EPUB file: add the image file to the zipped bundle, then add these two elements to the content.opf. Where am I going to deploy this? I wrote the initial prototype of this in Python, because that’s the language I’m most familiar with. Python has all the libraries I need: The zipfile module can unpack and modify the EPUB/ZIP The xml.etree or lxml modules can manipulate XML The Pillow library can generate images I built a small Flask web app: you upload the EPUB to my server, my server does some processing, and sends the EPUB back to you. But for such a simple app, do I need a server? I tried rebuilding it as a static web page, doing all the processing in client-side JavaScript. That’s simpler for me to host, and it doesn’t involve a round-trip to my server. That has lots of other benefits – it’s faster, less of a privacy risk, and doesn’t require a persistent connection. I love static websites, so can they do this? Yes! I just had to find a different set of libraries: The JSZip library can unpack and modify the EPUB/ZIP, and is the only third-party code I’m using in the tool Browsers include DOMParser for manipulating XML I’ve already mentioned the HTML <canvas> element for rendering the image This took a bit longer because I’m not as familiar with JavaScript, but I got it working. As a bonus, this makes the tool very portable. Everything is bundled into a single HTML file, so if you download that file, you have the whole tool. If my friend finds this tool useful, she can save the file and keep a local copy of it – she doesn’t have to rely on my website to keep using it. What should it look like? My first design was very “engineer brain” – I just put the basic controls on the page. It was fine, but it wasn’t good. That might be okay, because the only person I need to be able to use this app is my friend – but wouldn’t it be nice if other people were able to use it? If they’re going to do that, they need to know what it is – most people aren’t going to read a 2,500 word blog post to understand a tool they’ve never heard of. (Although if you have read this far, I appreciate you!) I started designing a proper page, including some explanations and descriptions of what the tool is doing. I got something that felt pretty good, including FAQs and acknowledgements, and I added a grey area for the part where you actually upload and download your EPUBs, to draw the user’s eye and make it clear this is the important stuff. But even with that design, something was missing. I realised I was telling you I’d create covers, but not showing you what they’d look like. Aha! I sat down and made up a bunch of amusing titles for fanfic and fanfic authors, so now you see a sample of the covers before you upload your first EPUB: This makes it clearer what the app will do, and was a fun way to wrap up the project. What did I learn from this project? Don’t be scared of new file formats My first instinct was to look for a third-party library that could handle the “complexity” of EPUB files. In hindsight, I’m glad I didn’t find one – it forced me to learn more about how EPUBs work, and I realised I could write my own code using built-in libraries. EPUB files are essentially ZIP files, and I only had basic needs. I was able to write my own code. Because I didn’t rely on a library, now I know more about EPUBs, I have code that’s simpler and easier for me to understand, and I don’t have a dependency that may cause problems later. There are definitely some file formats where I need existing libraries (I’m not going to write my own JPEG parser, for example) – but I should be more open to writing my own code, and not jumping to add a dependency. Static websites can handle complex file manipulations I love static websites and I’ve used them for a lot of tasks, but mostly read-only display of information – not anything more complex or interactive. But modern JavaScript is very capable, and you can do a lot of things with it. Static pages aren’t just for static data. One of the first things I made that got popular was find untagged Tumblr posts, which was built as a static website because that’s all I knew how to build at the time. Somewhere in the intervening years, I forgot just how powerful static sites can be. I want to build more tools this way. Async JavaScript calls require careful handling The JSZip library I’m using has a lot of async functions, and this is my first time using async JavaScript. I got caught out several times, because I forgot to wait for async calls to finish properly. For example, I’m using canvas.toBlob to render the image, which is an async function. I wasn’t waiting for it to finish, and so the zip would be repackaged before the cover image was ready to add, and I got an EPUB with a missing image. Oops. I think I’ll always prefer the simplicity of synchronous code, but I’m sure I’ll get better at async JavaScript with practice. Final thoughts I know my friend will find this helpful, and that feels great. Writing software that’s designed for one person is my favourite software to write. It’s not hyper-scale, it won’t launch the next big startup, and it’s usually not breaking new technical ground – but it is useful. I can see how I’m making somebody’s life better, and isn’t that what computers are for? If other people like it, that’s a nice bonus, but I’m really thinking about that one person. Normally the one person I’m writing software for is me, so it’s extra nice when I can do it for somebody else. If you want to try this tool yourself, go to alexwlchan.net/my-tools/add-cover-to-ao3-epubs/ If you want to read the code, it’s all on GitHub. [If the formatting of this post looks odd in your feed reader, visit the original article]

7 hours ago 4 votes
Looking at images in a spreadsheet

I’ve had a couple of projects recently where I needed to work with a list that involved images. For example, choosing a series of photos to print, or making an inventory of Lego parts. I could write a simple text list, but it’s really helpful to be able to see the images as part of the list, especially when I’m working with other people. The best tool I’ve found is Google Sheets – not something I usually associate with pictures! I’m using Google Sheets, and I use the IMAGE function, which inserts an image into a cell. For example: =IMAGE("https://www.google.com/images/srpr/logo3w.png") There’s a similar function in Microsoft Excel, but not in Apple Numbers. This function can reference values in other cells, so I’ll often prepare my spreadsheet in another tool – say, a Python script – and include an image URL in one of the columns. When I import the spreadsheet into Google Sheets, I use IMAGE() to reference that column, and then I see inline images. After that, I tend to hide the column with the image URL, and resize the rows/columns containing images to make them bigger and easier to look at. I often pair this with the HYPERLINK function, which can add a clickable link to a cell. This is useful to link to the source of the image, or to more detail I can’t fit in the spredsheet. I don’t know how far this approach can scale – I’ve never tried more than a thousand or so images in a single spreadsheet – but it’s pretty cool that it works at all! Using a spreadsheet gives me a simple, lightweight interface that most people are already familiar with. It doesn’t take much work on my part, and I get useful features like sorting and filtering for “free”. Previously I’d only thought of spreadsheets as a tool for textual data, and being able to include images has made them even more powerful. [If the formatting of this post looks odd in your feed reader, visit the original article]

a week ago 23 votes
How I test Rust command-line apps with `assert_cmd`

Rust has become my go-to language for my personal toolbox – small, standalone utilities like create_thumbnail, emptydir, and dominant_colours. There’s no place for Rust in my day job, so having some self-contained hobby projects means I can still have fun playing with it. I’ve been using the assert_cmd crate to test my command line tools, but I wanted to review my testing approach before I write my next utility. My old code was fine and it worked, but that’s about all you could say about it – it wasn’t clean or idiomatic Rust, and it wasn’t especially readable. My big mistake was trying to write Rust like Python. I’d written wrapper functions that would call assert_cmd and return values, then I wrote my own assertions a bit like I’d write a Python test. I missed out on the nice assertion helpers in the crate. I’d skimmed just enough of the assert_cmd documentation to get something working, but I hadn’t read it properly. As I was writing this blog post, I went back and read the documentation in more detail, to understand the right way to use the crate. Here are some examples of how I’m using it in my refreshed test suites: Testing a basic command This test calls dominant_colours with a single argument, then checks it succeeds and that a single line is printed to stdout: use assert_cmd::Command; /// If every pixel in an image is the same colour, then the image /// has a single dominant colour. #[test] fn it_prints_the_colour() { Command::cargo_bin("dominant_colours") .unwrap() .arg("./src/tests/red.png") .assert() .success() .stdout("#fe0000\n") .stderr(""); } If I have more than one argument or flag, I can replace .arg with .args to pass a list: use assert_cmd::Command; /// It picks the best colour from an image to go with a background -- /// the colour with sufficient contrast and the most saturation. #[test] fn it_chooses_the_right_colour_for_a_light_background() { Command::cargo_bin("dominant_colours") .unwrap() .args(&[ "src/tests/stripes.png", "--max-colours=5", "--best-against-bg=#fff", ]) .assert() .success() .stdout("#693900\n") .stderr(""); } Alternatively, I can omit .arg and .args if I don’t need to pass any arguments. Testing error cases Most of my tests are around error handling – call the tool with bad input, and check it returns a useful error message. I can check that the command failed, the exit code, and the error message printed to stderr: use assert_cmd::Command; /// Getting the dominant colour of a file that doesn't exist is an error. #[test] fn it_fails_if_you_pass_an_nonexistent_file() { Command::cargo_bin("dominant_colours") .unwrap() .arg("doesnotexist.jpg") .assert() .failure() .code(1) .stdout("") .stderr("No such file or directory (os error 2)\n"); } Comparing output to a regular expression All the examples so far are doing an exact match for the stdout/stderr, but sometimes I need something more flexible. Maybe I only know what part of the output will look like, or I only care about checking how it starts. If so, I can use the predicate::str::is_match predicate from the predicates crate and define a regular expression I want to match against. Here’s an example where I’m checking the output contains a version number, but not what the version number is: use assert_cmd::Command; use predicates::prelude::*; /// If I run `dominant_colours --version`, it prints the version number. #[test] fn it_prints_the_version() { // Match strings like `dominant_colours 1.2.3` let is_version_string = predicate::str::is_match(r"^dominant_colours [0-9]+\.[0-9]+\.[0-9]+\n$").unwrap(); Command::cargo_bin("dominant_colours") .unwrap() .arg("--version") .assert() .success() .stdout(is_version_string) .stderr(""); } Creating focused helper functions I have a couple of helper functions for specific test scenarios. I try to group these by common purpose – they should be testing similar behaviour. I’m trying to avoid creating helpers for the sake of reducing repetitive code. For example, I have a helper function that passes a single invalid file to dominant_colours and checks the error message is what I expect: use assert_cmd::Command; use predicates::prelude::*; /// Getting the dominant colour of a file that doesn't exist is an error. #[test] fn it_fails_if_you_pass_an_nonexistent_file() { assert_file_fails_with_error( "./doesnotexist.jpg", "No such file or directory (os error 2)\n", ); } /// Try to get the dominant colours for a file, and check it fails /// with the given error message. fn assert_file_fails_with_error( path: &str, expected_stderr: &str, ) -> assert_cmd::assert::Assert { Command::cargo_bin("dominant_colours") .unwrap() .arg(path) .assert() .failure() .code(1) .stdout("") .stderr(predicate::eq(expected_stderr)) } Initially I wrote this helper just calling .stderr(expected_stderr) to do an exact match, like in previous tests, but I got an error “expected_stderr escapes the function body here”. I’m not sure what that means – it’s something to do with borrowing – but wrapping it in a predicate seems to fix the error, so I’m happy. My test suite is a safety net, not a playground Writing this blog post has helped me refactor my tests into something that’s actually good. I’m sure there’s still room for improvement, but this is the first iteration that I feel happy with. It’s no coincidence that it looks very similar to other test suites using assert_cmd. My earlier approaches were far too clever. I was over-abstracting to hide a few lines of boilerplate, which made the tests harder to follow. I even wrote a macro with a variadic interface because of a minor annoyance, which is stretching the limits of my Rust knowledge. It was fun to write, but it would have been a pain to debug or edit later. It’s okay to have a bit of repetition in a test suite, if it makes them easier to read. I keep having to remind myself of this – I’m often tempted to create helper functions whose sole purpose is to remove boilerplate, or create some clever parametrisation which only made sense as I’m writing it. I need to resist the urge to compress my test code. My new tests are more simple and more readable. There’s a time and a place for clever code, but my test suite isn’t it. [If the formatting of this post looks odd in your feed reader, visit the original article]

3 weeks ago 34 votes
How I use the notes field in my password manager

I use 1Password to store the passwords for my online accounts, and I’ve been reviewing it as a new year cleanup task. I’ve been deleting unused accounts, changing old passwords which were weak, and making sure I’ve enabled multi-factor authentication for key accounts. Each 1Password item has a notes field, and I use it to record extra information about each account. I’ve never seen anybody else talk about these notes, or how they use them, but I find it invaluable, so I thought it’d be worth explaining what I do. (This is different from the Secure Notes feature – I’m talking about the notes attached to other 1Password items, not standalone notes.) Lots of password managers have a notes field, so you can keep notes even if you don’t use 1Password. I use the notes field as a mini-changelog, where I write dated entries to track the history of each account. Here’s some of the stuff I write down: Why did I create this account? If the purpose of an account isn’t obvious, I write a note that explains why I created it. This happens more often than you might think. For example, there are lots of ticketing websites that don’t allow a guest checkout – you have to make an account. If I’m only booking a single event, I’ll save the account in 1Password, and without a note it would be easy to forget why the account exists. Why did I make significant changes? I write down the date and details of anything important I change in an account, like: Updating the email address Changing the password Adding or removing authentication methods like passkeys Enabling multi-factor authentication If it’s useful, I include an explanation of why I made a change, not just what it was. For example, when I change a password: was it because the old password was weak, because the site forced a reset, or because I thought a password might be compromised? Somebody recently tried to hack into my broadband account, so I reset the password as a precaution. I wrote a note about it, so if I see signs of another hacking attempt, I’ll remember what happened the first time. Why is it set up in an unusual way? There are a small number of accounts that I set up in a different way to the rest of my accounts. I write down the reason, so my future self knows why and doesn’t try to “fix” the account later. For example, most of my accounts are linked to my @alexwlchan.net email address – but a small number of them are tied to other email addresses. When I do this, I wrote a note explaining why I deliberately linked that account to another email. The most common reason I do this is because the account is particularly important, and if I lost access to my @alexwlchan.net email, I wouldn’t want to lose access to that account at the same time. What are the password rules? I write down any frustrating password rules I discover. This is particularly valuable if those rules aren’t explicitly documented, and you can only discover them by trial and error. I include a date with each of these rules, in case they change later. These notes reduce confusion and annoyance if I ever have to change the password. It also means that when I’m reviewing my passwords later, I know that there’s a reason I picked a fairly weak password – I’d done the best I could given the site’s requirements. Here’s a real example: “the password reset UI won’t tell you this, but passwords longer than 16 characters are silently truncated, and must be alphanumeric only”. Do I have multi-factor authentication? I enable multi-factor authentication (MFA) for important accounts, but I don’t put the MFA codes in 1Password. Keeping the password and MFA code in the same app is collapsing multiple authentication factors back into one. If I do enable MFA, I write a note in 1Password that says when I enabled it, where to find my MFA codes, and where to find my account recovery codes. For example, when I used hardware security keys at work, I wrote notes about where the keys were stored (“in the fire safe, ask Jane Smith in IT to unlock”) and how to identify different keys (“pink ribbon = workflow account”). These details weren’t sensitive security information, but they were easy to forget. Sometimes I choose not to enable MFA even though it’s available, and I write a note about that as well. For example, Wikipedia supports MFA but it’s described as “experimental and optional”, so I’ve decided not to enable it on my account yet. Why doesn’t this account exist any more? When I deactivate an account, I don’t delete it from 1Password. Instead, I write a final note explaining why and how I deactivated it, and then I move it to the Archive. I prefer this to deleting the entry – it means I still have some record that the account existed, and I can see how long it existed for. I’ve only had to retrieve something from the archive a handful of times, but I was glad I could do so, and I don’t see any downside to having a large archive. I also use the archive for accounts that I can’t delete, but are probably gone. For example, I have old accounts with utility companies that have been acquired or gone bust, and their website no longer exists. My account is probably gone, but I have no way of verifying that. Moving it to the archive gets it out of the way, and I still have the password if it ever comes back. What am I going to forget? I’m not trying to create a comprehensive audit trail of my online accounts – I’m just writing down stuff I think will be helpful later, and which I know I’m likely to forget. It only takes a few seconds to write each note. Writing notes is always a tricky balance. You want to capture the useful information, but you don’t want the note-taking to become a chore, or for the finished notes to be overwhelming. I’ve only been writing notes in my password manager for a few years, so I might not have the right balance yet – but I’m almost certainly better off than I was before, when I wasn’t writing any. I’m really glad I started keeping notes in my password manager, and if you’ve never done it, I’d encourage you to try it. [If the formatting of this post looks odd in your feed reader, visit the original article]

4 weeks ago 48 votes

More in programming

Adding auto-generated cover images to EPUBs downloaded from AO3

I was chatting with a friend recently, and she mentioned an annoyance when reading fanfiction on her iPad. She downloads fic from AO3 as EPUB files, and reads it in the Kindle app – but the files don’t have a cover image, and so the preview thumbnails aren’t very readable: She’s downloaded several hundred stories, and these thumbnails make it difficult to find things in the app’s “collections” view. This felt like a solvable problem. There are tools to add cover images to EPUB files, if you already have the image. The EPUB file embeds some key metadata, like the title and author. What if you had a tool that could extract that metadata, auto-generate an image, and use it as the cover? So I built that. It’s a small site where you upload EPUB files you’ve downloaded from AO3, the site generates a cover image based on the metadata, and it gives you an updated EPUB to download. The new covers show the title and author in large text on a coloured background, so they’re much easier to browse in the Kindle app: If you’d find this helpful, you can use it at alexwlchan.net/my-tools/add-cover-to-ao3-epubs/ Otherwise, I’m going to explain how it works, and what I learnt from building it. There are three steps to this process: Open the existing EPUB to get the title and author Generate an image based on that metadata Modify the EPUB to insert the new cover image Let’s go through them in turn. Open the existing EPUB I’ve not worked with EPUB before, and I don’t know much about it. My first instinct was to look for Python EPUB libraries on PyPI, but there was nothing appealing. The results were either very specific tools (convert EPUB to/from format X) or very unmaintained (the top result was last updated in April 2014). I decied to try writing my own code to manipulate EPUBs, rather than using somebody else’s library. I had a vague memory that EPUB files are zips, so I changed the extension from .epub to .zip and tried unzipping one – and it turns out that yes, it is a zip file, and the internal structure is fairly simple. I found a file called content.opf which contains metadata as XML, including the title and author I’m looking for: <?xml version='1.0' encoding='utf-8'?> <package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="uuid_id"> <metadata xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:calibre="http://calibre.kovidgoyal.net/2009/metadata"> <dc:title>Operation Cameo</dc:title> <meta name="calibre:timestamp" content="2025-01-25T18:01:43.253715+00:00"/> <dc:language>en</dc:language> <dc:creator opf:file-as="alexwlchan" opf:role="aut">alexwlchan</dc:creator> <dc:identifier id="uuid_id" opf:scheme="uuid">13385d97-35a1-4e72-830b-9757916d38a7</dc:identifier> <meta name="calibre:title_sort" content="operation cameo"/> <dc:description><p>Some unusual orders arrive at Operation Mincemeat HQ.</p></dc:description> <dc:publisher>Archive of Our Own</dc:publisher> <dc:subject>Fanworks</dc:subject> <dc:subject>General Audiences</dc:subject> <dc:subject>Operation Mincemeat: A New Musical - SpitLip</dc:subject> <dc:subject>No Archive Warnings Apply</dc:subject> <dc:date>2023-12-14T00:00:00+00:00</dc:date> </metadata> … That dc: prefix was instantly familiar from my time working at Wellcome Collection – this is Dublin Core, a standard set of metadata fields used to describe books and other objects. I’m unsurprised to see it in an EPUB; this is exactly how I’d expect it to be used. I found an article that explains the structure of an EPUB file, which told me that I can find the content.opf file by looking at the root-path element inside the mandatory META-INF/container.xml file which is every EPUB. I wrote some code to find the content.opf file, then a few XPath expressions to extract the key fields, and I had the metadata I needed. Generate a cover image I sketched a simple cover design which shows the title and author. I wrote the first version of the drawing code in Pillow, because that’s what I’m familiar with. It was fine, but the code was quite flimsy – it didn’t wrap properly for long titles, and I couldn’t get custom fonts to work. Later I rewrote the app in JavaScript, so I had access to the HTML canvas element. This is another tool that I haven’t worked with before, so a fun chance to learn something new. The API felt fairly familiar, similar to other APIs I’ve used to build HTML elements. This time I did implement some line wrapping – there’s a measureText() API for canvas, so you can see how much space text will take up before you draw it. I break the text into words, and keeping adding words to a line until measureText tells me the line is going to overflow the page. I have lots of ideas for how I could improve the line wrapping, but it’s good enough for now. I was also able to get fonts working, so I picked Georgia to match the font used for titles on AO3. Here are some examples: I had several ideas for choosing the background colour. I’m trying to help my friend browse her collection of fic, and colour would be a useful way to distinguish things – so how do I use it? I realised I could get the fandom from the EPUB file, so I decided to use that. I use the fandom name as a seed to a random number generator, then I pick a random colour. This means that all the fics in the same fandom will get the same colour – for example, all the Star Wars stories are a shade of red, while Star Trek are a bluey-green. This was a bit harder than I expected, because it turns out that JavaScript doesn’t have a built-in seeded random number generator – I ended up using some snippets from a Stack Overflow answer, where bryc has written several pseudorandom number generators in plain JavaScript. I didn’t realise until later, but I designed something similar to the placeholder book covers in the Apple Books app. I don’t use Apple Books that often so it wasn’t a deliberate choice to mimic this style, but clearly it was somewhere in my subconscious. One difference is that Apple’s app seems to be picking from a small selection of background colours, whereas my code can pick a much nicer variety of colours. Apple’s choices will have been pre-approved by a designer to look good, but I think mine is more fun. Add the cover image to the EPUB My first attempt to add a cover image used pandoc: pandoc input.epub --output output.epub --epub-cover-image cover.jpeg This approach was no good: although it added the cover image, it destroyed the formatting in the rest of the EPUB. This made it easier to find the fic, but harder to read once you’d found it. An EPUB file I downloaded from AO3, before/after it was processed by pandoc. So I tried to do it myself, and it turned out to be quite easy! I unzipped another EPUB which already had a cover image. I found the cover image in OPS/images/cover.jpg, and then I looked for references to it in content.opf. I found two elements that referred to cover images: <?xml version="1.0" encoding="UTF-8"?> <package xmlns="http://www.idpf.org/2007/opf" version="3.0" unique-identifier="PrimaryID"> <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> <meta name="cover" content="cover-image"/> … </metadata> <manifest> <item id="cover-image" href="images/cover.jpg" media-type="image/jpeg" properties="cover-image"/> … </manifest> </package> This gave me the steps for adding a cover image to an EPUB file: add the image file to the zipped bundle, then add these two elements to the content.opf. Where am I going to deploy this? I wrote the initial prototype of this in Python, because that’s the language I’m most familiar with. Python has all the libraries I need: The zipfile module can unpack and modify the EPUB/ZIP The xml.etree or lxml modules can manipulate XML The Pillow library can generate images I built a small Flask web app: you upload the EPUB to my server, my server does some processing, and sends the EPUB back to you. But for such a simple app, do I need a server? I tried rebuilding it as a static web page, doing all the processing in client-side JavaScript. That’s simpler for me to host, and it doesn’t involve a round-trip to my server. That has lots of other benefits – it’s faster, less of a privacy risk, and doesn’t require a persistent connection. I love static websites, so can they do this? Yes! I just had to find a different set of libraries: The JSZip library can unpack and modify the EPUB/ZIP, and is the only third-party code I’m using in the tool Browsers include DOMParser for manipulating XML I’ve already mentioned the HTML <canvas> element for rendering the image This took a bit longer because I’m not as familiar with JavaScript, but I got it working. As a bonus, this makes the tool very portable. Everything is bundled into a single HTML file, so if you download that file, you have the whole tool. If my friend finds this tool useful, she can save the file and keep a local copy of it – she doesn’t have to rely on my website to keep using it. What should it look like? My first design was very “engineer brain” – I just put the basic controls on the page. It was fine, but it wasn’t good. That might be okay, because the only person I need to be able to use this app is my friend – but wouldn’t it be nice if other people were able to use it? If they’re going to do that, they need to know what it is – most people aren’t going to read a 2,500 word blog post to understand a tool they’ve never heard of. (Although if you have read this far, I appreciate you!) I started designing a proper page, including some explanations and descriptions of what the tool is doing. I got something that felt pretty good, including FAQs and acknowledgements, and I added a grey area for the part where you actually upload and download your EPUBs, to draw the user’s eye and make it clear this is the important stuff. But even with that design, something was missing. I realised I was telling you I’d create covers, but not showing you what they’d look like. Aha! I sat down and made up a bunch of amusing titles for fanfic and fanfic authors, so now you see a sample of the covers before you upload your first EPUB: This makes it clearer what the app will do, and was a fun way to wrap up the project. What did I learn from this project? Don’t be scared of new file formats My first instinct was to look for a third-party library that could handle the “complexity” of EPUB files. In hindsight, I’m glad I didn’t find one – it forced me to learn more about how EPUBs work, and I realised I could write my own code using built-in libraries. EPUB files are essentially ZIP files, and I only had basic needs. I was able to write my own code. Because I didn’t rely on a library, now I know more about EPUBs, I have code that’s simpler and easier for me to understand, and I don’t have a dependency that may cause problems later. There are definitely some file formats where I need existing libraries (I’m not going to write my own JPEG parser, for example) – but I should be more open to writing my own code, and not jumping to add a dependency. Static websites can handle complex file manipulations I love static websites and I’ve used them for a lot of tasks, but mostly read-only display of information – not anything more complex or interactive. But modern JavaScript is very capable, and you can do a lot of things with it. Static pages aren’t just for static data. One of the first things I made that got popular was find untagged Tumblr posts, which was built as a static website because that’s all I knew how to build at the time. Somewhere in the intervening years, I forgot just how powerful static sites can be. I want to build more tools this way. Async JavaScript calls require careful handling The JSZip library I’m using has a lot of async functions, and this is my first time using async JavaScript. I got caught out several times, because I forgot to wait for async calls to finish properly. For example, I’m using canvas.toBlob to render the image, which is an async function. I wasn’t waiting for it to finish, and so the zip would be repackaged before the cover image was ready to add, and I got an EPUB with a missing image. Oops. I think I’ll always prefer the simplicity of synchronous code, but I’m sure I’ll get better at async JavaScript with practice. Final thoughts I know my friend will find this helpful, and that feels great. Writing software that’s designed for one person is my favourite software to write. It’s not hyper-scale, it won’t launch the next big startup, and it’s usually not breaking new technical ground – but it is useful. I can see how I’m making somebody’s life better, and isn’t that what computers are for? If other people like it, that’s a nice bonus, but I’m really thinking about that one person. Normally the one person I’m writing software for is me, so it’s extra nice when I can do it for somebody else. If you want to try this tool yourself, go to alexwlchan.net/my-tools/add-cover-to-ao3-epubs/ If you want to read the code, it’s all on GitHub. [If the formatting of this post looks odd in your feed reader, visit the original article]

7 hours ago 4 votes
Non-alcoholic apéritifs

I’ve been doing Dry January this year. One thing I missed was something for apéro hour, a beverage to mark the start of the evening. Something complex and maybe bitter, not like a drink you’d have with lunch. I found some good options. Ghia sodas are my favorite. Ghia is an NA apéritif based on grape juice but with enough bitterness (gentian) and sourness (yuzu) to be interesting. You can buy a bottle and mix it with soda yourself but I like the little cans with extra flavoring. The Ginger and the Sumac & Chili are both great. Another thing I like are low-sugar fancy soda pops. Not diet drinks, they still have a little sugar, but typically 50 calories a can. De La Calle Tepache is my favorite. Fermented pineapple is delicious and they have some fun flavors. Culture Pop is also good. A friend gave me the Zero book, a drinks cookbook from the fancy restaurant Alinea. This book is a little aspirational but the recipes are doable, it’s just a lot of labor. Very fancy high end drink mixing, really beautiful flavor ideas. The only thing I made was their gin substitute (mostly junipers extracted in glycerin) and it was too sweet for me. Need to find the right use for it, a martini definitely ain’t it. An easier homemade drink is this Nonalcoholic Dirty Lemon Tonic. It’s basically a lemonade heavily flavored with salted preserved lemons, then mixed with tonic. I love the complexity and freshness of this drink and enjoy it on its own merits. Finally, non-alcoholic beer has gotten a lot better in the last few years thanks to manufacturing innovations. I’ve been enjoying NA Black Butte Porter, Stella Artois 0.0, Heineken 0.0. They basically all taste just like their alcoholic uncles, no compromise. One thing to note about non-alcoholic substitutes is they are not cheap. They’ve become a big high end business. Expect to pay the same for an NA drink as one with alcohol even though they aren’t taxed nearly as much.

2 days ago 5 votes
It burns

The first time we had to evacuate Malibu this season was during the Franklin fire in early December. We went to bed with our bags packed, thinking they'd probably get it under control. But by 2am, the roaring blades of fire choppers shaking the house got us up. As we sped down the canyon towards Pacific Coast Highway (PCH), the fire had reached the ridge across from ours, and flames were blazing large out the car windows. It felt like we had left the evacuation a little too late, but they eventually did get Franklin under control before it reached us. Humans have a strange relationship with risk and disasters. We're so prone to wishful thinking and bad pattern matching. I remember people being shocked when the flames jumped the PCH during the Woolsey fire in 2017. IT HAD NEVER DONE THAT! So several friends of ours had to suddenly escape a nightmare scenario, driving through burning streets, in heavy smoke, with literally their lives on the line. Because the past had failed to predict the future. I feel into that same trap for a moment with the dramatic proclamations of wind and fire weather in the days leading up to January 7. Warning after warning of "extremely dangerous, life-threatening wind" coming from the City of Malibu, and that overly-bureaucratic-but-still-ominous "Particularly Dangerous Situation" designation. Because, really, how much worse could it be? Turns out, a lot. It was a little before noon on the 7th when we first saw the big plumes of smoke rise from the Palisades fire. And immediately the pattern matching ran astray. Oh, it's probably just like Franklin. It's not big yet, they'll get it out. They usually do. Well, they didn't. By the late afternoon, we had once more packed our bags, and by then it was also clear that things actually were different this time. Different worse. Different enough that even Santa Monica didn't feel like it was assured to be safe. So we headed far North, to be sure that we wouldn't have to evacuate again. Turned out to be a good move. Because by now, into the evening, few people in the connected world hadn't started to see the catastrophic images emerging from the Palisades and Eaton fires. Well over 10,000 houses would ultimately burn. Entire neighborhoods leveled. Pictures that could be mistaken for World War II. Utter and complete destruction. By the night of the 7th, the fire reached our canyon, and it tore through the chaparral and brush that'd been building since the last big fire that area saw in 1993. Out of some 150 houses in our immediate vicinity, nearly a hundred burned to the ground. Including the first house we moved to in Malibu back in 2009. But thankfully not ours. That's of course a huge relief. This was and is our Malibu Dream House. The site of that gorgeous home office I'm so fond to share views from. Our home. But a house left standing in a disaster zone is still a disaster. The flames reached all the way up to the base of our construction, incinerated much of our landscaping, and devoured the power poles around it to dysfunction. We have burnt-out buildings every which way the eye looks. The national guard is still stationed at road blocks on the access roads. Utility workers are tearing down the entire power grid to rebuild it from scratch. It's going to be a long time before this is comfortably habitable again. So we left. That in itself feels like defeat. There's an urge to stay put, and to help, in whatever helpless ways you can. But with three school-age children who've already missed over a months worth of learning from power outages, fire threats, actual fires, and now mudslide dangers, it was time to go. None of this came as a surprise, mind you. After Woolsey in 2017, Malibu life always felt like living on borrowed time to us. We knew it, even accepted it. Beautiful enough to be worth the risk, we said.  But even if it wasn't a surprise, it's still a shock. The sheer devastation, especially in the Palisades, went far beyond our normal range of comprehension. Bounded, as it always is, by past experiences. Thus, we find ourselves back in Copenhagen. A safe haven for calamities of all sorts. We lived here for three years during the pandemic, so it just made sense to use it for refuge once more. The kids' old international school accepted them right back in, and past friendships were quickly rebooted. I don't know how long it's going to be this time. And that's an odd feeling to have, just as America has been turning a corner, and just as the optimism is back in so many areas. Of the twenty years I've spent in America, this feels like the most exciting time to be part of the exceptionalism that the US of A offers. And of course we still are. I'll still be in the US all the time on both business, racing, and family trips. But it won't be exclusively so for a while, and it won't be from our Malibu Dream House. And that burns.

2 days ago 6 votes
Slow, flaky, and failing

Thou shalt not suffer a flaky test to live, because it’s annoying, counterproductive, and dangerous: one day it might fail for real, and you won’t notice. Here’s what to do.

3 days ago 7 votes
Name that Ware, January 2025

The ware for January 2025 is shown below. Thanks to brimdavis for contributing this ware! …back in the day when you would get wares that had “blue wires” in them… One thing I wonder about this ware is…where are the ROMs? Perhaps I’ll find out soon! Happy year of the snake!

3 days ago 5 votes