Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
48
Recently I was writing a shell script to deal with the AWS CLI, and I wanted to filter the list of results using jq. Specifically, I wanted to filter using some of the AWS tags, which are a bit unwieldy – although the tags form a set of key/value pairs, they’re returned as a list of objects with Key/Value keys. For example, given this list of two instances, how do I get the JSON object for the bastion host? [ { "InstanceId": "i-123456789", "Tags": [ { "Key": "Environment", "Value": "Production" }, { "Key": "Name", "Value": "container-host_a88676" } ] }, { "InstanceId": "i-987654321", "Tags": [ { "Key": "Environment", "Value": "Production" }, { "Key": "Name", "Value": "bastion-host_517e67" } ] } ] I found a few snippets around the Internet, but they all did a complicated combination of map and select. I cobbled this together, which seemed to...
over a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from alexwlchan

Handling JSON objects with duplicate names in Python

Consider the following JSON object: { "sides": 4, "colour": "red", "sides": 5, "colour": "blue" } Notice that sides and colour both appear twice. This looks invalid, but I learnt recently that this is actually legal JSON syntax! It’s unusual and discouraged, but it’s not completely forbidden. This was a big surprise to me. I think of JSON objects as key/value pairs, and I associate them with data structures like a dict in Python or a Hash in Ruby – both of which only allow unique keys. JSON has no such restriction, and I started thinking about how to handle it. What does the JSON spec say about duplicate names? JSON is described by several standards, which Wikipedia helpfully explains for us: After RFC 4627 had been available as its “informational” specification since 2006, JSON was first standardized in 2013, as ECMA‑404. RFC 8259, published in 2017, is the current version of the Internet Standard STD 90, and it remains consistent with ECMA‑404. That same year, JSON was also standardized as ISO/IEC 21778:2017. The ECMA and ISO/IEC standards describe only the allowed syntax, whereas the RFC covers some security and interoperability considerations. All three of these standards explicitly allow the use of duplicate names in objects. ECMA‑404 and ISO/IEC 21778:2017 have identical text to describe the syntax of JSON objects, and they say (emphasis mine): An object structure is represented as a pair of curly bracket tokens surrounding zero or more name/value pairs. […] The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs. These are all semantic considerations that may be defined by JSON processors or in specifications defining specific uses of JSON for data interchange. RFC 8259 goes further and strongly recommends against duplicate names, but the use of SHOULD means it isn’t completely forbidden: The names within an object SHOULD be unique. The same document warns about the consequences of ignoring this recommendation: An object whose names are all unique is interoperable in the sense that all software implementations receiving that object will agree on the name-value mappings. When the names within an object are not unique, the behavior of software that receives such an object is unpredictable. Many implementations report the last name/value pair only. Other implementations report an error or fail to parse the object, and some implementations report all of the name/value pairs, including duplicates. So it’s technically valid, but it’s unusual and discouraged. I’ve never heard of a use case for JSON objects with duplicate names. I’m sure there was a good reason for it being allowed by the spec, but I can’t think of it. Most JSON parsers – including jq, JavaScript, and Python – will silently discard all but the last instance of a duplicate name. Here’s an example in Python: >>> import json >>> json.loads('{"sides": 4, "colour": "red", "sides": 5, "colour": "blue"}') {'colour': 'blue', 'sides': 5} What if I wanted to decode the whole object, or throw an exception if I see duplicate names? This happened to me recently. I was editing a JSON file by hand, and I’d copy/paste objects to update the data. I also had scripts which could update the file. I forgot to update the name on one of the JSON objects, so there were two name/value pairs with the same name. When I ran the script, it silently erased the first value. I was able to recover the deleted value from the Git history, but I wondered how I could prevent this happening again. How could I make the script fail, rather than silently delete data? Decoding duplicate names in Python When Python decodes a JSON object, it first parses the object as a list of name/value pairs, then it turns that list of name value pairs into a dictionary. We can see this by looking at the JSONObject function in the CPython source code: it builds a list pairs, and at the end of the function, it calls dict(pairs) to turn the list into a dictionary. This relies on the fact that dict() can take an iterable of key/value tuples and create a dictionary: >>> dict([('sides', 4), ('colour', 'red')]) {'colour': 'red', 'sides': 4} The docs for dict() tell us that it` will discard duplicate keys: “if a key occurs more than once, the last value for that key becomes the corresponding value in the new dictionary”. >>> dict([('sides', 4), ('colour', 'red'), ('sides', 5), ('colour', 'blue')]) {'colour': 'blue', 'sides': 5} We can customise what Python does with the list of name/value pairs. Rather than calling dict(), we can pass our own function to the object_pairs_hook parameter of json.loads(), and Python will call that function on the list of pairs. This allows us to parse objects in a different way. For example, we can just return the literal list of name/value pairs: >>> import json >>> json.loads( ... '{"sides": 4, "colour": "red", "sides": 5, "colour": "blue"}', ... object_pairs_hook=lambda pairs: pairs ... ) ... [('sides', 4), ('colour', 'red'), ('sides', 5), ('colour', 'blue')] We could also use the multidict library to get a dict-like data structure which supports multiple values per key. This is based on HTTP headers and URL query strings, two environments where it’s common to have multiple values for a single key: >>> from multidict import MultiDict >>> md = json.loads( ... '{"sides": 4, "colour": "red", "sides": 5, "colour": "blue"}', ... object_pairs_hook=lambda pairs: MultiDict(pairs) ... ) ... >>> md <MultiDict('sides': 4, 'colour': 'red', 'sides': 5, 'colour': 'blue')> >>> md['sides'] 4 >>> md.getall('sides') [4, 5] Preventing silent data loss If we want to throw an exception when we see duplicate names, we need a longer function. Here’s the code I wrote: import collections import typing def dict_with_unique_names(pairs: list[tuple[str, typing.Any]]) -> dict[str, typing.Any]: """ Convert a list of name/value pairs to a dict, but only if the names are unique. If there are non-unique names, this function throws a ValueError. """ # First try to parse the object as a dictionary; if it's the same # length as the pairs, then we know all the names were unique and # we can return immediately. pairs_as_dict = dict(pairs) if len(pairs_as_dict) == len(pairs): return pairs_as_dict # Otherwise, let's work out what the repeated name(s) were, so we # can throw an appropriate error message for the user. name_tally = collections.Counter(n for n, _ in pairs) repeated_names = [n for n, count in name_tally.items() if count > 1] assert len(repeated_names) > 0 if len(repeated_names) == 1: raise ValueError(f"Found repeated name in JSON object: {repeated_names[0]}") else: raise ValueError( f"Found repeated names in JSON object: {', '.join(repeated_names)}" ) If I use this as my object_pairs_hook when parsing an object which has all unique names, it returns the normal dict I’d expect: >>> json.loads( ... '{"sides": 4, "colour": "red"}', ... object_pairs_hook=dict_with_unique_names ... ) ... {'colour': 'red', 'sides': 4} But if I’m parsing an object with one or more repeated names, the parsing fails and throws a ValueError: >>> json.loads( ... '{"sides": 4, "colour": "red", "sides": 5}', ... object_pairs_hook=dict_with_unique_names ... ) Traceback (most recent call last): […] ValueError: Found repeated name in JSON object: sides >>> json.loads( ... '{"sides": 4, "colour": "red", "sides": 5, "colour": "blue"}', ... object_pairs_hook=dict_with_unique_names ... ) Traceback (most recent call last): […] ValueError: Found repeated names in JSON object: sides, colour This is precisely the behaviour I want – throwing an exception, not silently dropping data. Encoding non-unique names in Python It’s hard to think of a use case, but this post feels incomplete without at least a brief mention. If you want to encode custom data structures with Python’s JSON library, you can subclass JSONEncoder and define how those structures should be serialised. Here’s a rudimentary attempt at doing that for a MultiDict: class MultiDictEncoder(json.JSONEncoder): def encode(self, o: typing.Any) -> str: # If this is a MultiDict, we need to construct the JSON string # manually -- first encode each name/value pair, then construct # the JSON object literal. if isinstance(o, MultiDict): name_value_pairs = [ f'{super().encode(str(name))}: {self.encode(value)}' for name, value in o.items() ] return '{' + ', '.join(name_value_pairs) + '}' return super().encode(o) and here’s how you use it: >>> md = MultiDict([('sides', 4), ('colour', 'red'), ('sides', 5), ('colour', 'blue')]) >>> json.dumps(md, cls=MultiDictEncoder) {"sides": 4, "colour": "red", "sides": 5, "colour": "blue"} This is rough code, and you shouldn’t use it – it’s only an example. I’m constructing the JSON string manually, so it doesn’t handle edge cases like indentation or special characters. There are almost certainly bugs, and you’d need to be more careful if you wanted to use this for real. In practice, if I had to encode a multi-dict as JSON, I’d encode it as a list of objects which each have a key and a value field. For example: [ {"key": "sides", "value": 4 }, {"key": "colour", "value": "red" }, {"key": "sides", "value": 5 }, {"key": "colour", "value": "blue"}, ] This is a pretty standard pattern, and it won’t trip up JSON parsers which aren’t expecting duplicate names. Do you need to worry about this? This isn’t a big deal. JSON objects with duplicate names are pretty unusual – this is the first time I’ve ever encountered one, and it was a mistake. Trying to account for this edge case in every project that uses JSON would be overkill. It would add complexity to my code and probably never catch a single error. This started when I made a copy/paste error that introduced the initial duplication, and then a script modified the JSON file and caused some data loss. That’s a somewhat unusual workflow, because most JSON files are exclusively modified by computers, and this wouldn’t be an issue. I’ve added this error handling to my javascript-data-files library, but I don’t anticipate adding it to other projects. I use that library for my static website archives, which is where I had this issue. Although I won’t use this code exactly, it’s been good practice at writing custom encoders/decoders in Python. That is something I do all the time – I’m often encoding native Python types as JSON, and I want to get the same type back when I decode later. I’ve been writing my own subclasses of JSONEncoder and JSONDecoder for a while. Now I know a bit more about how Python decodes JSON, and object_pairs_hook is another tool I can consider using. This was a fun deep dive for me, and I hope you found it helpful too. [If the formatting of this post looks odd in your feed reader, visit the original article]

5 days ago 1 votes
A faster way to copy SQLite databases between computers

I store a lot of data in SQLite databases on remote servers, and I often want to copy them to my local machine for analysis or backup. When I’m starting a new project and the database is near-empty, this is a simple rsync operation: $ rsync --progress username@server:my_remote_database.db my_local_database.db As the project matures and the database grows, this gets slower and less reliable. Downloading a 250MB database from my web server takes about a minute over my home Internet connection, and that’s pretty small – most of my databases are multiple gigabytes in size. I’ve been trying to make these copies go faster, and I recently discovered a neat trick. What really slows me down is my indexes. I have a lot of indexes in my SQLite databases, which dramatically speed up my queries, but also make the database file larger and slower to copy. (In one database, there’s an index which single-handedly accounts for half the size on disk!) The indexes don’t store anything unique – they just duplicate data from other tables to make queries faster. Copying the indexes makes the transfer less efficient, because I’m copying the same data multiple times. I was thinking about ways to skip copying the indexes, and I realised that SQLite has built-in tools to make this easy. Dumping a database as a text file SQLite allows you to dump a database as a text file. If you use the .dump command, it prints the entire database as a series of SQL statements. This text file can often be significantly smaller than the original database. Here’s the command: $ sqlite3 my_database.db .dump > my_database.db.txt And here’s what the beginning of that file looks like: PRAGMA foreign_keys=OFF; BEGIN TRANSACTION; CREATE TABLE IF NOT EXISTS "tags" ( [name] TEXT PRIMARY KEY, [count_uses] INTEGER NOT NULL ); INSERT INTO tags VALUES('carving',260); INSERT INTO tags VALUES('grass',743); … Crucially, this reduces the large and disk-heavy indexes into a single line of text – it’s an instruction to create an index, not the index itself. CREATE INDEX [idx_photo_locations] ON [photos] ([longitude], [latitude]); This means that I’m only storing each value once, rather than the many times it may be stored across the original table and my indexes. This is how the text file can be smaller than the original database. If you want to reconstruct the database, you pipe this text file back to SQLite: $ cat my_database.db.txt | sqlite3 my_reconstructed_database.db Because the SQL statements are very repetitive, this text responds well to compression: $ sqlite3 explorer.db .dump | gzip -c > explorer.db.txt.gz To give you an idea of the potential savings, here’s the relative disk size for one of my databases. File Size on disk original SQLite database 3.4 GB text file (sqlite3 my_database.db .dump) 1.3 GB gzip-compressed text (sqlite3 my_database.db .dump | gzip -c) 240 MB The gzip-compressed text file is 14× smaller than the original SQLite database – that makes downloading the database much faster. My new ssh+rsync command Rather than copying the database directly, now I create a gzip-compressed text file on the server, copy that to my local machine, and reconstruct the database. Like so: # Create a gzip-compressed text file on the server ssh username@server "sqlite3 my_remote_database.db .dump | gzip -c > my_remote_database.db.txt.gz" # Copy the gzip-compressed text file to my local machine rsync --progress username@server:my_remote_database.db.txt.gz my_local_database.db.txt.gz # Remove the gzip-compressed text file from my server ssh username@server "rm my_remote_database.db.txt.gz" # Uncompress the text file gunzip my_local_database.db.txt.gz # Reconstruct the database from the text file cat my_local_database.db.txt | sqlite3 my_local_database.db # Remove the local text file rm my_local_database.db.txt A database dump is a stable copy source This approach fixes another issue I’ve had when copying SQLite databases. If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file. The first half of the file is pre-update, the second half file is post-update, and they don’t match. When I try to open the database locally, I get an error: database disk image is malformed By creating a text dump before I start the copy operation, I’m giving rsync a stable copy source. That text dump isn’t going to change midway through the copy, so I’ll always get a complete and consistent text file. This approach has saved me hours when working with large databases, and made my downloads both faster and more reliable. If you have to copy around large SQLite databases, give it a try. [If the formatting of this post looks odd in your feed reader, visit the original article]

a week ago 1 votes
A flash of light in the darkness

I support dark mode on this site, and as part of the dark theme, I have a colour-inverted copy of the default background texture. I like giving my website a subtle bit of texture, which I think makes it stand out from a web which is mostly solid-colour backgrounds. Both my textures are based on the “White Waves” pattern made by Stas Pimenov. I was setting these images as my background with two CSS rules, using the prefers-color-scheme: dark media feature to use the alternate image in dark mode: body { background: url('https://alexwlchan.net/theme/white-waves-transparent.png'); } @media (prefers-color-scheme: dark) { body { background: url('https://alexwlchan.net/theme/black-waves-transparent.png'); } } This works, mostly. But I prefer light mode, so while I wrote this CSS and I do some brief testing whenever I make changes, I’m not using the site in dark mode. I know how dark mode works in my local development environment, not how it feels as a day-to-day user. Late last night I was using my phone in dark mode to avoid waking the other people in the house, and I opened my site. I saw a brief flash of white, and then the dark background texture appeared. That flash of bright white is precisely what you don’t want when you’re using dark mode, but it happened anyway. I made a note to work it out in the morning, then I went to bed. Now I’m fully awake, it’s obvious what happened. Because my only background is the image URL, there’s a brief gap between the CSS being parsed and the background image being loaded. In that time, the browser doesn’t have anything to put in the background, so you just get pure white. This was briefly annoying in the moment, but it would be even more worse if the background texture never loaded. I have light text on black in dark mode, but without the background image it’s just light text on white, which is barely readable: I never noticed this in local development, because I’m usually working in a well-lit room where that white flash would be far less obvious. I’m also using a local version of the site, which loads near-instantly and where the background image is almost certainly saved in my browser cache. I’ve made two changes to prevent this happening again. I’ve added a colour to use as a fallback until the image loads. The CSS background property supports adding a colour, which is used until the image loads, or as a fallback if it doesn’t. I already use this in a few places, and now I’ve added it to my body background. body { background: url('https://…/white-waves-transparent.png') #fafafa; } @media (prefers-color-scheme: dark) { body { background: url('https://…/black-waves-transparent.png') #0d0d0d; } } This avoids the flash of unstyled background before the image loads – the browser will use a solid dark background until it gets the texture. I’ve added rel="preload" elements to the head of the page, so the browser will start loading the background textures faster. These elements are a clue to the browser that these resources are going to be useful when it renders the page, so it should start loading them as soon as possible: <link rel="preload" href="https://alexwlchan.net/theme/white-waves-transparent.png" as="image" type="image/png" media="(prefers-color-scheme: light)" /> <link rel="preload" href="https://alexwlchan.net/theme/black-waves-transparent.png" as="image" type="image/png" media="(prefers-color-scheme: dark)" /> This means the browser is downloading the appropriate texture at the same time as it’s downloading the CSS file. Previously it had to download the CSS file, parse it, and only then would it know to start downloading the texture. With the preload, it’s a bit faster! The difference is probably imperceptible if you’re on a fast connection, but it’s a small win and I can’t see any downside (as long as I scope the preload correctly, and don’t preload resources I don’t end up using). I’ve seen a lot of sites using <link rel="preload"> and I’ve only half-understood what it is and why it’s useful – I’m glad to have a chance to use it myself, so I can understand it better. This bug reminds me of a phenomenon called flash of unstyled text. Back when custom fonts were fairly new, you’d often see web pages appear briefly with the default font before custom fonts finished loading. There are well-understood techniques for preventing this, so it’s unusual to see that brief unstyled text on modern web pages – but the same issue is affecting me in dark mode I avoided using custom fonts on the web to avoid tackling this issue, but it got me anyway! In these dark times for the web, old bugs are new again. [If the formatting of this post looks odd in your feed reader, visit the original article]

2 weeks ago 13 votes
Beyond `None`: actionable error messages for `keyring.get_password()`

I’m a big fan of keyring, a Python module made by Jason R. Coombs for storing secrets in the system keyring. It works on multiple operating systems, and it knows what password store to use for each of them. For example, if you’re using macOS it puts secrets in the Keychain, but if you’re on Windows it uses Credential Locker. The keyring module is a safe and portable way to store passwords, more secure than using a plaintext config file or an environment variable. The same code will work on different platforms, because keyring handles the hard work of choosing which password store to use. It has a straightforward API: the keyring.set_password and keyring.get_password functions will handle a lot of use cases. >>> import keyring >>> keyring.set_password("xkcd", "alexwlchan", "correct-horse-battery-staple") >>> keyring.get_password("xkcd", "alexwlchan") "correct-horse-battery-staple" Although this API is simple, it’s not perfect – I have some frustrations with the get_password function. In a lot of my projects, I’m now using a small function that wraps get_password. What do I find frustrating about keyring.get_password? If you look up a password that isn’t in the system keyring, get_password returns None rather than throwing an exception: >>> print(keyring.get_password("xkcd", "the_invisible_man")) None I can see why this makes sense for the library overall – a non-existent password is very normal, and not exceptional behaviour – but in my projects, None is rarely a usable value. I normally use keyring to retrieve secrets that I need to access protected resources – for example, an API key to call an API that requires authentication. If I can’t get the right secrets, I know I can’t continue. Indeed, continuing often leads to more confusing errors when some other function unexpectedly gets None, rather than a string. For a while, I wrapped get_password in a function that would throw an exception if it couldn’t find the password: def get_required_password(service_name: str, username: str) -> str: """ Get password from the specified service. If a matching password is not found in the system keyring, this function will throw an exception. """ password = keyring.get_password(service_name, username) if password is None: raise RuntimeError(f"Could not retrieve password {(service_name, username)}") return password When I use this function, my code will fail as soon as it fails to retrieve a password, rather than when it tries to use None as the password. This worked well enough for my personal projects, but it wasn’t a great fit for shared projects. I could make sense of the error, but not everyone could do the same. What’s that password meant to be? A good error message explains what’s gone wrong, and gives the reader clear steps for fixing the issue. The error message above is only doing half the job. It tells you what’s gone wrong (it couldn’t get the password) but it doesn’t tell you how to fix it. As I started using this snippet in codebases that I work on with other developers, I got questions when other people hit this error. They could guess that they needed to set a password, but the error message doesn’t explain how, or what password they should be setting. For example, is this a secret they should pick themselves? Is it a password in our shared password vault? Or do they need an API key for a third-party service? If so, where do they find it? I still think my initial error was an improvement over letting None be used in the rest of the codebase, but I realised I could go further. This is my extended wrapper: def get_required_password(service_name: str, username: str, explanation: str) -> str: """ Get password from the specified service. If a matching password is not found in the system keyring, this function will throw an exception and explain to the user how to set the required password. """ password = keyring.get_password(service_name, username) if password is None: raise RuntimeError( "Unable to retrieve required password from the system keyring!\n" "\n" "You need to:\n" "\n" f"1/ Get the password. Here's how: {explanation}\n" "\n" "2/ Save the new password in the system keyring:\n" "\n" f" keyring set {service_name} {username}\n" ) return password The explanation argument allows me to explain what the password is for to a future reader, and what value it should have. That information can often be found in a code comment or in documentation, but putting it in an error message makes it more visible. Here’s one example: get_required_password( "flask_app", "secret_key", explanation=( "Pick a random value, e.g. with\n" "\n" " python3 -c 'import secrets; print(secrets.token_hex())'\n" "\n" "This password is used to securely sign the Flask session cookie. " "See https://flask.palletsprojects.com/en/stable/config/#SECRET_KEY" ), ) If you call this function and there’s no keyring entry for flask_app/secret_key, you get the following error: Unable to retrieve required password from the system keyring! You need to: 1/ Get the password. Here's how: Pick a random value, e.g. with python3 -c 'import secrets; print(secrets.token_hex())' This password is used to securely sign the Flask session cookie. See https://flask.palletsprojects.com/en/stable/config/#SECRET_KEY 2/ Save the new password in the system keyring: keyring set flask_app secret_key It’s longer, but this error message is far more informative. It tells you what’s wrong, how to save a password, and what the password should be. This is based on a real example where the previous error message led to a misunderstanding. A co-worker saw a missing password called “secret key” and thought it referred to a secret key for calling an API, and didn’t realise it was actually for signing Flask session cookies. Now I can write a more informative error message, I can prevent that misunderstanding happening again. (We also renamed the secret, for additional clarity.) It takes time to write this explanation, which will only ever be seen by a handful of people, but I think it’s important. If somebody sees it at all, it’ll be when they’re setting up the project for the first time. I want that setup process to be smooth and straightforward. I don’t use this wrapper in all my code, particularly small or throwaway toys that won’t last long enough for this to be an issue. But in larger codebases that will be used by other developers, and which I expect to last a long time, I use it extensively. Writing a good explanation now can avoid frustration later. [If the formatting of this post looks odd in your feed reader, visit the original article]

3 weeks ago 13 votes
Localising the `` with JavaScript

I’ve been writing some internal dashboards recently, and one hard part is displaying timestamps. Our server does everything in UTC, but the team is split across four different timezones, so the server timestamps aren’t always easy to read. For most people, it’s harder to understand a UTC timestamp than a timestamp in your local timezone. Did that event happen just now, an hour ago, or much further back? Was that at the beginning of your working day? Or at the end? Then I remembered that I tried to solve this five years ago at a previous job. I wrote a JavaScript snippet that converts UTC timestamps into human-friendly text. It displays times in your local time zone, and adds a short suffix if the time happened recently. For example: today @ 12:00 BST (1 hour ago) In my old project, I was using writing timestamps in a <div> and I had to opt into the human-readable text for every date on the page. It worked, but it was a bit fiddly. Doing it again, I thought of a more elegant solution. HTML has a <time> element for expressing datetimes, which is a more meaningful wrapper than a <div>. When I render the dashboard on the server, I don’t know the user’s timezone, so I include the UTC timestamp in the page like so: <time datetime="2025-04-15 19:45:00Z"> Tue, 15 Apr 2025 at 19:45 UTC </time> I put a machine-readable date and time string with a timezone offset string in the datetime attribute, and then a more human-readable string in the text of the element. Then I add this JavaScript snippet to the page: window.addEventListener("DOMContentLoaded", function() { document.querySelectorAll("time").forEach(function(timeElem) { // Set the `title` attribute to the original text, so a user // can hover over a timestamp to see the UTC time. timeElem.setAttribute("title", timeElem.innerText); // Replace the display text with a human-friendly date string // which is localised to the user's timezone. timeElem.innerText = getHumanFriendlyDateString( timeElem.getAttribute("datetime") ); }) }); This updates any <time> element on the page to use a human friendly date string, which is localised to the user’s timezone. For example, I’m in the UK so that becomes: <time datetime="2025-04-15 19:45:00Z" title="Tue, 15 Apr 2025 at 19:45 UTC"> Tue, 15 Apr 2025 at 20:45 BST </time> In my experience, these timestamps are easier and more intuitive for people to read. I always include a timezone string (e.g. BST, EST, PDT) so it’s obvious that I’m showing a localised timestamp. If you really need the UTC timestamp, it’s in the title attribute, so you can see it by hovering over it. (Sorry, mouseless users, but I don’t think any of my team are browsing our dashboards from their phone or tablet.) If the JavaScript doesn’t load, you see the plain old UTC timestamp. It’s not ideal, but the page still loads and you can see all the information – this behaviour is an enhancement, not an essential. To me, this is the unfulfilled promise of the <time> element. In my fantasy world, web page authors would write the time in a machine-readable format, and browsers would show it in a way that makes sense for the reader. They’d take into account their language, locale, and time zone. I understand why that hasn’t happened – it’s much easier said than done. You need so much context to know what’s the “right” thing to do when dealing with datetimes, and guessing without that context is at the heart of many datetime bugs. These sort of human-friendly, localised timestamps are very handy sometimes, and a complete mess at other times. In my staff-only dashboards, I have that context. I know what these timestamps mean, who’s going to be reading them, and I think they’re a helpful addition that makes the data easier to read. [If the formatting of this post looks odd in your feed reader, visit the original article]

3 weeks ago 17 votes

More in programming

I switched from GMail and nobody died

Whether we like it or not, email is widely used to identify a person. Code sent to email is used as authentication and sometimes as authorisation for certain actions. I’m not comfortable with Google having such power over me, especially given the fact that they practically don’t have any support you can appeal to. If your Google account is blocked, that’s it. Maybe you know someone from Google and they can help you, but for most of us mortals that’s not an option.

20 hours ago 2 votes
Write the most clever code you possibly can

I started writing this early last week but Real Life Stuff happened and now you're getting the first-draft late this week. Warning, unedited thoughts ahead! New Logic for Programmers release! v0.9 is out! This is a big release, with a new cover design, several rewritten chapters, online code samples and much more. See the full release notes at the changelog page, and get the book here! Write the cleverest code you possibly can There are millions of articles online about how programmers should not write "clever" code, and instead write simple, maintainable code that everybody understands. Sometimes the example of "clever" code looks like this (src): # Python p=n=1 exec("p*=n*n;n+=1;"*~-int(input())) print(p%n) This is code-golfing, the sport of writing the most concise code possible. Obviously you shouldn't run this in production for the same reason you shouldn't eat dinner off a Rembrandt. Other times the example looks like this: def is_prime(x): if x == 1: return True return all([x%n != 0 for n in range(2, x)] This is "clever" because it uses a single list comprehension, as opposed to a "simple" for loop. Yes, "list comprehensions are too clever" is something I've read in one of these articles. I've also talked to people who think that datatypes besides lists and hashmaps are too clever to use, that most optimizations are too clever to bother with, and even that functions and classes are too clever and code should be a linear script.1. Clever code is anything using features or domain concepts we don't understand. Something that seems unbearably clever to me might be utterly mundane for you, and vice versa. How do we make something utterly mundane? By using it and working at the boundaries of our skills. Almost everything I'm "good at" comes from banging my head against it more than is healthy. That suggests a really good reason to write clever code: it's an excellent form of purposeful practice. Writing clever code forces us to code outside of our comfort zone, developing our skills as software engineers. Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you [will get excellent debugging practice at exactly the right level required to push your skills as a software engineer] — Brian Kernighan, probably There are other benefits, too, but first let's kill the elephant in the room:2 Don't commit clever code I am proposing writing clever code as a means of practice. Being at work is a job with coworkers who will not appreciate if your code is too clever. Similarly, don't use too many innovative technologies. Don't put anything in production you are uncomfortable with. We can still responsibly write clever code at work, though: Solve a problem in both a simple and a clever way, and then only commit the simple way. This works well for small scale problems where trying the "clever way" only takes a few minutes. Write our personal tools cleverly. I'm a big believer of the idea that most programmers would benefit from writing more scripts and support code customized to their particular work environment. This is a great place to practice new techniques, languages, etc. If clever code is absolutely the best way to solve a problem, then commit it with extensive documentation explaining how it works and why it's preferable to simpler solutions. Bonus: this potentially helps the whole team upskill. Writing clever code... ...teaches simple solutions Usually, code that's called too clever composes several powerful features together — the "not a single list comprehension or function" people are the exception. Josh Comeau's "don't write clever code" article gives this example of "too clever": const extractDataFromResponse = (response) => { const [Component, props] = response; const resultsEntries = Object.entries({ Component, props }); const assignIfValueTruthy = (o, [k, v]) => (v ? { ...o, [k]: v } : o ); return resultsEntries.reduce(assignIfValueTruthy, {}); } What makes this "clever"? I count eight language features composed together: entries, argument unpacking, implicit objects, splats, ternaries, higher-order functions, and reductions. Would code that used only one or two of these features still be "clever"? I don't think so. These features exist for a reason, and oftentimes they make code simpler than not using them. We can, of course, learn these features one at a time. Writing the clever version (but not committing it) gives us practice with all eight at once and also with how they compose together. That knowledge comes in handy when we want to apply a single one of the ideas. I've recently had to do a bit of pandas for a project. Whenever I have to do a new analysis, I try to write it as a single chain of transformations, and then as a more balanced set of updates. ...helps us master concepts Even if the composite parts of a "clever" solution aren't by themselves useful, it still makes us better at the overall language, and that's inherently valuable. A few years ago I wrote Crimes with Python's Pattern Matching. It involves writing horrible code like this: from abc import ABC class NotIterable(ABC): @classmethod def __subclasshook__(cls, C): return not hasattr(C, "__iter__") def f(x): match x: case NotIterable(): print(f"{x} is not iterable") case _: print(f"{x} is iterable") if __name__ == "__main__": f(10) f("string") f([1, 2, 3]) This composes Python match statements, which are broadly useful, and abstract base classes, which are incredibly niche. But even if I never use ABCs in real production code, it helped me understand Python's match semantics and Method Resolution Order better. ...prepares us for necessity Sometimes the clever way is the only way. Maybe we need something faster than the simplest solution. Maybe we are working with constrained tools or frameworks that demand cleverness. Peter Norvig argued that design patterns compensate for missing language features. I'd argue that cleverness is another means of compensating: if our tools don't have an easy way to do something, we need to find a clever way. You see this a lot in formal methods like TLA+. Need to check a hyperproperty? Cast your state space to a directed graph. Need to compose ten specifications together? Combine refinements with state machines. Most difficult problems have a "clever" solution. The real problem is that clever solutions have a skill floor. If normal use of the tool is at difficult 3 out of 10, then basic clever solutions are at 5 out of 10, and it's hard to jump those two steps in the moment you need the cleverness. But if you've practiced with writing overly clever code, you're used to working at a 7 out of 10 level in short bursts, and then you can "drop down" to 5/10. I don't know if that makes too much sense, but I see it happen a lot in practice. ...builds comradery On a few occasions, after getting a pull request merged, I pulled the reviewer over and said "check out this horrible way of doing the same thing". I find that as long as people know they're not going to be subjected to a clever solution in production, they enjoy seeing it! Next week's newsletter will probably also be late, after that we should be back to a regular schedule for the rest of the summer. Mostly grad students outside of CS who have to write scripts to do research. And in more than one data scientist. I think it's correlated with using Jupyter. ↩ If I don't put this at the beginning, I'll get a bajillion responses like "your team will hate you" ↩

16 hours ago 1 votes
A Little Bit Now, A Lotta Bit Later

In mid-March we released a big bug fix update—elementary OS 8.0.1—and since then we’ve been hard at work on even more bug fixes and some new exciting features that I’m excited to share with you today! Read ahead to find out what we’ve released recently and what you can help us test in Early Access. Quick Settings Quick Settings has a new “Prevent Sleep” toggle Leo added a new “Prevent Sleep” toggle. This is useful when you’re giving a presentation or have a long-running background task where you want to temporarily avoid letting the computer go to sleep on its normal schedule. We also fixed a bug where the “Dark Mode” toggle would cancel the dark mode schedule when used. We now have proper schedule snoozing, so when you manually toggle Dark Mode on or off while using a timed or sunset-to-sunrise schedule, your schedule will resume on the next schedule change instead of being canceled completely. Vishal also fixed an issue that caused some apps to report being improperly closed on system shutdown or restart and on the lock screen we now show the “Suspend” button rather than the “Lock” button. System Settings Locale settings has a fresh layout thanks to Alain with its options aligned more cleanly and improved links to additional settings. Locale Settings has a more responsive design We’ve also added the phrase “about this device” as a search term for the System page and improved interface copy when a restart is required to finish installing updates based on your feedback. Plus, Stanisław improved stylus detection in Wacom settings preventing a crash when no stylus is found. AppCenter We now show a small label next to the download button for apps which contain in-app purchases. This is especially useful for easily identifying free-to-play games or alt stores like Steam or Heroic Games Launcher. AppCenter now shows when apps have in-app purchases Plus, we now reload app icons on-the-fly as their data is processed, thanks to Italo. That means you’ll no longer get occasionally stuck with an AppCenter which shows missing images for app’s who have taken a bit longer than usual to load. Get These Updates As always, pop open System Settings → System on elementary OS 8 and hit “Update All” to get these updates plus your regular security, bug fix, and translation updates. Or set up automatic updates and get a notification when updates are ready to install! Early Access Our development focus recently has been on some of the bigger features that will likely land for either elementary OS 8.1 or 9. We’ve got a new app, big changes to the design of our desktop itself, a whole lot of under-the-hood cleanup, and the return of some key system services thanks to a new open source project. Monitor We’re now shipping a System Monitor app by default By popular demand—and thanks to the hard work of Stanisław—we have a new system monitor app called “Monitor” shipping in Early Access. Monitor provides usage information for your processor, GPU, memory, storage, network, and currently running processes. You can optionally see system information in the panel with Monitor You can also optionally get a ton of glanceable information shown in the panel. There’s currently a lot of work happening to port Monitor to GTK4 and improve its functionality under the Secure Session, so make sure to report any issues you find! Multitasking The Dock is getting a workspace switcher Probably the biggest change to the Pantheon shell since its early inception, the Dock is getting a new workspace switcher! The workspace switcher works in a familiar way to the one you may have seen in the Multitasking View: Your currently open workspaces are represented as tiles with the icons of apps running on them; You can select a workspace to switch to it; You can drag-and-drop workspaces to rearrange them; And you can use the “+” button to create a new blank workspace. One new trick however is that selecting the workspace you’re already on will launch Multitasking View. The new workspace switcher makes it so much more accessible to multitask with just the mouse and get an overview of your workflows without having to first enter the Multitasking View. We’re really excited to hear what people think about it! You can close apps from Multitasking View by swiping up Another very satisfying feature for folks using touch input, you can now swipe up windows in the Multitasking View to close them. This is a really familiar gesture for those of us with Android and iOS devices and feels really natural for managing a big stack of windows without having to aim for a small “x” button. GTK4 Porting We’ve recently landed the port of Tasks to GTK4. So far that comes with a few fixes to tighten up its design, with much more possible in the future. Please make sure to help us test it thoroughly for any regressions! Tasks has a slightly tightened up design We’re also making great progress on porting the panel to GTK4. So far we have branches in review for Nightlight, Bluetooth, Datetime, and Network indicators. Power, Keyboard, and Quick Settings indicators all have in-progress branches. That leaves just Applications, Sound, and Notifications. So far these ports don’t come with major feature changes, but they do involve lots of cleaning up and modernizing of these code bases and in some cases fixing bugs! When the port is finished, we should see immediate performance gains and we’ll have a much better foundation for future releases. You can follow along with our progress porting everything to GTK4 in this GitHub Project. And More When you take a screenshot using keyboard shortcuts or by secondary-clicking an app’s window handle, we now send a notification letting you know that it was succesful and where to find the resulting image. Plus there’s a handy button that opens Files with your screenshot pre-selected. We’re also testing beaconDB as a replacement for Mozilla Location Services (MLS). If you’re not aware, we relied on MLS in previous versions of elementary OS to provide location information for devices that don’t have a GPS radio. Unfortunately Mozilla discontinued the service last June and we’ve been left without a replacement until now. Without these services, not only did maps and weather apps cease to function, but system features like automatic timezone detection and features that rely on sunset and sunrise times no longer work properly. beaconDB offers a drop-in replacement for MLS that uses Wireless networks, bluetooth devices, and cell towers to provide location data when requested. All of its data is crowd-sourced and opt-in and several distributions are now defaulting to using it as their location services data provider. I’ve set up a small sponsorship from elementary on Liberapay to support the project. If you can help support beaconDB either by sponsoring or providing stumbler data, I’d highly encourage you to do so! Sponsors At the moment we’re at 23% of our monthly funding goal and 336 Sponsors on GitHub! Shoutouts to everyone helping us reach our goals here. Your monthly sponsorship funds development and makes sure we have the resources we need to give you the best version of elementary OS we can! Monthly release candidate builds and daily Early Access builds are available to GitHub Sponsors from any tier! Beware that Early Access builds are not considered stable and you will encounter fresh issues when you run them. We’d really appreciate reporting any problems you encounter with the Feedback app or directly on GitHub.

3 days ago 1 votes
The System-Level Foundation of Assembly

Tracing how the CPU, OS, and ELF format shape the structure of your assembly code

4 days ago 1 votes
Taking a break

I've been publishing at least one blog post every week on this blog for about 2.5 years. I kept it up even when I was very sick last year with Lyme disease. It's time for me to take a break and reset. This is the right time, because the world is very difficult for me to move through right now and I'm just burnt out. I need to focus my energy on things that give me energy and right now, that's not writing and that's not tech. I'll come back to this, and it might look a little different. This is my last post for at least a month. It might be longer, if I still need more time, but I won't return before the end of May. I know I need at least that long to heal, and I also need that time to focus on music. I plan to play a set at West Philly Porchfest, so this whole month I'll be prepping that set. If you want to follow along with my music, you can find it on my bandcamp (only one track, but I'll post demos of the others that I prepare for Porchfest as they come together). And if you want to reach out, my inbox is open. Be kind to yourself. Stay well, drink some water. See you in a while.

4 days ago 1 votes