How to add a directory to your PATH

from Julia Evans [alt+shift+b] in programming

I was talking to a friend about how to add a directory to your PATH today. It’s something that feels “obvious” to me since I’ve been using the terminal for a long time, but when I searched for instructions for how to do it, I actually couldn’t find something that explained all of the steps – a lot of them just said “add this to ~/.bashrc”, but what if you’re not using bash? What if your bash config is actually in a different file? And how are you supposed to figure out which directory to add anyway? So I wanted to try to write down some more complete directions and mention some of the gotchas I’ve run into over the years. Here’s a table of contents: step 1: what shell are you using? step 2: find your shell’s config file a note on bash’s config file step 3: figure out which directory to add step 3.1: double check it’s the right directory step 4: edit your shell config step 5: restart your shell problems: problem 1: it ran the wrong program problem 2: the program isn’t being run from...

4 months ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Julia Evans

New zine: The Secret Rules of the Terminal

Hello! After many months of writing deep dive blog posts about the terminal, on Tuesday I released a new zine called “The Secret Rules of the Terminal”! You can get it for $12 here: https://wizardzines.com/zines/terminal, or get an 15-pack of all my zines here. Here’s the cover: the table of contents Here’s the table of contents: why the terminal? At first when I thought about writing about the terminal I was a bit baffled. After all – you just type in a command and run it, right? How hard could it be? But then I ran a terminal workshop for some folks who were new to the terminal, and somebody asked this question: “how do I quit? Ctrl+C isn’t working!” This question has a very simple answer (they’d run man pngquant, so they just needed to press q to quit). But it made me think about how even though different situations in the terminal look extremely similar (it’s all text!), the way they behave can be very different. Something as simple as “quitting” is different depending on whether you’re in a REPL (Ctrl+D), a full screen program like less (q), or a noninteractive program (Ctrl+C). And then I realized that the terminal was way more complicated than I’d been giving it credit for. there are a million tiny inconsistencies The more I thought about using the terminal, the more I realized that the terminal has a lot of tiny inconsistencies like: sometimes you can use the arrow keys to move around, but sometimes pressing the arrow keys just prints ^[[D sometimes you can use the mouse to select text, but sometimes you can’t sometimes your commands get saved to a history when you run them, and sometimes they don’t some shells let you use the up arrow to see the previous command, and some don’t If you use the terminal daily for 10 or 20 years, even if you don’t understand exactly why these things happen, you’ll probably build an intuition for them. But having an intuition for them isn’t the same as understanding why they happen. When writing this zine I actually had to do a lot of work to figure out exactly what was happening in the terminal to be able to talk about how to reason about it. the rules aren’t written down anywhere It turns out that the “rules” for how the terminal works (how do you edit a command you type in? how do you quit a program? how do you fix your colours?) are extremely hard to fully understand, because “the terminal” is actually made of many different pieces of software (your terminal emulator, your operating system, your shell, the core utilities like grep, and every other random terminal program you’ve installed) which are written by different people with different ideas about how things should work. So I wanted to write something that would explain: how the 4 pieces of the terminal (your shell, terminal emulator, programs, and TTY driver) fit together to make everything work some of the core conventions for how you can expect things in your terminal to work lots of tips and tricks for how to use terminal programs this zine explains the most useful parts of terminal internals Terminal internals are a mess. A lot of it is just the way it is because someone made a decision in the 80s and now it’s impossible to change, and honestly I don’t think learning everything about terminal internals is worth it. But some parts are not that hard to understand and can really make your experience in the terminal better, like: if you understand what your shell is responsible for, you can configure your shell (or use a different one!) to access your history more easily, get great tab completion, and so much more if you understand escape codes, it’s much less scary when cating a binary to stdout messes up your terminal, you can just type reset and move on if you understand how colour works, you can get rid of bad colour contrast in your terminal so you can actually read the text I learned a surprising amount writing this zine When I wrote How Git Works, I thought I knew how Git worked, and I was right. But the terminal is different. Even though I feel totally confident in the terminal and even though I’ve used it every day for 20 years, I had a lot of misunderstandings about how the terminal works and (unless you’re the author of tmux or something) I think there’s a good chance you do too. A few things I learned that are actually useful to me: I understand the structure of the terminal better and so I feel more confident debugging weird terminal stuff that happens to me (I was even able to suggest a small improvement to fish!). Identifying exactly which piece of software is causing a weird thing to happen in my terminal still isn’t easy but I’m a lot better at it now. you can write a shell script to copy to your clipboard over SSH how reset works under the hood (it does the equivalent of stty sane; sleep 1; tput reset) – basically I learned that I don’t ever need to worry about remembering stty sane or tput reset and I can just run reset instead how to look at the invisible escape codes that a program is printing out (run unbuffer program > out; less out) why the builtin REPLs on my Mac like sqlite3 are so annoying to use (they use libedit instead of readline) blog posts I wrote along the way As usual these days I wrote a bunch of blog posts about various side quests: How to add a directory to your PATH “rules” that terminal problems follow why pipes sometimes get “stuck”: buffering some terminal frustrations ASCII control characters in my terminal on “what’s the deal with Ctrl+A, Ctrl+B, Ctrl+C, etc?” entering text in the terminal is complicated what’s involved in getting a “modern” terminal setup? reasons to use your shell’s job control standards for ANSI escape codes, which is really me trying to figure out if I think the terminfo database is serving us well today people who helped with this zine A long time ago I used to write zines mostly by myself but with every project I get more and more help. I met with Marie Claire LeBlanc Flanagan every weekday from September to June to work on this one. The cover is by Vladimir Kašiković, Lesley Trites did copy editing, Simon Tatham (who wrote PuTTY) did technical review, our Operations Manager Lee did the transcription as well as a million other things, and Jesse Luehrs (who is one of the very few people I know who actually understands the terminal’s cursed inner workings) had so many incredibly helpful conversations with me about what is going on in the terminal. get the zine Here are some links to get the zine again: get The Secret Rules of the Terminal get a 15-pack of all my zines here. As always, you can get either a PDF version to print at home or a print version shipped to your house. The only caveat is print orders will ship in August – I need to wait for orders to come in to get an idea of how many I should print before sending it to the printer.

a week ago • 14 votes

Using `make` to compile C programs (for non-C-programmers)

I have never been a C programmer but every so often I need to compile a C/C++ program from source. This has been kind of a struggle for me: for a long time, my approach was basically “install the dependencies, run make, if it doesn’t work, either try to find a binary someone has compiled or give up”. “Hope someone else has compiled it” worked pretty well when I was running Linux but since I’ve been using a Mac for the last couple of years I’ve been running into more situations where I have to actually compile programs myself. So let’s talk about what you might have to do to compile a C program! I’ll use a couple of examples of specific C programs I’ve compiled and talk about a few things that can go wrong. Here are three programs we’ll be talking about compiling: paperjam sqlite qf (a pager you can run to quickly open files from a search with rg -n THING | qf) step 1: install a C compiler This is pretty simple: on an Ubuntu system if I don’t already have a C compiler I’ll install one with: sudo apt-get install build-essential This installs gcc, g++, and make. The situation on a Mac is more confusing but it’s something like “install xcode command line tools”. step 2: install the program’s dependencies Unlike some newer programming languages, C doesn’t have a dependency manager. So if a program has any dependencies, you need to hunt them down yourself. Thankfully because of this, C programmers usually keep their dependencies very minimal and often the dependencies will be available in whatever package manager you’re using. There’s almost always a section explaining how to get the dependencies in the README, for example in paperjam’s README, it says: To compile PaperJam, you need the headers for the libqpdf and libpaper libraries (usually available as libqpdf-dev and libpaper-dev packages). You may need a2x (found in AsciiDoc) for building manual pages. So on a Debian-based system you can install the dependencies like this. sudo apt install -y libqpdf-dev libpaper-dev If a README gives a name for a package (like libqpdf-dev), I’d basically always assume that they mean “in a Debian-based Linux distro”: if you’re on a Mac brew install libqpdf-dev will not work. I still have not 100% gotten the hang of developing on a Mac yet so I don’t have many tips there yet. I guess in this case it would be brew install qpdf if you’re using Homebrew. step 3: run ./configure (if needed) Some C programs come with a Makefile and some instead come with a script called ./configure. For example, if you download sqlite’s source code, it has a ./configure script in it instead of a Makefile. My understanding of this ./configure script is: You run it, it prints out a lot of somewhat inscrutable output, and then it either generates a Makefile or fails because you’re missing some dependency The ./configure script is part of a system called autotools that I have never needed to learn anything about beyond “run it to generate a Makefile”. I think there might be some options you can pass to get the ./configure script to produce a different Makefile but I have never done that. step 4: run make The next step is to run make to try to build a program. Some notes about make: Sometimes you can run make -j8 to parallelize the build and make it go faster It usually prints out a million compiler warnings when compiling the program. I always just ignore them. I didn’t write the software! The compiler warnings are not my problem. compiler errors are often dependency problems Here’s an error I got while compiling paperjam on my Mac: /opt/homebrew/Cellar/qpdf/12.0.0/include/qpdf/InputSource.hh:85:19: error: function definition does not declare parameters 85 | qpdf_offset_t last_offset{0}; | ^ Over the years I’ve learned it’s usually best not to overthink problems like this: if it’s talking about qpdf, there’s a good change it just means that I’ve done something wrong with how I’m including the qpdf dependency. Now let’s talk about some ways to get the qpdf dependency included in the right way. the world’s shortest introduction to the compiler and linker Before we talk about how to fix dependency problems: building C programs is split into 2 steps: Compiling the code into object files (with gcc or clang) Linking those object files into a final binary (with ld) It’s important to know this when building a C program because sometimes you need to pass the right flags to the compiler and linker to tell them where to find the dependencies for the program you’re compiling. make uses environment variables to configure the compiler and linker If I run make on my Mac to install paperjam, I get this error: c++ -o paperjam paperjam.o pdf-tools.o parse.o cmds.o pdf.o -lqpdf -lpaper ld: library 'qpdf' not found This is not because qpdf is not installed on my system (it actually is!). But the compiler and linker don’t know how to find the qpdf library. To fix this, we need to: pass "-I/opt/homebrew/include" to the compiler (to tell it where to find the header files) pass "-L/opt/homebrew/lib -liconv" to the linker (to tell it where to find library files and to link in iconv) And we can get make to pass those extra parameters to the compiler and linker using environment variables! To see how this works: inside paperjam’s Makefile you can see a bunch of environment variables, like LDLIBS here: paperjam: $(OBJS) $(LD) -o $@ $^ $(LDLIBS) Everything you put into the LDLIBS environment variable gets passed to the linker (ld) as a command line argument. secret environment variable: CPPFLAGS Makefiles sometimes define their own environment variables that they pass to the compiler/linker, but make also has a bunch of “implicit” environment variables which it will automatically pass to the C compiler and linker. There’s a full list of implicit environment variables here, but one of them is CPPFLAGS, which gets automatically passed to the C compiler. (technically it would be more normal to use CXXFLAGS for this, but this particular Makefile hardcodes CXXFLAGS so setting CPPFLAGS was the only way I could find to set the compiler flags without editing the Makefile) how to use CPPFLAGS and LDLIBS to fix this compiler error Now that we’ve talked about how CPPFLAGS and LDLIBS get passed to the compiler and linker, here’s the final incantation that I used to get the program to build successfully! CPPFLAGS="-I/opt/homebrew/include" LDLIBS="-L/opt/homebrew/lib -liconv" make paperjam This passes -I/opt/homebrew/include to the compiler and -L/opt/homebrew/lib -liconv to the linker. Also I don’t want to pretend that I “magically” knew that those were the right arguments to pass, figuring them out involved a bunch of confused Googling that I skipped over in this post. I will say that: the -I compiler flag tells the compiler which directory to find header files in, like /opt/homebrew/include/qpdf/QPDF.hh the -L linker flag tells the linker which directory to find libraries in, like /opt/homebrew/lib/libqpdf.a the -l linker flag tells the linker which libraries to link in, like -liconv means “link in the iconv library”, or -lm means “link math” tip: how to just build 1 specific file: make $FILENAME Yesterday I discovered this cool tool called qf which you can use to quickly open files from the output of ripgrep. qf is in a big directory of various tools, but I only wanted to compile qf. So I just compiled qf, like this: make qf Basically if you know (or can guess) the output filename of the file you’re trying to build, you can tell make to just build that file by running make $FILENAME tip: look at how other packaging systems built the same C program If you’re having trouble building a C program, maybe other people had problems building it too! Every Linux distribution has build files for every package that they build, so even if you can’t install packages from that distribution directly, maybe you can get tips from that Linux distro for how to build the package. Realizing this (thanks to my friend Dave) was a huge ah-ha moment for me. For example, this line from the nix package for paperjam says: env.NIX_LDFLAGS = lib.optionalString stdenv.hostPlatform.isDarwin "-liconv"; This is basically saying “pass the linker flag -liconv to build this on a Mac”, so that’s a clue we could use to build it. That same file also says env.NIX_CFLAGS_COMPILE = "-DPOINTERHOLDER_TRANSITION=1";. I’m not sure what this means, but when I try to build the paperjam package I do get an error about something called a PointerHolder, so I guess that’s somehow related to the “PointerHolder transition”. step 5: installing the binary Once you’ve managed to compile the program, probably you want to install it somewhere! Some Makefiles have an install target that let you install the tool on your system with make install. I’m always a bit scared of this (where is it going to put the files? what if I want to uninstall them later?), so if I’m compiling a pretty simple program I’ll often just manually copy the binary to install it instead, like this: cp qf ~/bin step 6: maybe make your own package! Once I figured out how to do all of this, I realized that I could use my new make knowledge to contribute a paperjam package to Homebrew! Then I could just brew install paperjam on future systems. The good thing is that even if the details of how all of the different packaging systems, they fundamentally all use C compilers and linkers. it can be useful to understand a little about C even if you’re not a C programmer I think all of this is an interesting example of how it can useful to understand some basics of how C programs work (like “they have header files”) even if you’re never planning to write a nontrivial C program if your life. It feels good to have some ability to compile C/C++ programs myself, even though I’m still not totally confident about all of the compiler and linker flags and I still plan to never learn anything about how autotools works other than “you run ./configure to generate the Makefile”. Also one important thing I left out is LD_LIBRARY_PATH / DYLD_LIBRARY_PATH (which you use to tell the dynamic linker at runtime where to find dynamically linked files) because I can’t remember the last time I ran into an LD_LIBRARY_PATH issue and couldn’t find an example.

3 weeks ago • 16 votes

Standards for ANSI escape codes

Hello! Today I want to talk about ANSI escape codes. For a long time I was vaguely aware of ANSI escape codes (“that’s how you make text red in the terminal and stuff”) but I had no real understanding of where they were supposed to be defined or whether or not there were standards for them. I just had a kind of vague “there be dragons” feeling around them. While learning about the terminal this year, I’ve learned that: ANSI escape codes are responsible for a lot of usability improvements in the terminal (did you know there’s a way to copy to your system clipboard when SSHed into a remote machine?? It’s an escape code called OSC 52!) They aren’t completely standardized, and because of that they don’t always work reliably. And because they’re also invisible, it’s extremely frustrating to troubleshoot escape code issues. So I wanted to put together a list for myself of some standards that exist around escape codes, because I want to know if they have to feel unreliable and frustrating, or if there’s a future where we could all rely on them with more confidence. what’s an escape code? ECMA-48 xterm control sequences terminfo should programs use terminfo? is there a “single common set” of escape codes? some reasons to use terminfo some more documents/standards why I think this is interesting what’s an escape code? Have you ever pressed the left arrow key in your terminal and seen ^[[D? That’s an escape code! It’s called an “escape code” because the first character is the “escape” character, which is usually written as ESC, \x1b, \E, \033, or ^[. Escape codes are how your terminal emulator communicates various kinds of information (colours, mouse movement, etc) with programs running in the terminal. There are two kind of escape codes: input codes which your terminal emulator sends for keypresses or mouse movements that don’t fit into Unicode. For example “left arrow key” is ESC[D, “Ctrl+left arrow” might be ESC[1;5D, and clicking the mouse might be something like ESC[M :3. output codes which programs can print out to colour text, move the cursor around, clear the screen, hide the cursor, copy text to the clipboard, enable mouse reporting, set the window title, etc. Now let’s talk about standards! ECMA-48 The first standard I found relating to escape codes was ECMA-48, which was originally published in 1976. ECMA-48 does two things: Define some general formats for escape codes (like “CSI” codes, which are ESC[ + something and “OSC” codes, which are ESC] + something) Define some specific escape codes, like how “move the cursor to the left” is ESC[D, or “turn text red” is ESC[31m. In the spec, the “cursor left” one is called CURSOR LEFT and the one for changing colours is called SELECT GRAPHIC RENDITION. The formats are extensible, so there’s room for others to define more escape codes in the future. Lots of escape codes that are popular today aren’t defined in ECMA-48: for example it’s pretty common for terminal applications (like vim, htop, or tmux) to support using the mouse, but ECMA-48 doesn’t define escape codes for the mouse. xterm control sequences There are a bunch of escape codes that aren’t defined in ECMA-48, for example: enabling mouse reporting (where did you click in your terminal?) bracketed paste (did you paste that text or type it in?) OSC 52 (which terminal applications can use to copy text to your system clipboard) I believe (correct me if I’m wrong!) that these and some others came from xterm, are documented in XTerm Control Sequences, and have been widely implemented by other terminal emulators. This list of “what xterm supports” is not a standard exactly, but xterm is extremely influential and so it seems like an important document. terminfo In the 80s (and to some extent today, but my understanding is that it was MUCH more dramatic in the 80s) there was a huge amount of variation in what escape codes terminals actually supported. To deal with this, there’s a database of escape codes for various terminals called “terminfo”. It looks like the standard for terminfo is called X/Open Curses, though you need to create an account to view that standard for some reason. It defines the database format as well as a C library interface (“curses”) for accessing the database. For example you can run this bash snippet to see every possible escape code for “clear screen” for all of the different terminals your system knows about: for term in $(toe -a | awk '{print $1}') do echo $term infocmp -1 -T "$term" 2>/dev/null | grep 'clear=' | sed 's/clear=//g;s/,//g' done On my system (and probably every system I’ve ever used?), the terminfo database is managed by ncurses. should programs use terminfo? I think it’s interesting that there are two main approaches that applications take to handling ANSI escape codes: Use the terminfo database to figure out which escape codes to use, depending on what’s in the TERM environment variable. Fish does this, for example. Identify a “single common set” of escape codes which works in “enough” terminal emulators and just hardcode those. Some examples of programs/libraries that take approach #2 (“don’t use terminfo”) include: kakoune python-prompt-toolkit linenoise libvaxis chalk I got curious about why folks might be moving away from terminfo and I found this very interesting and extremely detailed rant about terminfo from one of the fish maintainers, which argues that: [the terminfo authors] have done a lot of work that, at the time, was extremely important and helpful. My point is that it no longer is. I’m not going to do it justice so I’m not going to summarize it, I think it’s worth reading. is there a “single common set” of escape codes? I was just talking about the idea that you can use a “common set” of escape codes that will work for most people. But what is that set? Is there any agreement? I really do not know the answer to this at all, but from doing some reading it seems like it’s some combination of: The codes that the VT100 supported (though some aren’t relevant on modern terminals) what’s in ECMA-48 (which I think also has some things that are no longer relevant) What xterm supports (though I’d guess that not everything in there is actually widely supported enough) and maybe ultimately “identify the terminal emulators you think your users are going to use most frequently and test in those”, the same way web developers do when deciding which CSS features are okay to use I don’t think there are any resources like Can I use…? or Baseline for the terminal though. (in theory terminfo is supposed to be the “caniuse” for the terminal but it seems like it often takes 10+ years to add new terminal features when people invent them which makes it very limited) some reasons to use terminfo I also asked on Mastodon why people found terminfo valuable in 2025 and got a few reasons that made sense to me: some people expect to be able to use the TERM environment variable to control how programs behave (for example with TERM=dumb), and there’s no standard for how that should work in a post-terminfo world even though there’s less variation between terminal emulators than there was in the 80s, there’s far from zero variation: there are graphical terminals, the Linux framebuffer console, the situation you’re in when connecting to a server via its serial console, Emacs shell mode, and probably more that I’m missing there is no one standard for what the “single common set” of escape codes is, and sometimes programs use escape codes which aren’t actually widely supported enough some more documents/standards A few more documents and standards related to escape codes, in no particular order: the Linux console_codes man page documents escape codes that Linux supports how the VT 100 handles escape codes & control sequences the kitty keyboard protocol OSC 8 for links in the terminal (and notes on adoption) A summary of ANSI standards from tmux this terminal features reporting specification from iTerm sixel graphics why I think this is interesting I sometimes see people saying that the unix terminal is “outdated”, and since I love the terminal so much I’m always curious about what incremental changes might make it feel less “outdated”. Maybe if we had a clearer standards landscape (like we do on the web!) it would be easier for terminal emulator developers to build new features and for authors of terminal applications to more confidently adopt those features so that we can all benefit from them and have a richer experience in the terminal. Obviously standardizing ANSI escape codes is not easy (ECMA-48 was first published almost 50 years ago and we’re still not there!). But the situation with HTML/CSS/JS used to be extremely bad too and now it’s MUCH better, so maybe there’s hope.

3 months ago • 38 votes

Some terminal frustrations

A few weeks ago I ran a terminal survey (you can read the results here) and at the end I asked: What’s the most frustrating thing about using the terminal for you? 1600 people answered, and I decided to spend a few days categorizing all the responses. Along the way I learned that classifying qualitative data is not easy but I gave it my best shot. I ended up building a custom tool to make it faster to categorize everything. As with all of my surveys the methodology isn’t particularly scientific. I just posted the survey to Mastodon and Twitter, ran it for a couple of days, and got answers from whoever happened to see it and felt like responding. Here are the top categories of frustrations! I think it’s worth keeping in mind while reading these comments that 40% of people answering this survey have been using the terminal for 21+ years 95% of people answering the survey have been using the terminal for at least 4 years These comments aren’t coming from total beginners. Here are the categories of frustrations! The number in brackets is the number of people with that frustration. Honestly I don’t how how interesting this is to other people – I’m just writing this up for myself because I’m trying to write a zine about the terminal and I wanted to get a sense for what people are having trouble with. remembering syntax (115) People talked about struggles remembering: the syntax for CLI tools like awk, jq, sed, etc the syntax for redirects keyboard shortcuts for tmux, text editing, etc One example comment: There are just so many little “trivia” details to remember for full functionality. Even after all these years I’ll sometimes forget where it’s 2 or 1 for stderr, or forget which is which for > and >>. switching terminals is hard (91) People talked about struggling with switching systems (for example home/work computer or when SSHing) and running into: OS differences in keyboard shortcuts (like Linux vs Mac) systems which don’t have their preferred text editor (“no vim” or “only vim”) different versions of the same command (like Mac OS grep vs GNU grep) no tab completion a shell they aren’t used to (“the subtle differences between zsh and bash”) as well as differences inside the same system like pagers being not consistent with each other (git diff pagers, other pagers). One example comment: I got used to fish and vi mode which are not available when I ssh into servers, containers. color (85) Lots of problems with color, like: programs setting colors that are unreadable with a light background color finding a colorscheme they like (and getting it to work consistently across different apps) color not working inside several layers of SSH/tmux/etc not liking the defaults not wanting color at all and struggling to turn it off This comment felt relatable to me: Getting my terminal theme configured in a reasonable way between the terminal emulator and fish (I did this years ago and remember it being tedious and fiddly and now feel like I’m locked into my current theme because it works and I dread touching any of that configuration ever again). keyboard shortcuts (84) Half of the comments on keyboard shortcuts were about how on Linux/Windows, the keyboard shortcut to copy/paste in the terminal is different from in the rest of the OS. Some other issues with keyboard shortcuts other than copy/paste: using Ctrl-W in a browser-based terminal and closing the window the terminal only supports a limited set of keyboard shortcuts (no Ctrl-Shift-, no Super, no Hyper, lots of ctrl- shortcuts aren’t possible like Ctrl-,) the OS stopping you from using a terminal keyboard shortcut (like by default Mac OS uses Ctrl+left arrow for something else) issues using emacs in the terminal backspace not working (2) other copy and paste issues (75) Aside from “the keyboard shortcut for copy and paste is different”, there were a lot of OTHER issues with copy and paste, like: copying over SSH how tmux and the terminal emulator both do copy/paste in different ways dealing with many different clipboards (system clipboard, vim clipboard, the “middle click” keyboard on Linux, tmux’s clipboard, etc) and potentially synchronizing them random spaces added when copying from the terminal pasting multiline commands which automatically get run in a terrifying way wanting a way to copy text without using the mouse discoverability (55) There were lots of comments about this, which all came down to the same basic complaint – it’s hard to discover useful tools or features! This comment kind of summed it all up: How difficult it is to learn independently. Most of what I know is an assorted collection of stuff I’ve been told by random people over the years. steep learning curve (44) A lot of comments about it generally having a steep learning curve. A couple of example comments: After 15 years of using it, I’m not much faster than using it than I was 5 or maybe even 10 years ago. and That I know I could make my life easier by learning more about the shortcuts and commands and configuring the terminal but I don’t spend the time because it feels overwhelming. history (42) Some issues with shell history: history not being shared between terminal tabs (16) limits that are too short (4) history not being restored when terminal tabs are restored losing history because the terminal crashed not knowing how to search history One example comment: It wasted a lot of time until I figured it out and still annoys me that “history” on zsh has such a small buffer; I have to type “history 0” to get any useful length of history. bad documentation (37) People talked about: documentation being generally opaque lack of examples in man pages programs which don’t have man pages Here’s a representative comment: Finding good examples and docs. Man pages often not enough, have to wade through stack overflow scrollback (36) A few issues with scrollback: programs printing out too much data making you lose scrollback history resizing the terminal messes up the scrollback lack of timestamps GUI programs that you start in the background printing stuff out that gets in the way of other programs’ outputs One example comment: When resizing the terminal (in particular: making it narrower) leads to broken rewrapping of the scrollback content because the commands formatted their output based on the terminal window width. “it feels outdated” (33) Lots of comments about how the terminal feels hampered by legacy decisions and how users often end up needing to learn implementation details that feel very esoteric. One example comment: Most of the legacy cruft, it would be great to have a green field implementation of the CLI interface. shell scripting (32) Lots of complaints about POSIX shell scripting. There’s a general feeling that shell scripting is difficult but also that switching to a different less standard scripting language (fish, nushell, etc) brings its own problems. Shell scripting. My tolerance to ditch a shell script and go to a scripting language is pretty low. It’s just too messy and powerful. Screwing up can be costly so I don’t even bother. more issues Some more issues that were mentioned at least 10 times: (31) inconsistent command line arguments: is it -h or help or –help? (24) keeping dotfiles in sync across different systems (23) performance (e.g. “my shell takes too long to start”) (20) window management (potentially with some combination of tmux tabs, terminal tabs, and multiple terminal windows. Where did that shell session go?) (17) generally feeling scared/uneasy (“The debilitating fear that I’m going to do some mysterious Bad Thing with a command and I will have absolutely no idea how to fix or undo it or even really figure out what happened”) (16) terminfo issues (“Having to learn about terminfo if/when I try a new terminal emulator and ssh elsewhere.”) (16) lack of image support (sixel etc) (15) SSH issues (like having to start over when you lose the SSH connection) (15) various tmux/screen issues (for example lack of integration between tmux and the terminal emulator) (15) typos & slow typing (13) the terminal getting messed up for various reasons (pressing Ctrl-S, cating a binary, etc) that’s all! I’m not going to make a lot of commentary on these results, but here are a couple of categories that feel related to me: remembering syntax & history (often the thing you need to remember is something you’ve run before!) discoverability & the learning curve (the lack of discoverability is definitely a big part of what makes it hard to learn)

4 months ago • 36 votes

More in programming

Logical Quantifiers in Software

I realize that for all I've talked about Logic for Programmers in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! Sets and quantifiers A set is a collection of unordered, unique elements. {1, 2, 3, …} is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.2 Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a set collection type in the core language? We would write it like this: # all of them all l in ProgrammingLanguages: HasSetType(l) # at least one some l in ProgrammingLanguages: HasSetType(l) This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was ∀x ∈ set: P(x) to mean all x in set, and ∃ to mean some. I use these when writing for just myself, but find them confusing to programmers when communicating. "All" and "some" are respectively referred to as "universal" and "existential" quantifiers. Some cool properties We can simplify expressions with quantifiers, in the same way that we can simplify !(x && y) to !x || !y. First of all, quantifiers are commutative with themselves. some x: some y: P(x,y) is the same as some y: some x: P(x, y). For this reason we can write some x, y: P(x,y) as shorthand. We can even do this when quantifying over different sets, writing some x, x' in X, y in Y instead of some x, x' in X: some y in Y. We can not do this with "alternating quantifiers": all p in Person: some m in Person: Mother(m, p) says that every person has a mother. some m in Person: all p in Person: Mother(m, p) says that someone is every person's mother. Second, existentials distribute over || while universals distribute over &&. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites". Finally, some and all are duals: some x: P(x) == !(all x: !P(x)), and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign. All these rules together mean we can manipulate quantifiers almost as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. Speaking of which, how do we use this in in programming? How we use this in programming First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form: for x in list: if P(x): return true return false That's just some x in list: P(x). And this is a prevalent pattern, as you can see by using GitHub code search. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be any(P(x) for x in list). (Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.) More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That all i, j in 0..<len(l): if i < j then l[i] <= l[j]. When should a ratchet test fail? When some f in functions - exceptions: Uses(f, bad_function). Should the image classifier work upside down? all i in images: classify(i) == classify(rotate(i, 180)). These are the properties we verify with tests and types and MISU and whatnot;1 it helps to be able to make them explicit! One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like all a in accounts: a.balance > 0. That's enforceable with a CHECK constraint. But what about something like all i, i' in intervals: NoOverlap(i, i')? That isn't covered by CHECK, since it spans two rows. Quantifier duality to the rescue! The invariant is equivalent to !(some i, i' in intervals: Overlap(i, i')), so is preserved if the query SELECT COUNT(*) FROM intervals CROSS JOIN intervals … returns 0 rows. This means we can test it via a database trigger.3 There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's crazy how crude v0.1 was compared to the current version. MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a location -> Optional(item) lookup and want to make sure that each item is in exactly one location, consider instead changing the map to item -> location. This is a means of implementing the property all i in item, l, l' in location: if ItemIn(i, l) && l != l' then !ItemIn(i, l'). ↩ Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". ↩ Though note that when you're inserting or updating an interval, you already have that row's fields in the trigger's NEW keyword. So you can just query !(some i in intervals: Overlap(new, i')), which is more efficient. ↩

17 hours ago • 3 votes

Setting Element Ordering With HTML Rewriter Using CSS

After shipping my work transforming HTML with Netlify’s edge functions I realized I have a little bug: the order of the icons specified in the URL doesn’t match the order in which they are displayed on screen. Why’s this happening? I have a bunch of links in my HTML document, like this: <icon-list> <a href="/1/">…</a> <a href="/2/">…</a> <a href="/3/">…</a>  </icon-list> I use html-rewriter in my edge function to strip out the HTML for icons not specified in the URL. So for a request to: /lookup?id=1&id=2 My HTML will be transformed like so: <icon-list>  <a href="/1/">…</a> <a href="/2/">…</a>  <a href="/3/">…</a> </icon-list> Resulting in less HTML over the wire to the client. But what about the order of the IDs in the URL? What if the request is to: /lookup?id=2&id=1 Instead of: /lookup?id=1&id=2 In the source HTML document containing all the icons, they’re marked up in reverse chronological order. But the request for this page may specify a different order for icons in the URL. So how do I rewrite the HTML to match the URL’s ordering? The problem is that html-rewriter doesn’t give me a fully-parsed DOM to work with. I can’t do things like “move this node to the top” or “move this node to position x”. With html-rewriter, you only “see” each element as it streams past. Once it passes by, your chance at modifying it is gone. (It seems that’s just the way these edge function tools are designed to work, keeps them lean and performant and I can’t shoot myself in the foot). So how do I change the icon’s display order to match what’s in the URL if I can’t modify the order of the elements in the HTML? CSS to the rescue! Because my markup is just a bunch of <a> tags inside a custom element and I’m using CSS grid for layout, I can use the order property in CSS! All the IDs are in the URL, and their position as parameters has meaning, so I assign their ordering to each element as it passes by html-rewriter. Here’s some pseudo code: // Get all the IDs in the URL const ids = url.searchParams.getAll("id"); // Select all the icons in the HTML rewriter.on("icon-list a", { element: (element) => { // Get the ID const id = element.getAttribute('id'); // If it's in our list, set it's order // position from the URL if (ids.includes(id)) { const order = ids.indexOf(id); element.setAttribute( "style", `order: ${order}` ); // Otherwise, remove it } else { element.remove(); } }, }); Boom! I didn’t have to change the order in the source HTML document, but I can still get the displaying ordering to match what’s in the URL. I love shifty little workarounds like this! Email · Mastodon · Bluesky

18 hours ago • 2 votes

The missing part of Espressif’s reset circuit

In the previous article, we peeked at the reset circuit of ESP-Prog with an oscilloscope, and reproduced it with basic components. We observed that it did not behave quite as expected. In this article, we’ll look into the missing pieces. An incomplete circuit For a hint, we’ll first look a bit more closely at the … Continue reading The missing part of Espressif’s reset circuit → The post The missing part of Espressif’s reset circuit appeared first on Quentin Santos.

17 hours ago • 2 votes

clamp / median / range

Here are a few tangentially-related ideas vaguely near the theme of comparison operators. comparison style clamp style clamp is median clamp in range range style style clash? comparison style Some languages such as BCPL, Icon, Python have chained comparison operators, like if min <= x <= max: ... In languages without chained comparison, I like to write comparisons as if they were chained, like, if min <= x && x <= max { // ... } A rule of thumb is to prefer less than (or equal) operators and avoid greater than. In a sequence of comparisons, order values from (expected) least to greatest. clamp style The clamp() function ensures a value is between some min and max, def clamp(min, x, max): if x < min: return min if max < x: return max return x I like to order its arguments matching the expected order of the values, following my rule of thumb for comparisons. (I used that flavour of clamp() in my article about GCRA.) But I seem to be unusual in this preference, based on a few examples I have seen recently. clamp is median Last month, Fabian Giesen pointed out a way to resolve this difference of opinion: A function that returns the median of three values is equivalent to a clamp() function that doesn’t care about the order of its arguments. This version is written so that it returns NaN if any of its arguments is NaN. (When an argument is NaN, both of its comparisons will be false.) fn med3(a: f64, b: f64, c: f64) -> f64 { match (a <= b, b <= c, c <= a) { (false, false, false) => f64::NAN, (false, false, true) => b, // a > b > c (false, true, false) => a, // c > a > b (false, true, true) => c, // b <= c <= a (true, false, false) => c, // b > c > a (true, false, true) => a, // c <= a <= b (true, true, false) => b, // a <= b <= c (true, true, true) => b, // a == b == c } } When two of its arguments are constant, med3() should compile to the same code as a simple clamp(); but med3()’s misuse-resistance comes at a small cost when the arguments are not known at compile time. clamp in range If your language has proper range types, there is a nicer way to make clamp() resistant to misuse: fn clamp(x: f64, r: RangeInclusive<f64>) -> f64 { let (&min,&max) = (r.start(), r.end()); if x < min { return min } if max < x { return max } return x; } let x = clamp(x, MIN..=MAX); range style For a long time I have been fond of the idea of a simple counting for loop that matches the syntax of chained comparisons, like for min <= x <= max: ... By itself this is silly: too cute and too ad-hoc. I’m also dissatisfied with the range or slice syntax in basically every programming language I’ve seen. I thought it might be nice if the cute comparison and iteration syntaxes were aspects of a more generally useful range syntax, but I couldn’t make it work. Until recently when I realised I could make use of prefix or mixfix syntax, instead of confining myself to infix. So now my fantasy pet range syntax looks like >= min < max // half-open >= min <= max // inclusive And you might use it in a pattern match if x is >= min < max { // ... } Or as an iterator for x in >= min < max { // ... } Or to take a slice xs[>= min < max] style clash? It’s kind of ironic that these range examples don’t follow the left-to-right, lesser-to-greater rule of thumb that this post started off with. (x is not lexically between min and max!) But that rule of thumb is really intended for languages such as C that don’t have ranges. Careful stylistic conventions can help to avoid mistakes in nontrivial conditional expressions. It’s much better if language and library features reduce the need for nontrivial conditions and catch mistakes automatically.

yesterday • 3 votes

C++ engineering decision in SumatraPDF code

SumatraPDF is a medium size (120k+ loc, not counting dependencies) Windows GUI (win32) C++ code base started by me and written by mostly 2 people. The goals of SumatraPDF are to be: fast small packed with features and yet with thoughtfully minimal UI It’s not just a matter of pride in craftsmanship of writing code. I believe being fast and small are a big reason for SumatraPDF’s success. People notice when an app starts in an instant because that’s sadly not the norm in modern software. The engineering goals of SumatraPDF are: reliable (no crashes) fast compilation to enable fast iteration SumatraPDF has been successful achieving those objectives so I’m writing up my C++ implementation decisions. I know those decisions are controversial. Maybe not Terry Davis level of controversial but still. You probably won’t adopt them. Even if you wanted to, you probably couldn’t. There’s no way code like this would pass Google review. Not because it’s bad but becaues it’s different. Diverging from mainstream this much is only feasible if you have total control: it’s your company or your own open-source project. If my ideas were just like everyone else’s ideas, there would be little point in writing about them, would it? Use UTF8 strings internally My app only runs on Windows and a string native to Windows is WCHAR* where each character consumes 2 bytes. Despite that I mostly use char* assumed to be utf8-encoded. I only decided on that after lots of code was written so it was a refactoring oddysey that is still ongoing. My initial impetus was to be able to compile non-GUI parts under Linux and Mac. I abandoned that goal but I think that’s a good idea anyway. WCHAR* strings are 2x larger than char*. That’s more memory used which also makes the app slower. Binaries are bigger if string constants are WCHAR*. The implementation rule is simple: I only convert to WCHAR* when calling Windows API. When Windows API returns WCHA* I convert it to utf-8. No exceptions Do you want to hear a joke? “Zero-cost exceptions”. Throwing and catching exceptions generate bloated code. Exceptions are a non-local control flow that makes it hard to reason about program. Every memory allocation becomes a potential leak. But RAII, you protest. RAII is a “solution” to a problem created by exceptions. How about I don’t create the problem in the first place. Hard core #include discipline I wrote about it in depth. My objects are not shy I don’t bother with private and protected. struct is just class with guts exposed by default, so I use that. While intellectually I understand the reasoning behind hiding implementation details in practices it becomes busy work of typing noise and then even more typing when you change your mind about visibility. I’m the only person working on the code so I don’t need to force those of lesser intellect to write the code properly. My objects are shy At the same time I minimize what goes into a class, especially methods. The smaller the class, the faster the build. A common problem is adding too many methods to a class. You have a StrVec class for array of strings. A lesser programmer is tempted to add Join(const char* sep) method to StrVec. A wise programmer makes it a stand-alone function: Join(const StrVec& v, const char* sep). This is enabled by making everything in a class public. If you limit visibility you then have to use friendto allow Join() function access what it needs. Another example of “solution” to self-inflicted problems. Minimize #ifdef #ifdef is problematic because it creates code paths that I don’t always build. I provide arm64, intel 32-bit and 64-bit builds but typically only develop with 64-bit intel build. Every #ifdef that branches on architecture introduces potential for compilation error which I’ll only know about when my daily ci build fails. Consider 2 possible implementations of IsProcess64Bit(): Bad: bool IsProcess64Bit() { #ifdef _WIN64 return true; #else return false; #endif } Good: bool IsProcess64Bit() { return sizeof(uintptr_t) == 8; } The bad version has a bug: it was correct when I was only doing intel builds but became buggy when I added arm64 builds. This conflicts with the goal of smallest possible size but it’s worth it. Stress testing SumatraPDF supports a lot of very complex document and image formats. Complex format require complex code that is likely to have bugs. I also have lots of files in those formats. I’ve added stress testing functionality where I point SumatraPDF to a folder with files and tell it to render all of them. For greater coverage, I also simulate some of the possible UI actions users can take like searching, switching view modes etc. Crash reporting I wrote about it in depth. Heavy use of CrashIf() C/C++ programmers are familiar with assert() macro. CrashIf() is my version of that, tailored to my needs. The purpose of assert / CrashIf is to add checks to detect incorrect use of APIs or invalid states in the program. For example, if the code tries to access an element of an array at an invalid index (negative or larger than size of the array), it indicates a bug in the program. I want to be notified about such bugs both when I test SumatraPDF and when it runs on user’s computers. As the name implies, it’ll crash (by de-referencing null pointer) and therefore generate a crash report. It’s enabled in debug and pre-release builds but not in release builds. Release builds have many, many users so I worry about too many crash reports. premake to generate Visual Studio solution Visual Studio uses XML files as a list of files in the project and build format. The format is impossible to work with in a text editor so you have no choice but to use Visual Studio to edit the project / solution. To add a new file: find the right UI element, click here, click there, pick a file using file picker, click again. To change a compilation setting of a project or a file? Find the right UI element, click here, click there, type this, confirm that. You accidentally changed compilation settings of 1 file out of a hundred? Good luck figuring out which one. Go over all files in UI one by one. In other words: managing project files using Visual Studio UI is a nightmare. Premake is a solution. It’s a meta-build system. You define your build using lua scripts, which look like test configuration files. Premake then can generate Visual Studio projects, XCode project, makefiles etc. That’s the meta part. It was truly a life server on project with lots of files (SumatraPDF’s own are over 300, many times more for third party libraries). Using /analyze and cppcheck cppcheck and /analyze flag in cl.exe are tools to find bugs in C++ code via static analysis. They are like a C++ compiler but instead of generating code, they analyze control flow in a program to find potential programs. It’s a cheap way to find some bugs, so there’s no excuse to not run them from time to time on your code. Using asan builds Address Sanitizer (asan) is a compiler flag /fsanitize=address that instruments the code with checks for common memory-related bugs like using an object after freeing it, over-writing values on the stack, freeing an object twice, writing past allocated memory. The downside of this instrumentation is that the code is much slower due to overhead of instrumentation. I’ve created a project for release build with asan and run it occasionally, especially in stress test. Write for the debugger Programmers love to code golf i.e. put us much code on one line as possible. As if lines of code were expensive. Many would write: Bad: // ... return (char*)(start + offset); I write: Good: // ... char* s = (char*)(start + offset); return s; Why? Imagine you’re in a debugger stepping through a debug build of your code. The second version makes it trivial to set a breakpoint at return s line and look at the value of s. The first doesn’t. I don’t optimize for smallest number of lines of code but for how easy it is to inspect the state of the program in the debugger. In practice it means that I intentionally create intermediary variables like s in the example above. Do it yourself standard library I’m not using STL. Yes, I wrote my own string and vector class. There are several reasons for that. Historical reason When I started SumatraPDF over 15 years ago STL was crappy. Bad APIs Today STL is still crappy. STL implementations improved greatly but the APIs still suck. There’s no API to insert something in the middle of a string or a vector. I understand the intent of separation of data structures and algorithms but I’m a pragmatist and to my pragmatist eyes v.insert (v.begin(), myarray, myarray+3); is just stupid compared to v.inert(3, el). Code bloat STL is bloated. Heavy use of templates leads to lots of generated code i.e. surprisingly large binaries for supposedly low-level language. That bloat is invisible i.e. you won’t know unless you inspect generated binaries, which no one does. The bloat is out of my control. Even if I notice, I can’t fix STL classes. All I can do is to write my non-bloaty alternative, which is what I did. Slow compilation times Compilation of C code is not fast but it feels zippy compared to compilation of C++ code. Heavy use of templates is big part of it. STL implementations are over-templetized and need to provide all the C++ support code (operators, iterators etc.). As a pragmatist, I only implement the absolute minimum functionality I use in my code. I minimize use of templates. For example Str and WStr could be a single template but are 2 implementations. I don’t understand C++ I understand the subset of C++ I use but the whole of C++ is impossibly complicated. For example I’ve read a bunch about std::move() and I’m not confident I know how to use it correctly and that’s just one of many complicated things in C++. C++ is too subtle and I don’t want my code to be a puzzle. Possibility of optimized implementations I wrote a StrVec class that is optimized for storing vector of strings. It’s more efficient than std::vector<std::string> by a large margin and I use it extensively. Temporary allocator and pool allocators I use temporary allocators heavily. They make the code faster and smaller. Technically STL has support for non-standard allocators but the API is so bad that I would rather not. My temporary allocator and pool allocators are very small and simple and I can add support for them only when beneficial. Minimize unsigned int STL and standard C library like to use size_t and other unsigned integers. I think it was a mistake. Go shows that you can just use int. Having two types leads to cast-apalooza. I don’t like visual noise in my code. Unsigned are also more dangerous. When you substract you can end up with a bigger value. Indexing from end is subtle, for (int i = n; i >= 0; i--) is buggy because i >= 0 is always true for unsigned. Sadly I only realized this recently so there’s a lot of code still to refactor to change use of size_t to int. Mostly raw pointers No std::unique_ptr for me. Warnings are errors C++ makes a distinction between compilation errors and compilation warnings. I don’t like sloppy code and polluting build output with warning messages so for my own code I use a compiler flag that turns warnings into errors, which forces me to fix the warnings.

yesterday • 2 votes

New here?