More from Evan Jones - Software Engineer | Computer Scientist
You can't safely use the C setenv() or unsetenv() functions in a program that uses threads. Those functions modify global state, and can cause other threads calling getenv() to crash. This also causes crashes in other languages that use those C standard library functions, such as Go's os.Setenv (Go issue) and Rust's std::env::set_var() (Rust issue). I ran into this in a Go program, because Go's built-in DNS resolver can call C's getaddrinfo(), which uses environment variables. This cost me 2 days to track down and file the Go bug. Sadly, this problem has been known for decades. For example, an article from January 2017 said: "None of this is new, but we do re-discover it roughly every five years. See you in 2022." This was only one year off! (She wrote an update in October 2023 after I emailed her about my Go bug.) This is a flaw in the POSIX standard, which extends the C Standard to allow modifying environment varibles. The most infuriating part is that many people who could influence the standard or maintain the C libraries don't see this as a problem. The argument is that the specification clearly documents that setenv() cannot be used with threads. Therefore, if someone does this, the crashes are their fault. We should apparently read every function's specification carefully, not use software written by others, and not use threads. These are unrealistic assumptions in modern software. I think we should instead strive to create APIs that are hard to screw up, and evolve as the ecosystem changes. The C language and standard library continue to play an important role at the base of most software. We either need to figure out how to improve it, or we need to figure out how to abandon it. Why is setenv() not thread-safe? The biggest problem is that getenv() returns a char*, with no need for applications to free it later. One thread could be using this pointer when another thread changes the same environment variable using setenv() or unsetenv(). The getenv() function is perfect if environment variables never change. For example, for accessing a process's initial table of environment variables (see the System V ABI: AMD64 Section 3.4.1). It turns out the C Standard only includes getenv(), so according to C, that is exactly how this should work. However, most implementations also follow the POSIX standard (e.g. POSIX.1-2017), which extends C to include functions that modify the environment. This means the current getenv() API is problematic. Even worse, putenv() adds a char* to the set of environment variables. It is explicitly required that if the application modifies the memory after putenv() returns, it modifies the environment variables. This means applications can modify the value passed to putenv() at any time, without any synchronization. FreeBSD used to implement putenv() by copying the value, but it changed it with FreeBSD 7 in 2008, which suggests some programs really do depend on modifying the environment in this fashion (see FreeBSD putenv man page). As a final problem, environ is a NULL-terminated array of pointers (char**) that an application can read and assign to (see definition in POSIX.1-2017). This is how applications can iterate over all environment variables. Accesses to this array are not thread-safe. However, in my experience many fewer applications use this than getenv() and setenv(). However, this does cause some libraries to not maintain the set of environment variables in a thread-safe way, since they directly update this table. Environment variable implementations Implementations need to choose what do do when an application overwrites an existing variable. I looked at glibc, musl, Solaris/Illumos, and FreeBSD/Apple's C standard libraries, and they make the following choices: Never free environment variables (glibc, Solaris/Illumos): Calling setenv() repeatedly is effectively a memory leak. However, once a value is returned from getenv(), it is immutable and can be used by threads safely. Free the environment variables (musl, FreeBSD/Apple): Using the pointer returned by getenv() after another thread calls setenv() can crash. A second problem is ensuring the set of environment variables is updated in a thread-safe fashion. This is what causes crashes in glibc. glibc uses an array to hold pointers to the "NAME=value" strings. It holds a lock in setenv() when changing this array, but not in getenv(). If a thread calling setenv() needs to resize the array of pointers, it copies the values to a new array and frees the previous one. This can cause other threads executing getenv() to crash, since they are now iterating deallocated memory. This is particularly annoying since glibc already leaks environment variables, and holds a lock in setenv(). All it needs to do is hold the lock inside getenv(), and it would no longer crash. This would make getenv() slightly slower. However, getenv() already uses a linear search of the array, so performance does not appear to be a concern. More sophisticated implementations are possible if this is a problem, such as Solaris/Illumos's lock-free implementation. Why do programs use environment variables? Environment variables useful for configuring shared libraries or language runtimes that are included in other programs. This allows users to change the configuration, without program authors needing to explicitly pass the configuration in. One alternative is command line flags, which requires programs to parse them and pass them in to the libraries. Another alternative are configuration files, which then need some other way to disable or configure, to be able to test new configurations. Environment variables are a simple solution. AS a result, many libraries call getenv() (see a partial list below). Since many libraries are configured through environment variables, a program may need to change these variables to configure the libraries it uses. This is common at application startup. This causes programs to need to call setenv(). Given this issue, it seems like libraries should also provide a way to explicitly configure any settings, and avoid using environment variables. We should fix this problem, and we can In my opinion, it is rediculous that this has been a known problem for so long. It has wasted thousands of hours of people's time, either debugging the problems, or debating what to do about it. We know how to fix the problem. First, we can make a thread-safe implementation, like Illumos/Solaris. This has some limitations: it leaks memory in setenv(), and is still unsafe if a program uses putenv() or the environ variable. However, this is an improvement over the current Linux and Apple implementations. The second solution is to add new APIs to get one and get all environment variables that are thread-safe by design, like Microsoft's getenv_s() (see below for the controversy around C11's "Annex K"). My preferred solution would be to do both. This would reduce the chances of hitting this problem for existing programs and libraries, and also provide a path to avoid the problems entirely for new code or languages like Go and Rust. My rough idea would be the following: Add a function to copy one single environment variable to a user-specified buffer, similar to getenv_s(). Add a thread-safe API to iterate over all environment variables, or to copy all variables out. Mark getenv() as deprecated, recommending the new thread-safe getenv() function instead. Mark putenv() as deprecated, recommending setenv() instead. Mark environ as deprecated, recommending environment variable functions instead. Update the implementation of environment varibles to be thread-safe. This requires leaking memory if getenv() is used on a variable, but we can detect if the old functions are used, and only leak memory in that case. This means programs written in other languages will avoid these problems as soon as their runtimes are updated. Update the C and POSIX standards to require the above changes. This would be progress. The getenv_s / C Standard Annex K controversy Microsoft provides getenv_s(), which copies the environment variable into a caller-provided buffer. This is easy to make thread-safe by holding a read lock while copying the variable. After the function returns, future changes to the environment have no effect. This is included in the C11 Standard as Annex K "Bounds Checking Interfaces". The C standard Annexes are optional features. This Annex includes new functions intended to make it harder to make mistakes with buffers that are the wrong size. The first draft of this extension was published in 2003. This is when Microsoft was focusing on "Trustworthy Computing" after a January 2002 memo from Bill Gates. Basically, Windows wasn't designed to be connected to the Internet, and now that it was, people were finding many security problems. Lots of them were caused by buffer handling mistakes. Microsoft developed new versions of a number of problematic functions, and added checks to the Visual C++ compiler to warn about using the old ones. They then attempted to standardize these functions. My understanding is the people responsible for the Unix POSIX standards did not like the design of these functions, so they refused to implement them. For more details, see Field Experience With Annex K published in September 2015, Stack Overflow: Why didn't glibc implement _s functions? updated March 2023, and Rich Felker of musl on both technical and social reasons for not implementing Annex K from February 2019. I haven't looked at the rest of the functions, but having spent way too long looking at getenv(), the general idea of getenv_s() seems like a good idea to me. Standardizing this would help avoid this problem. Incomplete list of common environment variables This is a list of some uses of environment variables from fairly widely used libraries and services. This shows that environment variables are pretty widely used. Cloud Provider Credentials and Services AWS's SDKs for credentials (e.g. AWS_ACCESS_KEY_ID) Google Cloud Application Default Credentials (e.g. GOOGLE_APPLICATION_CREDENTIALS) Microsoft Azure Default Azure Credential (e.g. AZURE_CLIENT_ID) AWS's Lambda serverless product: sets a large number of variables like AWS_REGION, AWS_LAMBDA_FUNCTION_NAME, and credentials like AWS_SECRET_ACCESS_KEY Google Cloud Run serverless product: configuration like PORT, K_SERVICE, K_REVISION Kubernetes service discovery: Defines variables SERVICE_NAME_HOST and SERVICE_NAME_PORT. Third-party C/C++ Libraries OpenTelemetry: Metrics and tracing. Many environment variables like OTEL_SERVICE_NAME and OTEL_RESOURCE_ATTRIBUTES. OpenSSL: many configurable variables like HTTPS_PROXY, OPENSSL_CONF, OPENSSL_ENGINES. BoringSSL: Google's fork of OpenSSL used in Chrome and others. It reads SSLKEYLOGFILE just like OpenSSL for logging TLS keys for debugging. Libcurl: proxies, SSL/TLS configuration and debugging like HTTPS_PROXY, CURL_SSL_BACKEND, CURL_DEBUG. Libpq Postgres client library: connection parameters including credentials like PGHOSTADDR, PGDATABASE, and PGPASSWORD. Rust Standard Library std::thread RUST_MIN_STACK: Calls std::env::var() on the first call to spawn() a new thread. It is cached in a static atomic variable and never read again. See implementation in thread::min_stack(). std::backtrace RUST_LIB_BACKTRACE: Calls std::env::var() on the first call to capture a backtrace. It is cached in a static atomic variable and never read again. See implementation in Backtrace::enabled().
I was wondering: how often do nanosecond timestamps collide on modern systems? The answer is: very often, like 5% of all samples, when reading the clock on all 4 physical cores at the same time. As a result, I think it is unsafe to assume that a raw nanosecond timestamp is a unique identifier. I wrote a small test program to test this. I used Go, which records both the "absolute" time and the "monotonic clock" relative time on each call to time.Now(), so I compared both the relative difference between consecutive timestamps, as well as just the absolute timestamps. As expected, the behavior depends on the system, so I observe very different results on Mac OS X and Linux. On Linux, within a single thread, both the absolute and monotonic times always increase. On my system, the minimum increment was 32 ns. Between threads, approximately 5% of the absolute times were exactly the same as other threads. Even with 2 threads on a 4 core system, approximately 2% of timestamps collided. On Mac OS X: the absolute time has microsecond resolution, so there are an astronomical number of collisions when I repeat this same test. Even within a thread I often observe the monotonic clock not increment. See the test program on Github if you are curious.
The read() and write() system calls take a variable-length byte array as an argument. As a simplified model, the time for the system call should be some constant "per-call" time, plus time directly proportional to the number of bytes in the array. That is, the time for each call should be time = (per_call_minimum_time) + (array_len) × (per_byte_time). With this model, using a larger buffer should increase throughput, asymptotically approaching 1/per_byte_time. I was curious: do real system calls behave this way? What are the ideal buffer sizes for read() and write() if we want to maximize throughput? I decided to do some experiments with blocking I/O. These are not rigorous, and I suspect the results will vary significantly if the hardware and software are different than one the system I tested. The really short answer is that a buffer of 32 KiB is a good starting point on today's systems, and I would want to measure the performance to go beyond that. However, for large writes, performance can increase. On Linux, the simple model holds for small buffers (≤ 4 KiB), but once the program approaches the maximum throughput, the throughput becomes highly variable and in many cases decreases as the buffers get larger. For blocking I/O, approximately 32 KiB is large enough to hit the maximum throughput for read(), but write() throughput improves with buffers up to around 256 KiB - 1 MiB. The reason for the asymmetry is that the Linux kernel will only write less than the entire buffer (a "short write") if there is an error (e.g. a signal causing EINTR). Thus, larger write buffers means the operating system needs to switch to the process less often. On the other head, "short reads", where a read() returns less than the maximum length, become increasingly common as the buffer size increases, which diminishes the benefit. There is a SO_RCVLOWAT socket option to change this that I did not test. The experiments were run on two 16 CPU Google Cloud T2D instances, which use AMD EPYC Milan processors (3rd generation, released in 2021). Each core is a real physical core. I used Ubuntu 23.04 running kernel 6.2.0-1005-gcp. My benchmark program is written in Rust and is available on Github. On localhost, Unix sockets were able to transfer data at approximately 9000 MiB/s. Localhost TCP sockets were a bit slower, around 7000 MiB/s. When using two separate cloud VMs with a networking throughput limit of 32 Gbps = 3800 MiB/s, I needed to use 6 TCP sockets to reliably reach that maximum throughput. A single TCP socket gets around 1400 MiB/s with 256 KiB buffers, with peaks as high as 2200 MiB/s. Experiment 1: /dev/zero and /dev/urandom My first experiment is reading from the /dev/zero and /dev/urandom devices. These are software devices implemented by the kernel, so they should have low overhead and low variability, since other tasks are not involved. Reading from /dev/urandom should be much slower than /dev/zero since the kernel must generate random bytes, rather than just zeros. The chart below shows the throughput for reading from /dev/zero as the buffer size is increased. The results show that the basic linear time per system call model holds until the system reaches maximum throughput (256 kiB buffer = 39000 MiB/s for /dev/zero, or 16 kiB = 410 MiB/s for /dev/urandom). As the buffer size increases further, the throughput decreases as the buffers get too big. This suggests that some other cost for larger buffers starts to outweigh the reduction in number of system calls. Perhaps CPU caches become less effective? The AMD EPYC Milan (3rd gen) CPU I tested on has 32 KiB of L1 data cache and 512 KiB of L2 data cache per core. The performance decreases don't exactly line up with these numbers, but it still seems plausible. The numbers for /dev/urandom are substantially lower, but otherwise similar. I did a linear least-squares fit on the average time per system call, shown in the following chart. If I use all the data, the fit is not good, because the trend changes for larger buffers. However, if I use the data up to the maximum throughput at 256 KiB, the fit is very good, as shown on the chart below. The linear fit models the minimum time per system call as 167 ns, with 0.0235 ns/byte additional time. If we want to use smaller buffers, using a 64 KiB buffer for reading from /dev/zero gets within 95% of the maximum throughput. Experiment 2: Unix and localhost TCP sockets Exchanging data with other processes is the thing I am actually interested in, so I tested Unix and TCP sockets on a single machine. In this case, I varied both the write buffer size and the read buffer size. Unfortunately, these results vary a lot. A more robust comparison would require running each experiment many times, and using some sort of statistical comparison. However, this "quick and dirty" experiment satisfied my curiousity, so I didn't do that. As a result, my conclusions here are vague. The simple model that increasing buffer size should decrease overhead is true, but only until the buffers are about 4 KiB. Above that point, the results start to be highly variable, and it is much harder to draw general conclusion. However, appears that increasing the write buffer size generally is quite helpful up to at least 256 KiB, and often needed as much as 1 MiB to get the highest localhost throughput. I suspect this is because on Linux with blocking sockets, write() will not return until it has written all the data in the buffer, unless there is an error (e.g. EINTR). As a result, passing a large buffer means the kernel can do a lot of the work without needing to switch back to user space. Unfortunately, the same is not true for read(), which often returns "short reads" with any data that is available in the buffer. This starts with buffer sizes around 2 KiB, with the percentage of short reads increasing as the buffer size gets larger. This means the simple model does not hold, because we aren't actually increasing the bytes per read call. I suspect this is a factor which means this microbenchmark is likely not representative of real programs. A real program will do something with the buffer, which will provide time for more data to be buffered in the kernel, and would probably decrease the number of short reads. This likely means larger buffers are in practice more useful than this microbenchmark suggests. As a result of this, the highest throughput often was achievable with small read buffers. I'm somewhat arbitrarily selecting 16 KiB at the best read buffer, and 256 KiB as the best write buffer, although a 1 MiB write buffer seems to be To give a sense of how variable the results are, the plot below shows the local Unix socket throughput for each read and write buffer throughput size. I apologize for the ugly plot. I did not want to spend the time to make it more beautiful. This plot is interactive so you can slice the data to the area of interest. I recommend zooming in to the left hand size with read buffers up to about 300 KiB. The first thing to note is at least on Linux with blocking sockets, the writer will almost never have a "short write", where the write system call returns before writing all the data in the buffer. Unless there is a signal (EINTR) or some other "error" condition, write() will not return until all the bytes are written. The same is not true for reads. The read() system call will often return a "short" read, starting around buffer sizes of 2 KiB. The percentage of short reads generally increases as buffer sizes get bigger, which is logical. Another note is that sockets have in-kernel send and receive buffers. I did not tune these at all. It is possible that better performance is possible by tuning these settings, but that was not my goal. I wanted to know what happens "out of the box" for general-purpose programs without any special tuning. Experiment 3: TCP between two hosts In this experiment, I used two separate hosts connected with 32 Gbps networking in Google Cloud. I first tested the TCP throughput using iperf, to independently verify the network performance. A single TCP connection with iperf is not enough to fully utilize the network. I tried fiddling with some command line options and with Kernel settings like net.ipv4.tcp_rmem and wasn't able to get much better than about 12 Gb/s = 1400 MiB/s. The throughput is also highly varible. Here is some example output with iperf reporting at 2 second intervals, where you can see the throughput ranging from 10 to 19 Gb/s, with an average over the entire interval of 12 Gb/s. To hit the maximum network throughput, I need to use 6 or more parallel TCP connections (iperf -c IP_ADDRESS --time 60 --interval 2 -l 262144 -P 6). Using 3 connections gets around 26 Gb/s, and using 4 or 5 will occasionally hit the maximum, but will also occasionally drop down. Using at least 6 seems to reliably stay at the maximum. Due to this variability, it is hard to draw any conclusions about buffer size. In particular: a single TCP connection is not limited by CPU. The system uses about 40% of a single CPU core, basically all in the kernel. This is more about how the buffer sizes may impact scheduling choices. That said, it is clear that you cannot hit the maximum throughput with a small write buffer. The experiments with 4 KiB write buffers reached approximately 300 MiB/s, while an 8 KiB write buffer was much faster, around 1400 MiB/s. Larger still generally seems better, up to around 256 KiB, which occasionally reached 2200 MiB/s = 17.6 Gb/s. The plot below shows the TCP socket throughput for each read and write buffer size. Again, I apologize for the ugly plot.
This is a post for myself, because I wasted a lot of time understanding this bug, and I want to be able to remember it in the future. I expect close to zero others to be interested. The C standard library function isspace() returns a non-zero value (true) for the six "standard" ASCII white-space characters ('\t', '\n', '\v', '\f', '\r', ' '), and any locale-specific characters. By default, a program starts in the "C" locale, which will only return true for the six ASCII white-space characters. However, if the program changes locales, it can return true for other values. As a result, unless you really understand locales, you should use your own version of this function, or ICU4C's u_isspace() function. An implementation of isspace() for ASCII is one line: /* Returns true for the 6 ASCII white-space characters: \t \n \v \f \r ' '. */ int isspace_ascii(int c) { return c == '\t' || c == '\n' || c == '\v' || c == '\f' || c == '\r' || c == ' '; } I ran into this because On Mac OS X, Postgres switches to the system's default locale, which is something that uses UTF-8 (e.g. en_US.UTF-8, fr_CA.UTF-8, etc). In this case, isspace() returns true for Unicode white-space values, which includes 0x85 = NEL = Next Line, and 0xA0 = NBSP = No-Break Space. This caused a bug in parsing Postgres Hstore values that use Unicode. I have attempted to submit a patch to fix this (mailing list post, commitfest entry). For a program to demonstrate the behaviour on different systems, see isspace_locale on Github.
More in programming
While Stripe is a widely admired company for things like its creation of the Sorbet typer project, I personally think that Stripe’s most interesting strategy work is also among its most subtle: its willingness to significantly prioritize API stability. This strategy is almost invisible externally. Internally, discussions around it were frequent and detailed, but mostly confined to dedicated API design conversations. API stability isn’t just a technical design quirk, it’s a foundational decision in an API-driven business, and I believe it is one of the unsung heroes of Stripe’s business success. This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Reading this document To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore. More detail on this structure in Making a readable Engineering Strategy document. Policy & Operation Our policies for managing API changes are: Design for long API lifetime. APIs are not inherently durable. Instead we have to design thoughtfully to ensure they can support change. When designing a new API, build a test application that doesn’t use this API, then migrate to the new API. Consider how integrations might evolve as applications change. Perform these migrations yourself to understand potential friction with your API. Then think about the future changes that we might want to implement on our end. How would those changes impact the API, and how would they impact the application you’ve developed. At this point, take your API to API Review for initial approval as described below. Following that approval, identify a handful of early adopter companies who can place additional pressure on your API design, and test with them before releasing the final, stable API. All new and modified APIs must be approved by API Review. API changes may not be enabled for customers prior to API Review approval. Change requests should be sent to api-review email group. For examples of prior art, review the api-review archive for prior requests and the feedback they received. All requests must include a written proposal. Most requests will be approved asynchronously by a member of API Review. Complex or controversial proposals will require live discussions to ensure API Review members have sufficient context before making a decision. We never deprecate APIs without an unavoidable requirement to do so. Even if it’s technically expensive to maintain support, we incur that support cost. To be explicit, we define API deprecation as any change that would require customers to modify an existing integration. If such a change were to be approved as an exception to this policy, it must first be approved by the API Review, followed by our CEO. One example where we granted an exception was the deprecation of TLS 1.2 support due to PCI compliance obligations. When significant new functionality is required, we add a new API. For example, we created /v1/subscriptions to support those workflows rather than extending /v1/charges to add subscriptions support. With the benefit of hindsight, a good example of this policy in action was the introduction of the Payment Intents APIs to maintain compliance with Europe’s Strong Customer Authentication requirements. Even in that case the charge API continued to work as it did previously, albeit only for non-European Union payments. We manage this policy’s implied technical debt via an API translation layer. We release changed APIs into versions, tracked in our API version changelog. However, we only maintain one implementation internally, which is the implementation of the latest version of the API. On top of that implementation, a series of version transformations are maintained, which allow us to support prior versions without maintaining them directly. While this approach doesn’t eliminate the overhead of supporting multiple API versions, it significantly reduces complexity by enabling us to maintain just a single, modern implementation internally. All API modifications must also update the version transformation layers to allow the new version to coexist peacefully with prior versions. In the future, SDKs may allow us to soften this policy. While a significant number of our customers have direct integrations with our APIs, that number has dropped significantly over time. Instead, most new integrations are performed via one of our official API SDKs. We believe that in the future, it may be possible for us to make more backwards incompatible changes because we can absorb the complexity of migrations into the SDKs we provide. That is certainly not the case yet today. Diagnosis Our diagnosis of the impact on API changes and deprecation on our business is: If you are a small startup composed of mostly engineers, integrating a new payments API seems easy. However, for a small business without dedicated engineers—or a larger enterprise involving numerous stakeholders—handling external API changes can be particularly challenging. Even if this is only marginally true, we’ve modeled the impact of minimizing API changes on long-term revenue growth, and it has a significant impact, unlocking our ability to benefit from other churn reduction work. While we believe API instability directly creates churn, we also believe that API stability directly retains customers by increasing the migration overhead even if they wanted to change providers. Without an API change forcing them to change their integration, we believe that hypergrowth customers are particularly unlikely to change payments API providers absent a concrete motivation like an API change or a payment plan change. We are aware of relatively few companies that provide long-term API stability in general, and particularly few for complex, dynamic areas like payments APIs. We can’t assume that companies that make API changes are ill-informed. Rather it appears that they experience a meaningful technical debt tradeoff between the API provider and API consumers, and aren’t willing to consistently absorb that technical debt internally. Future compliance or security requirements—along the lines of our upgrade from TLS 1.2 to TLS 1.3 for PCI—may necessitate API changes. There may also be new tradeoffs exposed as we enter new markets with their own compliance regimes. However, we have limited ability to predict these changes at this point.
Recently I got a question on formal methods1: how does it help to mathematically model systems when the system requirements are constantly changing? It doesn't make sense to spend a lot of time proving a design works, and then deliver the product and find out it's not at all what the client needs. As the saying goes, the hard part is "building the right thing", not "building the thing right". One possible response: "why write tests"? You shouldn't write tests, especially lots of unit tests ahead of time, if you might just throw them all away when the requirements change. This is a bad response because we all know the difference between writing tests and formal methods: testing is easy and FM is hard. Testing requires low cost for moderate correctness, FM requires high(ish) cost for high correctness. And when requirements are constantly changing, "high(ish) cost" isn't affordable and "high correctness" isn't worthwhile, because a kinda-okay solution that solves a customer's problem is infinitely better than a solid solution that doesn't. But eventually you get something that solves the problem, and what then? Most of us don't work for Google, we can't axe features and products on a whim. If the client is happy with your solution, you are expected to support it. It should work when your customers run into new edge cases, or migrate all their computers to the next OS version, or expand into a market with shoddy internet. It should work when 10x as many customers are using 10x as many features. It should work when you add new features that come into conflict. And just as importantly, it should never stop solving their problem. Canonical example: your feature involves processing requested tasks synchronously. At scale, this doesn't work, so to improve latency you make it asynchronous. Now it's eventually consistent, but your customers were depending on it being always consistent. Now it no longer does what they need, and has stopped solving their problems. Every successful requirement met spawns a new requirement: "keep this working". That requirement is permanent, or close enough to decide our long-term strategy. It takes active investment to keep a feature behaving the same as the world around it changes. (Is this all a pretentious of way of saying "software maintenance is hard?" Maybe!) Phase changes In physics there's a concept of a phase transition. To raise the temperature of a gram of liquid water by 1° C, you have to add 4.184 joules of energy.2 This continues until you raise it to 100°C, then it stops. After you've added two thousand joules to that gram, it suddenly turns into steam. The energy of the system changes continuously but the form, or phase, changes discretely. Software isn't physics but the idea works as a metaphor. A certain architecture handles a certain level of load, and past that you need a new architecture. Or a bunch of similar features are independently hardcoded until the system becomes too messy to understand, you remodel the internals into something unified and extendable. etc etc etc. It's doesn't have to be totally discrete phase transition, but there's definitely a "before" and "after" in the system form. Phase changes tend to lead to more intricacy/complexity in the system, meaning it's likely that a phase change will introduce new bugs into existing behaviors. Take the synchronous vs asynchronous case. A very simple toy model of synchronous updates would be Set(key, val), which updates data[key] to val.3 A model of asynchronous updates would be AsyncSet(key, val, priority) adds a (key, val, priority, server_time()) tuple to a tasks set, and then another process asynchronously pulls a tuple (ordered by highest priority, then earliest time) and calls Set(key, val). Here are some properties the client may need preserved as a requirement: If AsyncSet(key, val, _, _) is called, then eventually db[key] = val (possibly violated if higher-priority tasks keep coming in) If someone calls AsyncSet(key1, val1, low) and then AsyncSet(key2, val2, low), they should see the first update and then the second (linearizability, possibly violated if the requests go to different servers with different clock times) If someone calls AsyncSet(key, val, _) and immediately reads db[key] they should get val (obviously violated, though the client may accept a slightly weaker property) If the new system doesn't satisfy an existing customer requirement, it's prudent to fix the bug before releasing the new system. The customer doesn't notice or care that your system underwent a phase change. They'll just see that one day your product solves their problems, and the next day it suddenly doesn't. This is one of the most common applications of formal methods. Both of those systems, and every one of those properties, is formally specifiable in a specification language. We can then automatically check that the new system satisfies the existing properties, and from there do things like automatically generate test suites. This does take a lot of work, so if your requirements are constantly changing, FM may not be worth the investment. But eventually requirements stop changing, and then you're stuck with them forever. That's where models shine. As always, I'm using formal methods to mean the subdiscipline of formal specification of designs, leaving out the formal verification of code. Mostly because "formal specification" is really awkward to say. ↩ Also called a "calorie". The US "dietary Calorie" is actually a kilocalorie. ↩ This is all directly translatable to a TLA+ specification, I'm just describing it in English to avoid paying the syntax tax ↩
We received over 2,200 applications for our just-closed junior programmer opening, and now we're going through all of them by hand and by human. No AI screening here. It's a lot of work, but we have a great team who take the work seriously, so in a few weeks, we'll be able to invite a group of finalists to the next phase. This highlights the folly of thinking that what it'll take to land a job like this is some specific list of criteria, though. Yes, you have to present a baseline of relevant markers to even get into consideration, like a great cover letter that doesn't smell like AI slop, promising projects or work experience or educational background, etc. But to actually get the job, you have to be the best of the ones who've applied! It sounds self-evident, maybe, but I see questions time and again about it, so it must not be. Almost every job opening is grading applicants on the curve of everyone who has applied. And the best candidate of the lot gets the job. You can't quantify what that looks like in advance. I'm excited to see who makes it to the final stage. I already hear early whispers that we got some exceptional applicants in this round. It would be great to help counter the narrative that this industry no longer needs juniors. That's simply retarded. However good AI gets, we're always going to need people who know the ins and outs of what the machine comes up with. Maybe not as many, maybe not in the same roles, but it's truly utopian thinking that mankind won't need people capable of vetting the work done by AI in five minutes.
I like the job title “Design Engineer”. When required to label myself, I feel partial to that term (I should, I’ve written about it enough). Lately I’ve felt like the term is becoming more mainstream which, don’t get me wrong, is a good thing. I appreciate the diversification of job titles, especially ones that look to stand in the middle between two binaries. But — and I admit this is a me issue — once a title starts becoming mainstream, I want to use it less and less. I was never totally sure why I felt this way. Shouldn’t I be happy a title I prefer is gaining acceptance and understanding? Do I just want to rebel against being labeled? Why do I feel this way? These were the thoughts simmering in the back of my head when I came across an interview with the comedian Brian Regan where he talks about his own penchant for not wanting to be easily defined: I’ve tried over the years to write away from how people are starting to define me. As soon as I start feeling like people are saying “this is what you do” then I would be like “Alright, I don't want to be just that. I want to be more interesting. I want to have more perspectives.” [For example] I used to crouch around on stage all the time and people would go “Oh, he’s the guy who crouches around back and forth.” And I’m like, “I’ll show them, I will stand erect! Now what are you going to say?” And then they would go “You’re the guy who always feels stupid.” So I started [doing other things]. He continues, wondering aloud whether this aversion to not being easily defined has actually hurt his career in terms of commercial growth: I never wanted to be something you could easily define. I think, in some ways, that it’s held me back. I have a nice following, but I’m not huge. There are people who are huge, who are great, and deserve to be huge. I’ve never had that and sometimes I wonder, ”Well maybe it’s because I purposely don’t want to be a particular thing you can advertise or push.” That struck a chord with me. It puts into words my current feelings towards the job title “Design Engineer” — or any job title for that matter. Seven or so years ago, I would’ve enthusiastically said, “I’m a Design Engineer!” To which many folks would’ve said, “What’s that?” But today I hesitate. If I say “I’m a Design Engineer” there are less follow up questions. Now-a-days that title elicits less questions and more (presumed) certainty. I think I enjoy a title that elicits a “What’s that?” response, which allows me to explain myself in more than two or three words, without being put in a box. But once a title becomes mainstream, once people begin to assume they know what it means, I don’t like it anymore (speaking for myself, personally). As Brian says, I like to be difficult to define. I want to have more perspectives. I like a title that befuddles, that doesn’t provide a presumed sense of certainty about who I am and what I do. And I get it, that runs counter to the very purpose of a job title which is why I don’t think it’s good for your career to have the attitude I do, lol. I think my own career evolution has gone something like what Brian describes: Them: “Oh you’re a Designer? So you make mock-ups in Photoshop and somebody else implements them.” Me: “I’ll show them, I’ll implement them myself! Now what are you gonna do?” Them: “Oh, so you’re a Design Engineer? You design and build user interfaces on the front-end.” Me: “I’ll show them, I’ll write a Node server and setup a database that powers my designs and interactions on the front-end. Now what are they gonna do?” Them: “Oh, well, we I’m not sure we have a term for that yet, maybe Full-stack Design Engineer?” Me: “Oh yeah? I’ll frame up a user problem, interface with stakeholders, explore the solution space with static designs and prototypes, implement a high-fidelity solution, and then be involved in testing, measuring, and refining said solution. What are you gonna call that?” [As you can see, I have some personal issues I need to work through…] As Brian says, I want to be more interesting. I want to have more perspectives. I want to be something that’s not so easily definable, something you can’t sum up in two or three words. I’ve felt this tension my whole career making stuff for the web. I think it has led me to work on smaller teams where boundaries are much more permeable and crossing them is encouraged rather than discouraged. All that said, I get it. I get why titles are useful in certain contexts (corporate hierarchies, recruiting, etc.) where you’re trying to take something as complicated and nuanced as an individual human beings and reduce them to labels that can be categorized in a database. I find myself avoiding those contexts where so much emphasis is placed in the usefulness of those labels. “I’ve never wanted to be something you could easily define” stands at odds with the corporate attitude of, “Here’s the job req. for the role (i.e. cog) we’re looking for.” Email · Mastodon · Bluesky
I've been biking in Brooklyn for a few years now! It's hard for me to believe it, but I'm now one of the people other bicyclists ask questions to now. I decided to make a zine that answers the most common of those questions: Bike Brooklyn! is a zine that touches on everything I wish I knew when I started biking in Brooklyn. A lot of this information can be found in other resources, but I wanted to collect it in one place. I hope to update this zine when we get significantly more safe bike infrastructure in Brooklyn and laws change to make streets safer for bicyclists (and everyone) over time, but it's still important to note that each release will reflect a specific snapshot in time of bicycling in Brooklyn. All text and illustrations in the zine are my own. Thank you to Matt Denys, Geoffrey Thomas, Alex Morano, Saskia Haegens, Vishnu Reddy, Ben Turndorf, Thomas Nayem-Huzij, and Ryan Christman for suggestions for content and help with proofreading. This zine is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, so you can copy and distribute this zine for noncommercial purposes in unadapted form as long as you give credit to me. Check out the Bike Brooklyn! zine on the web or download pdfs to read digitally or print here!