Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
22
I’m wondering why you talk about “branches” at all. No such thing should exist. – Linus Torvalds, 2005, git@vger.kernel.org Git branches are hard to think about. But “GitHub” forces you to think about branches. A lot. Instead of futzing with GitHub’s feature branches, many big projects1 opt for a stacked patches workflow—the kind Gerrit and Phorge offer. I can see compelling reasons why, I’ll focus on two: Slow review Integration pain 🐌 Slow review Code review decision fatigue is real—big changes make for slow reviews.2 In GitHub, pull requests are synonymous with feature branches. So it’s tempting to cram commits into a branch until it’s feature-complete, then push for review: git checkout -b feature-foo --track origin/main origin/main …*:・゚✧ git commit -m refactor *:・゚✧ … …*:・゚✧ git commit -m add new code *:・゚✧ … …*:・゚✧ git commit -m change the UI *:・゚✧ … git push origin feature-foo GitHub flow: a local feature branch becomes a review with three commits But this is an...
a year ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Tyler Cipriani: blog

Digging into git commit templates

Any code of your own that you haven’t looked at for six or more months might as well have been written by someone else. – Eagleson’s Law After scouring git history, I found the correct config file, but someone removed it. Their full commit message read: Remove config. Don't bring it back. Very. helpful. But I get it; it’s hard to care about commit messages when you’re making a quick change. Git commit templates can help. Commit templates provide a scaffold for your commit messages, reminding you to answer questions like: What problem are you solving? Why is this the solution? What alternatives did you consider? Where can I read more? What is a git commit template? When you type git commit, git pops open your text editor1. Git can pre-fill your editor with a commit template—a form that reminds you of everything it’s easy to forget when writing a commit. Creating a commit template is simple. Create a plaintext file – mine lives at ~/.config/git/message.txt Tell git to use it: git config --global \ commit.template '~/.config/git/message.txt' My template packs everything I know about writing a commit. Project-specific templates Large projects, such as Linux kernel, git, and MediaWiki, have their own commit guidelines. Git templates can remind you about these per-project requirements if you add a commit template to a project’s .git/config file. Another way to do this is git’s includeIf configuration setting. includeIf lets you override git config settings when you’re working under directories you define. For example, all my Wikimedia work lives in ~/Projects/Wikimedia and at the bottom of my ~/.config/git/config I have: [includeIf "gitdir:~/Projects/Wikimedia/**"] path = ~/.config/git/config.wikimedia In config.wikimedia, I point to my Wikimedia-specific commit template (along with other necessary git settings: my user.email, core.hooksPath, and a pushInsteadOf url to push to ssh even when I clone via https). Forge-specific templates Personal git commit templates lead to better commits, which make for a better history. The forge-specific pull-request templates are a band-aid, the cheap kind that falls off in the shower. There’s no incentive for GitHub to make git history better: the worse your commit history, the more you rely on GitHub. Still, all the major pull-request-style forges let you foist a pull-request template on your contributors. As a contributor, I dislike filling those out—they add unnecessary friction. Commit message contents Your commit template allows you do the hard thinking upfront. Then, when you make a commit, you simply follow the template. My template asks questions I answer with my commit message: 72ch. wide -------------------------------------------------------- BODY # | # - Why should this change be made? | # - What problem are you solving? | # - Why this solution? | # - What's wrong with the current code? | # - Are there other ways to do it? | # - How can the reviewer confirm it works? | # | # ---------------------------------------------------------------- /BODY But other clever folks cooked up conventions you could incorporate: Conventional commits – how do your commits relate to semantic versioning? This makes it easier for SRE and downstream users. Problem/Solution format – first pioneered by ZeroMQ2, this format anticipates the questions of future developers and reviewers. Gitmoji – developed for the GitHub crowd, this format defines an emoji shorthand that makes it easy to spot changes of a particular type. Commit message formatting How you format text affects how people read it. My template also deals with text formatting rules3: Subject – 50 characters or less, capitalized, no end punctuation. Body – Wrap at 72 characters with a blank line separating it from the subject. Trailers – Standard formats with a blank line separating them from the body. People will read your commit in different contexts: git log, git shortlog, and git rebase. But git’s pager has no line wrapping by default. I hard wrap at 72 characters because that makes text easier to read in wide terminals.4 Finally, my template addresses trailers, reminding me about standard trailers supported in the projects I’m working on. Git can interpret trailers, which can be useful later. For example, if I wanted a tab-separated list of commits and their related tasks I could find that with git log: $ TAB=%x09 $ BUG_TRAILER='%(trailers:key=Bug,valueonly=true,separator=%x2C )' $ SHORT_HASH=%h $ SUBJ=%s $ FORMAT="${SHORT_HASH}${TAB}${BUG_TRAILER}${TAB}${GIT_SUBJ}" $ git log --topo-order --no-merges \ --format="$FORMAT" d2b09deb12f T359762 Rewrite Kurdish (ku) Latin to Arabic converter 28123a6a262 T332865 tests: Remove non-static fallback in HookRunnerTestBase 4e919a307a4 T328919 tests: Remove unused argument from data provider in PageUpdaterTest bedd0f685f9 objectcache: Improve `RESTBagOStuff::handleError()` 2182a0c4490 T393219 tests: Remove two data provider in RestStructureTest Git commit templates free your brain from remembering what you should write, allowing you to focus on the story you should tell. Your future self will thank you for the effort. Starting with core.editor in your git config, $VISUAL or $EDITOR in your shell, finally falling back to vi.↩︎ I think…↩︎ All cribbed from Tim Pope↩︎ Another story I’ve heard: a standard terminal allows 80 characters per line. git log indents commit messages with 4 spaces. A 72-character-per-line commit centers text on an 80-character-per-line terminal. To me, readability in modern terminals is a better reason to wrap than kowtowing to antiquated terminals.↩︎

2 weeks ago 2 votes
Boox Go 10.3, two months in

[The] Linux kernel uses GPLv2, and if you distribute GPLv2 code, you have to provide a copy of the source (and modifications) once someone asks for it. And now I’m asking nicely for you to do so 🙂 – Joga, bbs.onyx-international.com Boox in split screen, typewriter mode In January, I bought a Boox Go 10.3—a 10.3-inch, 300-ppi, e-ink Android tablet. After two months, I use the Boox daily—it’s replaced my planner, notebook, countless PDF print-offs, and the good parts of my phone. But Boox’s parent company, Onyx, is sketchy. I’m conflicted. The Boox Go is a beautiful, capable tablet that I use every day, but I recommend avoiding as long as Onyx continues to disregard the rights of its users. How I’m using my Boox My e-ink floor desk Each morning, I plop down in front of my MagicHold laptop stand and journal on my Boox with Obsidian. I use Syncthing to back up my planner and sync my Zotero library between my Boox and laptop. In the evening, I review my PDF planner and plot for tomorrow. I use these apps: Obsidian – a markdown editor that syncs between all my devices with no fuss for $8/mo. Syncthing – I love Syncthing—it’s an encrypted, continuous file sync-er without a centralized server. Meditation apps1 – Guided meditation away from the blue light glow of my phone or computer is better. Before buying the Boox, I considered a reMarkable. The reMarkable Paper Pro has a beautiful color screen with a frontlight, a nice pen, and a “type folio,” plus it’s certified by the Calm Tech Institute. But the reMarkable is a distraction-free e-ink tablet. Meanwhile, I need distraction-lite. What I like Calm(ish) technology – The Boox is an intentional device. Browsing the internet, reading emails, and watching videos is hard, but that’s good. Apps – Google Play works out of the box. I can install F-Droid and change my launcher without difficulty. Split screen – The built-in launcher has a split screen feature. I use it to open a PDF side-by-side with a notes doc. Reading – The screen is a 300ppi Carta 1200, making text crisp and clear. What I dislike I filmed myself typing at 240fps, each frame is 4.17ms. Boox’s typing latency is between 150ms and 275ms at the fastest refresh rate inside Obsidian. Typing – Typing latency is noticeable. At Boox’s highest refresh rate, after hitting a key, text takes between 150ms to 275ms to appear. I can still type, though it’s distracting at times. The horror of the default pen Accessories Pen – The default pen looks like a child’s whiteboard marker and feels cheap. I replaced it with the Kindle Scribe Premium pen, and the writing experience is vastly improved. Cover – It’s impossible to find a nice cover. I’m using a $15 cover that I’m encasing in stickers. Tool switching – Swapping between apps is slow and clunky. I blame Android and the current limitations of e-ink more than Boox. No frontlight – The Boox’s lack of frontlight prevents me from reading more with it. I knew this when I bought my Boox, but devices with frontlights seem to make other compromises. Onyx The Chinese company behind Boox, Onyx International, Inc., runs the servers where Boox shuttles tracking information. I block this traffic with Pi-Hole2. pihole-ing whatever telemetry Boox collects I inspected this traffic via Mitm proxy—most traffic was benign, though I never opted into sending any telemetry (nor am I logged in to a Boox account). But it’s also an Android device, so it’s feeding telemetry into Google’s gaping maw, too. Worse, Onyx is flouting the terms of the GNU Public License, declining to release Linux kernel modifications to users. This is anathema to me—GPL violations are tantamount to theft. Onyx’s disregard for user rights makes me regret buying the Boox. Verdict I’ll continue to use the Boox and feel bad about it. I hope my digging in this post will help the next person. Unfortunately, the e-ink tablet market is too niche to support the kind of solarpunk future I’d always imagined. But there’s an opportunity for an open, Linux-based tablet to dominate e-ink. Linux is playing catch-up on phones with PostmarketOS. Meanwhile, the best e-ink tablets have to offer are old, unupdateable versions of Android, like the OS on the Boox. In the future, I’d love to pay a license- and privacy-respecting company for beautiful, calm technology and recommend their product to everyone. But today is not the future. I go back and forth between “Waking Up” and “Calm”↩︎ Using github.com/JordanEJ/Onyx-Boox-Blocklist↩︎

3 months ago 20 votes
Eventually consistent plain text accounting

.title { text-wrap: balance } Spending for October, generated by piping hledger → R Over the past six months, I’ve tracked my money with hledger—a plain text double-entry accounting system written in Haskell. It’s been surprisingly painless. My previous attempts to pick up real accounting tools floundered. Hosted tools are privacy nightmares, and my stint with GnuCash didn’t last. But after stumbling on Dmitry Astapov’s “Full-fledged hledger” wiki1, it clicked—eventually consistent accounting. Instead of modeling your money all at once, take it one hacking session at a time. It should be easy to work towards eventual consistency. […] I should be able to [add financial records] bit by little bit, leaving things half-done, and picking them up later with little (mental) effort. – Dmitry Astapov, Full-Fledged Hledger Principles of my system I’ve cobbled together a system based on these principles: Avoid manual entry – Avoid typing in each transaction. Instead, rely on CSVs from the bank. CSVs as truth – CSVs are the only things that matter. Everything else can be blown away and rebuilt anytime. Embrace version control – Keep everything under version control in Git for easy comparison and safe experimentation. Learn hledger in five minutes hledger concepts are heady, but its use is simple. I divide the core concepts into two categories: Stuff hledger cares about: Transactions – how hledger moves money between accounts. Journal files – files full of transactions Stuff I care about: Rules files – how I set up accounts, import CSVs, and move money between accounts. Reports – help me see where my money is going and if I messed up my rules. Transactions move money between accounts: 2024-01-01 Payday income:work $-100.00 assets:checking $100.00 This transaction shows that on Jan 1, 2024, money moved from income:work into assets:checking—Payday. The sum of each transaction should be $0. Money comes from somewhere, and the same amount goes somewhere else—double-entry accounting. This is powerful technology—it makes mistakes impossible to ignore. Journal files are text files containing one or more transactions: 2024-01-01 Payday income:work $-100.00 assets:checking $100.00 2024-01-02 QUANSHENG UVK5 assets:checking $-29.34 expenses:fun:radio $29.34 Rules files transform CSVs into journal files via regex matching. Here’s a CSV from my bank: Transaction Date,Description,Category,Type,Amount,Memo 09/01/2024,DEPOSIT Paycheck,Payment,Payment,1000.00, 09/04/2024,PizzaPals Pizza,Food & Drink,Sale,-42.31, 09/03/2024,Amazon.com*XXXXXXXXY,Shopping,Sale,-35.56, 09/03/2024,OBSIDIAN.MD,Shopping,Sale,-10.00, 09/02/2024,Amazon web services,Personal,Sale,-17.89, And here’s a checking.rules to transform that CSV into a journal file so I can use it with hledger: # checking.rules # -------------- # Map CSV fields → hledger fields[0] fields date,description,category,type,amount,memo,_ # `account1`: the account for the whole CSV.[1] account1 assets:checking account2 expenses:unknown skip 1 date-format %m/%d/%Y currency $ if %type Payment account2 income:unknown if %category Food & Drink account2 expenses:food:dining # [0]: <https://hledger.org/hledger.html#field-names> # [1]: <https://hledger.org/hledger.html#account-field> With these two files (checking.rules and 2024-09_checking.csv), I can make the CSV into a journal: $ > 2024-09_checking.journal \ hledger print \ --rules-file checking.rules \ -f 2024-09_checking.csv $ head 2024-09_checking.journal 2024-09-01 DEPOSIT Paycheck assets:checking $1000.00 income:unknown $-1000.00 2024-09-02 Amazon web services assets:checking $-17.89 expenses:unknown $17.89 Reports are interesting ways to view transactions between accounts. There are registers, balance sheets, and income statements: $ hledger incomestatement \ --depth=2 \ --file=2024-09_bank.journal Revenues: $1000.00 income:unknown ----------------------- $1000.00 Expenses: $42.31 expenses:food $63.45 expenses:unknown ----------------------- $105.76 ----------------------- Net: $894.24 At the beginning of September, I spent $105.76 and made $1000, leaving me with $894.24. But a good chunk is going to the default expense account, expenses:unknown. I can use the hleger aregister to see what those transactions are: $ hledger areg expenses:unknown \ --file=2024-09_checking.journal \ -O csv | \ csvcut -c description,change | \ csvlook | description | change | | ------------------------ | ------ | | OBSIDIAN.MD | 10.00 | | Amazon web services | 17.89 | | Amazon.com*XXXXXXXXY | 35.56 | l Then, I can add some more rules to my checking.rules: if OBSIDIAN.MD account2 expenses:personal:subscriptions if Amazon web services account2 expenses:personal:web:hosting if Amazon.com account2 expenses:personal:shopping:amazon Now, I can reprocess my data to get a better picture of my spending: $ > 2024-09_bank.journal \ hledger print \ --rules-file bank.rules \ -f 2024-09_bank.csv $ hledger bal expenses \ --depth=3 \ --percent \ -f 2024-09_checking2.journal 30.0 % expenses:food:dining 33.6 % expenses:personal:shopping 9.5 % expenses:personal:subscriptions 16.9 % expenses:personal:web -------------------- 100.0 % For the Amazon.com purchase, I lumped it into the expenses:personal:shopping account. But I could dig deeper—download my order history from Amazon and categorize that spending. This is the power of working bit-by-bit—the data guides you to the next, deeper rabbit hole. Goals and non-goals Why am I doing this? For years, I maintained a monthly spreadsheet of account balances. I had a balance sheet. But I still had questions. Spending over six months, generated by piping hledger → gnuplot Before diving into accounting software, these were my goals: Granular understanding of my spending – The big one. This is where my monthly spreadsheet fell short. I knew I had money in the bank—I kept my monthly balance sheet. I budgeted up-front the % of my income I was saving. But I had no idea where my other money was going. Data privacy – I’m unwilling to hand the keys to my accounts to YNAB or Mint. Increased value over time – The more time I put in, the more value I want to get out—this is what you get from professional tools built for nerds. While I wished for low-effort setup, I wanted the tool to be able to grow to more uses over time. Non-goals—these are the parts I never cared about: Investment tracking – For now, I left this out of scope. Between monthly balances in my spreadsheet and online investing tools’ ability to drill down, I was fine.2 Taxes – Folks smarter than me help me understand my yearly taxes.3 Shared system – I may want to share reports from this system, but no one will have to work in it except me. Cash – Cash transactions are unimportant to me. I withdraw money from the ATM sometimes. It evaporates. hledger can track all these things. My setup is flexible enough to support them someday. But that’s unimportant to me right now. Monthly maintenance I spend about an hour a month checking in on my money Which frees me to spend time making fancy charts—an activity I perversely enjoy. Income vs. Expense, generated by piping hledger → gnuplot Here’s my setup: $ tree ~/Documents/ledger . ├── export │   ├── 2024-balance-sheet.txt │   └── 2024-income-statement.txt ├── import │   ├── in │   │   ├── amazon │   │   │   └── order-history.csv │   │   ├── credit │   │   │   ├── 2024-01-01_2024-02-01.csv │   │   │   ├── ... │   │   │   └── 2024-10-01_2024-11-01.csv │   │   └── debit │   │   ├── 2024-01-01_2024-02-01.csv │   │   ├── ... │   │   └── 2024-10-01_2024-11-01.csv │   └── journal │   ├── amazon │   │   └── order-history.journal │   ├── credit │   │   ├── 2024-01-01_2024-02-01.journal │   │   ├── ... │   │   └── 2024-10-01_2024-11-01.journal │   └── debit │   ├── 2024-01-01_2024-02-01.journal │   ├── ... │   └── 2024-10-01_2024-11-01.journal ├── rules │   ├── amazon │   │   └── journal.rules │   ├── credit │   │   └── journal.rules │   ├── debit │   │   └── journal.rules │   └── common.rules ├── 2024.journal ├── Makefile └── README Process: Import – download a CSV for the month from each account and plop it into import/in/<account>/<dates>.csv Make – run make Squint – Look at git diff; if it looks good, git add . && git commit -m "💸" otherwise review hledger areg to see details. The Makefile generates everything under import/journal: journal files from my CSVs using their corresponding rules. reports in the export folder I include all the journal files in the 2024.journal with the line: include ./import/journal/*/*.journal Here’s the Makefile: SHELL := /bin/bash RAW_CSV = $(wildcard import/in/**/*.csv) JOURNALS = $(foreach file,$(RAW_CSV),$(subst /in/,/journal/,$(patsubst %.csv,%.journal,$(file)))) .PHONY: all all: $(JOURNALS) hledger is -f 2024.journal > export/2024-income-statement.txt hledger bs -f 2024.journal > export/2024-balance-sheet.txt .PHONY clean clean: rm -rf import/journal/**/*.journal import/journal/%.journal: import/in/%.csv @echo "Processing csv $< to $@" @echo "---" @mkdir -p $(shell dirname $@) @hledger print --rules-file rules/$(shell basename $$(dirname $<))/journal.rules -f "$<" > "$@" If I find anything amiss (e.g., if my balances are different than what the bank tells me), I look at hleger areg. I may tweak my rules or my CSVs and then I run make clean && make and try again. Simple, plain text accounting made simple. And if I ever want to dig deeper, hledger’s docs have more to teach. But for now, the balance of effort vs. reward is perfect. while reading a blog post from Jonathan Dowland↩︎ Note, this is covered by full-fledged hledger – Investements↩︎ Also covered in full-fledged hledger – Tax returns↩︎

7 months ago 46 votes
Subliminal git commits

Luckily, I speak Leet. – Amita Ramanujan, Numb3rs, CBS’s IRC Drama There’s an episode of the CBS prime-time drama Numb3rs that plumbs the depths of Dr. Joel Fleischman’s1 knowledge of IRC. In one scene, Fleischman wonders, “What’s ‘leet’”? “Leet” is writing that replaces letters with numbers, e.g., “Numb3rs,” where 3 stands in for e. In short, leet is like the heavy-metal “S” you drew in middle school: Sweeeeet. / \ / | \ | | | \ \ | | | \ | / \ / ASCII art version of your misspent youth. Following years of keen observation, I’ve noticed Git commit hashes are also letters and numbers. Git commit hashes are, as Fleischman might say, prime targets for l33tification. What can I spell with a git commit? DenITDao via orlybooks) With hexidecimal we can spell any word containing the set of letters {A, B, C, D, E, F}—DEADBEEF (a classic) or ABBABABE (for Mama Mia aficionados). This is because hexidecimal is a base-16 numbering system—a single “digit” represents 16 numbers: Base-10: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 16 15 Base-16: 0 1 2 3 4 5 6 7 8 9 A B C D E F Leet expands our palette of words—using 0, 1, and 5 to represent O, I, and S, respectively. I created a script that scours a few word lists for valid words and phrases. With it, I found masterpieces like DADB0D (dad bod), BADA55 (bad ass), and 5ADBAB1E5 (sad babies). Manipulating commit hashes for fun and no profit Git commit hashes are no mystery. A commit hash is the SHA-1 of a commit object. And a commit object is the commit message with some metadata. $ mkdir /tmp/BADA55-git && cd /tmp/BAD55-git $ git init Initialized empty Git repository in /tmp/BADA55-git/.git/ $ echo '# BADA55 git repo' > README.md && git add README.md && git commit -m 'Initial commit' [main (root-commit) 68ec0dd] Initial commit 1 file changed, 1 insertion(+) create mode 100644 README.md $ git log --oneline 68ec0dd (HEAD -> main) Initial commit Let’s confirm we can recreate the commit hash: $ git cat-file -p 68ec0dd > commit-msg $ sha1sum <(cat \ <(printf "commit ") \ <(wc -c < commit-msg | tr -d '\n') \ <(printf '%b' '\0') commit-msg) 68ec0dd6dead532f18082b72beeb73bd828ee8fc /dev/fd/63 Our repo’s first commit has the hash 68ec0dd. My goal is: Make 68ec0dd be BADA55. Keep the commit message the same, visibly at least. But I’ll need to change the commit to change the hash. To keep those changes invisible in the output of git log, I’ll add a \t and see what happens to the hash. $ truncate -s -1 commit-msg # remove final newline $ printf '\t\n' >> commit-msg # Add a tab $ # Check the new SHA to see if it's BADA55 $ sha1sum <(cat \ <(printf "commit ") \ <(wc -c < commit-msg | tr -d '\n') \ <(printf '%b' '\0') commit-msg) 27b22ba5e1c837a34329891c15408208a944aa24 /dev/fd/63 Success! I changed the SHA-1. Now to do this until we get to BADA55. Fortunately, user not-an-aardvark created a tool for that—lucky-commit that manipulates a commit message, adding a combination of \t and [:space:] characters until you hit a desired SHA-1. Written in rust, lucky-commit computes all 256 unique 8-bit strings composed of only tabs and spaces. And then pads out commits up to 48-bits with those strings, using worker threads to quickly compute the SHA-12 of each commit. It’s pretty fast: $ time lucky_commit BADA555 real 0m0.091s user 0m0.653s sys 0m0.007s $ git log --oneline bada555 (HEAD -> main) Initial commit $ xxd -c1 <(git cat-file -p 68ec0dd) | grep -cPo ': (20|09)' 12 $ xxd -c1 <(git cat-file -p HEAD) | grep -cPo ': (20|09)' 111 Now we have an more than an initial commit. We have a BADA555 initial commit. All that’s left to do is to make ALL our commits BADA55 by abusing git hooks. $ cat > .git/hooks/post-commit && chmod +x .git/hooks/post-commit #!/usr/bin/env bash echo 'L337-ifying!' lucky_commit BADA55 $ echo 'A repo that is very l33t.' >> README.md && git commit -a -m 'l33t' L337-ifying! [main 0e00cb2] l33t 1 file changed, 1 insertion(+) $ git log --oneline bada552 (HEAD -> main) l33t bada555 Initial commit And now I have a git repo almost as cool as the sweet “S” I drew in middle school. This is a Northern Exposure spin off, right? I’ve only seen 1:48 of the show…↩︎ or SHA-256 for repos that have made the jump to a more secure hash function↩︎

8 months ago 63 votes
The Pull Request

A brief and biased history. Oh yeah, there’s pull requests now – GitHub blog, Sat, 23 Feb 2008 When GitHub launched, it had no code review. Three years after launch, in 2011, GitHub user rtomayko became the first person to make a real code comment, which read, in full: “+1”. Before that, GitHub lacked any way to comment on code directly. Instead, pull requests were a combination of two simple features: Cross repository compare view – a feature they’d debuted in 2010—git diff in a web page. A comments section – a feature most blogs had in the 90s. There was no way to thread comments, and the comments were on a different page than the diff. GitHub pull requests circa 2010. This is from the official documentation on GitHub. Earlier still, when the pull request debuted, GitHub claimed only that pull requests were “a way to poke someone about code”—a way to direct message maintainers, but one that lacked any web view of the code whatsoever. For developers, it worked like this: Make a fork. Click “pull request”. Write a message in a text form. Send the message to someone1 with a link to your fork. Wait for them to reply. In effect, pull requests were a limited way to send emails to other GitHub users. Ten years after this humble beginning—seven years after the first code comment—when Microsoft acquired GitHub for $7.5 Billion, this cobbled-together system known as “GitHub flow” had become the default way to collaborate on code via Git. And I hate it. Pull requests were never designed. They emerged. But not from careful consideration of the needs of developers or maintainers. Pull requests work like they do because they were easy to build. In 2008, GitHub’s developers could have opted to use git format-patch instead of teaching the world to juggle branches. Or they might have chosen to generate pull requests using the git request-pull command that’s existed in Git since 2005 and is still used by the Linux kernel maintainers today2. Instead, they shrugged into GitHub flow, and that flow taught the world to use Git. And commit histories have sucked ever since. For some reason, github has attracted people who have zero taste, don’t care about commit logs, and can’t be bothered. – Linus Torvalds, 2012 “Someone” was a person chosen by you from a checklist of the people who had also forked this repository at some point.↩︎ Though to make small, contained changes you’d use git format-patch and git am.↩︎

9 months ago 79 votes

More in programming

AI is a gamechanger for TLA+ users

New Logic for Programmers Release v0.10 is now available! This is a minor release, mostly focused on logic-based refactoring, with new material on set types and testing refactors are correct. See the full release notes at the changelog page. Due to conference pressure v0.11 will also likely be a minor release. AI is a gamechanger for TLA+ users TLA+ is a specification language to model and debug distributed systems. While very powerful, it's also hard for programmers to learn, and there's always questions of connecting specifications with actual code. That's why The Coming AI Revolution in Distributed Systems caught my interest. In the post, Cheng Huang claims that Azure successfully used LLMs to examine an existing codebase, derive a TLA+ spec, and find a production bug in that spec. "After a decade of manually crafting TLA+ specifications", he wrote, "I must acknowledge that this AI-generated specification rivals human work". This inspired me to experiment with LLMs in TLA+ myself. My goals are a little less ambitious than Cheng's: I wanted to see how LLMs could help junior specifiers write TLA+, rather than handling the entire spec automatically. Details on what did and didn't work below, but my takeaway is that LLMs are an immense specification force multiplier. All tests were done with a standard VSCode Copilot subscription, writing Claude 3.7 in Agent mode. Other LLMs or IDEs may be more or less effective, etc. Things Claude was good at Fixing syntax errors TLA+ uses a very different syntax than mainstream programming languages, meaning beginners make a lot of mistakes where they do a "programming syntax" instead of TLA+ syntax: NotThree(x) = \* should be ==, not = x != 3 \* should be #, not != The problem is that the TLA+ syntax checker, SANY, is 30 years old and doesn't provide good information. Here's what it says for that snippet: Was expecting "==== or more Module body" Encountered "NotThree" at line 6, column 1 That only isolates one error and doesn't tell us what the problem is, only where it is. Experienced TLA+ users get "error eyes" and can quickly see what the problem is, but beginners really struggle with this. The TLA+ foundation has made LLM integration a priority, so the VSCode extension naturally supports several agents actions. One of these is running SANY, meaning an agent can get an error, fix it, get another error, fix it, etc. Provided the above sample and asked to make it work, Claude successfully fixed both errors. It also fixed many errors in a larger spec, as well as figure out why PlusCal specs weren't compiling to TLA+. This by itself is already enough to make LLMs a worthwhile tool, as it fixes one of the biggest barriers to entry. Understanding error traces When TLA+ finds a violated property, it outputs the sequence of steps that leads to the error. This starts in plaintext, and VSCode parses it into an interactive table: Learning to read these error traces is a skill in itself. You have to understand what's happening in each step and how it relates back to the actually broken property. It takes a long time for people to learn how to do this well. Claude was successful here, too, accurately reading 20+ step error traces and giving a high-level explanation of what went wrong. It also could condense error traces: if ten steps of the error trace could be condensed into a one-sentence summary (which can happen if you're modeling a lot of process internals) Claude would do it. I did have issues here with doing this in agent mode: while the extension does provide a "run model checker" command, the agent would regularly ignore this and prefer to run a terminal command instead. This would be fine except that the LLM consistently hallucinated invalid commands. I had to amend every prompt with "run the model checker via vscode, do not use a terminal command". You can skip this if you're willing to copy and paste the error trace into the prompt. As with syntax checking, if this was the only thing LLMs could effectively do, that would already be enough1 to earn a strong recommend. Even as a TLA+ expert I expect I'll be using this trick regularly. Boilerplate tasks TLA+ has a lot of boilerplate. One of the most notorious examples is UNCHANGED rules. Specifications are extremely precise — so precise that you have to specify what variables don't change in every step. This takes the form of an UNCHANGED clause at the end of relevant actions: RemoveObjectFromStore(srv, o, s) == /\ o \in stored[s] /\ stored' = [stored EXCEPT ![s] = @ \ {o}] /\ UNCHANGED <<capacity, log, objectsize, pc>> Writing this is really annoying. Updating these whenever you change an action, or add a new variable to the spec, is doubly so. Syntax checking and error analysis are important for beginners, but this is what I wanted for myself. I took a spec and prompted Claude Add UNCHANGED <> for each variable not changed in an action. And it worked! It successfully updated the UNCHANGED in every action. (Note, though, that it was a "well-behaved" spec in this regard: only one "action" happened at a time. In TLA+ you can have two actions happen simultaneously, that each update half of the variables, meaning neither of them should have an UNCHANGED clause. I haven't tested how Claude handles that!) That's the most obvious win, but Claude was good at handling other tedious work, too. Some examples include updating vars (the conventional collection of all state variables), lifting a hard-coded value into a model parameter, and changing data formats. Most impressive to me, though, was rewriting a spec designed for one process to instead handle multiple processes. This means taking all of the process variables, which originally have types like Int, converting them to types like [Process -> Int], and then updating the uses of all of those variables in the spec. It didn't account for race conditions in the new concurrent behavior, but it was an excellent scaffold to do more work. Writing properties from an informal description You have to be pretty precise with your intended property description but it handles converting that precise description into TLA+'s formalized syntax, which is something beginners often struggle with. Things it is less good at Generating model config files To model check TLA+, you need both a specification (.tla) and a model config file (.cfg), which have separate syntaxes. Asking the agent to generate the second often lead to it using TLA+ syntax. It automatically fixed this after getting parsing errors, though. Fixing specs Whenever the ran model checking and discovered a bug, it would naturally propose a change to either the invalid property or the spec. Sometimes the changes were good, other times the changes were not physically realizable. For example, if it found that a bug was due to a race condition between processes, it would often suggest fixing it by saying race conditions were okay. I mean yes, if you say bugs are okay, then the spec finds that bugs are okay! Or it would alternatively suggest adding a constraint to the spec saying that race conditions don't happen. But that's a huge mistake in specification, because race conditions happen if we don't have coordination. We need to specify the mechanism that is supposed to prevent them. Finding properties of the spec After seeing how capable it was at translating my properties to TLA+, I started prompting Claude to come up with properties on its own. Unfortunately, almost everything I got back was either trivial, uninteresting, or too coupled to implementation details. I haven't tested if it would work better to ask it for "properties that may be violated". Generating code from specs I have to be specific here: Claude could sometimes convert Python into a passable spec, an vice versa. It wasn't good at recognizing abstraction. For example, TLA+ specifications often represent sequential operations with a state variable, commonly called pc. If modeling code that nonatomically retrieves a counter value and increments it, we'd have one action that requires pc = "Get" and sets the new value to "Inc", then another that requires it be "Inc" and sets it to "Done". I found that Claude would try to somehow convert pc into part of the Python program's state, rather than recognize it as a TLA+ abstraction. On the other side, when converting python code to TLA+ it would often try to translate things like sleep into some part of the spec, not recognizing that it is abstractable into a distinct action. I didn't test other possible misconceptions, like converting randomness to nondeterminism. For the record, when converting TLA+ to Python Claude tended to make simulators of the spec, rather than possible production code implementing the spec. I really wasn't expecting otherwise though. Unexplored Applications Things I haven't explored thoroughly but could possibly be effective, based on what I know about TLA+ and AI: Writing Java Overrides Most TLA+ operators are resolved via TLA+ interpreters, but you can also implement them in "native" Java. This lets you escape the standard language semantics and add capabilities like executing programs during model-checking or dynamically constrain the depth of the searched state space. There's a lot of cool things I think would be possible with overrides. The problem is there's only a handful of people in the world who know how to write them. But that handful have written quite a few overrides and I think there's enough there for Claude to work with. Writing specs, given a reference mechanism In all my experiments, the LLM only had my prompts and the occasional Python script as information. That makes me suspect that some of its problems with writing and fixing specs come down to not having a system model. Maybe it wouldn't suggest fixes like "these processes never race" if it had a design doc saying that the processes can't coordinate. (Could a Sufficiently Powerful LLM derive some TLA+ specification from a design document?) Connecting specs and code This is the holy grail of TLA+: taking a codebase and showing it correctly implements a spec. Currently the best ways to do this are by either using TLA+ to generate a test suite, or by taking logged production traces and matching them to TLA+ behaviors. This blog post discusses both. While I've seen a lot of academic research into these approaches there are no industry-ready tools. So if you want trace validation you have to do a lot of manual labour tailored to your specific product. If LLMs could do some of this work for us then that'd really amplify the usefulness of TLA+ to many companies. Thoughts Right now, agents seem good at the tedious and routine parts of TLA+ and worse at the strategic and abstraction parts. But, since the routine parts are often a huge barrier to beginners, this means that LLMs have the potential to make TLA+ far, far more accessible than it previously was. I have mixed thoughts on this. As an advocate, this is incredible. I want more people using formal specifications because I believe it leads to cheaper, safer, more reliable software. Anything that gets people comfortable with specs is great for our industry. As a professional TLA+ consultant, I'm worried that this obsoletes me. Most of my income comes from training and coaching, which companies will have far less demand of now. Then again, maybe this an opportunity to pitch "agentic TLA+ training" to companies! Anyway, if you're interested in TLA+, there has never been a better time to try it. I mean it, these tools handle so much of the hard part now. I've got a free book available online, as does the inventor of TLA+. I like this guide too. Happy modeling! Dayenu. ↩

10 hours ago 1 votes
The future of react-router just got a lot brighter (tip)

Today we go over the recent announcement blog post of react-router where they talk about their open governance model and what the future will look like

5 hours ago 1 votes
Notes from Andreas Fredriksson’s “Context is Everything”

I quite enjoyed this talk. Some of the technical details went over my head (I don’t know what “split 16-bit mask into two 8-bit LTUs” means) but I could still follow the underlying point. First off, Andreas has a great story at the beginning about how he has a friend with a browser bookmarklet that replaces every occurrence of the word “dependency” with the word “liability”. Can you imagine npm working that way? Inside package.json: { "liabilities": { "react": "^19.0.0", "typescript": "^5.0.0" }, "devLiabilities": {...} } But I digress, back to Andreas. He points out that the context of your problems and the context of someone else’s problems do not overlap as often as we might think. It’s so unlikely that someone else tried to solve exactly our same problem with exactly our same constraints that [their solution or abstraction] will be the most economical or the best choice for us. It might be ok, but it won’t be the best thing. So while we immediately jump to tools built by others, the reality is that their tools were built for their problems and therefore won’t overlap with our problems as much or as often as we’re led to believe. In Andreas’ example, rather than using a third-party library to parse JSON and turn it into something, he writes his own bespoke parser for the problem at hand. His parser ignores a whole swath of abstractions a more generalized parser solves for, and guess what? His is an order of magnitude faster! Solving problems in the wrong domain and then glueing things together is always much, much worse [in terms of performance] than solving for what you actually need to solve. It’s fun watching him step through the performance gains as he goes from a generalized solution to one more tailored to his own specific context. What really resonates in his step-by-step process is how, as problems present themselves, you see how much easier it is to deal with performance issues for stuff you wrote vs. stuff others wrote. Not only that, but you can debug way faster! (Just think of the last time you tried to debug a file 1) you wrote, vs. 2) one you vendored vs. 3) one you installed deep down in node_modules somewhere.) Andreas goes from 41MB/s throughput to 1.25GB/s throughput without changing the behavior of the program. He merely removed a bunch of generalized abstractions he wasn’t using and didn’t need. Surprise, surprise: not doing unnecessary things is faster! You should always consider the unique context of your situation and weigh trade-offs. A “generic” solution means a solution “not tuned for your use case”. Email · Mastodon · Bluesky

2 days ago 2 votes
Adding graphics support to DandeGUI

<![CDATA[DandeGUI now does graphics and this is what it looks like. Some text and graphics output windows created with DandeGUI on Medley Interlisp. In addition to the square root table text output demo, I created the other graphics windows with the newly implemented functionality. For example, this code draws the random circles of the top window: (DEFUN RANDOM-CIRCLES (&KEY (N 200) (MAX-R 50) (WIDTH 640) (HEIGHT 480)) (LET ((RANGE-X (- WIDTH ( 2 MAX-R))) (RANGE-Y (- HEIGHT ( 2 MAX-R))) (SHADES (LIST IL:BLACKSHADE IL:GRAYSHADE (RANDOM 65536)))) (DANDEGUI:WITH-GRAPHICS-WINDOW (STREAM :TITLE "Random Circles") (DOTIMES (I N) (DECLARE (IGNORE I)) (IL:FILLCIRCLE (+ MAX-R (RANDOM RANGE-X)) (+ MAX-R (RANDOM RANGE-Y)) (RANDOM MAX-R) (ELT SHADES (RANDOM 3)) STREAM))))) GUI:WITH-GRAPHICS-WINDOW, GUI:OPEN-GRAPHICS-STREAM, and GUI:WITH-GRAPHICS-STREAM are the main additions. These functions and macros are the equivalent for graphics of what GUI:WITH-OUTPUT-TO-WINDOW, GUI:OPEN-WINDOW-STREAM, and GUI:WITH-WINDOW-STREAM, respectively, do for text. The difference is the text facilities send output to TEXTSTREAM streams whereas the graphics facilities to IMAGESTREAM, a type of device-independent graphics streams. Under the hood DandeGUI text windows are customized TEdit windows with an associated TEXTSTREAM. TEdit is the rich text editor of Medley Interlisp. Similarly, the graphics windows of DandeGUI run the Sketch line drawing editor under the hood. Sketch windows have an IMAGESTREAM which Interlisp graphics primitives like IL:DRAWLINE and IL:DRAWPOINT accept as an output destination. DandeGUI creates and manages Sketch windows with the type of stream the graphics primitives require. In other words, IMAGESTREAM is to Sketch what TEXTSTREAM is to TEdit. The benefits of programmatically using Sketch for graphics are the same as TEdit windows for text: automatic window repainting, scrolling, and resizing. The downside is overhead. Scrolling more than a few thousand graphics elements is slow and adding even more may crash the system. However, this is an acceptable tradeoff. The new graphics functions and macros work similarly to the text ones, with a few differences. First, DandeGUI now depends on the SKETCH and SKETCH-STREAM library modules which it automatically loads. Since Sketch has no notion of a read-only drawing area GUI:OPEN-GRAPHICS-STREAM achieves the same effect by other means: (DEFUN OPEN-GRAPHICS-STREAM (&KEY (TITLE "Untitled")) "Open a new window and return the associated IMAGESTREAM to send graphics output to. Sets the window title to TITLE if supplied." (LET ((STREAM (IL:OPENIMAGESTREAM '|Untitled| 'IL:SKETCH '(IL:FONTS ,DEFAULT-FONT*))) (WINDOW (IL:\\SKSTRM.WINDOW.FROM.STREAM STREAM))) (IL:WINDOWPROP WINDOW 'IL:TITLE TITLE) ;; Disable left and middle-click title bar menu (IL:WINDOWPROP WINDOW 'IL:BUTTONEVENTFN NIL) ;; Disable sketch editing via right-click actions (IL:WINDOWPROP WINDOW 'IL:RIGHTBUTTONFN NIL) ;; Disable querying the user whether to save changes (IL:WINDOWPROP WINDOW 'IL:DONTQUERYCHANGES T) STREAM)) Only the mouse gestures and commands of the middle-click title bar menu and the right-click menu change the drawing area interactively. To disable these actions GUI:OPEN-GRAPHICS-STREAM removes their menu handlers by setting to NIL the window properties IL:BUTTONEVENTFN and IL:RIGHTBUTTONFN. This way only programmatic output can change the drawing area. The function also sets IL:DONTQUERYCHANGES to T to prevent querying whether to save the changes at window close. By design output to DandeGUI windows is not permanent, so saving isn't necessary. GUI:WITH-GRAPHICS-STREAM and GUI:WITH-GRAPHICS-WINDOW are straightforward: (DEFMACRO WITH-GRAPHICS-STREAM ((VAR STREAM) &BODY BODY) "Perform the operations in BODY with VAR bound to the graphics window STREAM. Evaluates the forms in BODY in a context in which VAR is bound to STREAM which must already exist, then returns the value of the last form of BODY." `(LET ((,VAR ,STREAM)) ,@BODY)) (DEFMACRO WITH-GRAPHICS-WINDOW ((VAR &KEY TITLE) &BODY BODY) "Perform the operations in BODY with VAR bound to a new graphics window stream. Creates a new window titled TITLE if supplied, binds VAR to the IMAGESTREAM associated with the window, and executes BODY in this context. Returns the value of the last form of BODY." `(WITH-GRAPHICS-STREAM (,VAR (OPEN-GRAPHICS-STREAM :TITLE (OR ,TITLE "Untitled"))) ,@BODY)) Unlike GUI:WITH-TEXT-STREAM and GUI:WITH-TEXT-WINDOW, which need to call GUI::WITH-WRITE-ENABLED to establish a read-only environment after every output operation, GUI:OPEN-GRAPHICS-STREAM can do this only once at window creation. GUI:CLEAR-WINDOW, GUI:WINDOW-TITLE, and GUI:PRINT-MESSAGE now work with graphics streams in addition to text streams. For IMAGESTREAM arguments GUI:PRINT-MESSAGE prints to the system prompt window as Sketch stream windows have no prompt area. The random circles and fractal triangles graphics demos round up the latest additions. #DandeGUI #CommonLisp #Interlisp #Lisp a href="https://remark.as/p/journal.paoloamoroso.com/adding-graphics-support-to-dandegui"Discuss.../a Email | Reply @amoroso@oldbytes.space !--emailsub--]]>

2 days ago 4 votes
Exploring the web in 1995

By the end of 1995, the web moved outward and into the hands of everyone. The post Exploring the web in 1995 appeared first on The History of the Web.

3 days ago 5 votes