Optimizing calling Windows DLL functions in Go

from Krzysztof Kowalczyk blog [alt+shift+b] in programming

Go team wrote golang.org/x/sys/windows package to call functions in a Windows DLL. Their way is inefficient and this article describes a better way. The sys/windows way To call a function in a DLL, let’s say kernel32.dll, we must: load the dll into memory with LoadLibrary get the address of a function in the dll call the function at that address Here’s how it looks when you use sys/windows library: var ( libole32 *windows.LazyDLL coCreateInstance *windows.LazyProc ) func init() { libole32 = windows.NewLazySystemDLL("ole32.dll") coCreateInstance = libole32.NewProc("CoCreateInstance") } func CoCreateInstance(rclsid *GUID, pUnkOuter *IUnknown, dwClsContext uint32, riid *GUID, ppv *unsafe.Pointer) HRESULT { ret, _, _ := syscall.SyscallN(coCreateInstance.Addr(), 5, uintptr(unsafe.Pointer(rclsid)), uintptr(unsafe.Pointer(pUnkOuter)), uintptr(dwClsContext), uintptr(unsafe.Pointer(riid)), uintptr(unsafe.Pointer(ppv)), 0, ) return HRESULT(ret) } The...

a month ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Krzysztof Kowalczyk blog

Fixing Zed's debugger keybindings

Zed has a good debugger built-in but their keybindings are crazy. Thankfully, you can remap them by opening menu Zed / Settings / Open key bindings and adding this to the file: { "use_key_equivalents": true, "bindings": { "f11": "debugger::StepInto", "f10": "debugger::StepOver" // "shift shift": "file_finder::Toggle" } }, Those are standard Windows debugging keybindings. F7 and Ctrl + F11 just don’t make sense to me.

2 weeks ago • 13 votes

Ideas for faster web dev cycle

I strongly believe that fast iteration cycle is important for productivity. In programming fast iteration means: how quickly can I go from finishing writing the code to testing it. That’s why I don’t like languages that compile slowly: slow compilation puts a hard limit on your iteration cycle. That’s a general principle. How does it translate to actions? I’m glad you asked. Simulated errors I’m writing a web app and some code paths might fail e.g. server returns an error response. When that happens I want to display an error message to the user. I want to test that but server doesn’t typically return errors. I can modify the server to return error and restart it but in a compiled language like Go it’s a whole thing. Instead I can force error condition in the code. Because web dev typically offers hot reload of code, I can modify the code to pretend the request failed, save the code, reload the app and I’m testing the error handling. To make it less ad-hoc another strategy is to have debug flags on window object e.g.: window.debug.simulateError = false Then the code will have: try { if (window.debug.simulateError) { throw Error("fetch failed"); } let rsp = await fetch(...); } That way I can toggle window.debug.simulateError in dev tools console, without changing the code. I have to repeat this code for every fetch(). More principled approach is: async function myFetch(uri, opts) { if (window.debug.simulateError) { throw new Error("featch failed"); } return await fetch(uri, opts); } To go even further, we could simulate different kinds of network errors: Response.ok is false response is 404 or 500 failed to reach the server We can change simulateError from bool to a number and have: async function myFetch(uri, opts) { let se = window.debug.simulateError; if (se === 1) { // simulate Response.ok is false return ...; } if (se === 2) { // simulate 404 response return; } if (se === 3) { // simulate network offline } return await fetch(uri, opts); } Start by show dialog Let’s say I’m working on a new dialog e.g. rename dialog. To get to that dialog I have to perform some UI action e.g. use context menu and click the Rename note menu item. Not a big deal but I’m still working on the dialog so it’s a bit annoying to repeat that UI action every time I move the button to the right, to see how it’ll look. In Svelte we do: let showingRenameDialog = $state(false); function showRenameDialog() { showingRenameDialog = true; } {#if showingRenameDialog} <RenameDialog ...></RenameDialog> {'if} To speed up dev cycle: let showingRenameDialog = $state(true); // show while we're working ont While I’m still designing and writing code for the dialog, show it by default. That way app reloads due to changing code won’t require you to manually redo UI actions to trigger the dialog. Ad-hoc test code Let’s say you’re writing a non-trivial code server code to process a request. Let’s say the request is a POST with body containing zip file with images which the server needs to unzip, resize the files, save them to a file system. You want to test it as you implement the logic but iteration cycle is slow: you write the code recompile and restart the server go through UI actions to send the request Most of the code (unpacking zip file, resizing images) can be tested without doing the request. So you isolate the function: func resizeImagesInZip(zipData []byte) error { // the code you're writing on } Now you have to trigger it easily. The simplest way is ad-hoc test code: func main() { if true { zipData := loadTestZipFileMust() resizeImagesInZip(zipData); return; } } While you’re working on the code, the server just runs the test code. When you’re done, you switch it off: func main() { if false { // leave the code so that you can re-enable it // easily in the future zipData := loadTestZipFileMust() resizeImagesInZip(zipData); return; } } Those are few tactical tips to increase dev cycles. You can come up with more such ideas by asking yourself: how can I speed iteration cycle?

2 weeks ago • 12 votes

AltTab for Mac OS

AltTab solves a simple problem really well: it brings Windows-style Alt + Tab window switching to Mac OS. That’s it. That’s all there is to it. I think Windows way of Alt - tabbing windows is better than Mac’s. It’s a must for someone like me, who switches between using Windows and Mac OS. That small difference drove me mad when using Mac so super thankful someone wrote an app that fixes it. It’s free and open source so no excuse to not to use it.

2 weeks ago • 12 votes

Stage manager in Mac OS

Stage Manager in Mac OS is not a secret, but I’ve only learned about it recently. It’s off by default so you have to enable it in system settings: It’s hard to describe in words, so you’ll have to try it. And experiment a bit because I didn’t get in the first hour. It’s a certain way to manage windows for less clutter. Imagine you have a browser, an editor and a terminal i.e. 3 windows. Might be annoying to have them all shown on screen. When Stage Manager is enabled, thumbnails of windows are on the left edge of the screen and you can switch to the window by clicking the thumbnail. By default Stage Manager shows each window by itself on screen. You can group them by dragging thumbnail onto screen. For example, when I’m developing, I’m using text editor and terminal so I will group them so they are both visible on screen at the same time, but not the other apps. So far I’m enjoying using Stage Manager.

2 weeks ago • 13 votes

Zed debug setup for go server / Svelte web app

Today I figured out how to setup Zed to debug, at the same time, my go server and Svelte web app. My dev setup for working on my web app is: go server is run with -run-dev arg go server provides backend apis and proxies requests it doesn’t handle to a vite dev server that does the serving of JavaScript etc. files from my Svelte code go server in -run-dev mode automatically launches vite dev go server runs on port 9339 It’s possible to setup Zed to debug both server and frontend JavaScript code. In retrospect it’s simple, but took me a moment to figure out. I needed to create the following .zed/debug.json: // Project-local debug tasks // // For more documentation on how to configure debug tasks, // see: https://zed.dev/docs/debugger [ { "adapter": "Delve", "label": "run go server", "request": "launch", "mode": "debug", "program": ".", "cwd": "${ZED_WORKTREE_ROOT}", "args": ["-run-dev", "-no-open"], "buildFlags": [], "env": {} }, { "adapter": "JavaScript", "label": "Debug in Chrome", "type": "chrome", "request": "launch", "url": "http://localhost:9339/", "webRoot": "$ZED_WORKTREE_ROOT/src", "console": "integratedTerminal", "skipFiles": ["<node_internals>/**"] } ] It’s mostly self-exploratory. First entry tells Zed to build go program with go build . and run the resulting executable under the debugger with -run-dev -no-open args. Second entry tells to launch Chrome in debug mode with http://localhost:9339/ and that files seen by Chrome come from src/ directory i.e. if browser loads /foo.js the source file is src/foo.js. This is necessary to be able to set breakpoints in Zed and have them propagate to Chrome. This eliminates the need for terminal so I can edit and debug with just Zed and Chrome. This is a great setup. I’m impressed with Zed.

2 weeks ago • 15 votes

More in programming

YouTube has earned its crown

I often give Google a lot of shit for shutting down services whenever they're bored, hire a new executive, or face a three-day weekend. The company seems institutionally incapable of standing behind the majority of the products they launch for longer than a KPI cycle. But when the company does decide that something is pivotal to the business, it's an entirely different story. And that's the tale of YouTube: The King of Internet Archives (Video Edition). I've just revived my YouTube channel after realizing just how often video has become my go-to for learning. This entire Linux adventure I've gotten myself into started by watching YouTube creators like ThePrimeagen, Typecraft, and Bread on Penguins. I learned about mechanical keyboards from Hipyo Tech. Devoured endless mini PC reviews from Level1Techs and Robtech. Oh, and took a side quest into retro gaming handhelds with Retro Game Corps. But it was when putting together the playlists for my own channel that YouTube's royal role in internet archival really stood out. Like with the original Rails Demo from 19 years ago(!), the infamous talk at Startup School from 2009, or my very first RailsConf keynote from 2006. You'd be hard-pressed to find any video content on the internet from those days anywhere else. I notice that with podcast appearances from even just a few years ago that have gone missing already. Decentralization is wonderful in many ways, but it's very much subject to link rot and disappearing content. I love how you can pull in videos from other channels onto your own page as well. I've gathered up a bunch of the many podcast appearances I've done, and even dedicated an entire playlist to the 69(!!) clips from the Lex Fridman interview. The majority of the RailsConf and Rails World keynotes are on a list. So is the old On Writing Software Well series that I keep meaning to bring back. When you're working in small tech, it's really easy to become so jaded with big tech that you become ideologically blind to the benefits they do bring. I find no inconsistency in cheering much of the antitrust agenda against Google while also celebrating their work on Chrome or their stewardship of YouTube. Any company as large as Google is bound to be full of contradictions, ambitions, and behaviors. We ought to have the capacity to cheer for the good parts and boo at the bad parts without feeling like frauds. So today, I choose to cheer for YouTube. It's an international treasure of learning, enthusiasm, and discovery.

23 hours ago • 4 votes

Writing: Blog Posts and Songs

I was listening to a podcast interview with the Jackson Browne (American singer/songwriter, political activist, and inductee into the Rock and Roll Hall of Fame) and the interviewer asks him how he approaches writing songs with social commentaries and critiques — something along the lines of: “How do you get from the New York Times headline on a social subject to the emotional heart of a song that matters to each individual?” Browne discusses how if you’re too subtle, people won’t know what you’re talking about. And if you’re too direct, you run the risk of making people feel like they’re being scolded. Here’s what he says about his songwriting: I want this to sound like you and I were drinking in a bar and we’re just talking about what’s going on in the world. Not as if you’re at some elevated place and lecturing people about something they should know about but don’t but [you think] they should care. You have to get to people where [they are, where] they do care and where they do know. I think that’s a great insight for anyone looking to have a connecting, effective voice. I know for me, it’s really easily to slide into a lecturing voice — you “should” do this and you “shouldn’t” do that. But I like Browne’s framing of trying to have an informal, conversational tone that meets people where they are. Like you’re discussing an issue in the bar, rather than listening to a sermon. Chris Coyier is the canonical example of this that comes to mind. I still think of this post from CSS Tricks where Chris talks about how to have submit buttons that go to different URLs: When you submit that form, it’s going to go to the URL /submit. Say you need another submit button that submits to a different URL. It doesn’t matter why. There is always a reason for things. The web is a big place and all that. He doesn’t conjure up some universally-applicable, justified rationale for why he’s sharing this method. Nor is there any pontificating on why this is “good” or “bad”. Instead, like most of Chris’ stuff, I read it as a humble acknowledgement of the practicalities at hand — “Hey, the world is a big place. People have to do crafty things to make their stuff work. And if you’re in that situation, here’s something that might help what ails ya.” I want to work on developing that kind of a voice because I love reading voices like that. Email · Mastodon · Bluesky

2 days ago • 4 votes

Doing versus Delegating

A staff+ skill

2 days ago • 7 votes

p-fast trie, but smaller

Previously, I wrote some sketchy ideas for what I call a p-fast trie, which is basically a wide fan-out variant of an x-fast trie. It allows you to find the longest matching prefix or nearest predecessor or successor of a query string in a set of names in O(log k) time, where k is the key length. My initial sketch was more complicated and greedy for space than necessary, so here’s a simplified revision. (“p” now stands for prefix.) layout A p-fast trie stores a lexicographically ordered set of names. A name is a sequence of characters from some small-ish character set. For example, DNS names can be represented as a set of about 50 letters, digits, punctuation and escape characters, usually one per byte of name. Names that are arbitrary bit strings can be split into chunks of 6 bits to make a set of 64 characters. Every unique prefix of every name is added to a hash table. An entry in the hash table contains: A shared reference to the closest name lexicographically greater than or equal to the prefix. Multiple hash table entries will refer to the same name. A reference to a name might instead be a reference to a leaf object containing the name. The length of the prefix. To save space, each prefix is not stored separately, but implied by the combination of the closest name and prefix length. A bitmap with one bit per possible character, corresponding to the next character after this prefix. For every other prefix that matches this prefix and is one character longer than this prefix, a bit is set in the bitmap corresponding to the last character of the longer prefix. search The basic algorithm is a longest-prefix match. Look up the query string in the hash table. If there’s a match, great, done. Otherwise proceed by binary chop on the length of the query string. If the prefix isn’t in the hash table, reduce the prefix length and search again. (If the empty prefix isn’t in the hash table then there are no names to find.) If the prefix is in the hash table, check the next character of the query string in the bitmap. If its bit is set, increase the prefix length and search again. Otherwise, this prefix is the answer. predecessor Instead of putting leaf objects in a linked list, we can use a more complicated search algorithm to find names lexicographically closest to the query string. It’s tricky because a longest-prefix match can land in the wrong branch of the implicit trie. Here’s an outline of a predecessor search; successor requires more thought. During the binary chop, when we find a prefix in the hash table, compare the complete query string against the complete name that the hash table entry refers to (the closest name greater than or equal to the common prefix). If the name is greater than the query string we’re in the wrong branch of the trie, so reduce the length of the prefix and search again. Otherwise search the set bits in the bitmap for one corresponding to the greatest character less than the query string’s next character; if there is one remember it and the prefix length. This will be the top of the sub-trie containing the predecessor, unless we find a longer match. If the next character’s bit is set in the bitmap, continue searching with a longer prefix, else stop. When the binary chop has finished, we need to walk down the predecessor sub-trie to find its greatest leaf. This must be done one character at a time – there’s no shortcut. thoughts In my previous note I wondered how the number of search steps in a p-fast trie compares to a qp-trie. I have some old numbers measuring the average depth of binary, 4-bit, 5-bit, 6-bit and 4-bit, 5-bit, dns qp-trie variants. A DNS-trie varies between 7 and 15 deep on average, depending on the data set. The number of steps for a search matches the depth for exact-match lookups, and is up to twice the depth for predecessor searches. A p-fast trie is at most 9 hash table probes for DNS names, and unlikely to be more than 7. I didn’t record the average length of names in my benchmark data sets, but I guess they would be 8–32 characters, meaning 3–5 probes. Which is far fewer than a qp-trie, though I suspect a hash table probe takes more time than chasing a qp-trie pointer. (But this kind of guesstimate is notoriously likely to be wrong!) However, a predecessor search might need 30 probes to walk down the p-fast trie, which I think suggests a linked list of leaf objects is a better option.

2 days ago • 4 votes

Software books I wish I could read

New Logic for Programmers Release! v0.11 is now available! This is over 20% longer than v0.10, with a new chapter on code proofs, three chapter overhauls, and more! Full release notes here. Software books I wish I could read I'm writing Logic for Programmers because it's a book I wanted to have ten years ago. I had to learn everything in it the hard way, which is why I'm ensuring that everybody else can learn it the easy way. Books occupy a sort of weird niche in software. We're great at sharing information via blogs and git repos and entire websites. These have many benefits over books: they're free, they're easily accessible, they can be updated quickly, they can even be interactive. But no blog post has influenced me as profoundly as Data and Reality or Making Software. There is no blog or talk about debugging as good as the Debugging book. It might not be anything deeper than "people spend more time per word on writing books than blog posts". I dunno. So here are some other books I wish I could read. I don't think any of them exist yet but it's a big world out there. Also while they're probably best as books, a website or a series of blog posts would be ok too. Everything about Configurations The whole topic of how we configure software, whether by CLI flags, environmental vars, or JSON/YAML/XML/Dhall files. What causes the configuration complexity clock? How do we distinguish between basic, advanced, and developer-only configuration options? When should we disallow configuration? How do we test all possible configurations for correctness? Why do so many widespread outages trace back to misconfiguration, and how do we prevent them? I also want the same for plugin systems. Manifests, permissions, common APIs and architectures, etc. Configuration management is more universal, though, since everybody either uses software with configuration or has made software with configuration. The Big Book of Complicated Data Schemas I guess this would kind of be like Schema.org, except with a lot more on the "why" and not the what. Why is important for the Volcano model to have a "smokingAllowed" field?1 I'd see this less as "here's your guide to putting Volcanos in your database" and more "here's recurring motifs in modeling interesting domains", to help a person see sources of complexity in their own domain. Does something crop up if the references can form a cycle? If a relationship needs to be strictly temporary, or a reference can change type? Bonus: path dependence in data models, where an additional requirement leads to a vastly different ideal data model that a company couldn't do because they made the old model. (This has got to exist, right? Business modeling is a big enough domain that this must exist. Maybe The Essence of Software touches on this? Man I feel bad I haven't read that yet.) Computer Science for Software Engineers Yes, I checked, this book does not exist (though maybe this is the same thing). I don't have any formal software education; everything I know was either self-taught or learned on the job. But it's way easier to learn software engineering that way than computer science. And I bet there's a lot of other engineers in the same boat. This book wouldn't have to be comprehensive or instructive: just enough about each topic to understand why it's an area of study and appreciate how research in it eventually finds its way into practice. MISU Patterns MISU, or "Make Illegal States Unrepresentable", is the idea of designing system invariants in the structure of your data. For example, if a Contact needs at least one of email or phone to be non-null, make it a sum type over EmailContact, PhoneContact, EmailPhoneContact (from this post). MISU is great. Most MISU in the wild look very different than that, though, because the concept of MISU is so broad there's lots of different ways to achieve it. And that means there are "patterns": smart constructors, product types, properly using sets, newtypes to some degree, etc. Some of them are specific to typed FP, while others can be used in even untyped languages. Someone oughta make a pattern book. My one request would be to not give them cutesy names. Do something like the Aarne–Thompson–Uther Index, where items are given names like "Recognition by manner of throwing cakes of different weights into faces of old uncles". Names can come later. The Tools of '25 Not something I'd read, but something to recommend to junior engineers. Starting out it's easy to think the only bit that matters is the language or framework and not realize the enormous amount of surrounding tooling you'll have to learn. This book would cover the basics of tools that enough developers will probably use at some point: git, VSCode, very basic Unix and bash, curl. Maybe the general concepts of tools that appear in every ecosystem, like package managers, build tools, task runners. That might be easier if we specialize this to one particular domain, like webdev or data science. Ideally the book would only have to be updated every five years or so. No LLM stuff because I don't expect the tooling will be stable through 2026, to say nothing of 2030. A History of Obsolete Optimizations Probably better as a really long blog series. Each chapter would be broken up into two parts: A deep dive into a brilliant, elegant, insightful historical optimization designed to work within the constraints of that era's computing technology What we started doing instead, once we had more compute/network/storage available. c.f. A Spellchecker Used to Be a Major Feat of Software Engineering. Bonus topics would be brilliance obsoleted by standardization (like what people did before git and json were universal), optimizations we do today that may not stand the test of time, and optimizations from the past that did. Sphinx Internals I need this. I've spent so much goddamn time digging around in Sphinx and docutils source code I'm gonna throw up. Systems Distributed Talk Today! Online premier's at noon central / 5 PM UTC, here! I'll be hanging out to answer questions and be awkward. You ever watch a recording of your own talk? It's real uncomfortable! In this case because it's a field on one of Volcano's supertypes. I guess schemas gotta follow LSP too ↩

2 days ago • 9 votes

New here?