More from charity.wtf
This article was originally commissioned by Luca Rossi (paywalled) for refactoring.fm, on February 11th, 2025. Luca edited a version of it that emphasized the importance of building “10x engineering teams” . It was later picked up by IEEE Spectrum (!!!), who scrapped most of the teams content and published a different, shorter piece on March […]
A few eagle-eyed readers have noticed that it’s been 4 weeks since my last entry in what I have been thinking of as my “niblet series” — one small piece per week, 1000 words or less, for the next three months. This is true. However, I did leave myself some wiggle room in my original […]
Hi friends! We’re on week three of my 12-week practice in writing one bite-sized topic per week — scoping it down, writing straight through, trying real hard to avoid over-writing or editing down to a pulp. Week 1 — “On Writing, Social Media, and Finding the Line of Embarrassment” Week 2 — “On Dropouts and […]
In my early twenties I had a cohort of friends and coworkers, all Silicon Valley engineers, all quite good at their jobs, all college dropouts. We developed a shared conviction that only losers got computer science degrees. This sounds like a joke, or a self-defense mechanism, but it was neither. We were serious. We held […]
Brace yourself, because I’m about to utter a sequence of words I never thought I would hear myself say: I really miss posting on Twitter. I really, really miss it. It’s funny, because Twitter was never not a trash fire. There was never a time when it felt like we were living through some kind […]
More in programming
The use of std::string should be banned in C++ code bases. I’m sure this statement sounds like heresy and you want to burn me at stake. But is it really controversial? Java, C#, Go, JavaScript, Python, Ruby, PHP: they all have immutable strings that are basically 2 machine words: a pointer to string data and size of the string. If they have an equivalent of std:string it’s something like StringBuilder. C++ should also use immutable strings in 97% of situations. The problem is gravity: the existing code, the culture. They all pull you strongly towards std::string and going against the current is the hardest thing there is. There isn’t a standard type for that. You can use newish std::span<char*> but there really should be std::str (or some such). I did that in SumatraPDF where I mostly pass char* but I don’t expect many other C++ code bases to switch away from std::string.
This article was originally commissioned by Luca Rossi (paywalled) for refactoring.fm, on February 11th, 2025. Luca edited a version of it that emphasized the importance of building “10x engineering teams” . It was later picked up by IEEE Spectrum (!!!), who scrapped most of the teams content and published a different, shorter piece on March […]
Go team wrote golang.org/x/sys/windows package to call functions in a Windows DLL. Their way is inefficient and this article describes a better way. The sys/windows way To call a function in a DLL, let’s say kernel32.dll, we must: load the dll into memory with LoadLibrary get the address of a function in the dll call the function at that address Here’s how it looks when you use sys/windows library: var ( libole32 *windows.LazyDLL coCreateInstance *windows.LazyProc ) func init() { libole32 = windows.NewLazySystemDLL("ole32.dll") coCreateInstance = libole32.NewProc("CoCreateInstance") } func CoCreateInstance(rclsid *GUID, pUnkOuter *IUnknown, dwClsContext uint32, riid *GUID, ppv *unsafe.Pointer) HRESULT { ret, _, _ := syscall.SyscallN(coCreateInstance.Addr(), 5, uintptr(unsafe.Pointer(rclsid)), uintptr(unsafe.Pointer(pUnkOuter)), uintptr(dwClsContext), uintptr(unsafe.Pointer(riid)), uintptr(unsafe.Pointer(ppv)), 0, ) return HRESULT(ret) } The problem The problem is that this is memory inefficient. For every function all we need is: name of the function to get its address in a dll. That is a string so its 8 bytes (address of the string) + 8 bytes (size of the string) + the content of the string. address of a function, which is 8 bytes on a 64-bit CPU Unfortunately in sys/windows each function requires this: type LazyProc struct { Name string mu sync.Mutex l *LazyDLL proc *Proc } type Proc struct { Dll *DLL Name string addr uintptr } // sync.Mutex type Mutex struct { _ noCopy mu isync.Mutex } // isync.Mutex type Mutex struct { state int32 sema uint32 } Let’s eyeball the size of all those structures: LazyProc : 16 + sizeof(Mutex) + 8 + 8 = 32 + sizeof(Mutex) Proc : 8 + 16 + 8 = 32 Mutex : 8 Total: 32 + 32 + 8 = 72 and that’s not counting possible memory padding for allocations. Windows has a lot of functions so this adds up. Additionally, at startup we call NewProcfor every function, even if they are not used by the program. This increases startup time. The better way What we ultimately need is uintptr for the address of the function. It’ll be lazily looked up. Let’s say we use 8 functions from ole32.dll. We can use a single array of uintptr values for storing function pointers: var oleFuncPtrs = [8]uintptr var oleFuncNames = []string{"CoCreateInstance", "CoGetClassObject", ... } const kCoCreateInstance = 0 const kCoGetClassObject = 1 // etc. const kFuncMissing = 1 func funcAddrInDLL(dll *windows.LazyDLL, funcPtrs []uintptr, funcIdx int, funcNames []string) uintptr { addr := funcPtrs[funcIdx]; if addr == kFuncMissing { // we already tried to look it up and didn't find it // this can happen becuse older version of Windows might not implement this function return 0 } if addr != 0 { return addr } // lookup the funcion by name in dll name := funcNames[funcIdx] /// ... return addr } In real life this would need multi-threading protection with e.g. a mutex. Saving on strings The following is not efficient: var oleFuncNames = []string{"CoCreateInstance", "CoGetClassObject", ... } In addition to the text of the string Go needs 16 bytes: 8 for a pointer to the string and 8 for the size of the string. We can be more efficient by storing all names as a single string: var oleFuncNames ` CoCreateInstance CoGetClassObject ` Only when we’re looking up the function by name we need to construct temporary string that is a slice of oleFuncNames. We need to know the offset and size inside oleFuncNames which we can cleverly encode as a single number: // Auto-generated shell procedure identifier: cache index | str start | str past-end. const ( _PROC_SHCreateItemFromIDList _PROC_SHELL = 0 | (9 << 16) | (31 << 32) _PROC_SHCreateItemFromParsingName _PROC_SHELL = 1 | (32 << 16) | (59 << 32) // ... ) We pack the info into a single number: bits 0-15 : index of function in array of function pointers bits 16-31: start of function name in multi-name string bits 32-47: end of function name in multi-name string This technique requires code generation. It would be too difficult to write those numbers manually. References This technique is used in https://github.com/rodrigocfd/windigo win32 bindings Go library. See e.g. https://github.com/rodrigocfd/windigo/blob/master/internal/dll/dll_gdi.go
How a wild side-quest became the source of many of the articles you’ve read—and have come to expect—in this publication
Watch now | Privilege levels, syscall conventions, and how assembly code talks to the Linux kernel