The Halting Problem is a terrible example of NP-Harder

from Computer Things [alt+shift+b] in programming

Short one this time because I have a lot going on this week. In computation complexity, NP is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be verified in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear). NP-complete is the class of "hardest possible" NP problems. Subset sum is NP-complete. NP-hard is the set all problems at least as hard as NP-complete. Notably, NP-hard is not a subset of NP, as it contains problems that are harder than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP. I...

3 months ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Computer Things

Maybe writing speed actually is a bottleneck for programming

I'm a big (neo)vim buff. My config is over 1500 lines and I regularly write new scripts. I recently ported my neovim config to a new laptop. Before then, I was using VSCode to write, and when I switched back I immediately saw a big gain in productivity. People often pooh-pooh vim (and other assistive writing technologies) by saying that writing code isn't the bottleneck in software development. Reading, understanding, and thinking through code is! Now I don't know how true this actually is in practice, because empirical studies of time spent coding are all over the place. Most of them, like this study, track time spent in the editor but don't distinguish between time spent reading code and time spent writing code. The only one I found that separates them was this study. It finds that developers spend only 5% of their time editing. It also finds they spend 14% of their time moving or resizing editor windows, so I don't know how clean their data is. But I have a bigger problem with "writing is not the bottleneck": when I think of a bottleneck, I imagine that no amount of improvement will lead to productivity gains. Like if a program is bottlenecked on the network, it isn't going to get noticeably faster with 100x more ram or compute. But being able to type code 100x faster, even with without corresponding improvements to reading and imagining code, would be huge. We'll assume the average developer writes at 80 words per minute, at five characters a word, for 400 characters a minute.What could we do if we instead wrote at 8,000 words/40k characters a minute? Writing fast Boilerplate is trivial Why do people like type inference? Because writing all of the types manually is annoying. Why don't people like boilerplate? Because it's annoying to write every damn time. Programmers like features that help them write less! That's not a problem if you can write all of the boilerplate in 0.1 seconds. You still have the problem of reading boilerplate heavy code, but you can use the remaining 0.9 seconds to churn out an extension that parses the file and presents the boilerplate in a more legible fashion. We can write more tooling This is something I've noticed with LLMs: when I can churn out crappy code as a free action, I use that to write lots of tools that assist me in writing good code. Even if I'm bottlenecked on a large program, I can still quickly write a script that helps me with something. Most of these aren't things I would have written because they'd take too long to write! Again, not the best comparison, because LLMs also shortcut learning the relevant APIs, so also optimize the "understanding code" part. Then again, if I could type real fast I could more quickly whip up experiments on new apis to learn them faster. We can do practices that slow us down in the short-term Something like test-driven development significantly slows down how fast you write production code, because you have to spend a lot more time writing test code. Pair programming trades speed of writing code for speed of understanding code. A two-order-of-magnitude writing speedup makes both of them effectively free. Or, if you're not an eXtreme Programming fan, you can more easily follow the The Power of Ten Rules and blanket your code with contracts and assertions. We could do more speculative editing This is probably the biggest difference in how we'd work if we could write 100x faster: it'd be much easier to try changes to the code to see if they're good ideas in the first place. How often have I tried optimizing something, only to find out it didn't make a difference? How often have I done a refactoring only to end up with lower-quality code overall? Too often. Over time it makes me prefer to try things that I know will work, and only "speculatively edit" when I think it be a fast change. If I could code 100x faster it would absolutely lead to me trying more speculative edits. This is especially big because I believe that lots of speculative edits are high-risk, high-reward: given 50 things we could do to the code, 49 won't make a difference and one will be a major improvement. If I only have time to try five things, I have a 10% chance of hitting the jackpot. If I can try 500 things I will get that reward every single time. Processes are built off constraints There are just a few ideas I came up with; there are probably others. Most of them, I suspect, will share the same property in common: they change the process of writing code to leverage the speedup. I can totally believe that a large speedup would not remove a bottleneck in the processes we currently use to write code. But that's because those processes are developed work within our existing constraints. Remove a constraint and new processes become possible. The way I see it, if our current process produces 1 Utils of Software / day, a 100x writing speedup might lead to only 1.5 UoS/day. But there are other processes that produce only 0.5 UoS/d because they are bottlenecked on writing speed. A 100x speedup would lead to 10 UoS/day. The problem with all of this that 100x speedup isn't realistic, and it's not obvious whether a 2x improvement would lead to better processes. Then again, one of the first custom vim function scripts I wrote was an aid to writing unit tests in a particular codebase, and it lead to me writing a lot more tests. So maybe even a 2x speedup is going to be speed things up, too. Patreon Stuff I wrote a couple of TLA+ specs to show how to model fork-join algorithms. I'm planning on eventually writing them up for my blog/learntla but it'll be a while, so if you want to see them in the meantime I put them up on Patreon.

2 days ago • 6 votes

Logic for Programmers Turns One

I released Logic for Programmers exactly one year ago today. It feels weird to celebrate the anniversary of something that isn't 1.0 yet, but software projects have a proud tradition of celebrating a dozen anniversaries before 1.0. I wanted to share about what's changed in the past year and the work for the next six+ months. The Road to 0.1 I had been noodling on the idea of a logic book since the pandemic. The first time I wrote about it on the newsletter was in 2021! Then I said that it would be done by June and would be "under 50 pages". The idea was to cover logic as a "soft skill" that helped you think about things like requirements and stuff. That version sucked. If you want to see how much it sucked, I put it up on Patreon. Then I slept on the next draft for three years. Then in 2024 a lot of business fell through and I had a lot of free time, so with the help of Saul Pwanson I rewrote the book. This time I emphasized breadth over depth, trying to cover a lot more techniques. I also decided to self-publish it instead of pitching it to a publisher. Not going the traditional route would mean I would be responsible for paying for editing, advertising, graphic design etc, but I hoped that would be compensated by much higher royalties. It also meant I could release the book in early access and use early sales to fund further improvements. So I wrote up a draft in Sphinx, compiled it to LaTeX, and uploaded the PDF to leanpub. That was in June 2024. Since then I kept to a monthly cadence of updates, missing once in November (short-notice contract) and once last month (Systems Distributed). The book's now on v0.10. What's changed? A LOT v0.1 was very obviously an alpha, and I have made a lot of improvements since then. For one, the book no longer looks like a Sphinx manual. Compare! Also, the content is very, very different. v0.1 was 19,000 words, v.10 is 31,000.1 This comes from new chapters on TLA+, constraint/SMT solving, logic programming, and major expansions to the existing chapters. Originally, "Simplifying Conditionals" was 600 words. Six hundred words! It almost fit in two pages! The chapter is now 2600 words, now covering condition lifting, quantifier manipulation, helper predicates, and set optimizations. All the other chapters have either gotten similar facelifts or are scheduled to get facelifts. The last big change is the addition of book assets. Originally you had to manually copy over all of the code to try it out, which is a problem when there are samples in eight distinct languages! Now there are ready-to-go examples for each chapter, with instructions on how to set up each programming environment. This is also nice because it gives me breaks from writing to code instead. How did the book do? Leanpub's all-time visualizations are terrible, so I'll just give the summary: 1180 copies sold, $18,241 in royalties. That's a lot of money for something that isn't fully out yet! By comparison, Practical TLA+ has made me less than half of that, despite selling over 5x as many books. Self-publishing was the right choice! In that time I've paid about $400 for the book cover (worth it) and maybe $800 in Leanpub's advertising service (probably not worth it). Right now that doesn't come close to making back the time investment, but I think it can get there post-release. I believe there's a lot more potential customers via marketing. I think post-release 10k copies sold is within reach. Where is the book going? The main content work is rewrites: many of the chapters have not meaningfully changed since 1.0, so I am going through and rewriting them from scratch. So far four of the ten chapters have been rewritten. My (admittedly ambitious) goal is to rewrite three of them by the end of this month and another three by the end of next. I also want to do final passes on the rewritten chapters; as most of them have a few TODOs left lying around. (Also somehow in starting this newsletter and publishing it I realized that one of the chapters might be better split into two chapters, so there could well-be a tenth technique in v0.11 or v0.12!) After that, I will pass it to a copy editor while I work on improving the layout, making images, and indexing. I want to have something worthy of printing on a dead tree by 1.0. In terms of timelines, I am very roughly estimating something like this: Summer: final big changes and rewrites Early Autumn: graphic design and copy editing Late Autumn: proofing, figuring out printing stuff Winter: final ebook and initial print releases of 1.0. (If you know a service that helps get self-published books "past the finish line", I'd love to hear about it! Preferably something that works for a fee, not part of royalties.) This timeline may be disrupted by official client work, like a new TLA+ contract or a conference invitation. Needless to say, I am incredibly excited to complete this book and share the final version with you all. This is a book I wished for years ago, a book I wrote because nobody else would. It fills a critical gap in software educational material, and someday soon I'll be able to put a copy on my bookshelf. It's exhilarating and terrifying and above all, satisfying. It's also 150 pages vs 50 pages, but admittedly this is partially because I made the book smaller with a larger font. ↩

a week ago • 15 votes

Logical Quantifiers in Software

I realize that for all I've talked about Logic for Programmers in this newsletter, I never once explained basic logical quantifiers. They're both simple and incredibly useful, so let's do that this week! Sets and quantifiers A set is a collection of unordered, unique elements. {1, 2, 3, …} is a set, as are "every programming language", "every programming language's Wikipedia page", and "every function ever defined in any programming language's standard library". You can put whatever you want in a set, with some very specific limitations to avoid certain paradoxes.2 Once we have a set, we can ask "is something true for all elements of the set" and "is something true for at least one element of the set?" IE, is it true that every programming language has a set collection type in the core language? We would write it like this: # all of them all l in ProgrammingLanguages: HasSetType(l) # at least one some l in ProgrammingLanguages: HasSetType(l) This is the notation I use in the book because it's easy to read, type, and search for. Mathematicians historically had a few different formats; the one I grew up with was ∀x ∈ set: P(x) to mean all x in set, and ∃ to mean some. I use these when writing for just myself, but find them confusing to programmers when communicating. "All" and "some" are respectively referred to as "universal" and "existential" quantifiers. Some cool properties We can simplify expressions with quantifiers, in the same way that we can simplify !(x && y) to !x || !y. First of all, quantifiers are commutative with themselves. some x: some y: P(x,y) is the same as some y: some x: P(x, y). For this reason we can write some x, y: P(x,y) as shorthand. We can even do this when quantifying over different sets, writing some x, x' in X, y in Y instead of some x, x' in X: some y in Y. We can not do this with "alternating quantifiers": all p in Person: some m in Person: Mother(m, p) says that every person has a mother. some m in Person: all p in Person: Mother(m, p) says that someone is every person's mother. Second, existentials distribute over || while universals distribute over &&. "There is some url which returns a 403 or 404" is the same as "there is some url which returns a 403 or some url that returns a 404", and "all PRs pass the linter and the test suites" is the same as "all PRs pass the linter and all PRs pass the test suites". Finally, some and all are duals: some x: P(x) == !(all x: !P(x)), and vice-versa. Intuitively: if some file is malicious, it's not true that all files are benign. All these rules together mean we can manipulate quantifiers almost as easily as we can manipulate regular booleans, putting them in whatever form is easiest to use in programming. Speaking of which, how do we use this in in programming? How we use this in programming First of all, people clearly have a need for directly using quantifiers in code. If we have something of the form: for x in list: if P(x): return true return false That's just some x in list: P(x). And this is a prevalent pattern, as you can see by using GitHub code search. It finds over 500k examples of this pattern in Python alone! That can be simplified via using the language's built-in quantifiers: the Python would be any(P(x) for x in list). (Note this is not quantifying over sets but iterables. But the idea translates cleanly enough.) More generally, quantifiers are a key way we express higher-level properties of software. What does it mean for a list to be sorted in ascending order? That all i, j in 0..<len(l): if i < j then l[i] <= l[j]. When should a ratchet test fail? When some f in functions - exceptions: Uses(f, bad_function). Should the image classifier work upside down? all i in images: classify(i) == classify(rotate(i, 180)). These are the properties we verify with tests and types and MISU and whatnot;1 it helps to be able to make them explicit! One cool use case that'll be in the book's next version: database invariants are universal statements over the set of all records, like all a in accounts: a.balance > 0. That's enforceable with a CHECK constraint. But what about something like all i, i' in intervals: NoOverlap(i, i')? That isn't covered by CHECK, since it spans two rows. Quantifier duality to the rescue! The invariant is equivalent to !(some i, i' in intervals: Overlap(i, i')), so is preserved if the query SELECT COUNT(*) FROM intervals CROSS JOIN intervals … returns 0 rows. This means we can test it via a database trigger.3 There are a lot more use cases for quantifiers, but this is enough to introduce the ideas! Next week's the one year anniversary of the book entering early access, so I'll be writing a bit about that experience and how the book changed. It's crazy how crude v0.1 was compared to the current version. MISU ("make illegal states unrepresentable") means using data representations that rule out invalid values. For example, if you have a location -> Optional(item) lookup and want to make sure that each item is in exactly one location, consider instead changing the map to item -> location. This is a means of implementing the property all i in item, l, l' in location: if ItemIn(i, l) && l != l' then !ItemIn(i, l'). ↩ Specifically, a set can't be an element of itself, which rules out constructing things like "the set of all sets" or "the set of sets that don't contain themselves". ↩ Though note that when you're inserting or updating an interval, you already have that row's fields in the trigger's NEW keyword. So you can just query !(some i in intervals: Overlap(new, i')), which is more efficient. ↩

2 weeks ago • 18 votes

You can cheat a test suite with a big enough polynomial

Hi nerds, I'm back from Systems Distributed! I'd heartily recommend it, wildest conference I've been to in years. I have a lot of work to catch up on, so this will be a short newsletter. In an earlier version of my talk, I had a gag about unit tests. First I showed the test f([1,2,3]) == 3, then said that this was satisfied by f(l) = 3, f(l) = l[-1], f(l) = len(l), f(l) = (129*l[0]-34*l[1]-617)*l[2] - 443*l[0] + 1148*l[1] - 182. Then I progressively rule them out one by one with more unit tests, except the last polynomial which stubbornly passes every single test. If you're given some function of f(x: int, y: int, …): int and a set of unit tests asserting specific inputs give specific outputs, then you can find a polynomial that passes every single unit test. To find the gag, and as SMT practice, I wrote a Python program that finds a polynomial that passes a test suite meant for max. It's hardcoded for three parameters and only finds 2nd-order polynomials but I think it could be generalized with enough effort. The code Full code here, breakdown below. from z3 import * # type: ignore s1, s2 = Solver(), Solver() Z3 is just the particular SMT solver we use, as it has good language bindings and a lot of affordances. As part of learning SMT I wanted to do this two ways. First by putting the polynomial "outside" of the SMT solver in a python function, second by doing it "natively" in Z3. I created two solvers so I could test both versions in one run. a0, a, b, c, d, e, f = Consts('a0 a b c d e f', IntSort()) x, y, z = Ints('x y z') t = "a*x+b*y+c*z+d*x*y+e*x*z+f*y*z+a0" Both Const('x', IntSort()) and Int('x') do the exact same thing, the latter being syntactic sugar for the former. I did not know this when I wrote the program. To keep the two versions in sync I represented the equation as a string, which I later eval. This is one of the rare cases where eval is a good idea, to help us experiment more quickly while learning. The polynomial is a "2nd-order polynomial", even though it doesn't have x^2 terms, as it has xy and xz terms. lambdamax = lambda x, y, z: eval(t) z3max = Function('z3max', IntSort(), IntSort(), IntSort(), IntSort()) s1.add(ForAll([x, y, z], z3max(x, y, z) == eval(t))) lambdamax is pretty straightforward: create a lambda with three parameters and eval the string. The string "a*x" then becomes the python expression a*x, a is an SMT symbol, while the x SMT symbol is shadowed by the lambda parameter. To reiterate, a terrible idea in practice, but a good way to learn faster. z3max function is a little more complex. Function takes an identifier string and N "sorts" (roughly the same as programming types). The first N-1 sorts define the parameters of the function, while the last becomes the output. So here I assign the string identifier "z3max" to be a function with signature (int, int, int) -> int. I can load the function into the model by specifying constraints on what z3max could be. This could either be a strict input/output, as will be done later, or a ForAll over all possible inputs. Here I just use that directly to say "for all inputs, the function should match this polynomial." But I could do more complicated constraints, like commutativity (f(x, y) == f(y, x)) or monotonicity (Implies(x < y, f(x) <= f(y))). Note ForAll takes a list of z3 symbols to quantify over. That's the only reason we need to define x, y, z in the first place. The lambda version doesn't need them. inputs = [(1,2,3), (4, 2, 2), (1, 1, 1), (3, 5, 4)] for g in inputs: s1.add(z3max(*g) == max(*g)) s2.add(lambdamax(*g) == max(*g)) This sets up the joke: adding constraints to each solver that the polynomial it finds must, for a fixed list of triplets, return the max of each triplet. for s, func in [(s1, z3max), (s2, lambdamax)]: if s.check() == sat: m = s.model() for x, y, z in inputs: print(f"max([{x}, {y}, {z}]) =", m.evaluate(func(x, y, z))) print(f"max([x, y, z]) = {m[a]}x + {m[b]}y", f"+ {m[c]}z +", # linebreaks added for newsletter rendering f"{m[d]}xy + {m[e]}xz + {m[f]}yz + {m[a0]}\n") Output: max([1, 2, 3]) = 3 # etc max([x, y, z]) = -133x + 130y + -10z + -2xy + 62xz + -46yz + 0 max([1, 2, 3]) = 3 # etc max([x, y, z]) = -17x + 16y + 0z + 0xy + 8xz + -6yz + 0 I find that z3max (top) consistently finds larger coefficients than lambdamax does. I don't know why. Practical Applications Test-Driven Development recommends a strict "red-green refactor" cycle. Write a new failing test, make the new test pass, then go back and refactor. Well, the easiest way to make the new test pass would be to paste in a new polynomial, so that's what you should be doing. You can even do this all automatically: have a script read the set of test cases, pass them to the solver, and write the new polynomial to your code file. All you need to do is write the tests! Pedagogical Notes Writing the script took me a couple of hours. I'm sure an LLM could have whipped it all up in five minutes but I really want to learn SMT and LLMs may decrease learning retention.1 Z3 documentation is not... great for non-academics, though, and most other SMT solvers have even worse docs. One useful trick I use regularly is to use Github code search to find code using the same APIs and study how that works. Turns out reading API-heavy code is a lot easier than writing it! Anyway, I'm very, very slowly feeling like I'm getting the basics on how to use SMT. I don't have any practical use cases yet, but I wanted to learn this skill for a while and glad I finally did. Caveat I have not actually read the study, for all I know it could have a sample size of three people, I'll get around to it eventually ↩

3 weeks ago • 21 votes

Solving LinkedIn Queens with SMT

No newsletter next week I’ll be speaking at Systems Distributed. My talk isn't close to done yet, which is why this newsletter is both late and short. Solving LinkedIn Queens in SMT The article Modern SAT solvers: fast, neat and underused claims that SAT solvers1 are "criminally underused by the industry". A while back on the newsletter I asked "why": how come they're so powerful and yet nobody uses them? Many experts responded saying the reason is that encoding SAT kinda sucked and they rather prefer using tools that compile to SAT. I was reminded of this when I read Ryan Berger's post on solving “LinkedIn Queens” as a SAT problem. A quick overview of Queens. You’re presented with an NxN grid divided into N regions, and have to place N queens so that there is exactly one queen in each row, column, and region. While queens can be on the same diagonal, they cannot be adjacently diagonal. (Important note: Linkedin “Queens” is a variation on the puzzle game Star Battle, which is the same except the number of stars you place in each row/column/region varies per puzzle, and is usually two. This is also why 'queens' don’t capture like chess queens.) Ryan solved this by writing Queens as a SAT problem, expressing properties like "there is exactly one queen in row 3" as a large number of boolean clauses. Go read his post, it's pretty cool. What leapt out to me was that he used CVC5, an SMT solver.2 SMT solvers are "higher-level" than SAT, capable of handling more data types than just boolean variables. It's a lot easier to solve the problem at the SMT level than at the SAT level. To show this, I whipped up a short demo of solving the same problem in Z3 (via the Python API). Full code here, which you can compare to Ryan's SAT solution here. I didn't do a whole lot of cleanup on it (again, time crunch!), but short explanation below. The code from z3 import * # type: ignore from itertools import combinations, chain, product solver = Solver() size = 9 # N Initial setup and modules. size is the number of rows/columns/regions in the board, which I'll call N below. # queens[n] = col of queen on row n # by construction, not on same row queens = IntVector('q', size) SAT represents the queen positions via N² booleans: q_00 means that a Queen is on row 0 and column 0, !q_05 means a queen isn't on row 0 col 5, etc. In SMT we can instead encode it as N integers: q_0 = 5 means that the queen on row 0 is positioned at column 5. This immediately enforces one class of constraints for us: we don't need any constraints saying "exactly one queen per row", because that's embedded in the definition of queens! (Incidentally, using 0-based indexing for the board was a mistake on my part, it makes correctly encoding the regions later really painful.) To actually make the variables [q_0, q_1, …], we use the Z3 affordance IntVector(str, n) for making n variables at once. solver.add([And(0 <= i, i < size) for i in queens]) # not on same column solver.add(Distinct(queens)) First we constrain all the integers to [0, N), then use the incredibly handy Distinct constraint to force all the integers to have different values. This guarantees at most one queen per column, which by the pigeonhole principle means there is exactly one queen per column. # not diagonally adjacent for i in range(size-1): q1, q2 = queens[i], queens[i+1] solver.add(Abs(q1 - q2) != 1) One of the rules is that queens can't be adjacent. We already know that they can't be horizontally or vertically adjacent via other constraints, which leaves the diagonals. We only need to add constraints that, for each queen, there is no queen in the lower-left or lower-right corner, aka q_3 != q_2 ± 1. We don't need to check the top corners because if q_1 is in the upper-left corner of q_2, then q_2 is in the lower-right corner of q_1! That covers everything except the "one queen per region" constraint. But the regions are the tricky part, which we should expect because we vary the difficulty of queens games by varying the regions. regions = { "purple": [(0, 0), (0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (0, 7), (0, 8), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (1, 1), (8, 1)], "red": [(1, 2), (2, 2), (2, 1), (3, 1), (4, 1), (5, 1), (6, 1), (6, 2), (7, 1), (7, 2), (8, 2), (8, 3),], # you get the picture } # Some checking code left out, see below The region has to be manually coded in, which is a huge pain. (In the link, some validation code follows. Since it breaks up explaining the model I put it in the next section.) for r in regions.values(): solver.add(Or( *[queens[row] == col for (row, col) in r] )) Finally we have the region constraint. The easiest way I found to say "there is exactly one queen in each region" is to say "there is a queen in region 1 and a queen in region 2 and a queen in region 3" etc." Then to say "there is a queen in region purple" I wrote "q_0 = 0 OR q_0 = 1 OR … OR q_1 = 0 etc." Why iterate over every position in the region instead of doing something like (0, q[0]) in r? I tried that but it's not an expression that Z3 supports. if solver.check() == sat: m = solver.model() print([(l, m[l]) for l in queens]) Finally, we solve and print the positions. Running this gives me: [(q__0, 0), (q__1, 5), (q__2, 8), (q__3, 2), (q__4, 7), (q__5, 4), (q__6, 1), (q__7, 3), (q__8, 6)] Which is the correct solution to the queens puzzle. I didn't benchmark the solution times, but I imagine it's considerably slower than a raw SAT solver. Glucose is really, really fast. But even so, solving the problem with SMT was a lot easier than solving it with SAT. That satisfies me as an explanation for why people prefer it to SAT. Sanity checks One bit I glossed over earlier was the sanity checking code. I knew for sure that I was going to make a mistake encoding the region, and the solver wasn't going to provide useful information abut what I did wrong. In cases like these, I like adding small tests and checks to catch mistakes early, because the solver certainly isn't going to catch them! all_squares = set(product(range(size), repeat=2)) def test_i_set_up_problem_right(): assert all_squares == set(chain.from_iterable(regions.values())) for r1, r2 in combinations(regions.values(), 2): assert not set(r1) & set(r2), set(r1) & set(r2) The first check was a quick test that I didn't leave any squares out, or accidentally put the same square in both regions. Converting the values into sets makes both checks a lot easier. Honestly I don't know why I didn't just use sets from the start, sets are great. def render_regions(): colormap = ["purple", "red", "brown", "white", "green", "yellow", "orange", "blue", "pink"] board = [[0 for _ in range(size)] for _ in range(size)] for (row, col) in all_squares: for color, region in regions.items(): if (row, col) in region: board[row][col] = colormap.index(color)+1 for row in board: print("".join(map(str, row))) render_regions() The second check is something that prints out the regions. It produces something like this: 111111111 112333999 122439999 124437799 124666779 124467799 122467899 122555889 112258899 I can compare this to the picture of the board to make sure I got it right. I guess a more advanced solution would be to print emoji squares like 🟥 instead. Neither check is quality code but it's throwaway and it gets the job done so eh. "Boolean SATisfiability Solver", aka a solver that can find assignments that make complex boolean expressions true. I write a bit more about them here. ↩ "Satisfiability Modulo Theories" ↩

a month ago • 22 votes

More in programming

Increase software sales by 50% or more

This is re-post of How to Permanently Increase Your Sales by 50% or More in Only One Day article by Steve Pavlina Of all the things you can do to increase your sales, one of the highest leverage activities is attempting to increase your products’ registration rate. Increasing your registration rate from 1.0% to 1.5% means that you simply convince one more downloader out of every 200 to make the decision to buy. Yet that same tiny increase will literally increase your sales by a full 50%. If you’re one of those developers who simply slapped the ubiquitous 30-day trial incentive on your shareware products without going any further than that, then I think a 50% increase in your registration rate is a very attainable goal you can achieve if you spend just one full day of concentrated effort on improving your product’s ability to sell. My hope is that this article will get you off to a good start and get you thinking more creatively. And even if you fail, your result might be that you achieve only a 25% or a 10% increase. How much additional money would that represent to you over the next five years of sales? What influence, if any, did the title of this article have on your decision to read it? If I had titled this article, “Registration Incentives,” would you have been more or less likely to read it now? Note that the title expresses a specific and clear benefit to you. It tells you exactly what you can expect to gain by reading it. Effective registration incentives work the same way. They offer clear, specific benefits to the user if a purchase is made. In order to improve your registration incentives, the first thing you need to do is to adopt some new beliefs that will change your perspective. I’m going to introduce you to what I call the “lies of success” in the shareware industry. These are statements that are not true at all, but if you accept them as true anyway, you’ll achieve far better results than if you don’t. Rule 1: What you are selling is merely the difference between the shareware and the registered versions, not the registered version itself. Note that this is not a true statement, but if you accept it as true, you’ll immediately begin to see the weaknesses in your registration incentives. If there are few additional benefits for buying the full version vs. using the shareware version, then you aren’t offering the user strong enough incentives to make the full purchase. Rule 2: The sole purpose of the shareware version is to close the sale. This is our second lie of success. Note the emphasis on the word “close.” Your shareware version needs to act as a direct sales vehicle. It must be able to take the user all the way to the point of purchase, i.e. your online order form, ideally with nothing more than a few mouse clicks. Anything that detracts from achieving a quick sale is likely to hurt sales. Rule 3: The customer’s perspective is the only one that matters. Defy this rule at your peril. Customers don’t care that you spent 2000 hours creating your product. Customers don’t care that you deserve the money for your hard work. Customers don’t care that you need to do certain things to prevent piracy. All that matters to them are their own personal wants and needs. Yes, these are lies of success. Some customers will care, but if you design your registration incentives assuming they only care about their own self-interests, your motivation to buy will be much stronger than if you merely appeal to their sense of honesty, loyalty, or honor. Assume your customers are all asking, “What’s in it for me if I choose to buy? What will I get? How will this help me?” I don’t care if you’re selling to Fortune 500 companies. At some point there will be an individual responsible for causing the purchase to happen, and that individual is going to consider how the purchase will affect him/her personally: “Will this purchase get me fired? Will it make me look good in front of my peers? Will this make my job easier or harder?” Many shareware developers get caught in the trap of discriminating between honest and dishonest users, believing that honest users will register and dishonest ones won’t. This line of thinking will ultimately get you nowhere, and it violates the third lie of success. When you make a purchase decision, how often do you use honesty as the deciding factor? Do you ever say, “I will buy this because I’m honest?” Or do you consider other more selfish factors first, such as how it will make you feel to purchase the software? The truth is that every user believes s/he is honest, so no user applies the honesty criterion when making a purchase decision. Thinking of your users in terms of honest ones vs. dishonest ones is a complete waste of time because that’s not how users primarily view themselves. Rule 4: Customers buy on emotion and justify with fact. If you’re honest with yourself, you’ll see that this is how you make most purchase decisions. Remember the last time you bought a computer. Is it fair to say that you first became emotionally attached to the idea of owning a new machine? For me, it’s the feeling of working faster, owning the latest technology, and being more productive that motivates me to go computer shopping. Once I’ve become emotionally committed, the justifications follow: “It’s been two years since I’ve upgraded, it will pay for itself with the productivity boost I gain, I can easily afford it, I’ve worked hard and I deserve a new machine, etc.” You use facts to justify the purchase. Once you understand how purchase decisions are made, you can see that your shareware products need to first get the user emotionally invested in the purchase, and then you give them all the facts they need to justify it. Now that we’ve gotten these four lies of success out of the way, let’s see how we might apply them to create some compelling registration incentives. Let’s start with Rule 1. What incentives can be spawned from this rule? The common 30-day trial is one obvious derivative. If you are only selling the difference between the shareware and registered versions, then a 30-day trial implies that you are selling unlimited future days of usage of the program after the trial period expires. This is a powerful incentive, and it’s been proven effective for products that users will continue to use month after month. 30-day trials are easy for users to understand, and they’re also easy to implement. You could also experiment with other time periods such as 10 days, 14 days, or 90 days. The only way of truly knowing which will work best for your products is to experiment. But let’s see if we can move a bit beyond the basic 30-day trial here by mixing in a little of Rule 3. How would the customer perceive a 30-day trial? In most cases 30 days is plenty of time to evaluate a product. But in what situations would a 30-day trial have a negative effect? A good example is when the user downloads, installs, and briefly checks out a product s/he may not have time to evaluate right away. By the time the user gets around to fully evaluating it, the shareware version has already expired, and a sale may be lost as a result. To get around this limitation, many shareware developers have started offering 30 days of actual program usage instead of 30 consecutive days. This allows the user plenty of time to try out the program at his/her convenience. Another possibility would be to limit the number of times the program can be run. The basic idea is that you are giving away limited usage and selling unlimited usage of the program. This incentive definitely works if your product is one that will be used frequently over a long period of time (much longer than the trial period). The flip side of usage limitation is to offer an additional bonus for buying within a certain period of time. For instance, in my game Dweep, I offer an extra 5 free bonus levels to everyone who buys within the first 10 days. In truth I give the bonus levels to everyone who buys, but the incentive is real from the customer’s point of view. Remember Rule 3 - it doesn’t matter what happens on my end; it only matters what the customer perceives. Any customer that buys after the first 10 days will be delighted anyway to receive a bonus they thought they missed. So if your product has no time-based incentives at all, this is the first place to start. When would you pay your bills if they were never due, and no interest was charged on late payments? Use time pressure to your advantage, either by disabling features in the shareware version after a certain time or by offering additional bonuses for buying sooner rather than later. If nothing else and if it’s legal in your area, offer a free entry in a random monthly drawing for a small prize, such as one of your other products, for anyone who buys within the first X days. Another logical derivative of Rule 1 is the concept of feature limitation. On the crippling side, you can start with the registered version and begin disabling functionality to create the shareware version. Disabling printing in a shareware text editor is a common strategy. So is corrupting your program’s output with a simple watermark. For instance, your shareware editor could print every page with your logo in the background. Years ago the Association of Shareware Professionals had a strict policy against crippling, but that policy was abandoned, and crippling has been recognized as an effective registration incentive. It is certainly possible to apply feature limitation without having it perceived as crippling. This is especially easy for games, which commonly offer a limited number of playable levels in the shareware version with many more levels available only in the registered version. In this situation you offer the user a seemingly complete experience of your product in the shareware version, and you provide additional features on top of that for the registered version. Time-based incentives and feature-based incentives are perhaps the two most common strategies used by shareware developers for enticing users to buy. Which will work best for you? You will probably see the best results if you use both at the same time. Imagine you’re the end user for a moment. Would you be more likely to buy if you were promised additional features and given a deadline to make the decision? I’ve seen several developers who were using only one of these two strategies increase their registration rates dramatically by applying the second strategy on top of the first. If you only use time-based limitations, how could you apply feature limitation as well? Giving the user more reasons to buy will translate to more sales per download. One you have both time-based and feature-based incentives to buy, the next step is to address the user’s perceived risk by applying a risk-reversal strategy. Fortunately, the shareware model already reduces the perceived risk of purchasing significantly, since the user is able to try before buying. But let’s go a little further, keeping Rule 3 in mind. What else might be a perceived risk to the user? What if the user reaches the end of the trial period and still isn’t certain the product will do what s/he needs? What if the additional features in the registered version don’t work as the user expects? What can we do to make the decision to purchase safer for the user? One approach is to offer a money-back guarantee. I’ve been offering a 60-day unconditional money-back guarantee on all my products since January 2000. If someone asks for their money back for any reason, I give them a full refund right away. So what is my return rate? Well, it’s about 8%. Just kidding! Would it surprise you to learn that my return rate at the time of this writing is less than 0.2%? Could you handle two returns out of every 1000 sales? My best estimate is that this one technique increased my sales by 5-10%, and it only took a few minutes to implement. When I suggest this strategy to other shareware developers, the usual reaction is fear. “But everyone would rip me off,” is a common response. I suggest trying it for yourself on an experimental basis; a few brave souls have already tried it and are now offering money-back guarantees prominently. Try putting it up on your web site for a while just to convince yourself it works. You can take it down at any time. After a few months, if you’re happy with the results, add the guarantee to your shareware products as well. I haven’t heard of one bad outcome yet from those who’ve tried it. If you use feature limitation in your shareware products, another important component of risk reversal is to show the user exactly what s/he will get in the full version. In Dweep I give away the first five levels in the demo version, and purchasing the full version gets you 147 more levels. When I thought about this from the customer’s perspective (Rule 3), I realized that a perceived risk is that s/he doesn’t know if the registered version levels will be as fun as the demo levels. So I released a new demo where you can see every level but only play the first five. This lets the customer see all the fun that awaits them. So if you have a feature-limited product, show the customer how the feature will work. For instance, if your shareware version has printing disabled, the customer could be worried that the full version’s print capability won’t work with his/her printer or that the output quality will be poor. A better strategy is to allow printing, but to watermark the output. This way the customer can still test and verify the feature, and it doesn’t take much imagination to realize what the output will look like without the watermark. Our next step is to consider Rule 2 and include the ability close the sale. It is imperative that you include an “instant gratification” button in your shareware products, so the customer can click to launch their default web browser and go directly to your online order form. If you already have a “buy now” button in your products, go a step further. A small group of us have been finding that the more liberally these buttons are used, the better. If you only have one or two of these buttons in your shareware program, you should increase the count by at least an order of magnitude. The current Dweep demo now has over 100 of these buttons scattered throughout the menus and dialogs. This makes it extremely easy for the customer to buy, since s/he never has to hunt around for the ordering link. What should you label these buttons? “Buy now” or “Register now” are popular, so feel free to use one of those. I took a slightly different approach by trying to think like a customer (Rule 3 again). As a customer the word “buy” has a slightly negative association for me. It makes me think of parting with my cash, and it brings up feelings of sacrifice and pressure. The words “buy now” imply that I have to give away something. So instead, I use the words, “Get now.” As a customer I feel much better about getting something than buying something, since “getting” brings up only positive associations. This is the psychology I use, but at present, I don’t know of any hard data showing which is better. Unless you have a strong preference, trust your intuition. Make it as easy as possible for the willing customer to buy. The more methods of payment you accept, the better your sales will be. Allow the customer to click a button to print an order form directly from your program and mail it with a check or money order. On your web order form, include a link to a printable text order form for those who are afraid to use their credit cards online. If you only accept two or three major credit cards, sign up with a registration service to handle orders for those you don’t accept. So far we’ve given the customer some good incentives to buy, minimized perceived risk, and made it easy to make the purchase. But we haven’t yet gotten the customer emotionally invested in making the purchase decision. That’s where Rule 4 comes in. First, we must recognize the difference between benefits and features. We need to sell the sizzle, not the steak. Features describe your product, while benefits describe what the user will get by using your product. For instance, a personal information manager (PIM) program may have features such as daily, weekly, and monthly views; task and event timers; and a contact database. However, the benefits of the program might be that it helps the user be more organized, earn more money, and enjoy more free time. For a game, the main benefit might be fun. For a nature screensaver, it could be relaxation, beauty appreciation, or peace. Features are logical; benefits are emotional. Logical features are an important part of the sale, but only after we’ve engaged the customer’s emotions. Many products do a fair job of getting the customer emotionally invested during the trial period. If you have an addictive program or one that’s fun to use, such as a game, you may have an easy time getting the customer emotionally attached to using it because the experience is already emotional in nature. But whatever your product is, you can increase your sales by clearly illustrating the benefits of making the purchase. A good place to do this is in your nag screens. I use nag screens both before and after the program runs to remind the user of the benefits of buying the full version. At the very least, include a nag screen when the customer exits the program, so the last thing s/he sees will be a reminder of the product’s benefits. Take this opportunity to sell the user on the product. Don’t expect features like “customizable colors” to motivate anyone to buy. Paint a picture of what benefits the user will obtain with the full version. Will I save time? Will I have more fun? Will I live longer, save money, or feel better? The simple change from feature-oriented selling to benefit-oriented selling can easily double or triple your sales. Be sure to use this approach on your web site as well if you don’t already. Developers who’ve recently made the switch have been reporting some amazing results. If you’re drawing a blank when trying to come up with benefits for your products, the best thing you can do is to email some of your old customers and ask them why they bought your program. What did it do for them? I’ve done this and was amazed at the answers I got back. People were buying my games for reasons I’d never anticipated, and that told me which benefits I needed to emphasize in my sales pitch. The next key is to make your offer irresistible to potential customers. Find ways to offer the customer so much value that it would be harder to say no than to say yes. Take a look at your shareware product as if you were a potential customer who’d never seen it before. Being totally honest with yourself, would you buy this program if someone else had written it? If not, don’t stop here. As a potential customer, what additional benefits or features would put you over the top and convince you to buy? More is always better than less. In the original version of Dweep, I offered ten levels in the demo and thirty in the registered version. Now I offer only five demo levels and 152 in the full version, plus a built-in level editor. Originally, I offered the player twice the value of the demo; now I’m offering over thirty times the value. I also offer free hints and solutions to every level; the benefit here is that it minimizes player frustration. As I keep adding bonuses for purchasing, the offer becomes harder and harder to resist. What clever bonuses can you throw in for registering? Take the time to watch an infomercial. Notice that there is always at least one “FREE” bonus thrown in. Consider offering a few extra filters for an image editor, ten extra images for a screensaver, or extra levels for a game. What else might appeal to your customers? Be creative. Your bonus doesn’t even have to be software-based. Offer a free report about building site traffic with your HTML editor, include an essay on effective time management with your scheduling program, or throw in a small business success guide with your billing program. If you make such programs, you shouldn’t have too much trouble coming up with a few pages of text that would benefit your customers. Keep working at it until your offer even looks irresistible to you. If all the bonuses you offer can be delivered electronically, how many can you afford to include? If each one only gains one more customer in a thousand (0.1%), would it be worth the effort over the lifetime of your sales? So how do you know if your registration incentives are strong enough? And how do you know if your product is over-crippled? Where do you draw the line? These are tough issues, but there is a good way to handle them if your product is likely to be used over a long period of time, particularly if it’s used on a daily basis. Simply make your program gradually increase its registration incentives over time. One easy way to do this is with a delay timer on your nag screens that increases each time the program is run. Another approach is to disable certain features at set intervals. You begin by disabling non-critical features and gradually move up to disabling key functionality. The program becomes harder and harder to continue using for free, so the benefits of registering become more and more compelling. Instead of having your program completely disable itself after your trial period, you gradually degrade its usability with additional usage. This approach can be superior to a strict 30-day trial, since it allows your program to still be used for a while, but after prolonged usage it becomes effectively unusable. However, you don’t simply shock the user by taking away all the benefits s/he has become accustomed to on a particular day. Instead, you begin with a gentle reminder that becomes harder and harder to ignore. There may be times when your 30-day trial shuts off at an inconvenient time for the user, and you may lose a sale as a result. For instance, the user may not have the money at the time, or s/he may be busy at the trial’s end and forget to register. In that case s/he may quickly replace what was lost with a competitor’s trial version. The gradual degradation approach allows the user to continue using your product, but with increasing difficulty over time. Eventually, there is a breaking point where the user either decides to buy or to stop using the program completely, but this can be done within a window of time at the user’s convenience. Hopefully this article has gotten you thinking creatively about all the overlooked ways you can entice people to buy your shareware products. The most important thing you can do is to begin seeing your products through your customers’ eyes. What additional motivation would convince you to buy? What would represent an irresistible offer to you? There is no limit to how many incentives you can add. Don’t stop at just one or two; instead, give the customer a half dozen or more reasons to buy, and you’ll see your registration rate soar. Is it worth spending a day to do this? I think so.

yesterday • 4 votes

Maybe writing speed actually is a bottleneck for programming

2 days ago • 6 votes

Occupation and Preoccupation

Here’s Jony Ive in his Stripe interview: What we make stands testament to who we are. What we make describes our values. It describes our preoccupations. It describes beautiful succinctly our preoccupation. I’d never really noticed the connection between these two words: occupation and preoccupation. What comes before occupation? Pre-occupation. What comes before what you do for a living? What you think about. What you’re preoccupied with. What you think about will drive you towards what you work on. So when you’re asking yourself, “What comes next? What should I work on?” Another way of asking that question is, “What occupies my thinking right now?” And if what you’re occupied with doesn’t align with what you’re preoccupied with, perhaps it's time for a change. Email · Mastodon · Bluesky

2 days ago • 3 votes

American hype

There's no country on earth that does hype better than America. It's one of the most appealing aspects about being here. People are genuinely excited about the future and never stop searching for better ways to work, live, entertain, and profit. There's a unique critical mass in the US accelerating and celebrating tomorrow. The contrast to Europe couldn't be greater. Most Europeans are allergic to anything that even smells like a commercial promise of a better tomorrow. "Hype" is universally used as a term to ridicule anyone who dares to be excited about something new, something different. Only a fool would believe that real progress is possible! This is cultural bedrock. The fault lines have been settling for generations. It'll take an earthquake to move them. You see this in AI, you saw it in the Internet. Europeans are just as smart, just as inventive as their American brethren, but they don't do hype, so they're rarely the ones able to sell the sizzle that public opinion requires to shift its vision for tomorrow. To say I have a complicated relationship with venture capital is putting it mildly. I've spent a career proving the counter narrative. Proving that you can build and bootstrap an incredible business without investor money, still leave a dent in the universe, while enjoying the spoils of capitalism. And yet... I must admit that the excesses of venture capital are integral to this uniquely American advantage on hype. The lavish overspending during the dot-com boom led directly to a spectacular bust, but it also built the foundation of the internet we all enjoy today. Pets.com and Webvan flamed out such that Amazon and Shopify could transform ecommerce out of the ashes. We're in the thick of peak hype on AI right now. Fantastical sums are chasing AGI along with every dumb derivative mirage along the way. The most outrageous claims are being put forth on the daily. It's easy to look at that spectacle with European eyes and roll them. Some of it is pretty cringe! But I think that would be a mistake. You don't have to throw away your critical reasoning to accept that in the face of unknown potential, optimism beats pessimism. We all have to believe in something, and you're much better off believing that things can get better than not. Americans fundamentally believe this. They believe the hype, so they make it come to fruition. Not every time, not all of them, but more of them, more of the time than any other country in the world. That really is exceptional.

2 days ago • 3 votes

File sync is very slow

I’m working on a Go library appendstore for append-only store of lots of things in a single file. To make things as robust as possible I was calling os.File.Sync() after each append. Sync() is waiting until the data is acknowledged as truly, really written to disk (as opposed to maybe floating somewhere in disk drive’s write buffer). Oh boy, is it slow. A test of appending 1000 records would take over 5 seconds. After removing the Sync() it would drop to 5 milliseconds. 1000x faster. I made sync optional - it’s now up to the user of the library to pick it, defaults to non-sync. Is it unsafe now? Well, the reality is that it probably doesn’t matter. I don’t think lots of software does the sync due to slowness and the world still runs.

2 days ago • 2 votes

New here?