AI & The Science of Creativity

from Jim Nielsen’s Blog [alt+shift+b] in programming

In an effort to better understand how all this AI stuff works, I’ve been chipping away at Stephen Wolfram’s meticulous piece, “What Is ChatGPT Doing … and Why Does It Work?”. As you likely know, ChatGPT works by guessing at the next word. Here’s Stephen: when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again “given the text so far, what should the next word be?”—and each time adding a word What strikes me in Stephen’s description is how it determines what word to guess next. Here’s Stephen again: at each step it gets a list of words with probabilities. But which one should it actually pick…? One might think it should be the “highest-ranked” word (i.e. the one to which the highest “probability” was assigned). But this is where a bit of voodoo begins to creep in. Because for some reason—that maybe one day we’ll have a scientific-style understanding of—if we always pick the highest-ranked word, we’ll typically get a very “flat”...

over a year ago

Remove from reading list Add to reading list [alt+a] Read now [→]

Comments

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Jim Nielsen’s Blog

Some Love For Python

I really enjoyed watching Python: The Documentary (from CultRepo, formerly Honeypot, same makers as the TypeScript documentary). Personally, I don’t write much Python and am not involved in the broader Python community. That said, I love how this documentary covers a lot of the human problems in tech and not just the technical history of Python as language. For example: How do you handle succession from a pivotal creator? How do you deal with poor representation? How do you fund and steer open projects? How do you build community? How do you handle the fallout of major version changes? And honestly, all the stories around these topics as told from the perspective of Python feel like lessons to learn from. Here are a few things that stood out to me. Guido van Rossum, Creator of Python, Sounds Cool The film interviews Drew Houston, Founder/CEO at Dropbox, because he hired Python’s creator Guido van Rossum for a stint. This is what Drew had to say about his time working with Guido: It’s hard for me to think of someone who has had more impact with lower ego [than Guido] For tech, that’s saying something! Now that is a legacy if you ask me. The Python Community Sounds Cool Brett Cannon famously gave a talk at a Python conference where he said he “came for the language, but stayed for the community”. In the documentary they interview him and he adds: The community is the true strength of Pyhon. It’s not just the language, it’s the people. ❤️ This flies in the face of the current era we’re in, where it’s the technology that matters. How it disrupts or displaces people is insignificant next to the fantastic capabilities it purports to wield. But here’s this language surrounded by people who acknowledge that the community around the language is its true strength. People are the true strength. Let me call this out again, in case it’s not sinking in: Here’s a piece of technology where the people around it seem to acknowledge that the technology itself is only secondary to the people it was designed to serve. How incongruous is that belief with so many other pieces of technology we’ve seen through the years? What else do we have, if not each other? That’s something worth amplifying. Mariatta, Python Core Developer, Sounds Cool I absolutely loved the story of @mariatta@fosstodon.org. If you’re not gonna watch the documentary, at least watch the ~8 minutes of her story. Watched it? Ok, here’s my quick summary: She loves to program, but everywhere she looks it’s men. At work. At conferences. On core teams. She hears about pyladies and wants to go to Pycon where she can meet them. She goes to Pycon and sees Guido van Rossum stand up and say he wants 2 core contributors to Python that are female. She thinks, “Oh that’s cool! I’m not good enough for that, but I bet they’ll find someone awesome.” The next year she goes to the conference and finds out they’re still looking for those 2 core contributors. She thinks “Why not me?” and fires off an email to Guido. Here’s her recollection on composing that email: I felt really scared. I didn’t feel like I deserved mentorship from Guido van Rossum. I really hesitated to send this email to him, but in the end I realized I want to try. This was a great opportunity for me. I hit the send button. And later, her feelings on becoming the first female core contributor to Python: When you don’t have role models you can relate to, you don’t believe you can do it. ❤️ Mad respect. I love her story. As Jessica McKellar says in the film, Mariatta’s is an inspiring story and “a vision of what is possible in other communities”. Python Is Refreshing I’ve spent years in “webdev” circles — and there are some great ones — but this Python documentary was, to me, a tall, refreshing glass of humanity. Go Python! Email · Mastodon · Bluesky

2 days ago • 4 votes

Trying to Make Sense of Casing Conventions on the Web

(I present to you my stream of consciousness on the topic of casing as it applies to the web platform.) I’m reading about the new command and commandfor attributes — which I’m super excited about, declarative behavior invocation in HTML? YES PLEASE!! — and one thing that strikes me is the casing in these APIs. For example, the command attribute has a variety of values in HTML which correspond to APIs in JavaScript. The show-popover attribute value maps to .showPopover() in JavaScript. hide-popover maps to .hidePopover(), etc. So what we have is: lowercase in attribute names e.g. commandfor="..." kebab-case in attribute values e.g. show-popover camelCase for JS counterparts e.g. showPopover() After thinking about this a little more, I remember that HTML attributes names are case insensitive, so the browser will normalize them to lowercase during parsing. Given that, I suppose you could write commandFor="..." but it’s effectively the same. Ok, lowercase attribute names in HTML makes sense. The related popover attributes follow the same convention: popovertarget popovertargetaction And there are many other attribute names in HTML that are lowercase, e.g.: maxlength novalidate contenteditable autocomplete formenctype So that all makes sense. But wait, there are some attribute names with hyphens in them, like aria-label="..." and data-value="...". So why isn’t it command-for="..."? Well, upon further reflection, I suppose those attributes were named that way for extensibility’s sake: they are essentially wildcard attributes that represent a family of attributes that are all under the same namespace: aria-* and data-*. But wait, isn’t that an argument for doing popover-target and popover-target-action? Or command and command-for? But wait (I keep saying that) there are kebab-case attribute names in HTML — like http-equiv on the <meta> tag, or accept-charset on the form tag — but those seem more like legacy exceptions. It seems like the only answer here is: there is no rule. Naming is driven by convention and decisions are made on a case-by-case basis. But if I had to summarize, it would probably be that the default casing for new APIs tends to follow the rules I outlined at the start (and what’s reflected in the new command APIs): lowercase for HTML attributes names kebab-case for HTML attribute values camelCase for JS counterparts Let’s not even get into SVG attribute names We need one of those “bless this mess” signs that we can hang over the World Wide Web. Email · Mastodon · Bluesky

6 days ago • 14 votes

Successive Prototypes Bridge the Gap Between Idea and Reality

Dismissing an idea because it doesn’t work in your head is doing a disservice to the idea. (Same for dismissing someone else’s idea because it doesn’t work in your head.) The only way to truly know if an idea works is to test it. The gap between an idea and reality is the work. You can’t dismiss something as “not working” without doing the work. When collaborating with others, different ideas can be put forward which end up in competition with each other. We debate which is best, but verbal descriptions don’t do justice to ideas — so the idea that wins is the one whose champion is the most persuasive (or has the most institutional authority). You don’t want that. You want an environment where ideas can be evaluated based on their substance and not on the personal attributes of the person advocating them. This is the value of prototypes. We can’t visualize or predict how our own ideas will play out, let alone other people’s. This is why it’s necessary to bring them to life, have them take concrete form. It’s the only way to do them justice. (Picture a cute puppy in your head. I’ve got one too. Now how do we determine who’s imagining the cuter puppy? We can’t. We have to produce a concrete manifestation for contrast and comparison.) Prototypes are how we bridge the gap between idea and reality. They’re an iterative, evolutionary, exploratory form of birthing ideas that test their substance. People will bow out to a good persuasive argument. They’ll bow out to their boss saying it should be one way or another. But it’s hard to bow out to a good idea you can see, taste, touch, smell, or use. Email · Mastodon · Bluesky

2 weeks ago • 17 votes

Consistent Navigation Across My Inconsistent Websites, Part II

I refreshed the little thing that let’s you navigate consistently between my inconsistent subdomains (video recording). Here’s the tl;dr on the update: I had to remove some features on each site to make this feel right. Takeaway: adding stuff is easy, removing stuff is hard. The element is a web component and not even under source control (🤫). I serve it directly from my cdn. If I want to make an update, I tweak the file on disk and re-deploy. Takeaway: cowboy codin’, yee-haw! Live free and die hard. So. Many. Iterations. All of which led to what? A small, iterative evolution. Takeaway: it’s ok for design explorations to culminate in updates that look more like an evolution than a mutation. Want more info on the behind-the-scenes work? Read on! Design Explorations It might look like a simple iteration on what I previously had, but that doesn’t mean I didn’t explore the universe of possibilities first before coming back to the current iteration. v0: Tabs! A tab-like experience seemed the most natural, but how to represent it? I tried a few different ideas. On top. On bottom. Different visual styles, etc. And of course, gotta explore how that plays out on desktop too. Some I liked, some I didn’t. As much as I wanted to play with going to the edges of the viewport, I realized that every browser is different and you won't be able to get a consistent “bleed-like” visual experience across browsers. For example, if you try to make tabs that bleed to the edges, it looks nice in a frame in Figma, and even in some browsers. But it won’t look right in all browser, like iOS Safari. So I couldn’t reliably leverage the idea of a bounded canvas as a design element — which, I should’ve known, has always been the case with the web. v1: Bottom Tabs With a Site Theme I really like this pattern on mobile devices, so I thought maybe I’d consider it for navigating between my sites. But how to theme across differently-styled sites? The favicon styles seemed like a good bet! And, of course, what do to on larger devices? Just stacking it felt like overkill, so I explored moving it to the edge. I actually prototyped this in code, but I didn’t like how it felt so I scratched the idea and went other directions. v2: The Unification The more I explored what to do with this element, the more it started taking on additional responsibility. “What if I unified its position with site-specific navigation?” I thought. This led to design explorations where the disparate subdomains began to take on not just a unified navigational element, but a unified header. And I made small, stylistic explorations with the tabs themselves too. You can see how I played toyed with the idea of a consistent header across all my sites (not an intended goal, but ya know, scope creep gets us all). As I began to explore more possibilities than I planned for, things started to get out of hand. v3: Do More. MORE. MORE!! Questions I began asking: Why aren’t these all under the same domain?! What if I had a single domain for feeds across all of them, e.g. feeds.jim-nielsen.com? What about icons instead of words? Wait, wait, wait Jim. Consistent navigation across inconsistent sites. That’s the goal. Pare it back a little. v4: Reigning It Back In To counter my exploratory ambitions, I told myself I needed to ship something without the need to modify the entire design style of all my sites. So how do I do that? That got me back to a simpler premise: consistent navigation across my inconsistent sites. Better — and implementable. Technical Details The implementation was pretty simple. I basically just forked my previous web component and changed some styles. That’s it. The only thing I did different was I moved the web component JS file from being part of my www.jim-nielsen.com git repository to a standalone file (not under git control) on my CDN. This felt like one of the exceptions to the rule of always keeping stuff under version control. It’s more of the classic FTP-style approach to web development. Granted, it’s riskier, but it’s also way more flexible. And I’m good with that trade-off for now. (Ask me again in a few months if I’ve done anything terrible and now have regrets.) Each site implements the component like this (with a different subdomain attribute for each site): <script type="module" src="https://cdn.jim-nielsen.com/shared/jim-site-switcher.js"></script> <jim-site-switcher subdomain="blog"></jim-site-switcher> That’s really all there is to say. Thanks to Zach for prodding me to make this post. Email · Mastodon · Bluesky

3 weeks ago • 15 votes

Bottomless Subtleties

Jason Fried writes in his post “Knives and battleships”: Specific tools and familiar ingredients combined in different ratios, different molds, for different purposes. Like a baker working from the same tight set of pantry ingredients to make a hundred distinct recipes. You wouldn't turn to them and say "enough with the butter, flour, sugar, baking powder, and eggs already!" Getting the same few things right in different ways is a career's worth of work. Mastery comes from a lifetime of putting together the basics in different combinations. I think of Beethoven’s 5th and its famous “short-short-short-long” motif. The entire symphony is essentially the same core idea repeated and developed relentlessly! The same four notes (da-da-da-DAH!) moving between instruments, changing keys, etc. Beethoven took something basic — a four note motif — and extracted an enormous set of variations. Its genius is in illustrating how much can be explored and expressed within constraints (rather than piling on “more and more” novel stuff). Back to Jason’s point: the simplest building blocks in any form — music, code, paint, cooking — implemented with restraint can be combined in an almost infinite set of pleasing ways. As Devine noted (and I constantly link back to): we haven’t even begun to scratch the surface of what we can do with less. Email · Mastodon · Bluesky

3 weeks ago • 18 votes

More in programming

If Apple cared about privacy

Defaults matter

11 hours ago • 5 votes

ARM is great, ARM is terrible (and so is RISC-V)

I’ve long been interested in new and different platforms. I ran Debian on an Alpha back in the late 1990s and was part of the Alpha port team; then I helped bootstrap Debian on amd64. I’ve got somewhere around 8 Raspberry Pi devices in active use right now, and the free NNCPNET Internet email service … Continue reading ARM is great, ARM is terrible (and so is RISC-V) →

9 hours ago • 2 votes

Many Hard Leetcode Problems are Easy Constraint Problems

In my first interview out of college I was asked the change counter problem: Given a set of coin denominations, find the minimum number of coins required to make change for a given number. IE for USA coinage and 37 cents, the minimum number is four (quarter, dime, 2 pennies). I implemented the simple greedy algorithm and immediately fell into the trap of the question: the greedy algorithm only works for "well-behaved" denominations. If the coin values were [10, 9, 1], then making 37 cents would take 10 coins in the greedy algorithm but only 4 coins optimally (10+9+9+9). The "smart" answer is to use a dynamic programming algorithm, which I didn't know how to do. So I failed the interview. But you only need dynamic programming if you're writing your own algorithm. It's really easy if you throw it into a constraint solver like MiniZinc and call it a day. int: total; array[int] of int: values = [10, 9, 1]; array[index_set(values)] of var 0..: coins; constraint sum (c in index_set(coins)) (coins[c] * values[c]) == total; solve minimize sum(coins); You can try this online here. It'll give you a prompt to put in total and then give you successively-better solutions: coins = [0, 0, 37]; ---------- coins = [0, 1, 28]; ---------- coins = [0, 2, 19]; ---------- coins = [0, 3, 10]; ---------- coins = [0, 4, 1]; ---------- coins = [1, 3, 0]; ---------- Lots of similar interview questions are this kind of mathematical optimization problem, where we have to find the maximum or minimum of a function corresponding to constraints. They're hard in programming languages because programming languages are too low-level. They are also exactly the problems that constraint solvers were designed to solve. Hard leetcode problems are easy constraint problems.1 Here I'm using MiniZinc, but you could just as easily use Z3 or OR-Tools or whatever your favorite generalized solver is. More examples This was a question in a different interview (which I thankfully passed): Given a list of stock prices through the day, find maximum profit you can get by buying one stock and selling one stock later. It's easy to do in O(n^2) time, or if you are clever, you can do it in O(n). Or you could be not clever at all and just write it as a constraint problem: array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8]; var int: buy; var int: sell; var int: profit = prices[sell] - prices[buy]; constraint sell > buy; constraint profit > 0; solve maximize profit; Reminder, link to trying it online here. While working at that job, one interview question we tested out was: Given a list, determine if three numbers in that list can be added or subtracted to give 0? This is a satisfaction problem, not a constraint problem: we don't need the "best answer", any answer will do. We eventually decided against it for being too tricky for the engineers we were targeting. But it's not tricky in a solver; include "globals.mzn"; array[int] of int: numbers = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8]; array[index_set(numbers)] of var {0, -1, 1}: choices; constraint sum(n in index_set(numbers)) (numbers[n] * choices[n]) = 0; constraint count(choices, -1) + count(choices, 1) = 3; solve satisfy; Okay, one last one, a problem I saw last year at Chipy AlgoSIG. Basically they pick some leetcode problems and we all do them. I failed to solve this one: Given an array of integers heights representing the histogram's bar height where the width of each bar is 1, return the area of the largest rectangle in the histogram. The "proper" solution is a tricky thing involving tracking lots of bookkeeping states, which you can completely bypass by expressing it as constraints: array[int] of int: numbers = [2,1,5,6,2,3]; var 1..length(numbers): x; var 1..length(numbers): dx; var 1..: y; constraint x + dx <= length(numbers); constraint forall (i in x..(x+dx)) (y <= numbers[i]); var int: area = (dx+1)*y; solve maximize area; output ["(\(x)->\(x+dx))*\(y) = \(area)"] There's even a way to automatically visualize the solution (using vis_geost_2d), but I didn't feel like figuring it out in time for the newsletter. Is this better? Now if I actually brought these questions to an interview the interviewee could ruin my day by asking "what's the runtime complexity?" Constraint solvers runtimes are unpredictable and almost always than an ideal bespoke algorithm because they are more expressive, in what I refer to as the capability/tractability tradeoff. But even so, they'll do way better than a bad bespoke algorithm, and I'm not experienced enough in handwriting algorithms to consistently beat a solver. The real advantage of solvers, though, is how well they handle new constraints. Take the stock picking problem above. I can write an O(n²) algorithm in a few minutes and the O(n) algorithm if you give me some time to think. Now change the problem to Maximize the profit by buying and selling up to max_sales stocks, but you can only buy or sell one stock at a given time and you can only hold up to max_hold stocks at a time? That's a way harder problem to write even an inefficient algorithm for! While the constraint problem is only a tiny bit more complicated: include "globals.mzn"; int: max_sales = 3; int: max_hold = 2; array[int] of int: prices = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8]; array [1..max_sales] of var int: buy; array [1..max_sales] of var int: sell; array [index_set(prices)] of var 0..max_hold: stocks_held; var int: profit = sum(s in 1..max_sales) (prices[sell[s]] - prices[buy[s]]); constraint forall (s in 1..max_sales) (sell[s] > buy[s]); constraint profit > 0; constraint forall(i in index_set(prices)) (stocks_held[i] = (count(s in 1..max_sales) (buy[s] <= i) - count(s in 1..max_sales) (sell[s] <= i))); constraint alldifferent(buy ++ sell); solve maximize profit; output ["buy at \(buy)\n", "sell at \(sell)\n", "for \(profit)"]; Most constraint solving examples online are puzzles, like Sudoku or "SEND + MORE = MONEY". Solving leetcode problems would be a more interesting demonstration. And you get more interesting opportunities to teach optimizations, like symmetry breaking. Because my dad will email me if I don't explain this: "leetcode" is slang for "tricky algorithmic interview questions that have little-to-no relevance in the actual job you're interviewing for." It's from leetcode.com. ↩

9 hours ago • 2 votes

btrfs on a Raspberry Pi

I’m something of a filesystem geek, I guess. I first wrote about ZFS on Linux 14 years ago, and even before I used ZFS, I had used ext2/3/4, jfs, reiserfs, xfs, and no doubt some others. I’ve also used btrfs. I last posted about it in 2014, when I noted it has some advantages over … Continue reading btrfs on a Raspberry Pi →

yesterday • 3 votes

Stumbling upon

Something like a channel changer, for the web. That's what the idea was at first. But it led to a whole new path of discovery that even the site's creators couldn't have predicted. The post Stumbling upon appeared first on The History of the Web.

yesterday • 8 votes

New here?