More from seangoedecke.com RSS feed
What does it mean to get things done? In the abstract, you can complete a mathematical proof or a problem set, but the real world is much fuzzier. Suppose I plant a tree in my backyard. Once the sapling is in the ground, is that done? Not really. There’s always more work to do: clearing the ground around it, watering, keeping pests away, pruning, and so on. Programming large web applications is more like planting a tree than completing a mathematical proof. Once you write a service, you can keep working on it forever if you want to. In large tech companies, this fact is a trap for competent but unagentic engineers. They see an infinite queue of tasks that they’re capable of doing, and they start delivering a stream of marginal improvements to a particular subsystem. From their perspective, it feels like they’re crushing it. After all, they’re putting out work at their top speed: no downtime, no waiting on other teams. But they’re not doing their actual job, which is to deliver the most…
There’s a brand of tech influencer now that’s all about sharing the perfect prompt for any situation. The tweets in question typically read something like “this prompt will make you superhuman”, or “this prompt will be a 20k growth consultant in your pocket”. There’s a kernel of truth here - it’s surprising how much small alterations in a prompt can affect the quality of language model outputs - but overall it’s just a bit silly. Searching for the perfect prompt is just not how you should be engaging with language models. I’ve believed for a while that getting good at AI is not really about “prompt engineering”. Instead, it’s about getting a sense of what language models are good and bad at, of when it’s useful to continue a conversation with a LLM and when you should back out and start a brand-new conversation, of when to use reasoning models and when not to, of when you can broadly trust the model output and when you need to go over it with a fine-tooth comb, and so on. For instance…
I have delivered a lot of successful engineering projects. When I start on a project, I’m now very (perhaps unreasonably) confident that I will ship it successfully. Even so, in every single one of these projects there is a period - perhaps a day, or even a week - where it feels like everything has gone wrong and the project will be a disaster. I call this the valley of engineering despair. A huge part of becoming good at running projects is anticipating and enduring this period. The start of a project always feels good. I have a clear idea of what needs doing, and there’s plenty of time to do it. The very end of a project usually feels good too - by that point all the important pieces are ready, and it’s just a matter of getting the final tweaks and bugfixes in. The hard part is the middle of the project, when all these things are happening at the same time: You’re discovering that some of the things you thought would be easy are actually surprisingly hard New requirements have come…
People have been making fun of OpenAI models for being overly sycophantic for months now. I even wrote a post advising users to pretend that their work was written by someone else, to counteract the model’s natural desire to shower praise on the user. With the latest GPT-4o update, this tendency has been turned up even further. It’s now easy to convince the model that you’re the smartest, funniest, most handsome human in the world. This is bad for obvious reasons. Lots of people use ChatGPT for advice or therapy. It seems dangerous for ChatGPT to validate people’s belief that they’re always in the right. There are extreme examples on Twitter of ChatGPT agreeing with people that they’re a prophet sent by God, or that they’re making the right choice to go off their medication. These aren’t complicated jailbreaks - the model will actively push you down this path. I think it’s fair to say that sycophancy is the first LLM “dark pattern”. Dark patterns are user interfaces that are designed…
More in AI
It can be bleak out there, but the candor is very helpful, and you occasionally get a win.
Society is once again left holding the bag
Two simple questions to help make Society's Backend better