Full Width [alt+shift+f] Shortcuts [alt+shift+k]
Sign Up [alt+shift+s] Log In [alt+shift+l]
15
Entering 2025, I decided to spend some time exploring the topic of agents. I started reading Anthropic’s Building effective agents, followed by Chip Huyen’s AI Engineering. I kicked off a major workstream at work on using agents, and I also decided to do a personal experiment of sorts. This is a general commentary on building that project. What I wanted to build was a simple chat interface where I could write prompts, select models, and have the model use tools as appropriate. My side goal was to build this using Cursor and generally avoid writing code directly as much as possible, but I found that generally slower than writing code in emacs while relying on 4o-mini to provide working examples to pull from. Similarly, while I initially envisioned building this in fullstack TypeScript via Cursor, I ultimately bailed into a stack that I’m more comfortable, and ended up using Python3, FastAPI, PostgreSQL, and SQLAlchemy with the async psycopg3 driver. It’s been a… while… since I started a...
2 weeks ago

Improve your reading experience

Logged in users get linked directly to articles resulting in a better reading experience. Please login for free, it takes less than 1 minute.

More from Irrational Exuberance

Diagnosis in engineering strategy.

Once you’ve written your strategy’s exploration, the next step is working on its diagnosis. Diagnosis is understanding the constraints and challenges your strategy needs to address. In particular, it’s about doing that understanding while slowing yourself down from deciding how to solve the problem at hand before you know the problem’s nuances and constraints. If you ever find yourself wanting to skip the diagnosis phase–let’s get to the solution already!–then maybe it’s worth acknowledging that every strategy that I’ve seen fail, did so due to a lazy or inaccurate diagnosis. It’s very challenging to fail with a proper diagnosis, and almost impossible to succeed without one. The topics this chapter will cover are: Why diagnosis is the foundation of effective strategy, on which effective policy depends. Conversely, how skipping the diagnosis phase consistently ruins strategies A step-by-step approach to diagnosing your strategy’s circumstances How to incorporate data into your diagnosis effectively, and where to focus on adding data Dealing with controversial elements of your diagnosis, such as pointing out that your own executive is one of the challenges to solve Why it’s more effective to view difficulties as part of the problem to be solved, rather than a blocking issue that prevents making forward progress The near impossibility of an effective diagnosis if you don’t bring humility and self-awareness to the process Into the details we go! This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Diagnosis is strategy’s foundation One of the challenges in evaluating strategy is that, after the fact, many effective strategies are so obvious that they’re pretty boring. Similarly, most ineffective strategies are so clearly flawed that their authors look lazy. That’s because, as a strategy is operated, the reality around it becomes clear. When you’re writing your strategy, you don’t know if you can convince your colleagues to adopt a new approach to specifying APIs, but a year later you know very definitively whether it’s possible. Building your strategy’s diagnosis is your attempt to correctly recognize the context that the strategy needs to solve before deciding on the policies to address that context. Done well, the subsequent steps of writing strategy often feel like an afterthought, which is why I think of diagnosis as strategy’s foundation. Where exploration was an evaluation-free activity, diagnosis is all about evaluation. How do teams feel today? Why did that project fail? Why did the last strategy go poorly? What will be the distractions to overcome to make this new strategy successful? That said, not all evaluation is equal. If you state your judgment directly, it’s easy to dispute. An effective diagnosis is hard to argue against, because it’s a web of interconnected observations, facts, and data. Even for folks who dislike your conclusions, the weight of evidence should be hard to shift. Strategy testing, explored in the Refinement section, takes advantage of the reality that it’s easier to diagnose by doing than by speculating. It proposes a recursive diagnosis process until you have real-world evidence that the strategy is working. How to develop your diagnosis Your strategy is almost certain to fail unless you start from an effective diagnosis, but how to build a diagnosis is often left unspecified. That’s because, for most folks, building the diagnosis is indeed a dark art: unspecified, undiscussion, and uncontrollable. I’ve been guilty of this as well, with The Engineering Executive’s Primer’s chapter on strategy staying silent on the details of how to diagnose for your strategy. So, yes, there is some truth to the idea that forming your diagnosis is an emergent, organic process rather than a structured, mechanical one. However, over time I’ve come to adopt a fairly structured approach: Braindump, starting from a blank sheet of paper, write down your best understanding of the circumstances that inform your current strategy. Then set that piece of paper aside for the moment. Summarize exploration on a new piece of paper, review the contents of your exploration. Pull in every piece of diagnosis from similar situations that resonates with you. This is true for both internal and external works! For each diagnosis, tag whether it fits perfectly, or needs to be adjusted for your current circumstances. Then, once again, set the piece of paper aside. Mine for distinct perspectives on yet another blank page, talking to different stakeholders and colleagues who you know are likely to disagree with your early thinking. Your goal is not to agree with this feedback. Instead, it’s to understand their view. The Crux by Richard Rumelt anchors diagnosis in this approach, emphasizing the importance of “testing, adjusting, and changing the frame, or point of view.” Synthesize views into one internally consistent perspective. Sometimes the different perspectives you’ve gathered don’t mesh well. They might well explicitly differ in what they believe the underlying problem is, as is typical in tension between platform and product engineering teams. The goal is to competently represent each of these perspectives in the diagnosis, even the ones you disagree with, so that later on you can evaluate your proposed approach against each of them. When synthesizing feedback goes poorly, it tends to fail in one of two ways. First, the author’s opinion shines through so strongly that it renders the author suspect. Your goal is never to agree with every team’s perspective, just as your diagnosis should typically avoid crowning any perspective as correct: a reader should generally be appraised of the details and unaware of the author. The second common issue is when a group tries to jointly own the synthesis, but create a fractured perspective rather than a unified one. I generally find that having one author who is accountable for representing all views works best to address both of these issues. Test drafts across perspectives. Once you’ve written your initial diagnosis, you want to sit down with the people who you expect to disagree most fervently. Iterate with them until they agree that you’ve accurately captured their perspective. It might be that they disagree with some other view points, but they should be able to agree that others hold those views. They might argue that the data you’ve included doesn’t capture their full reality, in which case you can caveat the data by saying that their team disagrees that it’s a comprehensive lens. Don’t worry about getting the details perfectly right in your initial diagnosis. You’re trying to get the right crumbs to feed into the next phase, strategy refinement. Allowing yourself to be directionally correct, rather than perfectly correct, makes it possible to cover a broad territory quickly. Getting caught up in perfecting details is an easy way to anchor yourself into one perspective prematurely. At this point, I hope you’re starting to predict how I’ll conclude any recipe for strategy creation: if these steps feel overly mechanical to you, adjust them to something that feels more natural and authentic. There’s no perfect way to understand complex problems. That said, if you feel uncertain, or are skeptical of your own track record, I do encourage you to start with the above approach as a launching point. Incorporating data into your diagnosis The strategy for Navigating Private Equity ownership’s diagnosis includes a number of details to help readers understand the status quo. For example the section on headcount growth explains headcount growth, how it compares to the prior year, and providing a mental model for readers to translate engineering headcount into engineering headcount costs: Our Engineering headcount costs have grown by 15% YoY this year, and 18% YoY the prior year. Headcount grew 7% and 9% respectively, with the difference between headcount and headcount costs explained by salary band adjustments (4%), a focus on hiring senior roles (3%), and increased hiring in higher cost geographic regions (1%). If everyone evaluating a strategy shares the same foundational data, then evaluating the strategy becomes vastly simpler. Data is also your mechanism for supporting or critiquing the various views that you’ve gathered when drafting your diagnosis; to an impartial reader, data will speak louder than passion. If you’re confident that a perspective is true, then include a data narrative that supports it. If you believe another perspective is overstated, then include data that the reader will require to come to the same conclusion. Do your best to include data analysis with a link out to the full data, rather than requiring readers to interpret the data themselves while they are reading. As your strategy document travels further, there will be inevitable requests for different cuts of data to help readers understand your thinking, and this is somewhat preventable by linking to your original sources. If much of the data you want doesn’t exist today, that’s a fairly common scenario for strategy work: if the data to make the decision easy already existed, you probably would have already made a decision rather than needing to run a structured thinking process. The next chapter on refining strategy covers a number of tools that are useful for building confidence in low-data environments. Whisper the controversial parts At one time, the company I worked at rolled out a bar raiser program styled after Amazon’s, where there was an interviewer from outside the team that had to approve every hire. I spent some time arguing against adding this additional step as I didn’t understand what we were solving for, and I was surprised at how disinterested management was about knowing if the new process actually improved outcomes. What I didn’t realize until much later was that most of the senior leadership distrusted one of their peers, and had rolled out the bar raiser program solely to create a mechanism to control that manager’s hiring bar when the CTO was disinterested holding that leader accountable. (I also learned that these leaders didn’t care much about implementing this policy, resulting in bar raiser rejections being frequently ignored, but that’s a discussion for the Operations for strategy chapter.) This is a good example of a strategy that does make sense with the full diagnosis, but makes little sense without it, and where stating part of the diagnosis out loud is nearly impossible. Even senior leaders are not generally allowed to write a document that says, “The Director of Product Engineering is a bad hiring manager.” When you’re writing a strategy, you’ll often find yourself trying to choose between two awkward options: Say something awkward or uncomfortable about your company or someone working within it Omit a critical piece of your diagnosis that’s necessary to understand the wider thinking Whenever you encounter this sort of debate, my advice is to find a way to include the diagnosis, but to reframe it into a palatable statement that avoids casting blame too narrowly. I think it’s helpful to discuss a few concrete examples of this, starting with the strategy for navigating private equity, whose diagnosis includes: Based on general practice, it seems likely that our new Private Equity ownership will expect us to reduce R&D headcount costs through a reduction. However, we don’t have any concrete details to make a structured decision on this, and our approach would vary significantly depending on the size of the reduction. There are many things the authors of this strategy likely feel about their state of reality. First, they are probably upset about the fact that their new private equity ownership is likely to eliminate colleagues. Second, they are likely upset that there is no clear plan around what they need to do, so they are stuck preparing for a wide range of potential outcomes. However they feel, they don’t say any of that, they stick to precise, factual statements. For a second example, we can look to the Uber service migration strategy: Within infrastructure engineering, there is a team of four engineers responsible for service provisioning today. While our organization is growing at a similar rate as product engineering, none of that additional headcount is being allocated directly to the team working on service provisioning. We do not anticipate this changing. The team didn’t agree that their headcount should not be growing, but it was the reality they were operating in. They acknowledged their reality as a factual statement, without any additional commentary about that statement. In both of these examples, they found a professional, non-judgmental way to acknowledge the circumstances they were solving. The authors would have preferred that the leaders behind those decisions take explicit accountability for them, but it would have undermined the strategy work had they attempted to do it within their strategy writeup. Excluding critical parts of your diagnosis makes your strategies particularly hard to evaluate, copy or recreate. Find a way to say things politely to make the strategy effective. As always, strategies are much more about realities than ideals. Reframe blockers as part of diagnosis When I work on strategy with early-career leaders, an idea that comes up a lot is that an identified problem means that strategy is not possible. For example, they might argue that doing strategy work is impossible at their current company because the executive team changes their mind too often. That core insight is almost certainly true, but it’s much more powerful to reframe that as a diagnosis: if we don’t find a way to show concrete progress quickly, and use that to excite the executive team, our strategy is likely to fail. This transforms the thing preventing your strategy into a condition your strategy needs to address. Whenever you run into a reason why your strategy seems unlikely to work, or why strategy overall seems difficult, you’ve found an important piece of your diagnosis to include. There are never reasons why strategy simply cannot succeed, only diagnoses you’ve failed to recognize. For example, we knew in our work on Uber’s service provisioning strategy that we weren’t getting more headcount for the team, the product engineering team was going to continue growing rapidly, and that engineering leadership was unwilling to constrain how product engineering worked. Rather than preventing us from implementing a strategy, those components clarified what sort of approach could actually succeed. The role of self-awareness Every problem of today is partially rooted in the decisions of yesterday. If you’ve been with your organization for any duration at all, this means that you are directly or indirectly responsible for a portion of the problems that your diagnosis ought to recognize. This means that recognizing the impact of your prior actions in your diagnosis is a powerful demonstration of self-awareness. It also suggests that your next strategy’s success is rooted in your self-awareness about your prior choices. Don’t be afraid to recognize the failures in your past work. While changing your mind without new data is a sign of chaotic leadership, changing your mind with new data is a sign of thoughtful leadership. Summary Because diagnosis is the foundation of effective strategy, I’ve always found it the most intimidating phase of strategy work. While I think that’s a somewhat unavoidable reality, my hope is that this chapter has somewhat prepared you for that challenge. The four most important things to remember are simply: form your diagnosis before deciding how to solve it, try especially hard to capture perspectives you initially disagree with, supplement intuition with data where you can, and accept that sometimes you’re missing the data you need to fully understand. The last piece in particular, is why many good strategies never get shared, and the topic we’ll address in the next chapter on strategy refinement.

10 hours ago 3 votes
Exploring for strategy.

A surprising number of strategies are doomed from inception because their authors get attached to one particular approach without considering alternatives that would work better for their current circumstances. This happens when engineers want to pick tools solely because they are trending, and when executives insist on adopting the tech stack from their prior organization where they felt comfortable. Exploration is the antidote to early anchoring, forcing you to consider the problem widely before evaluating any of the paths forward. Exploration is about updating your priors before assuming the industry hasn’t evolved since you last worked on a given problem. Exploration is continuing to believe that things can get better when you’re not watching. This chapter covers: The goals of the exploration phase of strategy creation When to explore (always first!) and when it makes sense to stop exploring How to explore a topic, including discussion of the most common mechanisms: mining for internal precedent, reading industry papers and books, and leveraging your external network Why avoiding judgment is an essential part of exploration By the end of this chapter, you’ll be able to conduct an exploration for the current or next strategy that you work on. What is exploration? One of the frequent senior leadership anti-patterns I’ve encountered in my career is The Grand Migration, where a new leader declares that a massive migration to a new technology stack–typically the stack used by their former employer–will solve every pressing problem. What’s distinguishing about the Grand Migration is not the initially bad selection, but the single-minded ferocity with which the senior leader pushes for their approach, even when it becomes abundantly clear to others that it doesn’t solve the problem at hand. These senior leaders are very intelligent, but have allowed themselves to be framed in by their initial thinking from prior experiences. Accepting those early thoughts as the foundation of their strategy, they build the entire strategy on top of those ideas, and eventually there is so much weight standing on those early assumptions that it becomes impossible to acknowledge the errors. Exploration is the deliberate practice of searching through a strategy’s problem and solution spaces before allowing yourself to commit to a given approach. It’s understanding how others have approached the same problem recently and in the past. It’s doing this both in trendy companies you admire, and in practical companies that actually resemble yours. Most exploration will be external to your team, but depending on your company, much of your exploration might be internal to the company. If you’re in a massive engineering organization of 100,000, there are likely existing internal solutions to your problem that you’ve never heard of. Conversely, if you’re in an organization of 50 engineers, it’s likely that much of your exploration will be external. When to explore Exploration is the first step of good strategy work. Even when you want to skip it, you will always regret skipping it, because you’ll inadvertently frame yourself into whatever approach you focus on first. Especially when it comes to problems that you’ve solved previously, exploration is the only thing preventing you from over-indexing on your prior experiences. Try to continue exploration until you know how three similar teams within your company and three similar companies have recently solved the same problem. Further, make sure you are able to explain the thinking behind those decisions. At that point,you should be ready to stop exploring and move on to the diagnosis step of strategy creation. Exploration should always come with a minimum and maximum timeframe: less than a few hours is very suspicious, and more than a week is generally questionably as well. How to explore While the details of each exploration will differ a bit, the overarching approach tends to be pretty similar across strategies. After I open up the draft strategy document I’m working on, my general approach to exploration is: Start throwing in every resource I can think of related to that problem. For example, in the Uber service provisioning strategy, I started by collecting recent papers on Mesos, Kubernetes, and Aurora to understand the state of the industry on orchestration. Do some web searching, foundational model prompting, and checking with a few current and prior colleagues about what topics and resources I might be missing. For example, for the Calm engineering strategy, I focused on talking with industry peers on tools they’d used to focus a team with diffuse goals. Summarize the list of resources I’ve gathered, organizing them by which I want to explore, and which I won’t spend time on but are worth mentioning. For example, the Large Language Model adoption strategy’s exploration section documents the variety of resources the team explored before completing it. Work through the list one by one, continuing to collect notes in the strategy document. When you’re done, synthesize those into a concise, readable summary of what you’ve learned. For example, the monolith decomposition strategy synthesizes the exploration of a broad topic into four paragraphs, with links out to references. Stop once I generally understand how a handful of similar internal and external teams have recently approached this problem. Of all the steps in strategy creation, exploration is inherently open-ended, and you may find a different approach works better for you. If you’re not sure what to do, try following the above steps closely. If you have a different approach that you’re confident in–as long as it’s not skipping exploration!–then go ahead and try that instead. While not discussed in this chapter, you can also use some techniques like Wardley mapping, covered in the Refinement chapter, to support your exploration phase. Wardley mapping is a strategy tool designed within a different strategy tradition, and consequently categorizing it as either solely an exploration tool or a refinement tool ignores some of its potential uses. There’s no perfect way to do strategy: take what works for you and use it. Mine internal precedent One of the most powerful forms of strategy is simply documenting how similar decisions have been made internally: often this is enough to steer how similar future decisions are made within your organization. This approach, documented in Staff Engineer’s Write five, then synthesize, is also the most valuable step of exploration for those working in established companies. If you are a tenured engineer within your organization, then it’s somewhat safe to assume that you are aware of the typical internal approaches. Even then, it’s worth poking around to see if there are any related skunkworks projects happening internally. This is doubly true if you’ve joined the organization recently, or are distant from the codebase itself. In that case, it’s almost always worth poking around to see what already exists. Sometimes the internal approach isn’t ideal, but it’s still superior because it’s already been implemented and there’s someone else maintaining it. In the long-run, your strategy can ride along as someone else addresses the issues that aren’t perfect fits. Using your network How should we control access to user data’s exploration section begins with: Our experience is that best practices around managing internal access to user data are widely available through our networks, and otherwise hard to find. The exact rationale for this is hard to determine, While there are many topics with significant public writing out there, my experience is that there are many topics where there’s very little you can learn without talking directly to practitioners. This is especially true for security, compliance, operating at truly large scale, and competitive processes like optimizing advertising spend. Further, it’s surprisingly common to find that how people publicly describe solving a problem and how they actually approach the problem are largely divorced. This is why having a broad personal network is exceptionally powerful, and makes it possible to quickly understand the breadth of possible solutions. It also provides access to the practical downsides to various approaches, which are often omitted when talking to public proponents. In a recent strategy session, a proposal came up that seemed off to me, and I was able to text–and get answers to those texts–industry peers before the meeting ended, which invalidated the room’s assumptions about what was and was not possible. A disagreement that might have taken weeks to resolve was instead resolved in a few minutes, and we were able to figure out next steps in that meeting rather than waiting a week for the next meeting when we’d realized our mistake. Of course, it’s also important to hold information from your network with skepticism. I’ve certainly had my network be wrong, and your network never knows how your current circumstances differ from theirs, so blindly accepting guidance from your network is never the right decision either. If you’re looking for a more detailed coverage on building your network, this topic has also come up in Staff Engineer’s chapter on Build a network of peers, and The Engineering Executive’s Primer’s chapter on Building your executive network. It feels silly to cover the same topic a third time, but it’s a foundational technique for effective decision making. Read widely; read narrowly Reading has always been an important part of my strategy work. There are two distinct motions to this approach: read widely on an ongoing basis to broaden your thinking, and read narrowly on the specific topic you’re working on. Starting with reading widely, I make an effort each year to read ten to twenty industry-relevant works. These are not necessarily new releases, but are new releases for me. Importantly, I try to read things that I don’t know much about or that I initially disagree with. Some of my recent reads were Chip War, Building Green Software, Tidy First?, and How Big Things Get Done. From each of these books, I learned something, and over time they’ve built a series of bookmarks in my head about ideas that might apply to new problems. On the other end of things is reading narrowly. When I recently started working on an AI agents strategy, the first thing I did was read through Chip Huyen’s AI Engineering, which was an exceptionally helpful survey. Similarly, when we started thinking about Uber’s service migration, we read a number of industry papers, including Large-scale cluster management at Google with Borg and Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. None of these readings had all the answers to the problems I was working on, but they did an excellent job at helping me understand the range of options, as well as identifying other references to consult in my exploration. I’ll mention two nuances that will help a lot here. First, I highly encourage getting comfortable with skimming books. Even tightly edited books will have a lot of content that isn’t particularly relevant to your current goals, and you should skip that content liberally. Second, what you read doesn’t have to be books. It can be blog posts, essays, interview transcripts, or certainly it can be books. In this context, “reading” doesn’t event have to actually be reading. There are conference talks that contain just as much as a blog post, and conferences that cover as much breadth as a book. There are also conference talks without a written equivalent, such as Dan Na’s excellent Pushing Through Friction. Each job is an education Experience is frequently disregarded in the technology industry, and there are ways to misuse experience by copying too liberally the solutions that worked in different circumstances, but the most effective, and the slowest, mechanism for exploring is continuing to work in the details of meaningful problems. You probably won’t choose every job to optimize for learning, but allowing you to instantly explore more complex problems over time–recognizing that a bit of your data will have become stale each time–is uniquely valuable. Save judgment for later As I’ve mentioned several times, the point of exploration is to go broad with the goal of understanding approaches you might not have considered, and invalidating things you initially think are true. Both of those things are only possible if you save judgment for later: if you’re passing judgment about whether approaches are “good” or “bad”, then your exploration is probably going astray. As a soft rule, I’d argue that if no one involved in a strategy has changed their mind about something they believed when you started the exploration step, then you’re not done exploring. This is especially true when it comes to strategy work by senior leaders. Their beliefs are often well-justified by years of experience, but it’s unclear which parts of their experience have become stale over time. Summary At this point, I hope you feel comfortable exploring as the first step of your strategy work, and understand the likely consequences of skipping this step. It’s not an overstatement to say that every one of the worst strategic failures I’ve encountered would have been prevented by its primary author taking a few days to explore the space before anchoring on a particular approach. A few days of feeling slow are always worth avoiding years of misguided efforts.

a week ago 10 votes
How should we control access to user data?

At some point in a startup’s lifecycle, they decide that they need to be ready to go public in 18 months, and a flurry of IPO-readiness activity kicks off. This strategy focuses on a company working on IPO readiness, which has identified a gap in their internal controls for managing access to their users’ data. It’s a company that wants to meaningfully improve their security posture around user data access, but which has had a number of failed security initiatives over the years. Most of those initiatives have failed because they significantly degraded internal workflows for teams like customer support, such that the initial progress was reverted and subverted over time, to little long-term effect. This strategy represents the Chief Information Security Officer’s (CISO) attempt to acknowledge and overcome those historical challenges while meeting their IPO readiness obligations, and–most importantly–doing right by their users. This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Reading this document To apply this strategy, start at the top with Policy. To understand the thinking behind this strategy, read sections in reverse order, starting with Explore, then Diagnose and so on. Relative to the default structure, this document has been refactored in two ways to improve readability: first, Operation has been folded into Policy; second, Refine has been embedded in Diagnose. More detail on this structure in Making a readable Engineering Strategy document. Policy & Operations Our new policies, and the mechanisms to operate them are: Controls for accessing user data must be significantly stronger prior to our IPO. Senior leadership, legal, compliance and security have decided that we are not comfortable accepting the status quo of our user data access controls as a public company, and must meaningfully improve the quality of resource-level access controls as part of our pre-IPO readiness efforts. Our Security team is accountable for the exact mechanisms and approach to addressing this risk. We will continue to prioritize a hybrid solution to resource-access controls. This has been our approach thus far, and the fastest available option. Directly expose the log of our resource-level accesses to our users. We will build towards a user-accessible log of all company accesses of user data, and ensure we are comfortable explaining each and every access. In addition, it means that each rationale for access must be comprehensible and reasonable from a user perspective. This is important because it aligns our approach with our users’ perspectives. They will be able to evaluate how we access their data, and make decisions about continuing to use our product based on whether they agree with our use. Good security discussions don’t frame decisions as a compromise between security and usability. We will pursue multi-dimensional tradeoffs to simultaneously improve security and efficiency. Whenever we frame a discussion on trading off between security and utility, it’s a sign that we are having the wrong discussion, and that we should rethink our approach. We will prioritize mechanisms that can both automatically authorize and automatically document the rationale for accesses to customer data. The most obvious example of this is automatically granting access to a customer support agent for users who have an open support ticket assigned to that agent. (And removing that access when that ticket is reassigned or resolved.) Measure progress on percentage of customer data access requests justified by a user-comprehensible, automated rationale. This will anchor our approach on simultaneously improving the security of user data and the usability of our colleagues’ internal tools. If we only expand requirements for accessing customer data, we won’t view this as progress because it’s not automated (and consequently is likely to encourage workarounds as teams try to solve problems quickly). Similarly, if we only improve usability, charts won’t represent this as progress, because we won’t have increased the number of supported requests. As part of this effort, we will create a private channel where the security and compliance team has visibility into all manual rationales for user-data access, and will directly message the manager of any individual who relies on a manual justification for accessing user data. Expire unused roles to move towards principle of least privilege. Today we have a number of roles granted in our role-based access control (RBAC) system to users who do not use the granted permissions. To address that issue, we will automatically remove roles from colleagues after 90 days of not using the role’s permissions. Engineers in an active on-call rotation are the exception to this automated permission pruning. Weekly reviews until we see progress; monthly access reviews in perpetuity. Starting now, there will be a weekly sync between the security engineering team, teams working on customer data access initiatives, and the CISO. This meeting will focus on rapid iteration and problem solving. This is explicitly a forum for ongoing strategy testing, with CISO serving as the meeting’s sponsor, and their Principal Security Engineer serving as the meeting’s guide. It will continue until we have clarity on the path to 100% coverage of user-comprehensible, automated rationales for access to customer data. Separately, we are also starting a monthly review of sampled accesses to customer data to ensure the proper usage and function of the rationale-creation mechanisms we build. This meeting’s goal is to review access rationales for quality and appropriateness, both by reviewing sampled rationales in the short-term, and identifying more automated mechanisms for identifying high-risk accesses to review in the future. Exceptions must be granted in writing by CISO. While our overarching Engineering Strategy states that we follow an advisory architecture process as described in Facilitating Software Architecture, the customer data access policy is an exception and must be explicitly approved, with documentation, by the CISO. Start that process in the #ciso channel. Diagnose We have a strong baseline of role-based access controls (RBAC) and audit logging. However, we have limited mechanisms for ensuring assigned roles follow the principle of least privilege. This is particularly true in cases where individuals change teams or roles over the course of their tenure at the company: some individuals have collected numerous unused roles over five-plus years at the company. Similarly, our audit logs are durable and pervasive, but we have limited proactive mechanisms for identifying anomalous usage. Instead they are typically used to understand what occurred after an incident is identified by other mechanisms. For resource-level access controls, we rely on a hybrid approach between a 3rd-party platform for incoming user requests, and approval mechanisms within our own product. Providing a rationale for access across these two systems requires manual work, and those rationales are later manually reviewed for appropriateness in a batch fashion. There are two major ongoing problems with our current approach to resource-level access controls. First, the teams making requests view them as a burdensome obligation without much benefit to them or on behalf of the user. Second, because the rationale review steps are manual, there is no verifiable evidence of the quality of the review. We’ve found no evidence of misuse of user data. When colleagues do access user data, we have uniformly and consistently found that there is a clear, and reasonable rationale for that access. For example, a ticket in the user support system where the user has raised an issue. However, the quality of our documented rationales is consistently low because it depends on busy people manually copying over significant information many times a day. Because the rationales are of low quality, the verification of these rationales is somewhat arbitrary. From a literal compliance perspective, we do provide rationales and auditing of these rationales, but it’s unclear if the majority of these audits increase the security of our users’ data. Historically, we’ve made significant security investments that caused temporary spikes in our security posture. However, looking at those initiatives a year later, in many cases we see a pattern of increased scrutiny, followed by a gradual repeal or avoidance of the new mechanisms. We have found that most of them involved increased friction for essential work performed by other internal teams. In the natural order of performing work, those teams would subtly subvert the improvements because it interfered with their immediate goals (e.g. supporting customer requests). As such, we have high conviction from our track record that our historical approach can create optical wins internally. We have limited conviction that it can create long-term improvements outside of significant, unlikely internal changes (e.g. colleagues are markedly less busy a year from now than they are today). It seems likely we need a new approach to meaningfully shift our stance on these kinds of problems. Explore Our experience is that best practices around managing internal access to user data are widely available through our networks, and otherwise hard to find. The exact rationale for this is hard to determine, but it seems possible that it’s a topic that folks are generally uncomfortable discussing in public on account of potential future liability and compliance issues. In our exploration, we found two standardized dimensions (role-based access controls, audit logs), and one highly divergent dimension (resource-specific access controls): Role-based access controls (RBAC) are a highly standardized approach at this point. The core premise is that users are mapped to one or more roles, and each role is granted a certain set of permissions. For example, a role representing the customer support agent might be granted permission to deactivate an account, whereas a role representing the sales engineer might be able to configure a new account. Audit logs are similarly standardized. All access and mutation of resources should be tied in a durable log to the human who performed the action. These logs should be accumulated in a centralized, queryable solution. One of the core challenges is determining how to utilize these logs proactively to detect issues rather than reactively when an issue has already been flagged. Resource-level access controls are significantly less standardized than RBAC or audit logs. We found three distinct patterns adopted by companies, with little consistency across companies on which is adopted. Those three patterns for resource-level access control were: 3rd-party enrichment where access to resources is managed in a 3rd-party system such as Zendesk. This requires enriching objects within those systems with data and metadata from the product(s) where those objects live. It also requires implementing actions on the platform, such as archiving or configuration, allowing them to live entirely in that platform’s permission structure. The downside of this approach is tight coupling with the platform vendor, any limitations inherent to that platform, and the overhead of maintaining engineering teams familiar with both your internal technology stack and the platform vendor’s technology stack. 1st-party tool implementation where all activity, including creation and management of user issues, is managed within the core product itself. This pattern is most common in earlier stage companies or companies whose customer support leadership “grew up” within the organization without much exposure to the approach taken by peer companies. The advantage of this approach is that there is a single, tightly integrated and infinitely extensible platform for managing interactions. The downside is that you have to build and maintain all of that work internally rather than pushing it to a vendor that ought to be able to invest more heavily into their tooling. Hybrid solutions where a 3rd-party platform is used for most actions, and is further used to permit resource-level access within the 1st-party system. For example, you might be able to access a user’s data only while there is an open ticket created by that user, and assigned to you, in the 3rd-party platform. The advantage of this approach is that it allows supporting complex workflows that don’t fit within the platform’s limitations, and allows you to avoid complex coupling between your product and the vendor platform. Generally, our experience is that all companies implement RBAC, audit logs, and one of the resource-level access control mechanisms. Most companies pursue either 3rd-party enrichment with a sizable, long-standing team owning the platform implementation, or rely on a hybrid solution where they are able to avoid a long-standing dedicated team by lumping that work into existing teams.

2 weeks ago 14 votes
Is engineering strategy useful?

While I frequently hear engineers bemoan a missing strategy, they rarely complete the thought by articulating why the missing strategy matters. Instead, it serves as more of a truism: the economy used to be better, children used to respect their parents, and engineering organizations used to have an engineering strategy. This chapter starts by exploring something I believe quite strongly: there’s always an engineering strategy, even if there’s nothing written down. From there, we’ll discuss why strategy, especially written strategy, is such a valuable opportunity for organizations that take it seriously. We’ll dig into: Why there’s always a strategy, even when people say there isn’t How strategies have been impactful across my career How inappropriate strategies create significant organizational pain without much compensating impact How written strategy drives organizational learning The costs of not writing strategy down How strategy supports personal learning and developing, even in cases where you’re not empowered to “do strategy” yourself By this chapter’s end, hopefully you will agree with me that strategy is an undertaking worth investing your–and your organization’s–time in. This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. There’s always a strategy I’ve never worked somewhere where people didn’t claim there as no strategy. In many of those companies, they’d say there was no engineering strategy. Once I became an executive and was able to document and distribute an engineering strategy, accusations of missing strategy didn’t go away, they just shfited to focus on a missing product or company strategy. This even happened at companies that definitively had engineering strategies like Stripe in 2016 which had numerous pillars to a clear engineering strategy such as: Maintain backwards API compatibilty, at almost any cost (e.g. force an upgrade from TLS 1.2 to TLS 1.3 to retain PCI compliance, but don’t force upgrades from the /v1/charges endpoint to the /v1/payment_intents endpoint) Work in Ruby in a monorepo, unless it’s the PCI environment, data processing, or data science work Engineers are fully responsible for the usability of their work, even when there are product or engineering managers involved Working there it was generally clear what the company’s engineering strategy was on any given topic. That said, it sometimes required asking around, and over time certain decisions became sufficiently contentious that it became hard to definitively answer what the strategy was. For example, the adoptino of Ruby versus Java became contentious enough that I distributed a strategy attempting to mediate the disagreement, Magnitudes of exploration, although it wasn’t a particularly successful effort (for reasons that are obvious in hindsight, particularly the lack of any enforcement mechanism). In the same sense that William Gibson said “The future is already here – it’s just not very evenly distributed,” there is always a strategy embedded into an organization’s decisions, although in many organizations that strategy is only visible to a small group, and may be quickly forgotten. If you ever find yourself thinking that a strategy doesn’t exist, I’d encourage you to instead ask yourself where the strategy lives if you can’t find it. Once you do find it, you may also find that the strategy is quite ineffective, but I’ve simply never found that it doesn’t exist. Strategy is impactful In “We are a product engineering company!”, we discuss Calm’s engineering strategy to address pervasive friction within the engineering team. The core of that strategy is clarifying how Calm makes major technology decisions, along with documenting the motivating goal steering those decisions: maximizing time and energy spent on creating their product. That strategy reduced friction by eliminating the cause of ongoing debate. It was successful in resetting the team’s focus. It also caused several engineers to leave the company, because it was incompatible with their priorities. It’s easy to view that as a downside, but I don’t think it was. A clear, documented strategy made it clear to everyone involved what sort of game we were playing, the rules for that game, and for the first time let them accurately decide if they wanted to be part of that game with the wider team. Creating alignment is one of the ways that strategy makes an impact, but it’s certainly not the only way. Some of the ways that strategies support the creating organization are: Concentrating company investment into a smaller space. For example, deciding not to decompose a monolith allows you to invest the majority of your tooling efforts on one language, one test suite, and one deployment mechanism. Many interesting properties only available through universal adoption. For example, moving to an “N-1 policy” on backfilled roles is a significant opportunity for managing costs, but only works if consistently adopted. As another example, many strategies for disaster recovery or multi-region are only viable if all infrastructure has a common configuration mechanism. Focus execution on what truly matters. For example, Uber’s service migration strategy allowed a four engineer team to migrate a thousand services operated by two thousand engineers to a new provisioning and orchestration platform in less than a year. This was an extraordinarily difficult project, and was only possible because of clear thinking. Creating a knowledge repository of how your organization thinks. Onboarding new hires, particularly senior new hires, is much more effective with documented strategy. For example, most industry professionals today have a strongly held opinion on how to adopt large language models. New hires will have a strong opinion as well, but they’re unlikely to share your organization’s opinion unless there’s a clear document they can read to understand it. There are some things that a strategy, even a cleverly written one, cannot do. However, it’s always been my experience that developing a strategy creates progress, even if the progress is understanding the inherent disagreement preventing agreement. Inappropriate strategy is especially impactful While good strategy can accomplish many things, it sometimes feels that inappropriate strategy is far more impactful. Of course, impactful in all the wrong ways. Digg V4 remains the worst considered strategy I’ve personally participated in. It was a complete rewrite of the Digg V3.5 codebase from a PHP monolith to a PHP frontend and backend of a dozen Python services. It also moved the database from sharded MySQL to an early version of Cassandra. Perhaps worst, it replaced the nuanced algorithms developed over a decade with a hack implemented a few days before launch. Although it’s likely Digg would have struggled to become profitable due to its reliance on search engine optimization for traffic, and Google’s frequently changing search algorithm of that era, the engineering strategy ensured we died fast rather than having an opportunity to dig our way out. Importantly, it’s not just Digg. Almost every engineering organization you drill into will have it’s share of unused platform projects that captured decades of engineering years to the detriment of an important opportunity. A shocking number of senior leaders join new companies and initiate a grand migration that attempts to entirely rewrite the architecture, switch programming languages, or otherwise shift their new organization to resemble a prior organization where they understood things better. Inappropriate versus bad When I first wrote this section, I just labeled this sort of strategy as “bad.” The challenge with that term is that the same strategy might well be very effective in a different set of circumstances. For example, if Digg had been a three person company with no revenue, rewriting from scratch could have the right decision! As a result, I’ve tried to prefer the term “inappropriate” rather than “bad” to avoid getting caught up on whether a given approach might work in other circumstances. Every approach undoubtedly works in some organization. Written strategy drives organizational learning When I joined Carta, I noticed we had an inconsistent approach to a number of important problems. Teams had distinct standard kits for how they approached new projects. Adoption of existing internal platforms was inconsistent, as was decision making around funding new internal platforms. There was widespread agreement that we were decomposing our monolith, but no agreement on how we were doing it. Coming into such a permissive strategy environment, with strong, differing perspectives on the ideal path forward, one of my first projects was writing down an explicit engineering strategy along with our newly formed Navigators team, itself a part of our new engineering strategy. Navigators at Carta As discussed in Navigators, we developed a program at Carta to have explicitly named individual contributor, technical leaders to represent key parts of the engineering organization. This representative leadership group made it possible to iterate on strategy with a small team of about ten engineers that represented the entire organization, rather than take on the impossible task of negotiating with 400 engineers directly. This written strategy made it possible to explicitly describe the problems we saw, and how we wanted to navigate those problems. Further, it was an artifact that we were able to iterate on in a small group, but then share widely for feedback from teams we might have missed. After initial publishing, we shared it widely and talked about it frequently in engineering all-hands meetings. Then we came back to it each year, or when things stopped making much sense, and revised it. As an example, our initial strategy didn’t talk about artificial intelligence at all. A few months later, we extended it to mention a very conservative approach to using Large Language Models. Most recently, we’ve revised the artificial intelligence portion again, as we dive deeply into agentic workflows. A lot of people have disagreed with parts of the strategy, which is great: that’s one of the key benefits of a written strategy, it’s possible to precisely disagree. From that disagreement, we’ve been able to evolve our strategy. Sometimes because there’s new information like the current rapidly evolution of artificial intelligence pratices, and other times because our initial approach could be improved like in how we gated membership of the initial Navigators team. New hires are able to disagree too, and do it from an informed place rather than coming across as attached to their prior company’s practices. In particular, they’re able to understand the historical thinking that motivated our decisions, even when that context is no longer obvious. At the time we paused decomposition of our monolith, there was significant friction in service provisioning, but that’s far less true today, which makes the decision seem a bit arbitrary. Only the written document can consistently communicate that context across a growing, shifting, and changing organization. With oral history, what you believe is highly dependent on who you talk with, which shapes your view of history and the present. With writen history, it’s far more possible to agree at scale, which is the prerequisite to growing at scale rather than isolating growth to small pockets of senior leadership. The cost of implicit strategy We just finished talking about written strategy, and this book spends a lot of time on this topic, including a chapter on how to structure strategies to maximize readability. It’s not just because of the positives created by written strategy, but also because of the damage unwritten strategy creates. Vulnerable to misinterpretation. Information flow in verbal organizations depends on an individual being in a given room for a decision, and then accurately repeating that information to the others who need it. However, it’s common to see those individuals fail to repeat that information elsewhere. Sometimes their interpretation is also faulty to some degree. Both of these create significant problems in operating strategy. Two-headed organizations Some years ago, I started moving towards a model where most engineering organizations I worked with have two leaders: one who’s a manager, and another who is a senior engineer. This was partially to ensure engineering context was included in senior decision making, but it was also to reduce communication errors. Errors in point-to-point communication are so prevalent when done one-to-one, that the only solution I could find for folks who weren’t reading-oriented communicators was ensuring I had communicated strategy (and other updates) to at least two people. Inconsistency across teams. At one company I worked in, promotions to Staff-plus role happened at a much higher rate in the infrastructure engineering organization than the product engineering team. This created a constant drain out of product engineering to work on infrastructure shaped problems, even if those problems weren’t particularly valuable to the business. New leaders had no idea this informal policy existed, and they would routinely run into trouble in calibration discussions. They also weren’t aware they needed to go argue for a better policy. Worse, no one was sure if this was a real policy or not, so it was ultimately random whether this perspective was represented for any given promotion: sometimes good promotions would be blocked, sometimes borderline cases would be approved. Inconsistency over time. Implementing a new policy tends to be a mix of persistent and one-time actions. For example, let’s say you wanted to standardize all HTTP operations to use the same library across your codebase. You might add a linter check to reject known alternatives, and you’ll probably do a one-time pass across your codebase standardizing on that library. However, two years later there are another three random HTTP libraries in your codebase, creeping into the cracks surrounding your linting. If the policy is written down, and a few people read it, then there’s a number of ways this could be nonetheless prevented. If it’s not written down, it’s much less likely someone will remember, and much more likely they won’t remember the rationale well enough to argue about it. Hazard to new leadership. When a new Staff-plus engineer or executive joins a company, it’s common to blame them for failing to understand the existing context behind decisions. That’s fair: a big part of senior leadership is uncovering and understanding context. It’s also unfair: explicit documentation of prior thinking would have made this much easier for them. Every particularly bad new-leader onboarding that I’ve seen has involved a new leader coming into an unfilled role, that the new leader’s manager didn’t know how to do. In those cases, success is entirely dependent on that new leader’s ability and interest in learning. In most ways, the practice of documenting strategy has a lot in common with succession planning, where the full benefits accrue to the organization rather than to the individual doing it. It’s possible to maintain things when the original authors are present, appreciating the value requires stepping outside yourself for a moment to value things that will matter most to the organization when you’re no longer a member. Information herd immunity A frequent objection to written strategy is that no one reads anything. There’s some truth to this: it’s extremely hard to get everyone in an organization to know something. However, I’ve never found that goal to be particularly important. My view of information dispersal in an organization is the same as Herd immunity: you don’t need everyone to know something, just to have enough people who know something that confusion doesn’t propagate too far. So, it may be impossible for all engineers to know strategy details, but you certainly can have every Staff-plus engineer and engineering manager know those details. Strategy supports personal learning While I believe that the largest benefits of strategy accrue to the organization, rather than the individual creating it, I also believe that strategy is an underrated avenue for self-development. The ways that I’ve seen strategy support personal development are: Creating strategy builds self-awareness. Starting with a concrete example, I’ve worked with several engineers who viewed themselves as extremely senior, but frequently demanded that projects were implemented using new programming languages or technologies because they personally wanted to learn about the technology. Their internal strategy was clear–they wanted to work on something fun–but following the steps to build an engineering strategy would have created a strategy that even they agreed didn’t make sense. Strategy supports situational awareness in new environments. Wardley mapping talks a lot about situational awareness as a prerequisite to good strategy. This is ensuring you understand the realities of your circumstances, which is the most destructive failure of new senior engineering leaders. By explicitly stating the diagnosis where the strategy applied, it makes it easier for you to debug why reusing a prior strategy in a new team or company might not work. Strategy as your personal archive. Just as documented strategy is institutional memory, it also serves as personal memory to understand the impact of your prior approaches. Each of us is an archivist of our prior work, pulling out the most valuable pieces to address the problem at hand. Over a long career, memory fades–and motivated reasoning creeps in–but explicit documentation doesn’t. Indeed, part of the reason I started working on this book now rather than later is that I realized I was starting to forget the details of the strategy work I did earlier in my career. If I wanted to preserve the wisdom of that era, and ensure I didn’t have to relearn the same lessons in the future, I had to write it now. Summary We’ve covered why strategy can be a valuable learning mechanism for both your engineering organization and for you. We’ve shown how strategies have helped organizations deal with service migrations, monolith decomposition, and right-sizing backfilling. We’ve also discussed how inappropriate strategy contributed to Digg’s demise. However, if I had to pick two things to emphasize as this chapter ends, it wouldn’t be any of those things. Rather, it would be two themes that I find are the most frequently ignored: There’s always a strategy, even if it isn’t written down. The single biggest act you can take to further strategy in your organization is to write down strategy so it can be debated, agreed upon, and explicitly evolved. Discussions around topics like strategy often get caught up in high prestige activities like making controversial decisions, but the most effective strategists I’ve seen make more progress by actually performing the basics: writing things down, exploring widely to see how other companies solve the same problem, accepting feedback into their draft from folks who disagree with them. Strategy is useful, and doing strategy can be simple, too.

3 weeks ago 21 votes

More in programming

Diagnosis in engineering strategy.

Once you’ve written your strategy’s exploration, the next step is working on its diagnosis. Diagnosis is understanding the constraints and challenges your strategy needs to address. In particular, it’s about doing that understanding while slowing yourself down from deciding how to solve the problem at hand before you know the problem’s nuances and constraints. If you ever find yourself wanting to skip the diagnosis phase–let’s get to the solution already!–then maybe it’s worth acknowledging that every strategy that I’ve seen fail, did so due to a lazy or inaccurate diagnosis. It’s very challenging to fail with a proper diagnosis, and almost impossible to succeed without one. The topics this chapter will cover are: Why diagnosis is the foundation of effective strategy, on which effective policy depends. Conversely, how skipping the diagnosis phase consistently ruins strategies A step-by-step approach to diagnosing your strategy’s circumstances How to incorporate data into your diagnosis effectively, and where to focus on adding data Dealing with controversial elements of your diagnosis, such as pointing out that your own executive is one of the challenges to solve Why it’s more effective to view difficulties as part of the problem to be solved, rather than a blocking issue that prevents making forward progress The near impossibility of an effective diagnosis if you don’t bring humility and self-awareness to the process Into the details we go! This is an exploratory, draft chapter for a book on engineering strategy that I’m brainstorming in #eng-strategy-book. As such, some of the links go to other draft chapters, both published drafts and very early, unpublished drafts. Diagnosis is strategy’s foundation One of the challenges in evaluating strategy is that, after the fact, many effective strategies are so obvious that they’re pretty boring. Similarly, most ineffective strategies are so clearly flawed that their authors look lazy. That’s because, as a strategy is operated, the reality around it becomes clear. When you’re writing your strategy, you don’t know if you can convince your colleagues to adopt a new approach to specifying APIs, but a year later you know very definitively whether it’s possible. Building your strategy’s diagnosis is your attempt to correctly recognize the context that the strategy needs to solve before deciding on the policies to address that context. Done well, the subsequent steps of writing strategy often feel like an afterthought, which is why I think of diagnosis as strategy’s foundation. Where exploration was an evaluation-free activity, diagnosis is all about evaluation. How do teams feel today? Why did that project fail? Why did the last strategy go poorly? What will be the distractions to overcome to make this new strategy successful? That said, not all evaluation is equal. If you state your judgment directly, it’s easy to dispute. An effective diagnosis is hard to argue against, because it’s a web of interconnected observations, facts, and data. Even for folks who dislike your conclusions, the weight of evidence should be hard to shift. Strategy testing, explored in the Refinement section, takes advantage of the reality that it’s easier to diagnose by doing than by speculating. It proposes a recursive diagnosis process until you have real-world evidence that the strategy is working. How to develop your diagnosis Your strategy is almost certain to fail unless you start from an effective diagnosis, but how to build a diagnosis is often left unspecified. That’s because, for most folks, building the diagnosis is indeed a dark art: unspecified, undiscussion, and uncontrollable. I’ve been guilty of this as well, with The Engineering Executive’s Primer’s chapter on strategy staying silent on the details of how to diagnose for your strategy. So, yes, there is some truth to the idea that forming your diagnosis is an emergent, organic process rather than a structured, mechanical one. However, over time I’ve come to adopt a fairly structured approach: Braindump, starting from a blank sheet of paper, write down your best understanding of the circumstances that inform your current strategy. Then set that piece of paper aside for the moment. Summarize exploration on a new piece of paper, review the contents of your exploration. Pull in every piece of diagnosis from similar situations that resonates with you. This is true for both internal and external works! For each diagnosis, tag whether it fits perfectly, or needs to be adjusted for your current circumstances. Then, once again, set the piece of paper aside. Mine for distinct perspectives on yet another blank page, talking to different stakeholders and colleagues who you know are likely to disagree with your early thinking. Your goal is not to agree with this feedback. Instead, it’s to understand their view. The Crux by Richard Rumelt anchors diagnosis in this approach, emphasizing the importance of “testing, adjusting, and changing the frame, or point of view.” Synthesize views into one internally consistent perspective. Sometimes the different perspectives you’ve gathered don’t mesh well. They might well explicitly differ in what they believe the underlying problem is, as is typical in tension between platform and product engineering teams. The goal is to competently represent each of these perspectives in the diagnosis, even the ones you disagree with, so that later on you can evaluate your proposed approach against each of them. When synthesizing feedback goes poorly, it tends to fail in one of two ways. First, the author’s opinion shines through so strongly that it renders the author suspect. Your goal is never to agree with every team’s perspective, just as your diagnosis should typically avoid crowning any perspective as correct: a reader should generally be appraised of the details and unaware of the author. The second common issue is when a group tries to jointly own the synthesis, but create a fractured perspective rather than a unified one. I generally find that having one author who is accountable for representing all views works best to address both of these issues. Test drafts across perspectives. Once you’ve written your initial diagnosis, you want to sit down with the people who you expect to disagree most fervently. Iterate with them until they agree that you’ve accurately captured their perspective. It might be that they disagree with some other view points, but they should be able to agree that others hold those views. They might argue that the data you’ve included doesn’t capture their full reality, in which case you can caveat the data by saying that their team disagrees that it’s a comprehensive lens. Don’t worry about getting the details perfectly right in your initial diagnosis. You’re trying to get the right crumbs to feed into the next phase, strategy refinement. Allowing yourself to be directionally correct, rather than perfectly correct, makes it possible to cover a broad territory quickly. Getting caught up in perfecting details is an easy way to anchor yourself into one perspective prematurely. At this point, I hope you’re starting to predict how I’ll conclude any recipe for strategy creation: if these steps feel overly mechanical to you, adjust them to something that feels more natural and authentic. There’s no perfect way to understand complex problems. That said, if you feel uncertain, or are skeptical of your own track record, I do encourage you to start with the above approach as a launching point. Incorporating data into your diagnosis The strategy for Navigating Private Equity ownership’s diagnosis includes a number of details to help readers understand the status quo. For example the section on headcount growth explains headcount growth, how it compares to the prior year, and providing a mental model for readers to translate engineering headcount into engineering headcount costs: Our Engineering headcount costs have grown by 15% YoY this year, and 18% YoY the prior year. Headcount grew 7% and 9% respectively, with the difference between headcount and headcount costs explained by salary band adjustments (4%), a focus on hiring senior roles (3%), and increased hiring in higher cost geographic regions (1%). If everyone evaluating a strategy shares the same foundational data, then evaluating the strategy becomes vastly simpler. Data is also your mechanism for supporting or critiquing the various views that you’ve gathered when drafting your diagnosis; to an impartial reader, data will speak louder than passion. If you’re confident that a perspective is true, then include a data narrative that supports it. If you believe another perspective is overstated, then include data that the reader will require to come to the same conclusion. Do your best to include data analysis with a link out to the full data, rather than requiring readers to interpret the data themselves while they are reading. As your strategy document travels further, there will be inevitable requests for different cuts of data to help readers understand your thinking, and this is somewhat preventable by linking to your original sources. If much of the data you want doesn’t exist today, that’s a fairly common scenario for strategy work: if the data to make the decision easy already existed, you probably would have already made a decision rather than needing to run a structured thinking process. The next chapter on refining strategy covers a number of tools that are useful for building confidence in low-data environments. Whisper the controversial parts At one time, the company I worked at rolled out a bar raiser program styled after Amazon’s, where there was an interviewer from outside the team that had to approve every hire. I spent some time arguing against adding this additional step as I didn’t understand what we were solving for, and I was surprised at how disinterested management was about knowing if the new process actually improved outcomes. What I didn’t realize until much later was that most of the senior leadership distrusted one of their peers, and had rolled out the bar raiser program solely to create a mechanism to control that manager’s hiring bar when the CTO was disinterested holding that leader accountable. (I also learned that these leaders didn’t care much about implementing this policy, resulting in bar raiser rejections being frequently ignored, but that’s a discussion for the Operations for strategy chapter.) This is a good example of a strategy that does make sense with the full diagnosis, but makes little sense without it, and where stating part of the diagnosis out loud is nearly impossible. Even senior leaders are not generally allowed to write a document that says, “The Director of Product Engineering is a bad hiring manager.” When you’re writing a strategy, you’ll often find yourself trying to choose between two awkward options: Say something awkward or uncomfortable about your company or someone working within it Omit a critical piece of your diagnosis that’s necessary to understand the wider thinking Whenever you encounter this sort of debate, my advice is to find a way to include the diagnosis, but to reframe it into a palatable statement that avoids casting blame too narrowly. I think it’s helpful to discuss a few concrete examples of this, starting with the strategy for navigating private equity, whose diagnosis includes: Based on general practice, it seems likely that our new Private Equity ownership will expect us to reduce R&D headcount costs through a reduction. However, we don’t have any concrete details to make a structured decision on this, and our approach would vary significantly depending on the size of the reduction. There are many things the authors of this strategy likely feel about their state of reality. First, they are probably upset about the fact that their new private equity ownership is likely to eliminate colleagues. Second, they are likely upset that there is no clear plan around what they need to do, so they are stuck preparing for a wide range of potential outcomes. However they feel, they don’t say any of that, they stick to precise, factual statements. For a second example, we can look to the Uber service migration strategy: Within infrastructure engineering, there is a team of four engineers responsible for service provisioning today. While our organization is growing at a similar rate as product engineering, none of that additional headcount is being allocated directly to the team working on service provisioning. We do not anticipate this changing. The team didn’t agree that their headcount should not be growing, but it was the reality they were operating in. They acknowledged their reality as a factual statement, without any additional commentary about that statement. In both of these examples, they found a professional, non-judgmental way to acknowledge the circumstances they were solving. The authors would have preferred that the leaders behind those decisions take explicit accountability for them, but it would have undermined the strategy work had they attempted to do it within their strategy writeup. Excluding critical parts of your diagnosis makes your strategies particularly hard to evaluate, copy or recreate. Find a way to say things politely to make the strategy effective. As always, strategies are much more about realities than ideals. Reframe blockers as part of diagnosis When I work on strategy with early-career leaders, an idea that comes up a lot is that an identified problem means that strategy is not possible. For example, they might argue that doing strategy work is impossible at their current company because the executive team changes their mind too often. That core insight is almost certainly true, but it’s much more powerful to reframe that as a diagnosis: if we don’t find a way to show concrete progress quickly, and use that to excite the executive team, our strategy is likely to fail. This transforms the thing preventing your strategy into a condition your strategy needs to address. Whenever you run into a reason why your strategy seems unlikely to work, or why strategy overall seems difficult, you’ve found an important piece of your diagnosis to include. There are never reasons why strategy simply cannot succeed, only diagnoses you’ve failed to recognize. For example, we knew in our work on Uber’s service provisioning strategy that we weren’t getting more headcount for the team, the product engineering team was going to continue growing rapidly, and that engineering leadership was unwilling to constrain how product engineering worked. Rather than preventing us from implementing a strategy, those components clarified what sort of approach could actually succeed. The role of self-awareness Every problem of today is partially rooted in the decisions of yesterday. If you’ve been with your organization for any duration at all, this means that you are directly or indirectly responsible for a portion of the problems that your diagnosis ought to recognize. This means that recognizing the impact of your prior actions in your diagnosis is a powerful demonstration of self-awareness. It also suggests that your next strategy’s success is rooted in your self-awareness about your prior choices. Don’t be afraid to recognize the failures in your past work. While changing your mind without new data is a sign of chaotic leadership, changing your mind with new data is a sign of thoughtful leadership. Summary Because diagnosis is the foundation of effective strategy, I’ve always found it the most intimidating phase of strategy work. While I think that’s a somewhat unavoidable reality, my hope is that this chapter has somewhat prepared you for that challenge. The four most important things to remember are simply: form your diagnosis before deciding how to solve it, try especially hard to capture perspectives you initially disagree with, supplement intuition with data where you can, and accept that sometimes you’re missing the data you need to fully understand. The last piece in particular, is why many good strategies never get shared, and the topic we’ll address in the next chapter on strategy refinement.

10 hours ago 3 votes
My friend, JT

I’ve had a cat for almost a third of my life.

2 hours ago 2 votes
[Course Launch] Hands-on Introduction to X86 Assembly

A Live, Interactive Course for Systems Engineers

4 hours ago 2 votes
It’s cool to care

I’m sitting in a small coffee shop in Brooklyn. I have a warm drink, and it’s just started to snow outside. I’m visiting New York to see Operation Mincemeat on Broadway – I was at the dress rehearsal yesterday, and I’ll be at the opening preview tonight. I’ve seen this show more times than I care to count, and I hope US theater-goers love it as much as Brits. The people who make the show will tell you that it’s about a bunch of misfits who thought they could do something ridiculous, who had the audacity to believe in something unlikely. That’s certainly one way to see it. The musical tells the true story of a group of British spies who tried to fool Hitler with a dead body, fake papers, and an outrageous plan that could easily have failed. Decades later, the show’s creators would mirror that same spirit of unlikely ambition. Four friends, armed with their creativity, determination, and a wardrobe full of hats, created a new musical in a small London theatre. And after a series of transfers, they’re about to open the show under the bright lights of Broadway. But when I watch the show, I see a story about friendship. It’s about how we need our friends to help us, to inspire us, to push us to be the best versions of ourselves. I see the swaggering leader who needs a team to help him truly achieve. The nervous scientist who stands up for himself with the support of his friends. The enthusiastic secretary who learns wisdom and resilience from her elder. And so, I suppose, it’s fitting that I’m not in New York on my own. I’m here with friends – dozens of wonderful people who I met through this ridiculous show. At first, I was just an audience member. I sat in my seat, I watched the show, and I laughed and cried with equal measure. After the show, I waited at stage door to thank the cast. Then I came to see the show a second time. And a third. And a fourth. After a few trips, I started to see familiar faces waiting with me at stage door. So before the cast came out, we started chatting. Those conversations became a Twitter community, then a Discord, then a WhatsApp. We swapped fan art, merch, and stories of our favourite moments. We went to other shows together, and we hung out outside the theatre. I spent New Year’s Eve with a few of these friends, sitting on somebody’s floor and laughing about a bowl of limes like it was the funniest thing in the world. And now we’re together in New York. Meeting this kind, funny, and creative group of people might seem as unlikely as the premise of Mincemeat itself. But I believed it was possible, and here we are. I feel so lucky to have met these people, to take this ridiculous trip, to share these precious days with them. I know what a privilege this is – the time, the money, the ability to say let’s do this and make it happen. How many people can gather a dozen friends for even a single evening, let alone a trip halfway round the world? You might think it’s silly to travel this far for a theatre show, especially one we’ve seen plenty of times in London. Some people would never see the same show twice, and most of us are comfortably into double or triple-figures. Whenever somebody asks why, I don’t have a good answer. Because it’s fun? Because it’s moving? Because I enjoy it? I feel the need to justify it, as if there’s some logical reason that will make all of this okay. But maybe I don’t have to. Maybe joy doesn’t need justification. A theatre show doesn’t happen without people who care. Neither does a friendship. So much of our culture tells us that it’s not cool to care. It’s better to be detached, dismissive, disinterested. Enthusiasm is cringe. Sincerity is weakness. I’ve certainly felt that pressure – the urge to play it cool, to pretend I’m above it all. To act as if I only enjoy something a “normal” amount. Well, fuck that. I don’t know where the drive to be detached comes from. Maybe it’s to protect ourselves, a way to guard against disappointment. Maybe it’s to seem sophisticated, as if having passions makes us childish or less mature. Or perhaps it’s about control – if we stay detached, we never have to depend on others, we never have to trust in something bigger than ourselves. Being detached means you can’t get hurt – but you’ll also miss out on so much joy. I’m a big fan of being a big fan of things. So many of the best things in my life have come from caring, from letting myself be involved, from finding people who are a big fan of the same things as me. If I pretended not to care, I wouldn’t have any of that. Caring – deeply, foolishly, vulnerably – is how I connect with people. My friends and I care about this show, we care about each other, and we care about our joy. That care and love for each other is what brought us together, and without it we wouldn’t be here in this city. I know this is a once-in-a-lifetime trip. So many stars had to align – for us to meet, for the show we love to be successful, for us to be able to travel together. But if we didn’t care, none of those stars would have aligned. I know so many other friends who would have loved to be here but can’t be, for all kinds of reasons. Their absence isn’t for lack of caring, and they want the show to do well whether or not they’re here. I know they care, and that’s the important thing. To butcher Tennyson: I think it’s better to care about something you cannot affect, than to care about nothing at all. In a world that’s full of cynicism and spite and hatred, I feel that now more than ever. I’d recommend you go to the show if you haven’t already, but that’s not really the point of this post. Maybe you’ve already seen Operation Mincemeat, and it wasn’t for you. Maybe you’re not a theatre kid. Maybe you aren’t into musicals, or history, or war stories. That’s okay. I don’t mind if you care about different things to me. (Imagine how boring the world would be if we all cared about the same things!) But I want you to care about something. I want you to find it, find people who care about it too, and hold on to them. Because right now, in this city, with these people, at this show? I’m so glad I did. And I hope you find that sort of happiness too. Some of the people who made this trip special. Photo by Chloe, and taken from her Twitter. Timing note: I wrote this on February 15th, but I delayed posting it because I didn’t want to highlight the fact I was away from home. [If the formatting of this post looks odd in your feed reader, visit the original article]

yesterday 3 votes
Stick with the customer

One of the biggest mistakes that new startup founders make is trying to get away from the customer-facing roles too early. Whether it's customer support or it's sales, it's an incredible advantage to have the founders doing that work directly, and for much longer than they find comfortable. The absolute worst thing you can do is hire a sales person or a customer service agent too early. You'll miss all the golden nuggets that customers throw at you for free when they're rejecting your pitch or complaining about the product. Seeing these reasons paraphrased or summarized destroy all the nutrients in their insights. You want that whole-grain feedback straight from the customers' mouth!  When we launched Basecamp in 2004, Jason was doing all the customer service himself. And he kept doing it like that for three years!! By the time we hired our first customer service agent, Jason was doing 150 emails/day. The business was doing millions of dollars in ARR. And Basecamp got infinitely, better both as a market proposition and as a product, because Jason could funnel all that feedback into decisions and positioning. For a long time after that, we did "Everyone on Support". Frequently rotating programmers, designers, and founders through a day of answering emails directly to customers. The dividends of doing this were almost as high as having Jason run it all in the early years. We fixed an incredible number of minor niggles and annoying bugs because programmers found it easier to solve the problem than to apologize for why it was there. It's not easy doing this! Customers often offer their valuable insights wrapped in rude language, unreasonable demands, and bad suggestions. That's why many founders quit the business of dealing with them at the first opportunity. That's why few companies ever do "Everyone On Support". That's why there's such eagerness to reduce support to an AI-only interaction. But quitting dealing with customers early, not just in support but also in sales, is an incredible handicap for any startup. You don't have to do everything that every customer demands of you, but you should certainly listen to them. And you can't listen well if the sound is being muffled by early layers of indirection.

yesterday 4 votes