More from haseeb qureshi
Ethereum today is incredibly congested—it’s even more congested now than it was during the height of the ICO bubble. This is impressive, but also worrying! Ethereum 2.0 is still a ways away, but the tiny island of Ethereum 1.0 is already populated to the point of saturation. Artist’s rendition of Ethereum 1.0 (source) You’ve probably heard that Ethereum 2.0 is going to be sharded. Beyond base scalability improvements, sharding is how Ethereum 2.0 is going to scale to meet demand. But many people have asked—will sharding really work for DeFi? After all, sharding breaks composability, and isn’t composability the main thing about DeFi? (Money Legos™ and so on.) Let’s draw out this line of thinking. Ethereum 2.0 is going to create a bunch of shards, which will work like loosely connected blockchains. But all the DeFi stuff will end up living on a single shard, since it all wants to coagulate together. So we’ll end up in the exact same place we started: one huge DeFi shard, massive congestion. The same crowded island, but a little bigger now. (source) In a sense, this vision is almost certainly correct. But it’s wrong in being alarmed about this: in fact, this is perfectly fine and to be expected! Let me paint a thought experiment for you. Cities, Suburbs, and Farmland Imagine the day that Ethereum 2.0 launches with full smart contracts. On day one, it’s empty, like a fresh and untouched landscape. Eager Ethereum 1.0 settlers disperse across the shards. Ethereum 2.0 on day one. (Source) Will they spread uniformly across this landscape? Of course not! The first settlers will want to band together and form cities. In cities, individuals live and work together because they benefit from coordination and proximity. In exchange for the increased productivity of living in a city, those settlers are willing to pay more in higher rents and more congestion (in Ethereum, gas prices). But it’s worth it! These early cities are the centers of commerce. It might be too expensive for most people, but that’s okay. Those who benefit the most from being near the center of commerce are incentivized to move there. This first city in Ethereum 2.0 will likely be the DeFi shard. The city-like DeFi shard. Ah, the hustle and bustle of composability! (Source) That DeFi shard will be the place where the major DeFi protocols settle—those that benefit from high velocity and being connected to large liquidity pools for liquidations, flash loans, or whatever. Maybe there will be one major financial shard, like London, or two city shards with their own specializations, like New York City and Chicago. I expect if there is a second city shard, it will be for centralized exchange settlement, separated from DeFi and all of its chaos. City shards will be expensive and high-throughput, with primarily high-value transactions (otherwise the gas cost would be too prohibitive). But isn’t that awful? Wasn’t that what we were trying to avoid? Now normal people won’t be able to use the DeFi shard at all! Ah, but hold on. The other shards will not be empty. Most people will live in the outskirts of the cities. You don’t need to live in Manhattan to occasionally trek up there when you want to buy something exotic. But most of the time, you’ll be just fine living on another shard and making it into the metropolis when you really need to—the DeFi shard is only a few minutes’ cross-shard transaction away. So where will most people live? I expect there will be two other kinds of shards: suburbs and farmlands. Suburbs are places where lots of people will live at relatively low cost, and have access to decent services—not a ton, but enough to get by for most of their needs. If you want to do something fancy, like get a flash loan to refinance a multi-million dollar Maker vault into a recursively leveraged Compound position, then hey, you might have to take the train to the DeFi shard to do that. But if you want to do something simple at the local corner store, like swap some ETH for DAI or buy some WBTC, that’ll be easy enough. Almost every ERC-20 will be cross-shard tokenized and available in the suburbs, and there will be local MMs for the most popular tokens and simple use cases. And like in real suburbs, most suburban shards will look pretty much the same. Suburbs will see medium-throughput, medium-value transactions. It will be economical for most people to just park their assets here and live out their blockchain lives in middle-class tranquility. The blockchain ‘burbs. For us normies. (Source) Finally, there are the farmland shards. These are the rural areas that are empty of people. If you are a blockchain game that is mostly doing its own thing and doesn’t immediately need to interoperate with other assets, you can just settle all your game actions directly onto a farmland shard. Or if you’re a weather app just dumping a bunch of data on-chain, you’d rather do it in an unpopulated area, because why not? It’s not like that shard is being used for anything important. Ah, perfect for dumping my homomorphically encrypted supply chain data! (Source) If there are pollutive activities that are uneconomical in cities or suburbs, take it to the boonies. There are no DeFi services or tokens to displace anyway. Out here, no one is all that bothered. Farmland shards allow for high-throughput, low-value transactions to your heart’s content. Blockchain Urban Planning This vision of DeFi in Ethereum 2.0, if true, tells us two things. First, yes, there will be congested shards on Ethereum 2.0! And the most congested shards will be the highest value parts of DeFi that benefit from composability. Nevertheless, DeFi will also expand cross-shard so it can provide some ancillary services in suburb shards, akin to the local branch of a national bank. But sharding doesn’t mean that activity is uniformly spread across shards. That’s not only impossible—it’s economically stupid. Let high-value enterprises move into the cities, let boring families move to the suburbs, and let farmlands do their thing far away from the valuable real estate. Heterogeneity = economic efficiency. (Source) (This also gives you a sense why programmatically load balancing contracts across shards is unwise! We should assume that protocols and contract deployers are making rational choices about where to live. Uprooting a business from a city center and transplanting it onto farmland to “load balance the city” would be a disastrous mistake.) You can think of sharding as offering a similar vision to interoperability as Cosmos or Polkadot. Many different blockchains, each specialized for certain economic equilibria, with a superhighway connecting them all. Except in Ethereum 2.0’s case, all those shards will speak the same language, share the same tooling, and benefit from the immense community that Ethereum has already garnered. Ethereum 2.0 is a big and challenging vision. It carries a lot of execution risk. But at this point, I don’t think it’s possible for Ethereum to lose its status as DeFi’s largest city. It’s already the Wall Street of crypto, and it doesn’t look like there are any serious contenders to challenge its dominance. In the meantime, will we have to pin our scaling hopes on layer 2? I see these as shopping malls built on top of Ethereum. They’ll be an improvement, but it’s likely they won’t be enough. Impatient builders may instead construct makeshift suburbs and farmland around Ethereum 1.0, via bridges with other blockchains. In other words, if Ethereum 2.0 takes too long, outsource your minor cities to another blockchain like Serum is doing, or wait for the Cosmos/Polkadot interoperability story to materialize. Will DeFi wait for Ethereum 2.0, or will the germ just spread wherever it can? For now, one thing is clear: DeFi is going to be too big for Ethereum as it currently exists today. Where it grows from here, only time will tell. This piece was originally published on Bankless.
Imagine a college friend reached out to you and said, “Hey, I have a business idea. I’m going to run a market making bot. I’ll always quote a price no matter who’s asking, and for my pricing algorithm I’ll use x * y = k. That’s pretty much it. Want to invest?” You’d run away. Well, turns out your friend just described Uniswap. Uniswap is the world’s simplest on-chain market making operation. Seemingly from nowhere, it has exploded in volume in the last year, crowning itself the world’s largest “DEX” by volume. If you haven’t paid close attention to what’s happening in DeFi in the last year, you’re probably wondering: what is going on here? Uniswap v2 volume. Credit: Uniswap.info (If you’re already familiar with Uniswap and AMMs, skip ahead to the section titled “The Cambrian AMM Explosion.”) For the uninitiated: Uniswap is an automated market maker (AMM). You can think of an AMM as a primitive robotic market maker that is always willing to quote prices between two assets according to a simple pricing algorithm. For Uniswap, it prices the two assets so that the number of units it holds of each asset, multiplied together, is always equal to a fixed constant. That’s a bit of a mouthful: if Uniswap owns some units of token x and some units of token y, it prices any trade so that the final quantities of x and y it owns, multiplied together, are equal to a fixed constant, k. This is formalized as the constant product equation: x * y = k. This might strike you as a weird and arbitrary way to price two assets. Why would maintaining some fixed multiple between your units of inventory ensure that you quote the right price? Uniswap by example Let’s say we fund a Uniswap pool with 50 apples (a) and 50 bananas (b), so anyone is free to pay apples for bananas or bananas for apples. Let’s assume the exchange rate between apples and bananas is exactly 1:1 on their primary market. Because the Uniswap pool holds 50 of each fruit, the constant product rule gives us a * b = 2500—for any trade, Uniswap must maintain the invariant that our inventory of fruit, multiplied together, equals 2500. So let’s say a customer comes to our Uniswap pool to buy an apple. How many bananas will she need to pay? If she buys an apple, our pool will be left with 49 apples, but 49 * b has to still equal 2500. Solving for b, we get 51.02 total bananas. Since we already have 50 bananas in inventory, we’ll need 1.02 extra bananas for that apple (we’ll allow fractional bananas in this universe), so the price we have to quote her is 1.02 bananas / apple for 1 apple. Note that this is close to the natural price of 1:1! Because it’s a small order, there is only a little slippage. But what if the order is larger? You can interpret the slope at each point as the marginal exchange rate. If she wants to buy 10 apples, Uniswap would charge her 12.5 bananas for a unit price of 1.25 bananas / apple for 10 apples. And if she wanted a huge order of 25 apples—half of all the apples in inventory—the unit price would be 2 bananas / apple! (You can intuit this because if one side of the pool halves, the other side needs to double.) The important thing to realize is that Uniswap cannot deviate from this pricing curve. If someone wants to buy some apples and later someone else wants to buy some bananas, Uniswap will sweep back and forth through this pricing curve, wherever demand carries it. Uniswap sweeping back and forth through its pricing curve after a series of trades. Now here’s the kicker: if the true exchange rate between apples and bananas is 1:1, then after the first customer purchases 10 apples, our Uniswap pool will be left with 40 apples and 62.5 bananas. If an arbitrageur then steps in and buys 12.5 bananas, returning the pool back to its original state, Uniswap would charge them a unit price of only 0.8 apples / banana. Uniswap would underprice the bananas! It’s as though our algorithm now realizes it’s heavy on bananas, so it prices bananas cheap to attract apples and rebalance its inventory. Uniswap is constantly performing this dance — slightly moving off the real exchange rate, then sashaying back in line thanks to arbitrageurs. Impermanent Loss in a Nutshell This should give you a sense for how Uniswap pricing works. But this still begs the question — is Uniswap good at what it does? Does this thing actually generate profits? After all, any market maker can quote prices, but it’s another thing to make money. The answer is: it depends! Specifically, it depends on a concept known as impermanent loss. Here’s how it works. Uniswap charges a small fee for every trade (currently 0.3%). This is in addition to the nominal price. So if apples and bananas always and forever trade at 1:1, these fees will simply accumulate over time as the market maker sweeps back and forth across the exchange rate. Compared to the baseline of just holding those 50 apples and bananas, the Uniswap pool will end up with more fruit at the end, thanks to all the fees. But what if the real exchange rate between apples and bananas suddenly changes? Say a drone strike takes out a banana farm, and now there’s a massive banana shortage. Bananas are like gold now. The exchange rate soars to 5 apples : 1 banana. What happens on Uniswap? The very next second, an arbitrageur swoops in to pick off the cheaply priced bananas in your Uniswap pool. They size their trade so that they purchase every banana that’s priced below the new exchange rate of 5:1. That means they’ll need to move the curve until it satisfies the equation: 5b * b = 2500. Running the math out, they’d purchase 27.64 bananas for a grand total of 61.80 apples. This comes out to an average price of 2.2 apples : 1 banana, way under market, netting the equivalent of 76.4 free apples. And where does that profit come from? Of course, it comes at the expense of the pool! And indeed, if you do the accounting, you’ll see that the Uniswap pool is now down exactly 76.4 apples worth of value compared to someone who’d held the original 50 apples and 50 bananas. Uniswap sold off its bananas too cheaply, because it had no idea bananas had become so valuable in the real world. This phenomenon is known as impermanent loss. Whenever the exchange rate moves, this manifests as arbitrageurs sniping cheap assets until the pool is correctly priced. (These losses are “impermanent” because if the true exchange rate later reverts back to 1:1, then now it’s like you never lost that money to begin with. It’s a dumb name, but oh well.) Pools make money through fees, and they lose money via impermanent loss. It’s all a function of demand and price divergence — demand works for you, and price divergence works against you. This is Uniswap in a nutshell. You can go a lot deeper of course, but this is enough background for you to understand what’s happening in this space. Since its launch in 2018, Uniswap has taken DeFi by storm. This is especially amazing given that the original version of Uniswap was only about 300 lines of code! (AMMs themselves have a long lineage, but constant product market makers are a relatively recent invention.) Uniswap is completely permissionless and can be funded by anyone. It doesn’t even need an oracle. In retrospect, it’s incredibly elegant, one of the simplest possible products you could have invented, and yet it arose seemingly from nowhere to dominate DeFi. The Cambrian AMM Explosion Since Uniswap’s rise, there has been an explosion of innovation in AMMs. A legion of Uniswap descendants have emerged, each with its own specialized features. Uniswap, Balancer, and Curve trading volume. Source: Dune Analytics Though they all inherited the core design of Uniswap, they each come with their own specialized pricing function. Take Curve, which uses a mixture of constant product and constant sum, or Balancer, whose multi-asset pricing function is defined by a multi-dimensional surface. There are even shifted curves that can run out of inventory, like the ones Foundation uses to sell limited edition goods. The Stableswap curve (blue), used in Curve. Source: Curve whitepaper Different curves are better suited for certain assets, as they embed different assumptions about the price relationship between the assets being quoted. You can see in the chart above that the Stableswap curve (blue) approximates a line most of the time, meaning that in most of its trading range, the two stablecoins will be priced very close to each other. Constant product is a decent starting place if you don’t know anything about the two assets, but if we know the two assets are stablecoins and they are probably going to be worth around the same, then the Stableswap curve will produce more competitive pricing. Of course, there are infinitely many specific curves an AMM could adopt for pricing. We can abstract over all of these different pricing functions and call the whole category CFMMs: constant function market makers. Seeing the growth in CFMM volume, it’s tempting to assume that they are going to take over the world — that in the future, all on-chain liquidity will be provided by CFMMs. But not so fast! CFMMs are dominating today. But in order to get a clear sense of how DeFi evolves from here, we need to understand when CFMMs thrive and when they do poorly. The Correlation Spectrum Let’s stick to Uniswap, since it’s the simplest CFMM to analyze. Let’s say you want to be a Uniswap LP (liquidity provider) in the ETH/DAI pool. By funding this pool, there are two simultaneous things you have to believe for being an LP to be better than just holding onto your original funds: The ratio in value between ETH and DAI will not change too much (if it does, that will manifest as impermanent loss) Lots of fees will be paid in this pool To the extent that the pool exhibits impermanent loss, the fees need to more than make up for it. Note that for a pair that includes a stablecoin, to the extent that you’re bullish on ETH appreciating, you’re also assuming that there will be a lot of impermanent loss! The general principle is this: the Uniswap thesis works best when the two assets are mean-reverting. Think a pool like USDC/DAI, or WBTC/TBTC — these are assets that should exhibit minimal impermanent loss and will purely accrue fees over time. Note that impermanent loss is not merely a question of volatility (actually, highly volatile mean-reverting pairs are great, because they’ll produce lots of trading fees). We can accordingly draw a hierarchy of the most profitable Uniswap pools, all other things equal. Mean-reverting pairs are obvious. Correlated pairs often move together, so Uniswap won’t exhibit as much impermanent loss there. Uncorrelated pairs like ETH/DAI are rough, but sometimes the fees can make up for it. And then there are the inverse correlated pairs: these are absolutely awful for Uniswap. Imagine someone on a prediction market going long Trump, long Biden, and putting both longs in a Uniswap pool. By definition, eventually one of these two assets will be worth $1 and the other will be worth $0. At the end of the pool, an LP will have nothing but impermanent loss! (Prediction markets always stop trading before the markets resolve, but outcomes are often decided well before the market actually resolves.) So Uniswap works really well for certain pairs and terribly for others. But it’s hard not to notice that almost all of the top Uniswap pools so far have been profitable! In fact, even the ETH/DAI pool has been profitable since inception. Uniswap returns for ETH/DAI pool (vs holding 50/50 ETH/DAI). Source: ZumZoom Analytics This demands explanation. Despite their flaws, CFMMs have been impressively profitable market makers. How is this possible? To answer this question, it pays to understand a bit about how market makers work. Market making in a nutshell Market makers are in the business of providing liquidity to a market. There are three primary ways market makers make money: designated market making arrangements (traditionally paid by asset issuers), fee rebates (traditionally paid by an exchange), and by pocketing a spread when they’re making a market (what Uniswap does). You see, all market making is a battle against two kinds of order flow: informed flow, and uninformed flow. Say you’re quoting the BTC/USD market, and a fat BTC sell order arrives. You have to ask yourself: is this just someone looking for liquidity, or does this person know something I don’t? If this counterparty just realized that a PlusToken cache moved, and hence selling pressure is incoming, then you’re about to trade some perfectly good USD for some not so good BTC. On the other hand, if this is some rando selling because they need to pay their rent, then it doesn’t mean anything in particular and you should charge them a small spread. As a market maker, you make money on the uninformed flow. Uninformed flow is random — at any given day, someone is buying, someone is selling, and at the end of the day it cancels out. If you charge each of them the spread, you’ll make money in the long run. (This phenomenon is why market makers will pay for order flow from Robinhood, which is mostly uninformed retail flow.) So a market maker’s principal job is to differentiate between informed and uninformed flow. The more likely the flow is informed, the higher the spread you need to charge. If the flow is definitely informed, then you should pull your bids entirely, because you’ll pretty much always lose money if informed flow is willing to trade against you. (Another way to think about this: uninformed flow is willing to pay above true value for an asset — that’s your spread. Informed flow is only willing to pay below the true value of an asset, so when you trade against them, you’re actually the one who’s mispricing the trade. These orders know something you don’t.) The very same principle applies to Uniswap. Some people are trading on Uniswap because they randomly want to swap some ETH for DAI today. This is your uninformed retail flow, the random walk of trading activity that just produces fees. This is awesome. Then you have the arbitrageurs: they are your informed flow. They are picking off mispriced pools. In a sense, they are performing work for Uniswap by bringing its prices back in line. But in another sense, they are transferring money from liquidity providers to themselves. For any market maker to make money, they need to maximize the ratio of uninformed retail flow to arbitrageur flow. But Uniswap can’t tell the difference between the two! Uniswap has no idea if an order is dumb retail money or an arbitrageur. It just obediently quotes x * y = k, no matter what the market conditions. So if there’s a new player in town that offers better pricing than Uniswap, like Curve or Balancer, you should expect retail flow to migrate to whatever service offers them better pricing. Given Uniswap’s pricing model and fixed fees (0.3% on each trade), it’s hard to see it competing on the most competitive pools — Curve is both more optimized for stablecoins and charges 0.04% on each trade. Over time, if Uniswap pools get outcompeted on slippage, they will be left with majority arbitrageur flow. Retail flow is fickle, but arbitrage opportunities continually arise as the market moves around. This failure to compete on pricing is not just bad — its badness gets amplified. Uniswap has a network effect around liquidity on the way up, but it’s also reflexive on the way down. As Curve starts to eat the stablecoin-specific volume, the DAI/USDC pair on Uniswap will start to lose LPs, which will in turn make the pricing worse, which will attract even less volume, further disincentivizing LPs, and so on. So goes the way of network effects — it’s a rocket on the way up, but on the way down it incinerates on re-entry. Of course, these arguments apply no less to Balancer and Curve. It will be difficult for each of them to maintain fees once they get undercut by a market maker with better pricing and lower fees. Inevitably, this will result in a race to the bottom on fees and massive margin compression. (Which is exactly what happens to normal market makers! It’s a super competitive business!) But that still doesn’t explain: why are all of the CFMMs growing like crazy? Why are CFMMs winning? Let’s take stablecoins. CFMMs are clearly going to win this vertical. Imagine a big traditional market maker like Jump Trading were to start market making stablecoins on DeFi tomorrow. First they’d need to do a lot of upfront integration work, then to continue operating they’d need to continually pay their traders, maintain their trading software, and pay for office space. They’d have significant fixed costs and operating costs. Curve, meanwhile, has no costs at all. Once the contracts are deployed, it operates all on its own. (Even the computing cost, the gas fees, is all paid by end users!) And what is Jump doing when quoting USDC/USDT that’s so much more complicated than what Curve is doing? Stablecoin market making is largely inventory management. There’s not as much fancy ML or proprietary knowledge that goes into it, so if Curve does 80% as well as Jump there, that’s probably good enough. But ETH/DAI is a much more complex market. When Uniswap is quoting a price, it isn’t looking at exchange order books, modeling liquidity, or looking at historical volatility like Jump would — it’s just closing its eyes and shouting x * y = k! Compared to normal market makers, Uniswap has the sophistication of a refrigerator. But so long as normal market makers are not on DeFi, Uniswap will monopolize the market because it has zero startup costs and zero operating expense. Here’s another way to think about it: Uniswap is the first scrappy merchant to set up shop in this new marketplace called DeFi. Even with all its flaws, Uniswap is being served up a virtual monopoly. When you have a monopoly, you are getting all of the retail flow. And if the ratio between retail flow and arbitrageur flow is what principally determines the profitability of Uniswap, no wonder Uniswap is raking it in! But once the retail flow starts going elsewhere, this cycle is likely to end. LPs will start to suffer and withdraw liquidity. But this is only half of the explanation. Remember: long before we had Uniswap, we had tons of DEXes! Uniswap has decimated order book-based DEXes like IDEX or 0x. What explains why Uniswap beat out all the order book model exchanges? From Order Books to AMMs I believe there are four reasons why Uniswap beat out order book exchanges. First, Uniswap is extremely simple. This means there is low complexity, low surface area for hacks, and low integration costs. Not to mention, it has low gas costs! This really matters when you’re implementing all your trades on top of the equivalent of a decentralized graphing calculator. This is not a small point. Once next generation high-throughput blockchains arrive, I suspect the order book model will eventually dominate, as it does in the normal financial world. But will it be dominant on Ethereum 1.0? The extraordinary constraints of Ethereum 1.0 select for simplicity. When you can’t do complex things, you have to do the best simple thing. Uniswap is a pretty good simple thing. Second, Uniswap has a very small regulatory surface. (This is the same reason why Bram Cohen believes Bittorrent succeeded.) Uniswap is trivially decentralized and requires no off-chain inputs. Compared to order book DEXes that have to tiptoe around the perception of operating an exchange, Uniswap is free to innovate as a pure financial utility. Third, it’s extremely easy to provide liquidity to Uniswap. The one-click “set it and forget it” LP experience is a lot easier than getting active market makers to provide liquidity on an order book exchange, especially before DeFi attracts serious volume. This is critical, because much of the liquidity on Uniswap is provided by a small set of beneficent whales. These whales are not as sensitive to returns, so the one-click experience on Uniswap makes it painless for them to participate. Crypto designers have a bad habit of ignoring mental transaction costs and assuming market participants are infinitely diligent. Uniswap made liquidity provision dead simple, and that has paid off. The last reason why Uniswap has been so successful is the ease of creating incentivized pools. In an incentivized pool, the creator of a pool airdrops tokens onto liquidity providers, juicing their LP returns above the standard Uniswap returns. This phenomenon has also been termed “liquidity farming.” Some of Uniswap’s highest volume pools have been incentivized via airdrops, including AMPL, sETH, and JRT. For Balancer and Curve, all of their pools are currently incentivized with their own native token. Recall that one of the three ways that traditional market makers make money is through designated market making agreements, paid by the asset issuer. In a sense, an incentivized pool is a designated market maker agreement, translated for DeFi: an asset issuer pays an AMM to provide liquidity for their pair, with the payment delivered via token airdrop. But there’s an additional dimension to incentivized pools. They have allowed CFMMs to serve as more than mere market makers: they now double as marketing and distribution tools for token projects. Via incentivized pools, CFMMs create a sybil-resistant way to distribute tokens to speculators who want to accumulate the token, while simultaneously bootstrapping a liquid initial market. It also gives purchasers something to do with the token—don’t just turn it around and sell it, deposit it and get some yield! You could call this poor man’s staking. It’s a powerful marketing flywheel for an early token project, and I expect this to become integrated into the token go-to-market playbook. These factors go a long way toward explaining why Uniswap has been so successful. (I haven’t touched on “Initial DeFi Offerings,” but that’s a topic for another day.) That said, I don’t believe Uniswap’s success will last forever. If the constraints of Ethereum 1.0 created the conditions for CFMMs to dominate, then Ethereum 2.0 and layer 2 systems will enable more complex markets to flourish. Furthermore, DeFi’s star has been rising, and as mass users and volumes arrive, they will attract serious market makers. Over time, I expect this to cause Uniswap’s market share to contract. Five years from now, what role will CFMMs play in DeFi? In 2025, I don’t expect CFMMs the way they look today to be the dominant way people trade anymore. In the history of technology, transitions like this are common. In the early days of the Internet, web portals like Yahoo were the first affordance to take off on the Web. The constrained environment of the early Web was perfectly suited to being organized by hand-crafted directories. These portals grew like crazy as mainstream users started coming online! But we now know portals were a temporary stepping stone on the path to organizing the Internet’s information. The original Yahoo homepage and the original Google homepage What are CFMMs a stepping stone to? Will something replace it, or will CFMMs evolve alongside DeFi? In my next post, entitled Unbundling Uniswap, I’ll try to answer this question. Massive thanks to Hasu, Ivan Bogatyy, Ashwin Ramachandran, Kevin Hu, Tom Schmidt, and Mia Deng for their comments and feedback on this piece. Disclosure: Dragonfly Capital does not hold a position in any of the assets listed in this article aside from ETH.
Flash loans have been the center of attention lately. Recently two hackers used flash loans to attack the margin trading protocol bZx, first in a $350K attack and later in a $600K copycat attack. These attacks were, in a word, magnificent. In each attack, a penniless attacker instantaneously borrowed hundreds of thousands of dollars of ETH, threaded it through a chain of vulnerable on-chain protocols, extracted hundreds of thousands of dollars in stolen assets, and then paid back their massive ETH loans. All of this happened in an instant—that is, in a single Ethereum transaction. Cover art by Carmine Infantino We don’t know who these attackers were or where they came from. Both started with basically nothing and walked away with hundreds of thousands of dollars in value. Neither left any traces to identify themselves. In the wake of these attacks, I’ve been thinking a lot about flash loans and their implications for the security of DeFi. I think this is worth thinking through in public. In short: I believe flash loans are a big security threat. But flash loans are not going away, and we need to think carefully about the impact they will have for DeFi security going forward. What is a flash loan? The concept of a flash loan was first termed by Max Wolff, the creator of Marble Protocol in 2018. Marble marketed itself as a “smart contract bank,” and its product was a simple, yet brilliant DeFi innovation: zero-risk loans via a smart contract. How can a loan have zero risk? Traditional lenders take on two forms of risk. The first is default risk: if the borrower runs off with the money, that obviously sucks. But the second risk to a lender is illiquidity risk: if a lender lends out too many of their assets at the wrong times, or doesn’t receive timely repayments, the lender may be unexpectedly illiquid and not be able to meet their own obligations. Flash loans mitigate both risks. A flash loan basically works like this: I will lend you as much money as you want for this single transaction. But, by the end of this transaction, you must pay me at least as much as I lent you. If you are unable to do that, I will automatically roll back your transaction! (Yep, smart contracts can do that.) Simply put, your flash loan is atomic: if you fail to pay back the loan, the whole thing gets reverted as though the loan never happened. Something like this could only exist on blockchains. You could not do flash loans on, say, BitMEX. This is because smart contract platforms process transactions one at a time, so everything that happens in a transaction is executed serially as a batch operation. You can think of this as your transaction “freezing time” while it’s executing. A centralized exchange, on the other hand, can have race conditions such that a leg of your order fails to fill. On the blockchain, you’re guaranteed that all of your code runs one line after the next. So let’s think about the economics here for a second. Traditional lenders are compensated for two things: the risk they’re taking on (default risk and illiquidity risk), and for the opportunity cost of the capital they’re lending out (e.g., if I can get 2% interest elsewhere on that capital, the borrower must pay me more than the risk-free 2%). Flash loans are different. Flash loans literally have no risk and no opportunity cost! This is because the borrower “froze time” for the duration of their flash loan, so in anyone else’s eyes, the system’s capital was never at risk and never encumbered, therefore it could not have earned interest elsewhere (i.e., it did not have an opportunity cost). This means, in a sense, there’s no cost to being a flash lender. This is deeply counterintuitive. So how much should a flash loan cost at equilibrium? Basically, flash loans should be free. Or more properly, a small enough fee to amortize the cost of including the extra 3 lines of code to make an asset flash-lendable. interface Lender { function goWild() external; } contract FlashERC20 is ERC20 { using SafeMath for uint256; function flash(uint256 amount) external { balances[msg.sender] = balances[msg.sender].add(amount); Lender(msg.sender).goWild(); balances[msg.sender] = balances[msg.sender].sub(amount); } } h/t Remco Bloemen Flash loans cannot charge interest in the traditional sense, because the loan is active for zero time (any APR * 0 = 0). And of course, if flash lenders charged higher rates, they’d quickly be outcompeted by other flash lending pools that charged lower rates. Flash lending makes capital a true commodity. This race to the bottom inevitably results in zero fees or a tiny nominal fee. dYdX currently charges 0 fees for flash lending. AAVE, on the other hand, charges 0.09% on the principal for flash loans. I suspect this is not sustainable, and indeed, some in their community have called for slashing fees to 0. (Note that neither of the attacks we saw used AAVE as their flash lending pool.) What are flash loans useful for? Flash loans were originally marketed on the premise that they’d primarily be used for arbitrage. Marble’s breakout announcement claimed: “With flash lending, a trader can borrow from the Marble bank, buy a token on one DEX, sell the token on another DEX for a higher price, repay the bank, and pocket the arbitrage profit all in a single atomic transaction.” And it’s true — by volume, most of the flash loans we’ve seen so far have been used for this kind of arbitrage. Flash loan usage on AAVE. Credit: AAVE But volumes have been tiny. AAVE has originated barely over $10K of borrows since inception. This is miniscule compared to the arbitrage and liquidations market on DeFi. This is because most arbitrage is performed by competitive arbitrageurs running sophisticated bots. They engage in on-chain priority gas auctions and use gas tokens to optimize transaction fees. It’s a very competitive market — these guys are perfectly happy to keep some tokens on their balance sheet to optimize their earnings. On the other hand, borrowing on AAVE costs about 80K gas and charges 0.09% of the principal — a steep price to pay for an arbitrageur competing over tiny margins. In fact, in most AAVE arbitrages, the borrower ended up paying more in fees to the lending pool than they took home. In the long run, arbitrageurs are unlikely to use flash loans except in special circumstances. But flash loans have other more compelling use cases in DeFi. One example is refinancing loans. For example, say I have a Maker vault (CDP) with $100 of ETH locked in it, and I drew a loan of 40 DAI from it—so I’ve got a $60 net position minus my debt. Now say I want to refinance into Compound for a better interest rate. Normally I’d need to go out and repurchase that 40 DAI to close out my CDP, which requires some up-front capital. Instead, I can flash borrow 40 DAI, close out the $100 CDP, deposit $60 of my unlocked ETH into Compound, convert the other $40 of ETH back into DAI through Uniswap, and use that to repay the flash loan. Boom, atomic 0-capital refinancing. That’s pretty magical! It’s a great example of money legos™ at work. 1x.ag actually built a margin trading aggregator that automates this kind of thing using flash loans. But as cool as flash loans can be, the bZx attackers showed us that they aren’t just fun and games. Flash attacks have big security implications I’ve increasingly come to believe that what flash loans really unlock are flash attacks — capital-intensive attacks funded by flash loans. We saw the first glimpses of this in the recent bZx hacks, and I suspect that’s only the the tip of the spear. There are two main reasons why flash loans are especially attractive to attackers. Many attacks require lots of up-front capital (such as oracle manipulation attacks). If you’re earning a positive ROI on $10M of ETH, it’s probably not arbitrage — you’re likely up to some nonsense. Flash loans minimize taint for attackers. If I have an idea of how to manipulate an oracle with $10M of Ether, even if I own that much Ether, I might not want to risk it with my own capital. My ETH will get tainted, exchanges might reject my deposits, and it will be hard to launder. It’s risky! But if I take out a flash loan for $10M, then who cares? It’s all upside. It’s not like the collateral pool of dYdX will be considered tainted because that’s where my loan came from — the taint on dYdX just sort of evaporates. You might not like that exchange blacklisting is part of the blockchain security model today. It’s quite squishy and centralized. But it’s an important reality that informs the calculus behind these attacks. In the Bitcoin white paper, Satoshi famously claimed that Bitcoin is secure from attack because: “[The attacker] ought to find it more profitable to play by the rules […] than to undermine the system and validity of his own wealth.” With flash loans, attackers no longer need to have any skin in the game. Flash loans materially change the risks for an attacker. And remember, flash loans can stack! Subject to the gas limit, you could literally aggregate every flash loanable pool in a single transaction (upwards of $50M) and bring all that capital thundering down onto a single vulnerable contract. It’s a $50M battering ram that now anyone can slam into any on-chain pinata, so long as money comes out. This is scary. Now, of course, you shouldn’t be able to attack a protocol by just having a lot of money. If the DeFi stack is as secure as it’s claimed to be, all this shouldn’t be a problem — what kind of protocol isn’t secure against a rich whale? Not accounting for that is just negligence, you might say. And yet we acknowledge that Ethereum itself can be 51% attacked for less than $200K/hr**. That’s not that much money! If Ethereum’s own security model is basically built around capital constraints, why are we so quick to scoff at DeFi applications that can be successfully attacked for $10M? (** To be clear, I don’t believe these numbers—the figure conveniently ignores slippage and the dearth of supply—plus consensus-layer security and application-layer security are different beasts. But you get the point.) So how can you mitigate against flash attacks? Say I’m a DeFi protocol and I want to avoid getting flash attacked. The natural question might be — can I detect whether the user interacting with me is using a flash loan? The simple answer is: no. The EVM doesn’t let you read storage from any other contract. Thus, if you want to know what’s going on in another contract, it’s on that contract to tell you. So if you wanted to know whether a flash loan contract was actively being used, you’d have to ask the contract directly. Today many of the lending protocols don’t respond to such queries (and there’s no way to enforce that a flash lender does in general). Plus even if you tried to check, any such query could easily be misdirected using a proxy contract, or by chaining across flash lending pools. It’s simply not possible to tell in general whether a depositor is using a flash loan. Take that in for a second. If someone is knocking on your contract’s front door with $10M, it’s impossible to tell whether it’s their own money or not. So what real options do we have to protect against flash attacks? There are a few approaches we could consider. Convince flash lending pools to stop offering this service. Ha, just kidding. It’s crypto, you guys! In all seriousness, trying to get lending pools to stop offering flash lending is like trying to stop noise pollution—it’s a classic tragedy of the commons. It’s in every protocol’s individual interest to offer flash loans, and there are legitimate reasons why their users want this functionality. So we can safely dismiss this. Flash loans aren’t going away. Force critical transactions to span two blocks. Remember, flash loans allow you to borrow capital within the span of a single transaction. If you require a capital-intensive transaction spans two blocks, then the user must take out their loan for at least two blocks, defeating any flash attacks. (Note: for this to work, the user has to have their value locked up between the two blocks, preventing them from repaying the loan. If you don’t think through the design correctly, a user could just flash attack in both blocks.) Obviously this comes at a steep UX tradeoff: it means that transactions will no longer be synchronous. It sucks for users, and it’s a tough bullet to bite. Many developers bemoan asynchronous smart contract operations, such as interacting with layer 2 or cross-shard communication in Ethereum 2.0. Ironically, asynchrony actually makes these systems secure against flash attacks, since you cannot traverse a shard or a layer 2 in a single atomic transaction. This means no flash attacks across ETH 2.0 shards or against DEXes on layer 2. Request on-chain proofs that a user’s prior balance wasn’t altered by a flash loan. We could defeat flash attacks if there were some way to detect what a user’s real balance was — that is, what their balance was before they took out the loan. There’s no way to do that natively in the EVM, but you can sort of hack it. Here’s what you do: before a user interacts with your protocol, you demand a Merkle proof that demonstrates that at the end of the previous block, they had enough balance to account for the capital they’re currently using. You’d need to keep track of this for each user in each block. (Credit to Ari Juels for outlining this approach to me.) This kind of works. Of course, it has some gnarly problems: verifying these on-chain proofs is incredibly expensive on-chain, and no user in their right mind wants to generate them and pay the gas fees for this whole thing. Plus, users might have changed their balance earlier in the same block for perfectly legitimate reasons. So while theoretically it has some merit, it’s not a practical solution. None of these three solutions I’ve proposed are particularly promising. I’m convinced that there is no good general defense against flash attacks. But there are two specific applications that do have specific mitigations against flash attacks: market-based price oracles and governance tokens. For market-based price oracles like Uniswap or OasisDEX, flash attacks make it so you cannot under any circumstances use the current mid-market price as an oracle. It’s child’s play for an attacker to move the mid-market price within a single transaction and manufacture a flash crash, corrupting the price oracle. The best solution here is to use a weighted average of the last X blocks either via a TWAP or VWAP. Uniswap v2 will offer this natively. There’s also Polaris, a generalized approach for offering moving averages for DeFi protocols. Ironically, this Polaris was also built by Max Wolff, the original creator of Marble. (Polaris is now abandoned, but much credit for Max for seeing around that corner.) On-chain governance is its own can of worms. On-chain governance is usually determined by the coin-weighted voting among holders of the governance token. But if those governance tokens are in a flash lending pool, then any attacker can scoop up a giant pile of coins and bash on any outcome they want. Of course, most governance protocols require those coins to be locked up for the voting period, which defeats flash attacks. But some forms of voting don’t require this, such as carbon votes, or Maker’s executive contract. With flash attacks now on the table, these forms of voting should be considered completely broken. Ideally, it’d be great if governance tokens weren’t flash loanable at all. But this isn’t up to you as an issuer — it’s up to the market. Thus, all governance actions should require lockups to prevent against flash attacks. Compound’s new COMP token goes a step further by time-weighting all protocol votes, weakening even regular loan attacks against their governance token. More broadly, all governance tokens must have timelocks. A timelock enforces that all governance decisions must wait a period of time before they go live (for Compound’s timelock, it’s 2 days). This allows the system to recover from any unanticipated governance attacks. Even though MKR isn’t yet flash borrowable in bulk, MakerDAO was recently called out for being vulnerable to this sort of attack. It recently implemented a 24 hour timelock, closing this attack vector. What does all of this mean for the long term? I believe the bZx attacks changed things. This will not be the last flash attack. The second bZx attack was the first copycat, and I suspect it will set off a wave of attacks in the coming months. Now thousands of clever teenagers from the remotest parts of the world are poking at all these DeFi legos, examining them under a microscope, trying to discover if there is some way they can pull off a flash attack. If they manage to exploit a vulnerability, they too could make a few hundred thousand dollars — a life-changing sum in most parts of the world. Some people claim that flash attacks don’t change anything because these attacks were always possible if the attacker was well-capitalized. That’s both correct and incredibly incorrect. Most whales don’t know how to hack smart contracts, and most brilliant attackers don’t have millions of dollars lying around. Now anyone can rent a $50M wrecking ball for pennies. This changes the way every building needs to be constructed from now on. After the bZx hacks, being hit by a flash attack will be as embarrassing as getting hit by re-entrancy after the DAO hack: you will get no sympathy. You should have known better. Lastly, these episodes have gotten me thinking about an old concept in crypto: miner-extractable value. Miner-extractable value is the total value that miners can extract from a blockchain system. This includes block rewards and fees, but it also includes more mischievous forms of value extraction, such as reordering transactions or inserting rogue transactions into a block. At bottom, you should think of all of these flash attacks as single transactions in the mempool that make tons of money. For example, the second bZx attack resulted in $645K profit in ETH in a single transaction. If you’re a miner and you’re about to start mining a new block, imagine looking at the previous block’s transactions and saying to yourself… “wait, what? Why am I about to try to mine a new block for ~$500, when that last block contains $645K of profit in it??” Instead of extending the chain, it’d be in your interest to go back and try to rewrite history such that you werethe flash attacker instead. Think about it: that transaction alone was worth more than 4 hours worth of honestly mined Ethereum blocks! This is isomorphic to having a special super-block that contains 1000x the normal block reward — just as you expect, the rational result of such a super-block should be a dogpile of miners competing to orphan the tip of the chain and steal that block for themselves. Artist’s visualization of a miner dogpile. Credit: AP Photo/Denis Poroy At equilibrium, all flash attacks should ultimately be extracted by miners. (Note that they should also end up stealing all on-chain arbitrage and liquidations.) This will, ironically, serve as a deterrent against flash attacks, since it will leave attackers unable to monetize their discoveries of these vulnerabilities. Perhaps eventually miners will start soliciting attack code through private channels and pay the would-be attacker a finder’s fee. Technically, this could be done trustlessly using zero-knowledge proofs. (Weird to think about, right?) But that’s all pretty sci-fi for now. Miners obviously aren’t doing this today. Why aren’t they? Tons of reasons. It’s hard, it’s a lot of work, the EVM sucks to simulate, it’s risky, there would be bugs that would result in lost funds or orphaned blocks, it’d cause an uproar and the rogue mining pool might have a PR crisis and be branded an “enemy of Ethereum.” For now miners would lose more in business, R&D, and orphaned blocks than they’d gain by trying to do this. That’s true today. It won’t be true forever. This lends yet another motivation for Ethereum to hurry up and transition to Ethereum 2.0. DeFi on Ethereum, while always entertaining, is absolutely and irrevocably broken. DeFi is not stable on a PoW chain, because all high-value transactions are subject to miner reappropriation (also known as time bandit attacks). For these systems to work at scale, you need finality—the inability for miners to rewrite confirmed blocks. This will protect transactions in previous blocks from getting reappropriated. Plus if DeFi protocols exist on separate Ethereum 2.0 shards, they won’t be vulnerable to flash attacks. In my estimation, flash attacks give us a small but useful reminder that it’s early days. We’re still far from having sustainable architecture for building the financial system of the future. For now, flash loans will be the new normal. Maybe in the long run, all assets on Ethereum will be available for flash loans: all of the collateral held by exchanges, all the collateral in Uniswap, maybe all ERC-20s themselves. Who knows—it’s only a few lines of code.
By Haseeb Qureshi and Ashwin Ramachandran It was August 2017. The price of Ether was near an all time high, the Ethereum blockchain was exploding with usage, and the chain was buckling under the ever increasing demand. Researchers and developers were frantically searching for new scalability solutions. At blockchain conferences around the world, developers debated scaling proposals. The Ethereum community was desperate for a solution. In the middle of this frenzy the first version of the Plasma paper was released, promising a layer-2 scaling solution that could handle “nearly all financial computation worldwide.” TechCrunch reporting on Plasma Fast forward to 2020. Ethereum is as slow as ever, and yet it has survived all the so-called Ethereum killers. Ethereum 2.0’s launch date keeps receding further into the future, and Plasma seems to have disappeared entirely with many development groups shuttering operations. And yet, new solutions such as optimistic and ZK rollup are being hailed as the best scaling solutions. But memory of Plasma seems to have vanished without a trace. So who killed Plasma? Let’s go back to what it was like in early 2017. Ethereum had just gone mainstream for the first time, and there was limitless optimism about what would soon be possible. It was claimed that all valuable assets would soon be tokenized. Meetups in San Francisco were standing room only, and crowds materialized whenever Ethereum was mentioned. But Ethereum wasn’t scaling. In the middle of this craze, Vitalik Buterin and Joseph Poon published a paper, where they introduced a new layer-2 scalability solution called Plasma. Vitalik and Joseph Poon introduce Plasma at a meetup in San Francisco Plasma claimed to allow Ethereum to scale to Visa-level transaction volumes, and its bold claims triggered a wave of developer and community excitement. Soon after, the Ethereum research community rallied around Plasma as the salvation to Ethereum’s scaling woes. But what exactly was Plasma, and why didn’t it end up fulfilling its promises? How does Plasma work? The original Plasma paper described a mechanism for constructing a MapReduce “tree of blockchains”. Each node in the tree would represent a unique blockchain that was connected to its parent, and all of these blockchains were arranged in a massive hierarchy. This initial specification, however, was vague and complex. Soon after its release, Vitalik simplified the spec in a new paper appropriately named MVP (Minimal Viable Plasma). The Plasma “Tree of Blockchains” MVP proposed a stripped-down version of Plasma: a simple UTXO based sidechain that would be safe under data unavailability. But what is a sidechain? And what does it mean for data to be unavailable? Before we delve into Plasma, let’s walk through what these terms mean. A sidechain is simply a blockchain that is attached to another blockchain. Sidechains can be operated in many different ways, such as by a trusted third-party, a federation, or a consensus algorithm. For example, Blockstream participates in a federated sidechain on the Bitcoin network called Liquid. Liquid allows for higher transaction throughput, which it achieves due to a tradeoff in its trust model. Users must trust the federation not to collude and steal funds. The chain operators in this context are the various members of the Liquid federation, such as Blockstream the company. Visualization of sidechain transfers (exemplified by Liquid). Credit: Georgios Konstantopoulos A sidechain is attached to a larger blockchain (like Bitcoin) via a two-way peg. Users can deposit funds on the sidechain by sending them to a particular address or smart contract on the main chain. This is referred to as a peg-in transaction. To withdraw funds, users can perform the same operation on the sidechain to retrieve their funds on the main chain. This is referred to as a peg-out transaction. But how does this relate to Plasma? As we saw in the example above, moving funds out of a sidechain requires one critical component: trust. Users must trust the operators of the sidechain not to abscond with funds. But isn’t the main feature of blockchains trustlessness? What if users want to interact with a sidechain without trusting its operators? This is precisely the problem Plasma was created to solve. Plasma was designed to minimize the trust required of sidechain operators. That is, Plasma prevents funds from being stolen even if operators (or a consensus majority) misbehave. But even if operators can’t outright steal funds, sidechains have another problem. What if the sidechain operators publish a block header, but refuse to publish the underlying transaction data? That would prevent anyone from verifying the correctness of the sidechain. This concept is known as data unavailability. Plasma attempted to keep users safe even if operators withheld transaction data — in the event that an operator refuses to release data, all users would still be able to retrieve their funds and exit the sidechain. Plasma made big promises about its security and scalability. It’s no surprise then that during the bull run of 2017, it was widely believed that Plasma would solve Ethereum’s scaling problems. But as the market sobered up in 2018 and the blockchain hype collapsed, a more realistic picture of Plasma began to crystalize. When it came to real-world deployments, Plasma posed more problems than solutions. The first problem was that each user had to monitor and verify all transactions on the Plasma MVP chain to detect and exit in the case of malicious operator behavior. Transaction verification is expensive, however, and this monitoring requirement added significant overhead to participating in the Plasma chain. Researchers also realized that it’s difficult for users to exit a Plasma chain. When a user attempts to withdraw funds from a Plasma MVP chain, they must submit an exit transaction and then wait a set period of time. This is known as the challenge period. At any time during the challenge period, any user can challenge another user’s exit by providing a proof that the exit is invalid (such as that they’re minting fake coins or stealing someone else’s coins). Thus, all exits can only be processed after the challenge period is over, which takes up to 1 week in some proposals. But it gets worse. Remember, even if an operator withholds data, we want users to be able to withdraw their funds from the Plasma chain. MVP handled this in the following way: if Plasma transaction data was withheld, each user needed to individually exit their own money based on the Plasma chain’s last valid state. (Note: to avoid a malicious operator frontrunning honest users, exits are prioritized in order of how long ago they last transacted.) Growth of Ethereum storage. Credit: Alexey Akhunov In the worst case, if all users needed to exit a Plasma chain, the entire valid state of the chain would have to be posted on the Ethereum mainnet within a single challenge period. Given that Plasma chains can grow arbitrarily large, and that Ethereum blocks are already near capacity, it would be almost impossible to dump an entire Plasma chain onto the Ethereum mainnet. Thus, any stampede for the exits would almost certainly congest Ethereum itself. This is known as the mass exit problem. As prices began collapsing in 2018, Ethereum followers started to realize that Plasma MVP wouldn’t be the silver bullet scaling solution they’d hoped for. There was simply no way to overcome its weaknesses. Plasma MVP was a dead end. All the while, Ethereum continued to struggle under its transaction load, and Ethereum 2.0 was still many years away. The next generation of Plasma In mid-2018, as prices continued to crash, Ethereum’s research community continued their attempts to improve on Plasma, iterating on the Plasma MVP design. The new version they came up with was termed Plasma Cash. According to Vitalik, who was one of its main designers, Plasma Cash would allow for arbitrarily high transactions per second and solve the problems that plagued its predecessor. Some even claimed that this new design would achieve hundreds of thousands of transactions per second. First, let’s remember the issues with Plasma MVP. In the case of operator misbehavior, there was a mass-exit problem Users had to wait an entire challenge period before withdrawing Users had to monitor all transactions on the Plasma chain Plasma Cash held one primary advantage over MVP: by using a different data model, Plasma Cash could avoid the mass exit problem entirely. In Plasma Cash, all coins are represented as non-fungible tokens (NFTs), which makes it much easier to prove ownership of a set of coins. Simply put, users are responsible for proving ownership over their own coins and no one else’s. As a result, users only need to monitor their own coins and not the entire Plasma chain. Plasma Cash also presented a new interactive challenge system that allowed users to easily withdraw funds in the case of operator misbehavior. Utilizing a new Merkle tree construction, known as a Sparse Merkle Tree, users could easily authenticate a coin’s history and ownership using inclusion proofs. In the case of operator misbehavior, a user would only need to post on-chain proof that they currently owned the coin (consisting of the 2 most recent transactions and their corresponding inclusion proofs). However, Plasma Cash introduced a whole new set of problems. Primarily, malicious users or past owners of a coin could issue faulty withdrawal attempts. Because users were required to prove ownership of their own coins, it was up to these users to actually catch and challenge fraudulent withdrawals of their money. As a result, Plasma Cash, like Plasma MVP, required users to remain online at least once every two weeks to catch faulty withdrawals during their challenge periods. Additionally, to prove ownership of a coin, users would have to maintain that coin’s entire history and corresponding inclusion/exclusion proofs, leading to ever increasing storage requirements. By late 2018, the price of Ether had hit rock bottom, and the utopian crypto optimism had evaporated. Plasma Cash, while an improvement over MVP, was not the Visa-scale solution Ethereum was promised, and its MapReduce “tree of blockchains” was now little more than a pipe dream. Most companies developing clients for Plasma Cash halted work, and their implementations were archived in a half-finished state. The Ethereum community was in limbo. While new Plasma constructions continued to emerge and marginally improved on their predecessors, the Ethereum community failed to rally behind any of them. It seemed that Plasma was dead. Enter Rollups Just as confidence in layer-2 hit bottom, a GitHub repo named roll_up was made public by a pseudonymous user known as Barry Whitehat. This repo described a new type of layer-2 scaling solution: a Plasma-like construction with “bundled” up transactions, where instead of relying on operator trust, the correctness of the bundle could be attested to using an on-chain proof — a SNARK. This SNARK ensures it is impossible for an operator to post malicious or invalid transactions, and guarantees that all sidechain blocks are valid. Soon after, Vitalik released an improved version of Barry’s proposal he termed zk-Rollup. zk-Rollup became one of the highest ever viewed posts on Ethereum’s research forums. Vitalik’s proposal introduced a solution to prevent the data availability issues that plagued Plasma: posting side-chain transaction data on the Ethereum blockchain. Publishing transaction data as function arguments meant it could be verified at the time of publication and then thrown away (so that it did not bloat Ethereum’s storage). zk-Rollup could avoid Plasma’s exit games and challenge periods entirely without trading off affordability or security. With zk-Rollup, one could use novel cryptography to solve all of Plasma’s layer-2 scaling dilemmas in one fell swoop. Vitalik’s zk-Rollup post But zk-Rollup came with its own set of tradeoffs. Namely, validity proofs are computationally expensive to generate (details here). These zk-SNARKS are produced every block and can take upwards of 10 minutes to generate while costing up to 350,000 gas per verification (post Istanbul). For reference, that’s about 3.5% of an entire block (was 8% pre Istanbul). Additionally, it is currently not possible to deploy general smart contracts on zk-Rollup sidechains. Proposals are under development for specialized zero-knowledge VMs that would enable this, such as zkVM and ZEXE, but they still require lots of specialized knowledge to interact with them. For the most part, zk-Rollups limit general programmability. zk-Rollup visualization. Credit: Georgios Konstantopoulos By mid-2019, these new developments had re-energized the Ethereum research community. zk-Rollup seemed to solve many of the problems that had plagued the layer-2 narrative. Companies such as Matter Labs (one of our portfolio companies) and LoopRing began actively developing zk-Rollups, and both have testnet implementations live today. With optimizations, Matter Labs believes that it can achieve upwards of 2,000 TPS on its ZK Sync network. Additionally, Starkware (also a portfolio company) is building a variation on zk-Rollup they call StarkExchange. StarkExchange uses a STARK to prove the validity of sidechain transactions, but delegates the problem of data hosting off-chain (if the sidechain ever halts, exits are guaranteed through on-chain checkpointing). They are implementing a DEX in partnership with DeversiFi with this design and will be launching on mainnet in the near future. A dose of optimism But not everyone was pinning their hopes on zk-Rollups. One year after the release of the first zk-Rollup spec, John Adler and Mikerah introduced a design they called Merged Consensus. Merged Consensus enables off-chain consensus systems that are entirely verifiable on Ethereum without any fancy zero-knowledge cryptography. After its release, the Plasma Group released an extended version of the Merged Consensus design with the now well-known title: Optimistic Rollup. While zk-Rollup relies on zk-SNARKs to verify and finalize every block, Optimistic Rollups take a different approach: what if you just assumed every single block was valid? This works great in the happy path when everyone is playing nice, but we know operators can misbehave. So how does an Optimistic Rollup handle operator misbehavior? The “optimistic” answer is to use fraud proofs. A fraud proof is a computational proof that an operator performed an invalid action. If the operator posts an invalid state transition, anyone can submit a proof that the transition was invalid and revert those transactions (for a period of about ~1 week). Since these proofs are non-interactive, they can be sent by anyone: they don’t require users to monitor their own coins for security. Unlike zk-Rollups, however, Optimistic Rollups require 3–5x more transaction data to be posted on-chain (check out this post by StarkWare for details). This data primarily includes witnesses such as signature data (which are not required by zk-Rollups, since it verifies those in zero knowledge). In the best case, optimistic rollup transactions will never need to be verified except in the case of fraud-proof submissions. On-chain witness verification and posting is expensive, however, and developers have explored aggregate signature mechanisms that allow for inexpensive large-scale verification and reduced transaction data requirements. This optimization can increase the theoretical TPS of Optimistic Rollups from its current numbers of ~450 TPS all the way to potentially ~2,000 TPS. Optimistic Rollups offer a very different set of tradeoffs from zk-Rollups. They are less expensive (assuming that fraud challenges are rare), but they trade off by being less safe — in other words, it’s always possible that transactions can be incorrectly applied and later reverted. This safety window can be as long as an entire week. As a result, users cannot be allowed to exit the chain for that safety window (otherwise, they could run off with someone else’s funds). However, it’s possible to ameliorate these withdrawal issues by introducing a secondary market. A user could sell their exit rights to a third party liquidity provider in exchange for a small fee. (The liquidity provider would get paid for taking on the week-long illiquidity of the exit). This would allow for immediate exits from the rollup chain. While zk-Rollups would require programmers to understand complex constraint systems and advanced cryptography, Optimistic Rollups allow for general smart contract deployment (e.g Solidity) and execution. This means that smart contract-based protocols such as Uniswap can be built on top of Optimistic Rollup sidechains. The rollup family of solutions provide similar approaches to solving Plasma’s data availability issues and exit complexity, but all have the potential to far extend Plasma’s constructions. IDEX, for example, has built and deployed their own version of Optimistic Rollups and run a DEX on this construction. Similarly, Fuel labs has built a version of Optimistic Rollups that allows for UTXO style payments and ERC-20 token swaps. Plasma Group (now Optimism), recently announced their pivot to focus on Optimistic Rollups, and are aiming to offer general smart-contract capabilities on their platform (via their OVM construction). Everything that rises must converge Plasma was ultimately much more than just a protocol. In a time of irrational exuberance, Plasma was the story that Ethereum needed to believe in. But its claims of boundless scalability turned out to be, with the benefit of hindsight, technological hubris. Only in moving past Plasma have we been able to deeply appreciate the tradeoffs inherent in layer-2 scaling. As Ether prices have rebounded over the last year, so has optimism about Ethereum’s future. After nearly 3 years of searching for a secure, extensible, and robust scalability solution, the Ethereum research community has finally converged around rollups. Plasma and its cousins were noble first attempts, but a select group of innovators eventually created more realistic layer-2 designs that seem to have solved Plasma’s worst problems. Some Plasma focused research groups, such as the Plasma Group, have moved on to work on Optimistic Rollup solutions, but we believe the search for the final layer-2 scaling solution is just getting started. There are many contenders, and we expect the field to remain an active and exciting area of research and development. Thanks to Tom Schmidt, Georgios Konstantopoulos, and Ivan Bogatyy for reviewing drafts of this post. For more of our writing, follow us on Twitter at @ashwinrz and @hosseeb.
More in startups
How a Chinese AI lab spun out of a hedge fund shook up the entire tech industry.
A longer and more chaotic follow-up conversation in which Andrew and I dive into the weeds of our differing approaches to solving coordination.
Patrick interviews me; the energy transition; Americans die young; family and fertility; educating the poor; AI and growth
The Chinese app has just “blown the roof off” and “shifted the power dynamics.”
Soundtrack: The Hives — Hate To Say I Told You So In the last week or so, but especially over the weekend, the entire generative AI industry has been thrown into chaos. This won’t be a lengthy, technical write-up — although there will be some inevitable technical complexities,