gas_price
attribute) are converted to a 1559-compatible format (with gas_premium := max_fee := gas_price
).In this notebook we model and simulate the transition period suggested in Pull Request 2924. In this proposal, while 1559 transactions are expected, legacy-style transactions are still accepted by clients who convert them to 1559 format using default parameters based on the declared gas price.
Under EIP 1559, users are expected to quote their gas_premium
, the maximum amount received by a miner who includes their transaction, as well as a max_fee
, which caps the maximum price paid by the user. Meanwhile, current transactions (which we call "legacy" in this notebook) only declare a single gas_price
parameter, which the user pays in all circumstances (see our general introduction to EIP 1559 and its changes to the existing fee market here).
Michał Komorowski analysed for Nethermind the impact of this transition period on the user behaviour in a notebook using the abm1559 library. His takeaways include a comparison of the average fee paid by legacy users vs 1559 users, as well as their inclusion rates and eviction from the transaction pool. Notably, Michał ends with an important open question:
Percent of legacy vs. 1559 users
in the blocks diagrams show us that transactions submitted by all types of users are included in blocks when basefee reaches stablization. It's good. What is very interesting is the fact that around 40% of transactions in blocks (after stablization of basefee) are transactions submitted by naive users. I would expect more transactions from legacy or from clever users and I'm not sure how to exlain this behaviour. It is also worth pointing that in the in the beginning of the simulation, when basefee grows, mainly transactions of clever EIP-1559 users are included in blocks.
We make the hypothesis here that legacy users benefit from the actions of 1559 users, who implicitly align the oracles used by legacy users with the true market price, allowing all users to be included. In this notebook, we'll expand on this idea by comparing two boundary scenarios and their synthesis.
We first import the classes from the abm1559 library we'll use in this notebook. If this is your first notebook, you can also check out the README at the root of this repo for installation details.
%config InlineBackend.figure_format = 'svg'
import os, sys
sys.path.insert(1, os.path.realpath(os.path.pardir))
from typing import Sequence, Dict
from abm1559.utils import (
constants,
basefee_from_csv_history,
get_basefee_bounds,
flatten
)
constants["SIMPLE_TRANSACTION_GAS"] = 210000
from abm1559.txpool import TxPool
from abm1559.users import (
User1559,
AffineUser,
User
)
from abm1559.config import rng
from abm1559.txs import Transaction, Tx1559, TxLegacy
from abm1559.userpool import UserPool
from abm1559.chain import (
Chain,
Block1559,
Block
)
from abm1559.simulator import (
spawn_poisson_heterogeneous_demand,
update_basefee,
generate_gbm,
)
import matplotlib.pyplot as plt
import pandas as pd
pd.set_option('display.max_rows', 1000)
import numpy as np
import time
import seaborn as sns
from tqdm import tqdm
We assume that during the transitional period, part of the ecosystem will remain using legacy tooling and creating transactions of the legacy format. To deal with that issue the Pull Request suggests normalising legacy transactions to 1559 transactions using a simple normalisation function.
Whenever a legacy transaction declares some gas_price
, a 1559 transaction is created with gas_premium = gas_price
and max_fee = gas_price
. In other words, gas_price
is still the fee that the user will pay, but the miner is allowed to receive any fee between basefee
and gas_price
.
Sender | gas_price |
gas_premium |
max_fee |
basefee |
User pays | Miner receives |
---|---|---|---|---|---|---|
Legacy user | 100 | 100 | 100 | 40 | 100 | 60 |
1559 user | x | 1 | 100 | 40 | 41 | 1 |
The intuition is that legacy users end up consistently overpaying compared to what 1559 users pay.
We provide now the function casting legacy transactions into the 1559 format.
def normalize_from_legacy(legacy: TxLegacy) -> Tx1559:
"""
Normalized according to
https://github.com/ethereum/EIPs/blob/541c8be92fe759aa602b7d06a088ab1a139e37ce/EIPS/eip-1559.md
"""
normalized_params = {
"max_fee": legacy.gas_price(),
"gas_premium": legacy.gas_price(),
"start_block": legacy.start_block,
}
return Tx1559(
sender=legacy.sender,
gas_used=constants["SIMPLE_TRANSACTION_GAS"],
tx_params=normalized_params
)
Next we introduce Legacy agents with strategic behaviours that are analogous to different gas price estimation methods currently being used.
Before we are able to simulate the legacy market together with the new proposed 1559 mechanism, we must understand how legacy users will interpret the new environment. We make the assumption that the legacy users live in a world where they are fully unaware that the transition to 1559 took place. They refer to price oracles (given below) for their fee estimation, send transactions that contain their gas_price
and are oblivious to the current level of the basefee.
MIN_ACCEPTABLE_TIP = 1 * (10 ** 9)
ORACLE_SLOW = "slow"
ORACLE_MEDIUM = "medium"
ORACLE_FAST = "fast"
class LegacyBidder(AffineUser):
def __init__(self, wakeup_block, **kwargs):
super().__init__(wakeup_block, cost_per_unit = 0, **kwargs)
self.value = (1 + self.rng.pareto(2) * 20) * (10 ** 9)
def percentile(self):
raise Exception("Not implemented, try using a subclass.")
def query_oracle(self, env: dict):
fee_oracles = env["fee_oracles"]
return fee_oracles[self.oracle()]
def decide_parameters(self, env: dict):
gas_price = self.query_oracle(env)
if self.oracle() == ORACLE_SLOW:
scaling = 0.5
elif self.oracle() == ORACLE_MEDIUM:
scaling = 0.75
else:
scaling = 0.9
# We add a normal error scaled by the user type
gas_price += scaling * self.rng.normal() * (10 ** 9)
return {
"gas_price": int(max(gas_price, MIN_ACCEPTABLE_TIP)), # in wei
"start_block": self.wakeup_block,
}
# This function is the main user entry point
def create_transaction(self, env: dict):
tx_params = self.decide_parameters(env)
# If gas price is higher than 1.2 * my value, I balk
if tx_params["gas_price"] >= 1.2 * self.value:
return None
tx_params["gas_price"] = min(self.value, tx_params["gas_price"])
tx = TxLegacy(
sender = self.pub_key,
gas_used=constants["SIMPLE_TRANSACTION_GAS"],
tx_params = tx_params,
)
return tx
In contrast with previous notebooks, users here do not experience a cost for waiting, so it is set to zero in the __init__
constructor. We also change the distribution of values, previously a simple uniform distribution over the interval $[1, 20]$ to a Pareto distribution, a.k.a., the 80/20 law (in truth, we use the Lomax distribution, which is simply a Pareto distribution shifted to zero, scale it, and shift it again to 1, so that our users have value at least 1 Gwei).
The choice is made to reflect the actual distribution of bids in a more realistic manner, with most users having relatively low value and few users having relatively high value. Note that the scaling of the distribution is rather irrelevant, since we focus on relative differences between users. We scale this distribution to match observed prices on Ethereum and provide intuition. Given here is the distribution of a thousand samples from our value distribution (in Gwei).
rng = np.random.default_rng(4)
pd.DataFrame({ "value": rng.pareto(2, 1000) * 20 }).hist(log=True, bins = 100)
array([[<AxesSubplot:title={'center':'value'}>]], dtype=object)
Note also how our legacy users set their bids.
gas_price = self.query_oracle(env)
gas_price += scaling * self.rng.normal() * (10 ** 9)
The LegacyBidder
user class is subclassed below to define the query_oracle
method, from which the user obtains their initial estimate of the current price of gas. The oracle is part of the simulation environment, i.e., contained in the env
vairable.
We then allow the user to modify the price given by this oracle. This increment is returned by a normal distribution, so that the gas price chosen by the user is a noisy version of the quoted oracle price. We represent here the agency of users who tune the gas price themselves taking the oracle as a reference, with most users bidding close to the oracle quote regardless. The normal distribution is scaled by the type of the user, upon which we expand now.
We instantiate three types of users, one for each oracle level ("slow", "medium" and "fast"). Concretely, this represents the different preferences of users faced with the three price levels. We have assumed the cost for waiting of all users to be zero, but we can imagine most users being relatively neutral towards their inclusion speed while few would always hit the fast oracle default for inclusion at maximum speed. The type of the user mirrors their position on this distribution of preferences, with neutral users content to use the "slow" oracle recommendation while hurried users follow the "fast" oracle, and "medium" users in-between.
The observations in this notebook are fairly robust to the levels of these parameters or the proportion of users between these three classes ("slow", "medium" and "fast"), which we'll define later on.
class LegacyBidderSlow(LegacyBidder):
def oracle(self):
return ORACLE_SLOW
def export(self):
return {
**super().export(),
"user_type": "user_slow",
}
class LegacyBidderMedium(LegacyBidder):
def oracle(self):
return ORACLE_MEDIUM
def export(self):
return {
**super().export(),
"user_type": "user_medium",
}
class LegacyBidderFast(LegacyBidder):
def oracle(self):
return ORACLE_FAST
def export(self):
return {
**super().export(),
"user_type": "user_fast",
}
We use in this notebook a simplified version of existing oracle models. EthGasStation (EGS) estimates three oracle levels using intricate analysis of the transaction pool and previous blocks. In particular, EGS monitors whether a transaction sent at the lowest suggested price is eventually included, updating its safe low amount if it isn't.
Our oracle model is admittedly less sophisticated: we only take the percentiles of gas prices observed in previous blocks. Since the quotes are entirely determined by the distribution of gas prices in previous blocks, we call these oracles distributional.
Given a window of the N
most recent blocks, we collect the series of gas prices of all transactions included in these N
blocks. These gas prices are ordered from lowest to highest and percentiles are extracted. In the example below, suppose we take N = 1
and blocks include 8 transactions. The percentiles levels are 50 (the median), 75 and 90 respectively for the "slow", "medium" and "fast" oracles.
Gas prices | Median ("slow" oracle) | 75th-percentile ("medium" oracle) | 90th-percentile ("fast" oracle) |
---|---|---|---|
10, 20, 30, 40, 60, 100, 150, 200 | 50 | 125 | 175 |
This estimation may be biased when blocks aren't full. Suppose a single block includes only a 2000 Gwei transaction, while all other N-1
blocks are empty. Should our oracles return 2000 Gwei as their estimation of the current market price? Probably not. To avoid such situations, we define the block target as the targeted block size under EIP 1559 and pad empty transaction slots with zeros to ensure oracles return coherent prices.
Gas prices | Median ("slow" oracle) | 75th-percentile ("medium" oracle) | 90th-percentile ("fast" oracle) |
---|---|---|---|
0, 0, 0, 0, 0, 0, 150, 200 | 0 | 75 | 175 |
def update_oracles(env: dict, block_target: int) -> None:
blocks_window = len(env["recent_blocks"])
fees = flatten([[tx.gas_price({ "basefee": block.basefee }) for tx in block.txs] for block in env["recent_blocks"]])
fees += [0.0] * (blocks_window * block_target - len(fees))
env["fee_oracles"] = {
ORACLE_SLOW: np.percentile(fees, 50) if len(fees) > 0 else 0,
ORACLE_MEDIUM: np.percentile(fees, 75) if len(fees) > 0 else 0,
ORACLE_FAST: np.percentile(fees, 90) if len(fees) > 0 else 0
}
A special class of high-value users exist on the network too. These users post transactions closing collateralised positions, arbitraging fees on DEXes or bidding for first inclusion. This is where the highest fees are observed, as the largest admissible fee for these transactions is the size of their profit opportunity, which in the three cases mentioned above can be pretty huge. (For more on Miner-Extractable Value, or MEV, and bidding wars, check out Phil Daian et al., Flashbots 2.0 paper).
Recently, a lower bound of about 10k out of 443k blocks was estimated to have been used for MEV-related activity, with a large amount of the profits (~18.7%) paid back to the miners who included these high-fee, MEV-extracting transactions. Although the behaviour of bots acting on behalf of MEV-chasing users is complex, involving bidding wars and real-time analysis of the transaction pool, we'll use a simple model of an extremely risk averse user who wants their transaction included at any cost (as long as it is lower than the value it can extract).
The risk-averse user observes the "fast" oracle and posts a gas price somewhere between that oracle and its own (high) value. More data on the outcomes of such bidding wars could be used to tune this assumption, but we reason that any factor such as latency or adverse behaviour may be responsible for stopping the bidding war at any point between its lower bound (the fast oracle) and its upper bound (the user value, i.e., whatever MEV there is to extract).
class BiddingBot(AffineUser):
def __init__(self, wakeup_block, **kwargs):
super().__init__(wakeup_block, cost_per_unit = 0, **kwargs)
self.value = (1 + self.rng.pareto(2) * 40) * (10 ** 9)
def percentile(self):
raise Exception("Not implemented, try using a subclass.")
def decide_parameters(self, env: dict):
oracle_price = env["fee_oracles"][ORACLE_FAST]
gas_price = oracle_price + self.rng.random() * (self.value - oracle_price)
return {
"gas_price": int(gas_price), # in wei
"start_block": self.wakeup_block,
}
# This function is the main user entry point
def create_transaction(self, env: dict):
tx_params = self.decide_parameters(env)
expected_payoff = self.value - tx_params["gas_price"]
if expected_payoff <= 0:
return None
tx = TxLegacy(
sender = self.pub_key,
gas_used=constants["SIMPLE_TRANSACTION_GAS"],
tx_params = tx_params,
)
return tx
def export(self):
return {
**super().export(),
"user_type": "bidding_bot",
}
Mirroring the work of Michał Komorowski for Nethermind, we cap the transaction pool length to represent the bounded memory and computation of mining clients. The transaction pool otherwise still orders transactions by their effective tip (the amount received by the miner for inclusion).
MAX_TRANSACTIONS_IN_POOL = 2000
class MixedTxPool(TxPool):
def add_txs(self, txs: Sequence[Transaction], env: dict) -> None:
for tx in txs:
if type(tx) is TxLegacy:
tx = normalize_from_legacy(tx)
self.txs[tx.tx_hash] = tx
self.pool_length += len(txs)
if self.pool_length > MAX_TRANSACTIONS_IN_POOL:
sorted_txs = sorted(self.txs.values(), key = lambda tx: -tx.tip(env))
self.empty_pool()
self.add_txs(sorted_txs[0:MAX_TRANSACTIONS_IN_POOL], env)
return sorted_txs[MAX_TRANSACTIONS_IN_POOL:]
return []
def select_transactions(self, env, user_pool=None, rng=rng):
# Miner side
max_tx_in_block = int(constants["MAX_GAS_EIP1559"] / constants["SIMPLE_TRANSACTION_GAS"])
valid_txs = [tx for tx in self.txs.values() if tx.is_valid(env) and tx.tip(env) >= MIN_ACCEPTABLE_TIP]
rng.shuffle(valid_txs)
sorted_valid_demand = sorted(
valid_txs,
key = lambda tx: -tx.tip(env)
)
selected_txs = sorted_valid_demand[0:max_tx_in_block]
return selected_txs
Block space allocation is a dynamic process with a queue of users competing for scarce resources. Although first price auctions are an imprecise mechanism to allocate the resources to users efficiently, they provide an estimator of some true market price. We define market price as the actual gas price that ought to be quoted to exactly equalise demand with supply. How can we find it in our setting?
We target the inclusion of some number of users $T$ in the block (recall that all users have the same, constant demand for gas). New users come in between each block, while the pool holds at most 10,000 unincluded users, ranked by their tip amount. Before a new block is created, we take a "snapshot" of the current state: a set of new users with values for gas $(v_i)_i$, and a set of pending transactions with gas prices $(g_j)_j$. We now sort the values and gas prices in decreasing order, obtaining a sequence of "prices" $(p_k)_k$. Taking $p_T$, the $T$-th price in the sequence, we estimate the true market price that equalises supply with the demand given the snapshot $(v, g)$.
There are at least two other ways to estimate the true market price.
In a sense, we look for a value that answers the question: "In a world where the market has full information about user values, which uniform price should the market quote to equalise supply and demand?"
We give in get_market_price
the implementation of this estimator. Note that when we have less users and transactions than the target amount, the price is set to 0, since there is no scarcity. In practice, we could set this no-scarcity price to the minium fee miners expect from a transaction, although this makes no difference in the following.
def get_market_price(users: Sequence[User], txs: Sequence[Transaction], env: dict, target: int) -> float:
values = sorted(
[user.value for user in users] + [tx.gas_price(env) for tx in txs],
key = lambda v: -v
)
return values[target] if target < len(values) else 0
Finally, here is the simulation loop! Most of the lines below should be familiar if you've followed our notebooks until now.
def simulate(demand_scenario, shares_scenario, extra_metrics = None, rng = rng):
# Instantiate a couple of things
txpool = MixedTxPool()
chain = Chain()
metrics = []
user_pool = UserPool()
blocks_window = 30
start_time = time.time()
block_target = int(constants["MAX_GAS_EIP1559"] / constants["SIMPLE_TRANSACTION_GAS"] / 2.0)
# `env` is the "environment" of the simulation
env = {
# we start with the historic basefee based on previous txs
"basefee": constants["INITIAL_BASEFEE"],
# "basefee": initial_historic_basefee,
"current_block": None,
"min_premium": 2 * (10 ** 9),
"recent_blocks": [],
"fee_oracles": {
ORACLE_SLOW: 2 * (10 ** 9),
ORACLE_MEDIUM: 2 * (10 ** 9),
ORACLE_FAST: 2 * (10 ** 9)
}
}
for t in tqdm(range(len(demand_scenario))):
# Sets current block
env["current_block"] = t
# Reset the random number generator with new seed to generate users with same values across runs
rng = np.random.default_rng(t)
### SIMULATION ###
# We return some demand which on expectation yields `demand_scenario[t]` new users per round
users = spawn_poisson_heterogeneous_demand(t, demand_scenario[t], shares_scenario[t], rng=rng)
# Record the market price given the current snapshot
market_price = get_market_price(
users,
txpool.txs.values(),
env,
block_target
)
# Add new users to the pool
# We query each new user with the current basefee value
# Users either return a transaction or None if they prefer to balk
decided_txs = user_pool.decide_transactions(users, env)
# New transactions are added to the transaction pool
# `evicted_txs` holds the transactions removed from the pool for lack of space
evicted_txs = txpool.add_txs(decided_txs, env)
# The best valid transactions are taken out of the pool for inclusion
selected_txs = txpool.select_transactions(env)
txpool.remove_txs([tx.tx_hash for tx in selected_txs])
# We create a block with these transactions
block = Block1559(
txs = selected_txs, parent_hash = chain.current_head,
height = t, basefee = env["basefee"]
)
env["recent_blocks"] = env["recent_blocks"][-(blocks_window-1):] + [block]
# Record the min premium in the block
env["min_premium"] = block.min_premium()
# The block is added to the chain
chain.add_block(block)
### METRICS ###
row_metrics = {
"block": t,
"users": len(users),
"decided_txs": len(decided_txs),
"included_txs": len(selected_txs),
"basefee": env["basefee"] / (10 ** 9), # to Gwei
"slow_oracle": env["fee_oracles"][ORACLE_SLOW] / (10 ** 9), # to Gwei
"medium_oracle": env["fee_oracles"][ORACLE_MEDIUM] / (10 ** 9), # to Gwei
"fast_oracle": env["fee_oracles"][ORACLE_FAST] / (10 ** 9), # to Gwei
"market_price": market_price / (10 ** 9), # to Gwei
"blk_min_premium": block.min_premium() / (10 ** 9), # to Gwei
"blk_max_premium": block.max_premium() / (10 ** 9), # to Gwei
"blk_min_tip": block.min_tip(env) / (10 ** 9), # to Gwei
"blk_max_tip": block.max_tip(env) / (10 ** 9), # to Gwei
}
if not extra_metrics is None:
row_metrics = {
**row_metrics,
**extra_metrics(env, users, user_pool, txpool),
}
metrics.append(row_metrics)
### ORACLES ###
update_oracles(env, block_target)
# Finally, basefee is updated and a new round starts
env["basefee"] = update_basefee(block, env["basefee"])
return (pd.DataFrame(metrics), user_pool, chain)
The simulation loop above defines a 1559-environment. There is a basefee determining transaction validity and blocks are created with 1559-formatted transactions (with max_fee
and gas_premium
fields). In this environment, we start by looking at two scenarios.
gas_price
field to the amount desired by the user.max_fee
and gas_premium
attributes.Before launching into these scenarios, we discuss the demand process in this notebook.
In previous notebooks, we've always used a stationary demand to simulate the behaviour of the mechanism. This is a good assumption to obtain the stationary behaviour of the basefee and 1559 users, but is obviously a gross simplification of the real world. Demand can be made non-stationary in two ways:
While 2 is probably a better representation of systemic shocks (e.g., introduction of a new token or contract), 1 appears to reflect better more natural chain activity, with high and low demand regimes. This is the choice we make in this notebook.
We generate a sample path from a geometric Brownian motion (GBM) with $\mu = \frac{\sigma^2}{2}$ to remove any trend. The path has some value $d_t$ at each time step $t$. We then sample from a Poisson distribution of mean $d_t$ to obtain the actual number of users spawned at time step $t$. Note that $d_t$ may be a decimal number, but the Poisson sample will always be integer-valued.
blocks = 6000
rng = np.random.default_rng(4)
sigma = 0.05
mu = 0.5 * sigma**2
gbm = list(generate_gbm(200, blocks, paths=1, mu=mu, sigma=sigma, rng=rng).flatten())
# Plot demand
f, ax = plt.subplots()
plt.title('Demand size over time', color='black')
plt.plot(range(blocks), gbm)
ax.set_xlabel("Block height")
ax.set_ylabel("Number of new users")
plt.show()
In this scenario, only legacy users interact in a 1559-environment. We instantiate constant fractions of each user type each step, while the absolute number of users follows the demand process generated by the GBM sample path.
In particular, following a rough intuition of Pareto-law distributed users, we imagine that most users are relatively unhurried, while some users are willing to pay greater amounts for faster inclusion. This assumption is implemented with the following shares of user types:
def extra_metrics_fpa(env, users, user_pool, txpool):
pool_legacy_users = len(
[tx for tx in txpool.txs.values() if isinstance(user_pool.users[tx.sender], LegacyBidder)])
return {
"legacy": len([user for user in users if isinstance(user, LegacyBidder)]),
"pool_legacy": pool_legacy_users
}
# Number of new users per time step
demand_scenario = [gbm[t] for t in range(6000)]
# Shares of new users per time step
shares_scenario_fpa = [{
LegacyBidderSlow: 0.50,
LegacyBidderMedium: 0.30,
LegacyBidderFast: 0.15,
BiddingBot: 0.05,
} for t in range(blocks)]
(df, user_pool, chain) = simulate(demand_scenario, shares_scenario_fpa, extra_metrics_fpa)
100%|██████████| 6000/6000 [06:47<00:00, 14.73it/s]
ax = df.plot("block", ["basefee", "market_price", "slow_oracle", "medium_oracle", "fast_oracle"])
ax.set_xlabel("Block height")
ax.set_ylabel("Gas price (Gwei)")
Text(0, 0.5, 'Gas price (Gwei)')
Let's zoom in on a particularly interesting pattern.
ax = df[(df.block >= 400) & (df.block < 1750)].plot("block", ["basefee", "market_price", "slow_oracle", "medium_oracle", "fast_oracle"])
ax.set_xlabel("Block height")
ax.set_ylabel("Gas price (Gwei)")
Text(0, 0.5, 'Gas price (Gwei)')