Shapley, Weyl, Buterin, Freedman, Spinoza, Leibniz, and Kant

Published on
60 min read––– views


I walk into a House of Worship and the guy up front is in the middle of an oration. I don't expect to stay but catch a few words, “... so my friends, reach down and uplift thy less fortunate neighbor—to the 12\frac{1}{2}-power, for so it is written, and so may it be derived.” Startled, I take a seat; this sounds like theology on a wavelength I can receive.

This is the opening sentence to the best white paper I've ever read, published by the Fields medalist Michael Freedman, in which he explains how Quadratic Funding mechanisms for public goods uniquely solve Immanuel Kant's Categorical Imperative from Groundwork of the Metaphysics of Morals:1

"Act only according to that maxim whereby you can at the same time, will that it should become universal law."

To back up and explain how we got here: it's been nearly two years since this blog's first foray into voting power in game theoretic contexts, I am here to present another exercise in pedantry amongst friends due to a number of goodhearted disputes which arose during a cannonball run road trip from various corners and edges of the east coast to and from Baton Rouge. Violating the nordic principle of perpetual-favor-owing till death do us part and true to form, I will browbeat my friends into submission by citing not one source, not two, but three white papers saying "I'm right, actually" in a full tour de force of literature spanning mathematics, economics, and moral philosophy all so that I can dredge up old beef about "who left the cigs in Verdun," in the groupchat months later.

In this post, I'm going to try to synthesize the takeaways from three papers: Glen Weyl et al's Quadratic Voting,2 Buterin et al's addendum on Quadratic Funding,3 and Michael Freedman's pristine riff on previous paper, titled Spinoza, Kant, Buterin.4

And keeping score on Weyl's outfits because that guy knows how to dress.

1 | Shapley and his Values

Suppose two players p1p_1 and p2p_2 can cooperate to achieve some goal and be rewarded by prizes:


By teaming up and forming coalitions (ordered sets of agents), the players can receive the following payouts:

{p1,p2}\{p_1, p_2\}$10,000

As we can see, these coalition payout values seem to indicate that player p1p_1 should receive more prize money based on his individual contribution to the coalition. How, then, can we equitably divide prize money?

We compute the expected marginal contribution of each player in a coalition: the increase in the coalition's total payout due to the addition of the payout. For example, the marginal contributions of player p1p_1 to each possible coalition (with subscripts indicating player membership) are enumerated as:

C1,2C1=$5,000C1C0=$7,500\begin{aligned} C_{1,2} - C_1 &= \text{\textdollar}5,000 \\ C_{1} - C_0 &= \text{\textdollar}7,500 \end{aligned}

Therefore, the expected marginal utility is just the average of these two outcomes:

E[p1]=5,000+7,5002=$6,250\mathbb{E}[p_1] = \frac{5,000 + 7,500}{2} = \text{\textdollar}6,250

Similarly, the expected marginal contributions of p2p_2 are given by:

E[p2]=(C1,2C1)+(C2C0)2=2,500+5,0002=$3,750\begin{aligned} \mathbb{E}[p_2] &= \frac{(C_{1,2} - C_1) + (C_2 - C_0)}{2} \\ \\ &= \frac{2,500 + 5,000}{2} = \text{\textdollar}3,750 \end{aligned}

These expected marginal contributions are called Shapley values.

Note that summing of the exhaustive individual expected marginal contributions necessarily equals the expected value of the maximal coalition C1,2C_{1,2}:

E[p1]+E[p2]=E[C1,2]$6,250+$3,750=$10,000\begin{aligned} \mathbb{E}[p_1] + \mathbb{E}[p_2] &= \mathbb{E}[C_{1,2}] \\ \\ \text{\textdollar}6,250 + \text{\textdollar}3,750 &= \text{\textdollar}10,000 \end{aligned}

Keeping with conventional notation, an expected value is the weighted sum of possible outcomes, or in our case, coalitions. So, for more permutations of coalitions where players may not participate, due to arbitrary constraints (like being a party pooper), we have to introduce weights corresponding to the frequency of coalitions.

For example, given the following coalition payouts (denoted as utilities UU):


We compute the following marginal contributions for p1p_1:

E[p1]=  w1(C1,2,3+C2,3)+w2(C1,2+C2)+w3(C1,3+C3)+w4(C1+C0)\begin{aligned} \mathbb{E}[p_1] = \; &w_1 (C_{1,2,3} + C_{2,3}) + w_2 (C_{1,2} + C_{2}) + \\ &w_3 (C_{1,3} + C_{3}) + w_4 (C_{1} + C_{0}) \\ \end{aligned}

where weights wiw_i are given by the number of ways the given player can make those marginal contributions:

P((C{i})C)P\big((C \cup \{i\}) - C\big)

e.g. w1=P(C1,2,3C2,3)=1/3w_1 = P\big(C_{1,2,3} - C_{2,3} \big) = 1/3 since there are 3!=63! = 6 ways to form a coalition of all three players, and in 2/6 of them (C2,3,1C_{2,3,1} and C3,2,1C_{3,2,1}) p1p_1 makes the final, marginal contribution.

Repeating this computation of the weights for all coalitions where p1p_1 is the pivotal contributor, we get

E[p1]=  w1(C1,2,3+C2,3)+w2(C1,2+C2)+w3(C1,3+C3)+w4(C1+C0)=  13$5,000+16$2,500+16$7,500+13$5,000=  $5,000\begin{aligned} \mathbb{E}[p_1] = \; &w_1 (C_{1,2,3} + C_{2,3}) + w_2 (C_{1,2} + C_{2}) + \\ &w_3 (C_{1,3} + C_{3}) + w_4 (C_{1} + C_{0}) \\ = \; &\frac{1}{3} \text{\textdollar}5,000 + \frac{1}{6} \text{\textdollar}2,500 + \frac{1}{6} \text{\textdollar}7,500 + \frac{1}{3} \text{\textdollar}5,000 \\ = \; &\text{\textdollar}5,000 \end{aligned}

In general, for a PP player game where we want to compute the marginal contribution of player ii to coalition CC, the Shapley value is:

ϕi=C{1,...,P}{i}C!(PC1)!P![V(C{i})V(C)]\phi_i = \sum_{C \subseteq \{1, ..., P\}\{i\}} \frac{|C|!(P - |C| -1)!}{P!} \Big[V(C \cup \{i\}) - V(C) \Big]


  • P!P! is the number of ways to form a coalition of PP players
  • C|C| is the number of players in a coalition CC
  • C!|C|! is the number of ways that coalition CC can be formed
  • (PC1)!(P - |C| - 1)! is the number of ways that players can join after player ii joins

## Ass, Grass, or Cash (The Taxi Problem)

Rather than studying payouts, we can also apply Shapley values to split weird checks. Suppose three friends, call them say ... Howard, Trevor and Peter want to split the cost of gas (and other supplies) for the return trip from Baton Rouge to their respective headquarters on the east coast. We'll say the total cost of gas and cigarettes is $300, but –crucially– the amount of gas and cigarettes they each consume is proportionate to how much time they spend on the road. Howard is located in Virginia Beach, Trevor in Smithfield, and Peter in Charlotte. So the total distance traveled, gas consumed, and darts chuffed is not equal!!

We can flatten this out and visualize the shared cost of resources (⛽🚬) between the burgeoning coalition as follows:

Intuitively, we might have each participant pay the cost of their leg of the trip:

Now, while I did work out the exact numbers5 which were included in the preprint of this post which I sent straight to their dumbasses a few months ago, for the sake of readability, we can use round-er numbers to illustrate the point.

  • P=$180P = \text{\textdollar} 180
  • T=$60T = \text{\textdollar} 60
  • H=$60H = \text{\textdollar} 60

Under this division however, in the absence of the incentive of 15 hours of banter with his two better halves, Peter has no incentive to participate in the coalition, and could just drive himself home (hates fun), whereas Howard, who's already going past Charlotte would much rather pay $60 than $300 (and just think about having to smoke 3-person's worth of cigarettes by your lonesome in that UHaul), who could blame him for trying to rope in some companionship!

Peter might argue that, since Howard's already going to V.B., and Charlotte is on the way, the distribution should in fact be:

  • P=$0P = \text{\textdollar} 0
  • T=$120T = \text{\textdollar} 120
  • H=$180H = \text{\textdollar} 180

Trevor likes this line of reasoning and also argues that Howard should just facetank the cost and drop us both off since we're on the way, advocating for the following cost distribution:

  • P=$0P = \text{\textdollar} 0
  • T=$0T = \text{\textdollar} 0
  • H=$300H = \text{\textdollar} 300

But Howard, having been rebuked by his so-called "friends" stands to gain nothing from such an arrangement at this point, and doesn't wish to share his UHaul with these freeloading clowns.

By introducing the notion of order to the cannonball campaign, we can address this nonsense about "well, since you're already on the way..."

Suppose Peter taps into the group chat announcing that his return trip from Baton Rouge will cost him a mere $180, and then brother Trevor decides to join the coalition. Here, his contribution is necessarily an additional $60 to get from Charlotte to Smithfield and joins Peter's existing trip. Lastly, Howard stops pouting and decides to tag along at the cost of the remaining $60. For only this order of coalition formation does the initial naive marginal contribution amounts checkout.

Once again, the idea underlying Shapley values is that each participant pays their average marginal contribution over all possible orderings.

Characteristic Function of a Cooperative Game

The characteristic function of a cooperative game takes a subset of players as input, an maps them to a cost value:

V:2NRV : 2^N \rightarrow \mathbb R

We have 2N2^N possible coalitions of NN people since each person is either in or out of a coalition. So, the characteristic function of the three champions of the Pain Trust sojourning to6 –and, more importantly, from– Red Stick would be:

V(P,T,H)=$300V(P, T, H) = \text{\textdollar}300

And if Howard for some reason decides to sit out (how're ya gonna get home bucko?) we'd have:

V(P,T)=$120V(P, T) = \text{\textdollar}120

The marginal contribution of Howard joining an existing coalition is given by taking the difference between the characteristic functions:

V(P,T,H)V(P,T)=$180\begin{aligned} V(P, T, H) - V(P, T) &= \text{\textdollar}180 \end{aligned}

Alternatively, if Trevor bows out for some reason or another, the corresponding marginal contribution of Howard sucking it up and joining Peter,

V(P,H)V(P)=$240\begin{aligned} V(P, H) - V(P) &= \text{\textdollar}240 \end{aligned}

So we can reframe the Shapley value as:

ϕi=1N!CP(C)V(C{i})V(C)\begin{aligned} \phi_i = \frac{1}{N!} \sum_{C \in \mathcal P(C)} V(C \cup \{i\}) - V(C) \end{aligned}

once again:

  • ϕi\phi_i is the amount player ii pays
  • averaged over all N!N! possible orderings of players
  • summing the marginal contributions given by the difference in characteristic functions V(C{i})V(C)V(C \cup \{i\}) - V(C)
  • over all orderings of coalitions given by the power set P(C)\mathcal{P}(C)

We can tabulate the marginal contributions of each player ii and compute ϕi\phi_i as the row average:

C:C:{P,T,H}\{P,T,H\}{T,P,H}\{T,P,H\}{P,H,T}\{P,H,T\}{T,H,P}\{ T,H,P \}{H,P,T}\{H,P,T\}{H,T,P}\{H,T,P\}ϕi\phi_i
Peter1800180000$120/6 = $60
Trevor60240024000$300/6 = $90
Howard606012060300300$1,380/6 = $150

Demonstrating that Howard needs to pipe down


How do we prove fairness? For problems like these, four axioms are usually invoked: efficiency, symmetry, null-player invariance, and linearity.


A method is said to be efficient if people don't end up paying more than the total cost of the effort. Shapley values are efficient since

i=1Nϕi=V(C)\sum_{i=1}^N \phi_i = V(C)

That is, the sum of all Shapley values is not more (or less) than the characteristic function of an exhaustive coalition.


If two players are indistinguishable, then their costs should be the same.

If C,V(C{i})=V(C{j})\forall C, V(C \cup \{i\}) = V(C \cup \{j\}), then ϕi=ϕj\phi_i = \phi_j

Null Player

A player who stands to gain nothing from joining a coalition should have zero marginal contribution cost:

If V(C{i})=V(C)CV(C \cup \{i\}) = V(C) \forall C, then ϕi=0\phi_i = 0


Multistage cost functions should combine linearly:

If V=V1(C)+V2(C)V = V_1(C) + V_2(C) then ϕi=ϕi1+ϕi2\phi_i = \phi_{i_1} + \phi_{i_2}


Theorem: The Shapley value is the unique value that satisfies all four fairness axioms.

This is good and fair and probably enough, but I'm not done throwing the book at Howard. No no no.

2 | Weyl & Quadratic Voting

In the first2 of his many papers7 on quadratic mechanisms, Glen Weyl aims to propose a pragmatic mechanism for democratic reform relative to the current ineffective system of 1-person-1-vote (1p1v). Under 1p1v, each voter receives just a single unit of influence on any collective decision which prevents pareto-optimal improvements as it neglects the degree of preference or knowledge on a given issue from being expressed. 1p1v is therefore a low-bandwidth channel of democratic expression. It ignores the fact that some voters are willing to utterly forfeit their voice on some issues to gain influence on others.

The basic idea (which is proved ad nauseam by Weyl et al, and the also generalized to funding mechanisms by Buterin et al) is:

ϕ=V2\phi = V^2

that is, the optimal cost to each voter on a given issue is equal to the number of votes they wish to spend, squared. As we'll see, this mechanism vastly favors breadth of participation over individual depth of contribution. However, QV shines through in its tolerance to several, very realistic violations of the assumptions which several other (arguably more-optimal) voting mechanisms crumble.

Weyl's thesis for QV, then, is that that the set of robustly optimal Vote Pricing Rules is precisely the set of quadratic rules.


We'll spend a bit more time delving into his model and the assumptions/features that are baked into the nuances, which Buterin appropriates and Freedman also builds on top of.

  • We denote NN citizens indexed i=1,...,Ni = 1, ..., N,
  • A set of RR binary referenda,
  • Each voter is allocated some finite amount of voter ("voice") credits viv_i that they can trade and distribute across a measure rr,
  • We/Weyl assumes that R|R| is large enough and that the impact of each individual measure rRr \in R is sufficiently inconsequential such that each citizen has a quasi-linear continuation value for retaining voice credits for future votes8
  • Assume credits have been initially distributed according to some "fair" mechanism such that maximizing total equivalent continuation value defines social optimality.
    • Shrouded references to Rawls' veil of ignorance abound
  • For some measure rr, suppose voters receive 2ui2u_i to favor, and 2ui-2u_i (utility values) if it fails, casting a proportionate amount of votes viv_i to support/oppose.
  • The community votes on the referendum, choosing a continuous number of votes ±vi\pm v_i depending on support or opposition. A measure passes and is implemented iff i=vi0\sum_i = v_i \geq 0 and each voter pays c(vi)c(v_i) for her vote credits, where cc is a differentiable, convex, even, and strictly increasing cost function called the pricing rule.
    • See another one of Weyl's papers on Price Theory for analysis of the distribution of voter beliefs and equilibrium strategy9
  • Assume players weigh the marginal cost of an additional vote against the perceived chance that their potential vote will be pivotal in deciding the outcome of the ballot. - All voters agree on the marginal pivotality of votes vrv_r for any given issue - This assumption implies a rational voter will choose viv_i which maximizes
2uipvic(vi)2u_ipv_i - c(v_i)
  • Note most voters are not actually rational!10

  • Per the assumption of fair initial distribution conditions –for some arbitrary definition of "fair"– society collectively wishes to implement a measure when iui0\sum_i u_i \geq 0

  • A vote pricing mechanism is said to be robustly optimal if, p>0,N,uˉ\forall p > 0, N, \bar{u}, each price-taking voter ii chooses the optimal amount of votes viv_i^* s.t. ivi\sum_i v_i^* has the same sign as iui\sum_i u_i

    • Weyl and Lalley prove this in one of the appendices11 which is far more cerebral than the rest of the expressions which appear.12,13 Economists love to bastardize notation and only using "rigorous" definitions when they get to use cool symbols
    • This admittedly seems like a really low bar to cross, but bear in mind that each rational voter also takes into account their own pivotality to the referendum in the context of full set of issues to be voted on, and again the beauty of this paper (similarly to assumptions made by solutions to Byzantine Problem, solved by Strong Eventual Consistency14) is that competing mechanisms fold under even the slightest modifications to the assumptions, whereas QV alone is robust according to the perturbations of Weyl.

QV: The Quadratic Premise

Quadratic functions are the only ones with linear derivatives for which a citizen can equate marginal benefits and costs at a number of votes proportional to her perceived utility gained from the passage of a ballot which said votes are expended upon.15

Consider the class of vote pricing rules:

C(c)={c(x)=xaa>1}\mathcal C(c) = \{ c(x) = x^a | a > 1\}

The first-order condition for allegedly-optimal cost functions is differentiability:

2pui=a(vi)a1    vi=sign(ui)(2pa)1a1ui1a1\begin{aligned} 2pu_i &= a (v_i)^{a-1} \\ &\implies v_i = \text{sign}(u_i)\Big( \frac{2p}{a} \Big)^\frac{1}{a-1} |u_i|\frac{1}{a-1} \end{aligned}

If a=2a = 2, this leads to viv_i^* being proportional to uiu_i and thus robustly optimal:

2pui=2(vi)21    vi=sign(ui)(2p2)121ui1a2=sign(ui)pui\begin{aligned} 2pu_i &= 2(v_i)^{2-1} \\ &\implies v_i = \text{sign}(u_i)\Big( \frac{2p}{2} \Big)^\frac{1}{2-1} |u_i|\frac{1}{a-2}\\ &= \text{sign}(u_i)p|u_i| \end{aligned}

Formally, Weyl's claim is that for all other values of aa, the optimal number of votes viv_i^* cast by citizen ii is not proportional to the utility they gain from casting those votes uiu_i, and thus the costly voting rule will be sub-optimal for some arrangements of social values (preferences) and voters pp. Readers are encouraged to convince themselves of this truth in Desmos.

Possible price mechanism parameters fall somewhere on the spectrum of linear cost pricing: lima1\lim \limits_{a \rightarrow 1}, and the other extremum of lima\lim \limits_{a \rightarrow \infty}. For the linear case, as aa approaches 11, the power on viv_i which determines uiu_i goes to infinity, and so voters with only slightly greater preference values will be infinitely more influential, leading to a dictatorship of the most intense voter.

This pitfall is reflective of the intuitive rationale against vote trading, whereby the most-special interests can capture the whole populace. Thus, Weyl reasons that QV (or any robust voting mechanism, for that matter) should have marginal-at-best incentives for trading.

On the other end of the spectrum, for in the case where lima\lim \limits_{a \rightarrow \infty}, uiu_i goes to 1 as its power goes to 0, so we end up with 1p1v (monogamous) voting.

QV, then, is the optimal intermediate between the extremum via handwavey invocation of the Central Limit Theorem.


Under the appropriate conditions, in all symmetric Bayes-Nash equilibria in large populations, the price taking assumption approximately holds for almost all voters whose preferences are drawn i.i.d. from a known distribution of preferences, acting as rational and risk-neutral E.V. maximizers.16 Therefore, welfare losses from QV decay at a rate inversely proportional to the size of the population: welfare decay 1N\propto \frac{1}{N}.

In discussion of the pragmatism of QV, Weyl poses a few broader questions about the nature of the question of optimal voting: Are these common assumptions inherent to why QV works? Or can they be relaxed and QV still work...

Fundamentally, QV works because of the following theorem:

v=ϵu,  ϵuv = \epsilon u, \; \epsilon \perp u

– that in order to be efficient, the number of votes cast needs to be proportional to the utility gained by the measure in question, and that the degree of linear factor relating those two values be independent of utility itself.

This efficiency is only tangentially related to the population size NN, and rationality. The underlying optimality of QV is invariant to all or most other variables and assumptions which are quickly made in most other propositions

The proof is like 40 pages of supplemental appendices, and when presenting at the Becker Friedman Institute (where he slayed),17 Weyl underscored that QV's optimality hinges largely on the fair assumption that we have large NN.

For small NN, welfare lost by QV relative to the optimum is very small, while other methods such as 1p1v may easily be 100%. The "optimum" as defined by defined by Vickrey, Clark, and Groves in the 70s18 which is extremely sensitive to collusion, even by small groups, and which requires large & highly uncertain real world costs like dollars, rather than voice credits.

Weyl points out that the Federalist papers urge that democracy be a mechanism to ensure maximal utility; an instrument to augment culture, not the reverse we live in today, let alone theserpentine dystopia required to satiate the problematic assumptions of other mechanisms.

The Problematic Assumptions

Next, we'll study how these assumptions and violations therein effect various models including QV.

The minimal constraints that we're concerned with for QV to be effective are:

  1. Society is not collusive
  2. Voter preferences are IID values drawn from a known distribution
  3. Homoeconomicus - that voters are perfectly rational and instrumental in their motivations

Immediately, we have complications for there will always be some degree of collusion. One could imagine even encoding more preference over the total continuous utility via votes. That's not even a stretch, that's literally the view of the opposition in politics and your votes express that to others, even with atomic citizens.

As for the idealistic notion that a mechanism operates on complete information about the voter demographics which are furthermore independent and identically distributed about some axes – this would be uncharacteristic of any election.

And finally, the ever-troublesome assumption about rational behavior. Most behaviors are not rational under these various simple mechanisms.

Conceding these points, we can analyze how other mechanisms perform under indentical perturbations of these assumptions:

  1. Vickrey-Clarke-Groves yields full efficiency even with finite populations
  2. The Expected-Externality19 mechanism yields full efficiency with budget balance
  3. And with large populations, the simplest solution with these mechanisms is just a costly 1p1v. Meaning, if it's known to everybody what the distribution of values is, we/society/the governing authority can just implement the mean – provided it's non-zero, which is basically efficient for large NN.
    • (This works because, for this assumption, an infinitesimally small fraction of the population will vote which proxies QV conditional on every utility, the number of people that will vote would be the number of votes that someone with that utility would've voted with under QV.

All of these are either better or simpler than QV, so why pursue Yet Another Voting Mechanism?

Weyl argues that a violation of any of the three basic assumptions unravels any of the other mechanisms whereas QV is more or less invariant to even these base assumptions.

The Other Models (Jay Pow, Nate Silver, and Thomas Jefferson all Wept)20

Let's take a quick peak at how the other mechanisms work to understand why the fail.

1. Vickrey-Clark-Groves

VCG's mechanism essentially poses the question "How much am I willing to pay to guarantee that Gore gets elected over Bush?"

If the amount that I say ends up changing the outcome (that is, I am the pivotal voter), I have to pay the amount wagered by the opposition (those damn Bush voters).

E.g. if the c(vB)=\sum c(v_B) = $ 1M, vs. c(vG)=\sum c(v_G) = $500k, then Bush gets elected and I receive $500k.

Alternatively, if c(vB)=\sum c(v_B) = $1M vs. c(vG)=\sum c(v_G) = $1,000,001, then I payout $1M to the aggregate opposition.21

The problem: two people can demonstrate adept political acumen and wager $ 10 gazillion dollars for Gore, but since neither of the two conspirators are pivotal, neither one is on the hook to payout the Bush supporters. This is basically how superPACs work IRL btw.

And so this system is highly susceptible to collusion as it crumbles under even two rational actors who select both select (c(vopp))+1\Big(\sum c(v_{\text{opp}})\Big) + 1 which is basically not even seditious, it's rational. If more than one person per party has a brain (so, minimum N=4N= 4) then both sides can "collude" to run up the cost function of the opposition to infinity without risking any exposure to having to come up with infinity money.

2. Expected-Externality

The underlying idea behind this mechanism is that the governing authority charge everyone a price that's equal to the expected value of the preference distribution. The expected value of course is known by everyone according to the assumption of IID of the VCG payments that they would make under VCG.

Furthermore, in a world with non-zero uncertainty, if the IID distribution is not known, which it's not, no matter how many interactive maps FiveThirtyEight cooks up, then this mechanism is not even defined.

3. Costly 1p1v

Everyone in the contemporary literature pretty much just takes for granted that monogamous voting doesn't work in a world where people have any other motivation for voting other than implementation of policy which aligns with their values because it relies on infinitesimally small numbers of people voting which is only going to happen if the only motivation for voting is to be instrumental. However, in the real world, other motivations do exist (like getting the sticker so that you can be a shit on social media), and still the % of the VEP is like 66%.22

4. QV supremacy

Weyl's argument is therefore structured around QV being robust when scrutinized against the same criticisms that political economists normally just eat/ignore when theory crafting.

Either the Central Limit Theorem applies, or we can apply laws of large numbers in conjunction with large deviations. These are taken for fact by every serious statistician in like 99% of the literature. That's not to say it's a field lacking rigor, but rather that everyone starts from square 1, rather than square 0 for the sake of brevity in their findings. But because Weyl's argument hinges on the soundness of his argument as it pertains to square 0, he tacked it on as a supplement. Again, see the 40 pages of proof in the appendices of (Weyl, 2017)7 to establish the baseline case.

Calculations are based on the idea that (at least) one of these two statistical approximations holds.

QV vs. Problematic Assumptions

We can gauge QV's robustness by inquiring about how large a conspiracy needs to become in order to impact social efficiency.

  • Collusion: We get together & vote more than we otherwise would unilaterally to take advantage of the fact that I haven't exhaustively tit-for-tatted my way up another voter's quadratic function (as I might under VCG)

  • Fraud: Doing the same as above, as a lone actor by pretending that I'm multiple people in order to distribute the quadratic penalty to get around the fact that QV makes it increasingly costly for an individual to blast off their vote stack against a single referendum.

Experiments protracted under a range of assumptions, yielding combinatorial intersection of validity and soundness showing that in the worst case we have colluders consisting of a subset of the VEP @ the bottom (derogatory) of the population and in the tails of the value distribution. These people are the most extreme and therefore contribute the most inefficiency to social welfare as a result of their collusion

We compare the worst case against:

  • randomly sampled colluders, average Joe Shmo who dabbles in election fraud
  • fraud – which is like when Howard says "actually it's my three votes against your two"

The 2nd dimension of model perturbation is quality of collusion. The three notable degrees of conspiracy are:

  • Perfectly and undetected,
  • Imperfect - where conspirators might defect from the collusive agreement and therefore need to be monitored by the group,
  • Perfect, but detectable – collusive efforts might be perfect internally, but nevertheless suspected by the larger voting populace on the whole.

The third dimension is the mean-zero case, or extremum cases. Elsewhere, Lalley shows that those regimes behave very differently.23 In the mean zero case, the key threat to democratic efficiency is that a small group, or a single extremist, will buy enough votes to overturn the will of the people. In μ=0\mu = 0, the threat comes from extremists buying too few votes because it then becomes easier to become an accidental median voter (who otherwise dies in democracy).

So what are the good cases in these perturbations:

  1. All average case with colluders randomly sampled. It turns out they have low impact
  2. Even if the random sample of colluders happens to be a subset of the worst case extremists from 8chan, so long as society even suspects a possibility of collusion or fraud, it's no big deal because those extremists' participation dramatically increases the likelihood that an election is tied, because rational non-conspirators buy more votes as insurance, and since the numbers of non-conspirators is necessarily less than the number of seditious actors (otherwise they wouldn't be deemed seditious), the quadratic mechanism favors breadth of participation rather than depth, running up the linear cost rather than the quadratic part of the cost.
  3. In the μ=0\mu = 0 case, if the conspirators have any internal coordination issues, that removes the possibility of effective collusion. Collusion is not possible unless your firm controls a large share of the market economy of votes, which again they necessarily do not otherwise it wouldn't be considered a collusive firm, but simply the Republican or Democratic party.

And then the bad cases:

  1. μ0\mu \neq 0 case without suspicion. Everyone thinks the chance of a tied election is tiny, so fewer votes are purchased, and even the smallest amount of interference from a collusive group can easily sway the outcome of the election.
  2. The other troublesome case is not intuitive. For μ=0\mu = 0 elections with perfect internal agreement, a sensitive measure is subject to disproportionate interference from a relatively small group of colluders because there's fewer votes that need to be bought because everyone else already assumes there's a good chance that they're pivotal.

Voter Motivation and Rationality

Even in highly stylized lab experiments, with small groups of calibrated participants where the chance of being pivotal is higher than it would be for large NN, people still do not behave rationally, but instead buy way more votes than equilibria dictates. Despite the over extension, people still vote quite closely to their assigned preferences.

These deviations from optimality are explained by a number of possible factors :

  1. Expressive motive: people gain utility from expressing their preferences proportionately to their value.
  2. Expressive motive (to influence policy tho): The idea that the margin of victory might influence out-of-distribution policy initiatives by conferring some mandate to rule upon the victory, but this dies with large NN (whether or not voters realize this is tbd)
  3. Erroneous estimation of pivotal likelihood: this was especially prevalent for small NN, but this is also just hard to know

Each of these signals are muddied by noise, but the noise is orthogonal to individual voter preference which still correlates to the magnitude of their assigned vote credits.

In the μ0\mu \neq 0 case, these irrational behaviors actually help society converge on efficiency by avoiding the need for extremists whose preferences run contrary to the will of the public who doesn't vote very much, leaving the door open to deep-pocketed psychos. These other motivations to vote –which on paper seem irrational– cause people to vote more which solves the extremist problem in large NN, and suppresses it for small NN (whereas noise further obliterates voting mechanisms in VCG, and expected externality, and 1p1v is already obliterated).

In the μ=0\mu = 0, there is limiting inefficiency caused by the noise. Recycling the "democracy as a market economy" analogy used thus far with feasible behavioral agents, then things can be inefficienct. In a market economy, we might say "let's just ration all goods and disallow trade because these numb-nuts will engage in trade that will actually lead to a scenario that's worse for themselves."24

If the variance of the noise exceeds the variance of the underlying value distribution, then 1p1pv is better. Democratic society tends to prefer markets to rationing though, go figure, because we believe that heterogeneity in preferences is greater than heterogeneity in noise (Rock Flag and Eagle meme).And QV holds under this same assumption too, so Weyl argues that we mustn't throw the bather water out lest we part ways with the baby too.

The last assumption which QV must withstand is that values are IID, so Weyl presents a model where people don't know the exact distribution of values in the population. This also has limiting inefficiency because of the Bayesian underdog effect. E.g. a Mitt Romney supporter who ardently believes that the polls haven't accounted for him and maybe many people like him, thus his vote is "secretly" pivotal, and therefore the election is closer to being tied than pollsters make it out to be. A real "don't get out of line" typa guy.

Conversely, an Obama supporter makes the exact opposite inference, and concludes that the election is a done deal, pack it up, I've seen enough. It's less likely that the vote is tied because they haven't even accounted for my weird preference distribution yet and I'm voting for Obama, and he's ahead anyways, I'll just stay at home.

So, the underdog gets too many votes relative to the expected favorite because of people's estimates of being pivotal. Intuitively, this can't cause much inefficiency because it relies on the underdog remaining the underdog!

This is hard to translate into a formal result though. Experimentally, QV cedes about 4% inefficiency in the calibrated scenario and 1p1v buckles under 47%


VCG fails under collusion and fraud, whereas QV is tolerant to the average case. 1p1v fails to for what it's worth via vote-buying and coercive tactics.

Voluntary voting is inefficient when subjected to external motivations other than being instrumental when whereas QV actually converges on efficiency faster in the average case with believably irrational voters.

Expected externality is not even definable outside of a vacuum, and even theoretical constructions are highly sensitive to collusion and fraud.

So, at the very least, QV > 1p1v under all reasonable assumptions. Additionally, it's realistic under the complexity of real world constraints for a large set of specific examples with the mechanism that fits reasonably well across constraints under which other mechanisms fail.

3 | Buterin et al.: Quadratic Funding

Quadratic Funding3 (Which also features Weyl) extends ideas from Quadratic Voting to a funding mechanism for endogenous community formation. The amount of funding received by a project is proportional to the square of the sum of the square roots of contributions received.

ϕ=Vip((jCjp)2)Cjp\phi = V_i^p \Big((\sum_j \sqrt{C_j^p})^2\Big) - C_j^p

The effect is similar to QV's which rewards breadth of participation rather than depth.

The Problem

Simple private contributory systems famously lead to the under-provision of public goods that benefit many people because of the free-rider problem. Conversely, a system based purely on membership or on some other one-person one vote (1p1v) system cannot reflect how important various goods are to individuals and will tends to suppress smaller organizations of great value.

This is naively circumvented by e.g. “matching” by some larger institution. E.g. “many corporations use similar rules, matching charitable contributions by all full-time employees up to some annual amount. Doing so amplifies small contributions, incents more contributions and greater diversity in potential contributors, and confers a greater degree of influence on stakeholders in determining ultimate funding allocations.”

“Tax deductibility for charitable contributions is a form a governmental matching”

Unlike the sacrosanctity of 1p1v in democracy, the existence of matching programs in many realms of public goods lends more credence to the admissibility of QF to this domain!


I gloss over the model here and go into greater depth in the recap of Freedman’s summary of the elegance of the solution

  • 1,,N1, …, N citizens,
  • Public goods pPp \in P which can be proposed by any citizen at any time
  • Vip(Fp)V_i^p(F^p) be the currency-equivalent utility citizen ii receives if the funding level of good pp is FpF^p.
    • The value derived by citizen ii from public good pp is independent between goods
  • Each citizen ii can make contributions to the funding of each public good pp out of her own pocket cipc_i^p
  • The total utility of a citizen ii us the sum of utilities across all public goods minus their individual contributions and some tax tit_i:
pVip(Fp)cipti\sum_p V_i^p(F^p) - c_i^p - t_i
  • Similarly, total societal welfare is the sum over all public goods and citizens of the utilities gained by each citizen were each good to be funded VpV^p minus the actual cost of funding that good FpF^p:
p(iVip(Fp))Fp\sum_p\Big( \sum_i V_i^p(F^p) \Big) - F^p

The optimal mechanism is

ϕQF(cip)={icip}pP\phi^{QF}(c_i^p) = \Big\{ \sum_i \sqrt{c_i^p} \Big\}_{p \in P}

The effect is again an elegant mechanism which democratically rewards breadth of participation over depth

Play around with it on the website.25

4 | Freedman: Spinoza, Kant, Weyl

Recapitulating the model presented by Buterin et al, Freedman shows how differentiating the QF uniquely satisfies Kant's Categorical Imperative.


Should be familiar by now, but for completeness since these are derivations worth tracing, unlike –for our purposes– Weyl's appendix.

  • Society consists of NN well-defined citizens i,...,Ni, ..., N
  • pPp \in P are public goods requiring funding
  • Vip(Fp)V_i^p(F^p) is the utility function quantifying the utility that citizen ii receives if the funding of pp is FpF^p
    • this seems like an ass backwards way to measure total societal welfare, but in fact, out of lots of indirection, smoke and mirrors, emerges beauty
    • Must be smooth, increasing, concave – though simple monotonicity suffices (Freedman includes this as such a cute & meek lil footnote, Weyl provides the appendix gratia, and Buterin et al spend a lot of time in the derivations)
    • All utilities are independent
  • C,FC, F are the vector spaces of funding and contributions, respectively:
    • c={cip}\overrightarrow{c} = \{c_i^p\} is the vector of individual contributions of the ii-th citizen towards the pp-th public good, assumed to be non-negative (no bandits, sry)
    • F\overrightarrow{F} is the funding vector with components FpF^p for each good

The goal of QF is to find a funding mechanism ϕ:CF\phi: C \rightarrow F which maximizes the total societal welfare:

W=i,pVip(Fp)pFpW = \sum_{i,p} V_i^p(F^p) - \sum_p F^p

Here, Freedman hand waves away the externalities of taxation, equity, perception, etc. which are all covered by Buterin et al who shows how they perturb behavior at the extremum (turns out, not a whole lot, inherited from the robustness of QV!)

Taxation {ti}\{t_i\} governed by some arbitrary mechanism we don't really care about is required to balance the budget, constrained by:

iti=p(Fpicip)\sum_i t_i = \sum_p (F^p - \sum_ic_i^p)

So, an individual's tax-corrected utility is

Uit=pVip(Fp)ciptiU_i^t = \sum_p V_i^p(F^p) - c_i^p - t_i

which is included for completeness, but tit_i is just some constant which can effectively be omitted since it's not relevant when differentiating.

Fixing pp and differentiating societal welfare WpW^p w.r.t. FpF^p shows that marginal utility derived from good pp should equal 1 if FpF^p is positive at 0:

Wp=iVip(Fp)Fp(Wp)=i(Vip)1=0    i(Vip)=1\begin{aligned} W^p = &\sum_i V_i^p(F^p) - F^p \\ (W^p)^\prime = &\sum_i (V_i^p)^\prime - 1 = 0 \\ \implies &\sum_i (V_i^p)^\prime = 1 \\ \end{aligned}

Individual utility is given by:

Ui=pVip(Fp)ci=pVip(g(jh(cjp))cip)\begin{aligned} U_i &= \sum_p V_i^p(F^p) - c_i \\ &= \sum_p V_i^p\Big(g(\sum_j h(c_j^p)) - c_i^p\Big) \end{aligned}

which assumes FpF^p are built from internal analytic functions h,gh,g on the positive reals. hh is the weight of a contribution cc, and gg converts total weight into funding. Both hh and gg scale reciprocally, so funding choices are independent of currency (which seems like a bit of a non-sequitur, but is elucidated a bit more when analyzing gg).

A refined goal, therefore, becomes to find a funding mechanism ϕi\phi_i comprised of h,gh,g s.t. UiU_i is maximized under the constraint that the funding mechanism is democratic, FF is assumed to be symmetric in its NN variables, and Freedman also injects his own simplifying linearity constraint that:

ϕ(c)={Fp(c)}={g(ih(cip))}\begin{aligned} \phi(\overrightarrow{c}) = \Big\{ F^p(\overrightarrow{c}) \Big\} = \Big\{ g\Big(\sum_i h(c_i^p)\Big) \Big\} \end{aligned}

We can analyze g,hg,h by partially differentiating UipU_i^p w.r.t. cic_i and setting the resultant derivative to 0.

Uip=pVip(Fp)ci=pVip(g(jh(cjp))cip)(Uip)=Vipggh(cip)dh(cip)dcip1=0\begin{aligned} U_i^p &= \sum_p V_i^p(F^p) - c_i \\ \\ &= \sum_p V_i^p(g(\sum_j h(c_j^p)) - c_i^p) \\ \\ (U_i^p)' &= \frac{\partial V_i^p}{\partial g} \frac{\partial g}{\partial h(c_i^p)} \frac{d h(c_i^p)}{d c_i^p} - 1 = 0 \end{aligned}
Vipggh(cip)dh(cip)dcip=1(1) \tag{1} \frac{\partial V_i^p}{\partial g} \frac{\partial g}{\partial h(c_i^p)} \frac{d h(c_i^p)}{d c_i^p} = 1

Recalling that QF's mechanism ϕ\phi is given by:

ϕcpQF=(icip)2\phi_{c^p}^{QF} = \Big(\sum_i\sqrt{c_i^p}\Big)^2

That is, for every good pp, its level of funding is the square of the sum of the half-squares of individual contributions. Here, Freedman explains how Kant's CI ~implies QF:

CI implies that if citizen jj deems it proper to perturb her weighted contribution h(cj)h(c_j), say 1%, she should be following, not her limited self-interest, but be justified in expecting all her peers to also see the virtue of such a similar proportional increase in their weighted contribution—“act ... whereby ... it should become universal law.” So, mathematically we may write:

g(i(h(ci)))cj=ih(ci)h(cj)\frac{\partial g(\sum_i(h(c_i)))}{\partial c_j} = \sum_i \frac{h(c_i)}{h(c_j)}

Funding should respond to the imputed, community-wide judgment that additional matched resources are required for this good. It follows then, that changes to funding weight impact funding amount:

g(ih(ci))h(cj)=ih(ci)h(cj)(3)\tag{3} g'\Big(\sum_i h(c_i)\Big) h'(c_j) = \sum_i \frac{h'(c_i)}{h(c_j)}

Separating and solving the independent equations comprising (3) for some positive constant kk, we get:

g(x)=kx,h(y)=1kh1(y)\begin{aligned} g'(x) = kx, \quad h'(y) = \frac{1}{k}h^{-1}(y) \end{aligned}

And by integrating, we get:

g(x)=k2x2+m,h(y)=2ky+n\begin{aligned} g(x) = \frac{k}{2}x^2 + m, \quad h(y) = \frac{2}{k}\sqrt{y} + n \end{aligned}

Imposing reasonable boundary conditions of ϕ(0)=0\phi(0) =0 (no contributions imply no funding) implies m=n=0m = n = 0, so kk must be fixed s.t. for a society of N=1N = 1 we get F(c)=cF(c) = c:

g(x)=x2,h(y)=yg(x) = x^2, \quad h(y) = \sqrt{y}

How quadratic! That was the proof, that's it.

We can extremize social utility U=i(Ui)U = \sum_i(U_i) by throwing the book (Transcendentals, 5e) at em by tweaking ϕ\phi via h,gh,g. Rewriting (1) in terms of the partials of ViV_i w.r.t. gg, we get:

Vig=1g()h(ci)dh(ci)dci=1/dh(ci)dcig()h(ci)\frac{\partial V_i}{\partial g} = \frac{1}{\frac{\partial g(\sum)}{\partial h(c_i)} \frac{dh(c_i)}{dc_i} } = \frac{1/\frac{dh(c_i)}{dc_i}}{\frac{\partial g(\sum)}{\partial h(c_i)}}

where \sum is shorthand for i=1Nh(ci)\sum_{i=1}^N h(c_i), which is just the total weight.

Summing over ii, applying the optimality condition, and differentiating w.r.t. any given citizen's contributions (since they're symmetric, any citizen will do, so let's just choose c1c_1), we get:

c1(i=1NVig)=c1(i=1N1/dh(ci)dcig()h(ci))=0\frac{\partial}{\partial c_1} \Bigg( \sum_{i=1}^N \frac{\partial V_i}{\partial g} \Bigg) = \frac{\partial}{\partial c_1} \Bigg( \sum_{i=1}^N \frac{1 / \frac{dh(c_i)}{dc_i} }{\partial \frac{\partial g(\sum)}{\partial h(c_i)}} \Bigg) = 0

We can expand the outer partial via the quotient rule (uv)=uvuvv2(\frac{u}{v})' = \frac{u'v - uv'}{v^2}, keeping only the numerator and breaking the first variable out of the summation:

0=(1h(c1))g()g()h(c1)h()+0i=2N1h(ci)g()h(ci)0 = (\frac{1}{h'(c_1)}) \cdot g'(\sum) - \frac{g''(\sum) h'(c_1)}{h'(\sum)} + 0 - \sum_{i=2}^N\frac{1}{h'(c_i)}g''(\sum)h'(c_i)

After collecting terms, we get any of the following, equivalent relations:

(1h(c1))g(Σ)=g(Σ)(i=1Nh(c1)h(ci)),g(Σ)g(Σ)=i=1Nh(c1)h(ci)(1h)(c1),(logg)1=h(c1)h(c1)(h(c1))2(i=1N1h(ci))\begin{aligned} \Big(\frac{1}{h'(c_1)}\Big) \cdot g'(\Sigma) &= g''(\Sigma)\Big( \sum_{i=1}^N \frac{h'(c_1)}{h\prime(c_i)} \Big), \\ \\ \frac{g'(\Sigma)}{g''(\Sigma)} &= \frac{\sum_{i=1}^N \frac{h'(c_1)}{h'(c_i)}}{(\frac{1}{h'})'(c_1)}, \\ \\ (\log g')^{-1} &= \frac{h'(c_1)}{-\frac{h''(c_1)}{(h'(c_1))^2}} \Big(\sum_{i=1}^N \frac{1}{h'(c_i)} \Big) \end{aligned}
(logg)1=[(h)3h(c1)](i=1N1h(ci))(13)\tag{13} (\log g')^{-1} = \Big[ \frac{(h')^3}{h''}(c_1) \Big]\Big(\sum_{i=1}^N \frac{1}{h'(c_i)}\Big)

The factor in the brackets in (13) depends only on our select citizen c1c_1, whereas

log(g)(i=1N1h(c1))\log (g') (\sum_{i=1}^N \frac{1}{h'(c_1)})

is symmetric in all NN citizens. So, assuming N>1N > 1, it's constant! implying

(h)3=kh(h')^3 = -kh''

for some constant kk. And we can recursively solve this relation via Taylor series expansion around any positive value of the citizen, yielding the following radical solution:

h(y)=ay+bh(y) = a\sqrt{y} + b

and we know that b=0b = 0 per the boundary conditions, so:

1h(ci)=2a(y)\begin{aligned} \frac{1}{h'(c_i)} = \frac{2}{a} \sqrt(y) \end{aligned}


i=1N1h(ci)=a22Σ\begin{aligned} \sum_{i=1}^N \frac{1}{h'(c_i)} = \frac{a^2}{2} \Sigma \end{aligned}

and thus (13) becomes:

g(Σ)g(Σ)=const. Σ\frac{g'(\Sigma)}{g''(\Sigma)} = \text{const. } \Sigma

which, per the Taylor series expansion, yields

g(x)=const. x2+const. g(x) = \text{const. } x^2 + \text{const. }'

with const. =0\text{const. }' = 0 per the boundary condition, and const. =1\text{const. } = 1 to match self-funding in the limit of a single, positive contribution.

Thus, Freedman recovers g,hQFg, h \in \text{QF}:

g(x)=x2,h(y)=(y)g(x) = x^2, h(y) = \sqrt(y)

concluding that:

We can use both reason and love, faith for those who possess it, and the calculus of Leibniz and Newton for those who possess it, to navigate to a fairer, less contentious world.

Such a chad to sign off his paper the way I might've in a bad high school English essay overselling the sub-grandiose point I just made (poorly) in 250 words or less, but when he does it – it's based, actually.


It was never about the voting mechanisms or the cigarettes, but the friendship strengthened along the way 🫶



  1. Kant, Immanuel. "Groundwork of the Metaphysics of Morals." 1785.

  2. Weyl, Glen E. "The robustness of quadratic voting." Public Choice, Vol 172, July 2017. 2

  3. Buterin, Vitalik, Hitzig, Zoë, and Glen Weyl. "A Flexible Design for Funding Public Goods." arXiv, 16 August 2020. 2

  4. Freedman, Michael. "Spinoza, Leibniz, Kant, and Weyl." arXiv, 4 July 2022.

  5. the exact numbers are :

    • peter mi / total = 770 mi / 1,109 mi = .694
    • trevor mi / total = 1,078 mi / 1,109 = 0.972
    • howard mi / total = 1,109 / 1,109 mi = 1
  6. Canonically, the good natured disputes originated on the trip to Baton Rouge, where we then left brother Howard and flew back to our respective abodes. However, as pointed out in the ensuing debates over the preprint of this post, that setup where we all share a destination rather than an origin is not actually an instance of the Taxi Problem since the rationale for “tagging along” where someone is already headed does not hold. Notably, this actually makes my whole argument crumble, since I wind up owing the most and my motivations for Shapley values falls apart lol. But what was I going to do, not shitpost 35 pages of LaTeX? No.

  7. Steven P. Lalley and E. Glen Weyl. "Quadratic Voting: How Mechanism Design Can Radicalize Democracy." American Economic Association Vol 108, May 2018. 2

  8. The thought experiment strictly exists in a pre or post-Brexit universe of measures which would not alter the existence of RR itself

  9. Weyl, Glen E. "Price Theory." Journal of Economic Literature, June 2019.

  10. I'm so irrational it will get me killed

  11. Steven P. Lalley and E. Glen Weyl. An Online Appendix to “Quadratic Voting: How Mechanism Design Can Radicalize Democracy.” American Economic Association, 24 December 2017.

  12. +1 to Weyl for his sick ass outfit

  13. -1 for not open sourcing his paper, that’s not very democratically efficient of you, Glen, but fret not dear reader, I’ve taken liberties

  14. Strong Eventual Consistency

  15. I do love that it assumes a pseudo-rational voter who also doesn't understand non-linear proportionality, especially given the myriad empirical examples of calibrated lab participants demonstrating an utter inability to behave rationally. This is not the right hill to crucify this paper on as it actually aids the efficient convergence of price-taking voters for reasons which Weyl covers later

  16. EV maximizer merch here

  17. The Robustness of Quadratic Voting, Becker Friedman Institute University of Chicago

  18. Groves, Theodore, "Incentives in Teams." Econometrica, Vol 41, July 1973.

  19. Gorelkina, Olga. "The Expected Externality Mechanism in a Level-k Environment." Max Planck Institute for Research on Collective Goods, March 2015.

  20. A coworker said "[so an so] wept" after something inconsequential and now I can't stop saying it

  21. The actual amount is a bit more involved than this, as the winner pays out the per-person social welfare delta, which actually solves the Knapsack problem along the way

  22. Voter Turnout, 2018-2022. Pew Research Center.

  23. Steven P. Lalley and E. Glen Weyl., "Nash Equilbria26 for Quadratic Voting." arXiv, 18 July, 2019.

  24. Maybe I like dirt

  25. wtfisqf

  26. arXiv is also a typo factory it seems