This is a redux of the blog post I wrote on Democracy almost two years ago. The need to write about democracy arose in the context of the #Brexit referendum. The debate around Brexit, and perhaps a second “corrective” referendum, is again at an all time high. Therefore, revisiting the first principles may be of some use. In this post I’ve tried to distill my own understanding of the concept and have included the results of a numerical experiment I ran to quantify some ideas around it. [Please note that the second half of this blog post is fairly technical]
Democracy is best understood as an algorithm to correct political error. In that respect Democracy belongs to a special class of algorithms, with Darwinian evolution and scientific peer review as other notable members of the same class. The kinship between these disparate processes is not coincidental. The analogy can be explained in terms of Popperian epistemology, also known as the philosophy of Science, which posits the existence and mitigation of error as central to creating new knowledge.
Any discussion of the process of knowledge creation may seem like a digression at this point. However, please persevere as setting this context is important for the central thesis on Democracy. Popper’s epistemology implies that any agent must create knowledge in exactly the following manner: creatively produce guesses or conjectures, and criticize them to remove those that are erroneous. Two immediate corollaries of this theory arise: a) existence of error is a permanent feature of any form of knowledge, i.e. claims of error-free knowledge (perfect revelation etc) are aphysical. b) boundless knowledge-generation must require the ability or enabling culture to make and correct error as fast as possible.
we can afford many mistakes in the search. The main thing is to make them as fast as possible.
The above meta-theory explains why Darwinian evolution works at all – because mutation takes the role of, as it were, guesswork and natural selection acts as the trenchant critic of those guesses, inexorably optimizing on some measure of fitness to local environment. The actual error-correction itself happens at the level of the DNA molecule, which is where the knowledge created by selection pressure is stored for all life on Earth. The same is true of the growth of scientific (or mathematical) knowledge – a result of human creativity generating the conjectures and peer-review providing the criticism. The same predictor-corrector epistemology has been formalised in various successful machine learning algorithms e.g. actor-critic reinforcement learning. In short, Popper compliance is expected of any type of knowledge creator, sentient or otherwise.
Democracy is another avatar of the same underlying idea applied to politics. It does superficially look as if “Democracy” is the answer to the age-old Platonic question: who should rule? This is based on a very pervasive, but mistaken, assumption that the collective will of people is somehow sensible/rational guidance to choose a good leader. In truth, it almost never is due to very low signal-to-noise ratio. The label “Democracy” itself helps to reinforce the confusion further. Nonetheless, equating Democracy with literal rule-of-the-people is a completely mistaken and often dangerous assumption (cf. populism). Democracy works not because popular opinions are better than those of rulers, but because convincing some people of (real/imagined) shortcomings of the ruler is easy.
Human beings are innately risk-averse and familiarity seeking agents. It takes very long for humans to agree what is objectively good for them, even when that good should be obvious (herein lies the root of all tyranny!). Yet in light of the epistemology described above, an “objectively good” political idea has no meaning if the bad ones weren’t tried out and discarded. We have cultural concepts of academic freedom or the freedom of speech (at least functioning in some Western societies) to generate all kinds of good/bad ideas that are then open to scrutiny and review – where both the idea generator and the reviewer agree to abjure violence when playing the actor-critic game. Democracy is the same game to try out political ideas and consign bad ones to the dustbin of history without violence. Note that the act of consigning bad ideas to some “dustbin” is not by fiat. So, binned politics can (and do) get refurbished and replayed. Nonetheless our priors about their badness are updated and their efficacy grows less with each replay (e.g. German or East European Neo-Nazism is a mere shadow of NSDAP’s politics). Also in a functioning democracy, no single person/group actually sits in judgement on what constitutes objectively good or bad politics – though some people (say populist ideologues or utopia-seekers) may think they do. The system on the whole is rigged to be more sensible than the sum of its parts.
In well-designed democratic systems (more on this later) one does not even need to convince the entire population but only a fraction of it (the swing voters), and nor is good reasoning required to convince them. Emotional appeals work just as well. Such a system ensures that any leader, including the worst one (which is what really concerns us) is susceptible to swings of opinion for rational and purely emotional reasons. The better any voting system translates that swing into gain/loss of power, the better it is for hedging against downside political risk. Therefore reducing information asymmetry makes for better democracies, no matter how noisy, chaotic or opinionated people get.
Where probabilities are unknowable (which is often the case in, say, political decisions), it's not the case that reason is ineffective. It merely entails a methodology very different from utility theory—among other things focused on institutional rules not individual decisions.
— David Deutsch (@DavidDeutschOxf) January 4, 2019
The primary function of chosen representatives in a Democracy is not to carry out what some vaguely-defined popular will delegates to them, anymore than it is an airline pilot’s job to fly the plane on passengers’ instructions. Nonetheless, populists take the idea of the will-of-the-people seriously and are perfectly happy to crowd-source solutions to intricate questions of constitutional law and political organization via referendums. Referendums are therefore (paradoxically) antidemocratic, because they confound the whole point of the institution of seeking votes from people, namely to remove bad leaders.
If their pointlessness weren’t bad enough, referendums create a terrible precedent in a parliamentary political culture to settle debates by popular voting. They encourage rank populism, incentivize escaping responsibility/blame in politicians by outsourcing important decisions to the prevailing whims and fancies of the public and diminish the historically constructive role of the Parliament. Yet, it seems that the UK is sleepwalking into another disastrous referendum to “correct” the result of the first one. This sort of unbridled populism in the UK (and the West generally) is rather frightening and can cause the unravelling of an almost thousand year old political culture with error-correction at its heart.
In the spirit of putting ideas to the test, I created a simulation in Python of a political electoral system in a (hypothetical) country with 500 (this is a model input) constituencies. The toy model is based on the following quasi-realistic assumptions:
- The political spectrum/opinion can be represented on an axis, with normally distributed weights/frequencies corresponding to different intervals. So, say, -0.5 to 0.5 as Centrist, 0.5 to 1.5 as Centre-Right and 1.5 and beyond as Right and symmetrically negative intervals for the Left.
- Each constituency is divided into a section of voters that are ideologically corevoters, i.e. not swayed by political headwinds and those that change opinions based on political climate, i.e. swing voters. The core opinions are distributed normally across the country on the whole, yet the number of core voters varies randomly (again normally distributed) across constituencies. E.g. some constituencies are traditionally swing constituencies and others are core. The mean percentage of swing voters is a model input (set at 25% based on UK’s example).
- The swing in political opinion is modelled as a mean-reverting process (cf. Ornstein-Uhlenbeck process) which is a mathematical representation of a random quantity that has the property of reverting to its long-term average over some time-scale (another model input). Here the long-term mean of the swing is 0, i.e. if we wait long enough swing voters concur with core voters eventhough they may drift away in the short-term. The time-scale is generational, i.e. around 7 election time-periods.
- The swing happens similarly across all constituencies, i.e. all constituencies are causally aware of each other (no information asymmetry across constituencies). The swings from election-to-election can be very large or rather tame – a feature controlled by yet another volatility input to the model.
- I assume that the larger the swing, the more concentrated voting is around that opinion. This feature is a simplistic way to represent herd-mentality in swing voters especially when political swings are extremal. So, a swing of zero is as diffuse as core-voter political distribution (in pt 1), but a swing of 2.0 (well into Right-wing territory) implies that swing voters across constituencies will tend to vote right-wing. In mathematical terms, the variance of the swing voter distribution is narrower the larger the swing magnitude.
- Finally, the model is agnostic to actual truth values (assuming such an evaluation is possible at all) of political claims by Leftists or Right-wingers or, for that matter, Centrists. E.g. political centre of the 1930s Weimar Germany was well to the right of modern German politics, even by most conservative Bavarian-belt standards of today. The assumption here is that whatever the Centre may represent, its relative frequency of core support versus the fringes is a stable normalish distribution. Note that it doesn’t have to be the case and actual distributions may be rather skewed.
One of the sanity checks for this toy model for me was to numerically test out Popper’s theory that a democracy with First Past the Post (FPTP) system is better designed for its political error-correction function, as opposed to systems like Proportional Representation (PR). PR cannot avoid coalitions where fringe parties still form the government (even when the mandate is against them) and can still remain in position to affect government policy by acting as king-makers. In other words, PR is not the best democratic system to remove bad leaders. Turns out the guy was making sense and my numerical experiment, at least, bears this out quite nicely.
The above show simulations of how the Left, Right and Centre parties’ seats evolve across elections for Proportional Representation (PR) and First Past the Post (FPTP), where the average swing vote-share is set to 25% and the volatility is high. The results compare the final seats in 2 systems for exactly the same voting across all 500 constituencies. It is immediately obvious that under PR, no party ever gets absolute majority (>250 seats), which implies a polity hobbled with coalition politics for generations. Secondly, signals from voter swings barely register in PR, as opposed to FPTP where the effect is dramatic. This is evident in the results of the last few elections, which represent a clear swing for the Left – increase of seats from 150 to over 250, whereas the Right is decimated in FPTP. The same swing is also somewhat visible in PR, but the Right still retain around 120 seats leaving lots of room for a Centre-Right coalition to form the goverment even though the mandate was to deprive the Right of power. Finally, for most elections where the swings are indecisive, the power remains well with the Centrists with full majority in FPTP, with very little need to share it with the political fringes. In sharp contrast, PR systems tend to minimize centrist power in times of high political volatility, leading to minimal seats for the centrist party in almost all simulated elections.
This toy model is quite simplistic and makes some questionable assumptions. But there are a lot of ways to play around with it, not least by extending it to calibrate to real training datasets, i.e. constituency-level historic voting patterns and polling data. I am even tempted to have a stab at it, as most of these data are publicly available for the UK. Not sure how much time I’ll get to devote to this.