A Brief Review of Democratic and Authoritarian Datasets

April 13, 2021

Researchers and practitioners of the natural scientists are blessed with the ability to directly observe several measures of interest, but social and political scientists are cursed with the un-observable; the latent variable. Data products measuring civil liberties and levels of national democracy and authoritarianism have become a core component of social and political science. Comparative measures of democracy and civil liberties at the extremes is simple; France is clearly more democratic than North Korea. Comparing slight deviations across like-minded countries is more nuanced. Is Canada more democratic than Norway? Conversely, is Yemen more authoritarian than Chad. How do these relationships change over time, and what are effective ways to quantify inherently qualitative concepts of freedom, civil liberties, and governance?

Multiple datasets attempt to answer these questions. Some of the most widely used include Varieties of Democracy (Coppedge et al. 2020), Freedom House’s Freedom in the World (Freedom House 2021), the Economist Intelligence Unit’s Democracy Index (Economist Intelligence Unit 2020), and Polity V (Marshall and Gurr 2020). Although all the aforementioned datasets grapple with similar concepts, they vary in their transparency of methods, level of aggregation, temporal coverage, frequency of updates, and irregular data availability across countries and time. Moreover, their definitions of democracy and civil liberties may differ greatly (Cheibub, Gandhi, and Vreeland 2010). The primary disadvantage of all of these indices is a reliance on subjective judgments by teams of experts, which may be exacerbated by vague questions and inconsistent coding between team members (Cheibub, Gandhi, and Vreeland 2010; Michalik 2015; Wig, Hegre, and Regan 2015) . When combined with a lack of transparency in methods, assessing uncertainty and making informed comparisons across datasets becomes increasingly difficult.

Freedom House’s Freedom in the World offers a moderate temporal record (1972-) and somewhat disaggregated indicators of specific forms of civil liberties such as the electoral process, government functionality, and rights to assembly. The limited number of indicators constituting their primary composite indices of Civil Liberties and Political Rights make navigating the raw data an easier process, but the components are very abstract and they lack transparency for certain methodologies, which makes interpretation difficult. Similarly, the Polity V dataset also presents a moderately disaggregated approach to their dataset, but the concepts are also very abstract. The primary composite indices for both datasets employ ordinal scoring systems that span a 13 (Freedom House) and 21 (Polity V) point scale. This allows users to make more granular distinctions between levels of democracy across a set of nation-states when compared to datasets that rely on binary flags for autocracies, dictatorships, or democracies. That said, Polity V and Freedom House have some level of obfuscation in their methodologies and do not provide internal measures of inter-coder reliability. Analysis of the correlation between Freedom House’s and Polity V’s two primary metrics shows a high level of correlation (0.88) between the 2 datasets (Coppedge et al. 2017). Despite high correlation, Casper and Tufis (2003) (Casper and Tufis 2003) reported large shifts in statistical significance when substituting key metrics from Polity, Polyarchy (Vanhanen 2000), and Freedom House indices. Högström et al. (2013) (Högström 2013) found similar results comparing statistical significance and variable effects when exchanging Polity and Freedom House indices. These studies are 10-20 years old at this point, but users should proceed with caution.

Global map of Varieties of Democracy’s 2021 ‘v2x_lbdem’ composite metric. Raw V-Dem data is tabular, but depicted here using a choropleth.

In contrast with Polity V and Freedom House’s Freedom in the World, the Economist Intelligence Unit’s Democracy Index presents a more disaggregated approach to assessing global democracy. The Democracy Index consists of 60 indicators across 5 categories that measure pluralism, civil liberties, and political culture. These indicators are used to rank countries and place them into 4 categorical regime types: full democracies, flawed democracies, hybrid regimes, and authoritarian regimes. The Democracy Index is a more flexible framework due to a higher level of disaggregation, however, it’s limited in that it only provides data for 2006 until the present. Moreover, the Democracy Index is behind a pay-wall and relies heavily on polling data that is highly intermittent for many countries. This introduces bias and limits comparisons across countries (Wig, Hegre, and Regan 2015; Coppedge et al. 2017). This introduces bias and limits comparisons across countries. Because of the limited temporal coverage, potential sources of bias, and it being developed and maintained by The Economist, the Democracy Index is more widely used in journalistic reporting than academic research where long established datasets like Polity and Freedom House reign supreme. That said, the Democracy Index still maintains a moderate presence across modern academic investigations of nation-state democracy and should not be dismissed outright.

The Varieties of Democracy (V-Dem) dataset puts forth an even greater collection of disaggregated indicators (400+) depicting wide-ranging measures of democracy dating back to 1789. The size of V-Dem’s database is staggering, and even a bit intimidating, but they do offer a handful of high level composite indices more analogous to Polity and Freedom House. Additionally, there is the vdemdata R package (V-Dem Institute 2020) that provides search functionality and the ability to directly access the most recent version of the database directly from the V-Dem servers. One point of contrast for V-Dem is that it does not set out to define democracy for the user. Instead, V-Dem sets out to measure 7 core properties of democracy. These include highly disaggregated and clear assessments of electoral, liberal, majoritarian, consensual, participatory, deliberative, and egalitarian properties of democracy (Coppedge et al. 2020). This allows users to select the individual components most important to them and create composite variables that directly relate to their analysis. V-Dem is also transparent in their methodologies. They employ multiple independent coding teams, inter-coder reliability tests, confidence intervals for point estimates, and access to individual coder-level scoring (Coppedge et al. 2017). V-Dem requires some initial overhead to navigate the database for use in research or production pipelines, but many of my projects have been adopting the database in favor of Polity and Freedom House.

None of these datasets are unequivocally superior to another. In some instances, highly focused binary flags for aspects of democracy and governance will be sufficient. In others, more detailed ordinal measures may be required to illuminated nuanced difference across target nations. Likewise, users may have specific requirements for target metrics, temporal coverage, and included nation-states. As always, the analyst must decide what’s appropriate for their intended use case.


Casper, Gretchen, and Claudiu Tufis. 2003. “Correlation Versus Interchangeability: The Limited Robustness of Empirical Findings on Democracy Using Highly Correlated Data Sets.” Political Analysis 11 (2): 196–203. https://www.jstor.org/stable/25791723.
Cheibub, José Antonio, Jennifer Gandhi, and James Raymond Vreeland. 2010. “Democracy and Dictatorship Revisited.” Public Choice 143 (1, 1): 67–101. https://doi.org/10.1007/s11127-009-9491-2.
Coppedge, Michael, John Gerring, Carl Henrik Knutsen, Staffan I. Lindberg, Jan Teorell, David Altman, Michael Bernhard, et al. 2020. “V-Dem Codebook V10.” SSRN Scholarly Paper ID 3557877. Rochester, NY: Social Science Research Network. https://doi.org/10.2139/ssrn.3557877.
Coppedge, Michael, John Gerring, Staffan I. Lindberg, Svend-Erik Skaaning, and Jan Teorell. 2017. “V-Dem Comparisons and Contrasts with Other Measurement Projects.” SSRN Scholarly Paper ID 2951014. Rochester, NY: Social Science Research Network. https://doi.org/10.2139/ssrn.2951014.
Economist Intelligence Unit. 2020. “Democracy Index 2019. A Year of Democratic Setbacks and Popular Protest.” https://www.eiu.com/n/campaigns/democracy-index-2019/.
Freedom House. 2021. “Freedom in the World 2021.” Washington DC: Freedom House. https://freedomhouse.org/report/freedom-world.
Högström, John. 2013. “Does the Choice of Democracy Measure Matter? Comparisons Between the Two Leading Democracy Indices, Freedom House and Polity IV.” Government and Opposition 48 (2): 201–21. https://www.jstor.org/stable/26347393.
Marshall, Monty G, and Ted R. Gurr. 2020. “Polity5: Political Regime Characteristics and Transitions, 1800-2018 (Dataset UsersManual).” Vienna, VA.
Michalik, Susanne. 2015. “Measuring Authoritarian Regimes with Multiparty Elections.” In Multiparty Elections in Authoritarian Regimes: Explaining Their Introduction and Effects, edited by Susanne Michalik, 33–45. Studien Zur Neuen Politischen Ökonomie. Wiesbaden: Springer Fachmedien. https://doi.org/10.1007/978-3-658-09511-6_3.
Vanhanen, Tatu. 2000. “A New Dataset for Measuring Democracy, 1810-1998.” Journal of Peace Research 37 (2): 251–65. https://doi.org/10.1177/0022343300037002008.
V-Dem Institute. 2020. Vdemdata. V-Dem Institute. https://github.com/vdeminstitute/vdemdata.
Wig, Tore, Håvard Hegre, and Patrick M. Regan. 2015. “Updated Data on Institutions and Elections 1960–2012: Presenting the IAEP Dataset Version 2.0.” Research & Politics 2 (2): 2053168015579120. https://doi.org/10.1177/2053168015579120.