Jury verdicts: evidence from eighteenth century London of the dangers of sequential decision-making

Summary

When juries in criminal courts determine whether a defendant is guilty or not guilty, are their decisions at all influenced by the verdict and characteristics of the previous trial on which they were sitting? In an ideal world, every case should be assessed solely on its own merits, but a new study suggests that there is a strong element of what economists call ‘path dependency’ in jury decision-making: a guilty verdict in one trial raises the likelihood of a guilty verdict in the next one.

The context for this evidence of biases in sequential decision-making is the Central Criminal Court at London’s Old Bailey in the eighteenth century. This was a place of extremely high stakes given the harsh punishments of the time: even offenses that today would be considered minor, such as pickpocketing, were eligible for capital punishment. The main alternative sentence was transportation to the Americas (until the Revolution), and then, on the founding of a new penal colony, to Australia.

The researchers empirically test for biases in sequential decisions using a dataset of more than 27,000 jury verdicts reached by over 900 juries between 1751 and 1808. They leverage a unique feature of the judicial system in England at the time: each jury unanimously decided the verdicts in multiple trials presented to them sequentially.

The analysis shows that a previous guilty verdict increased the chance of a subsequent guilty verdict by between 6.7% and 14.1%. In addition, there was an increased chance of a conviction for a lesser offence if the previous case was capital. It seems that juries may have tried to be internally consistent in their decision-making, particularly when comparing two similar cases one after the other.

Naturally, the context of this study might suggest caution about extrapolating precisely to today. But the fact that the researchers find decision-making biases in these high-stakes situations, with explicit deliberations at the group level, raises the possibility that similar outcomes would be found in lower stakes settings too. Examples of contemporary situations that involve sequential decision-making include teachers marking examinations, caseworkers deciding on benefit applications, or at the group level, hiring committees, groups of lay judges, or grand juries.

The research also speaks to the policy question of whether juries should decide single cases only or sit for consecutive cases. It has been argued that criminal justice systems in which juries only decide on single cases may lead to incoherent decisions. But the new findings suggest that while sequential jury decisions may overcome that danger, it may also lead to path dependency in convictions, which is likely to be undesirable.

Main article

There is growing evidence of the decisions made by judges and juries being influenced by factors not directly related to the cases under consideration, including media exposure and the demographics of the jury. This column reports on a study focusing on another possible factor: the verdict and characteristics of the previous case. Analyzing data from the Central Criminal Court at London’s Old Bailey in the eighteenth century, the researchers find that a previous guilty verdict increased the chances of a subsequent guilty verdict by between 6.7% and 14.1%. These historical results are relevant to many contemporary situations that involve sequential decision-making.

Every day, people make numerous decisions and judgments. Although these decisions differ in many dimensions – including the degree of deliberation, the number of people involved, and the stakes or consequences – many are sequential in nature.

Research in a wide range of contexts has documented potential biases that can arise with such sequential decision-making, including loan officers, asylum court judges, and baseball umpires (Chen et al, 2016), investors (Hartzmark and Shue, 2018), speed dating participants (Bhargava and Fisman, 2014), commuters (Simonsohn, 2006), and home buyers (Simonsohn and Loewenstein, 2006).

Although decisions made by many agents in the criminal justice system – ranging from police officers and prosecutors to judges and juries – are also sequential in nature, much less is known about whether there are biases due to their consecutive ordering. This is particularly important in such a high-stakes environment as the courtroom: judges and/or juries hold the decision to convict or acquit defendants in their hands; and their decisions can have severe consequences, including a criminal record and potentially harsh punishment.

Besides the high-stakes nature of the decision, sequential jury (or judge) decisions are of interest for two further reasons. In contrast to job interviewing, for example, where the characteristics of the pool and not just the individual applicant matter, court decisions should only be based on the characteristics and evidence of the case – and they explicitly should not be affected by factors that are extraneous to the case.

Although there is growing evidence of the effects on jury decisions of external factors, such as media exposure (Ouss and Philippe, 2018; Lim et al, 2015), jury race, gender, or political affiliation (see Anwar et al, 2012, 2019a, 2019b), and emotional shocks (Eren and Mocan, 2018), the characteristics of previous criminal cases have not been studied.

Do the verdict and characteristics of the previous case matter for jury decisions in criminal courts? In recent work (Bindler and Hjalmarsson, 2019), we test for such biases of sequential decision-making in one of the highest stakes circumstances possible: jury verdicts in the Central Criminal Court at London’s Old Bailey during the eighteenth century.

This period was the height of capital punishment in England – termed the ‘Bloody Code’. Even minor offenses, such as pickpocketing, were considered capital. The primary alternative to capital punishment was transportation to the Americas (until the American Revolution), and then, on the founding of a new penal colony, to Australia.

Compared with previous research, there are two unique features of this environment: the extremely severe nature of the decisions; and the fact that they were made by a group (the jury) rather than by an individual.

The potential for biases in sequential jury decisions

On theoretical grounds, one may expect biases in sequential jury decisions. The ‘gambler’s fallacy’, first discussed by Tversky and Kahneman (1971, 1974), could lead to ‘negative autocorrelation’ in decisions if decision-makers underestimate the chance of a randomly occurring streak. For example, in the context of criminal court verdicts, a jury that has just convicted five defendants in a row may, a priori, expect the next defendant to be innocent.

Such negative autocorrelation may also arise if juries compare case characteristics to ‘benchmark’ their decisions by focusing on differences between cases (‘sequential contrast effects’). If, instead, they focus on similarities, one may expect ‘positive autocorrelation’ in decisions (‘sequential assimilation effects’).

Although these mechanisms may underlie decision-making in many contexts, jury decisions (especially in a setting like the eighteenth century Old Bailey) differ in that the gory details of a case and/or the consequences of the jury’s actions (for example, public executions) can also have an emotional impact on the jury.

These emotional effects may lead to autocorrelation in sequential verdicts that is either negative (due, for example, to ‘moral cleansing’ – see West and Zhong, 2015) or positive (due, for example, to updated beliefs about criminals or by changing a jury’s mood; see DellaVigna, 2009).

Finally, does the fact that the decision is taken by a group (the jury) instead of an individual change anything? If biases at the individual level simply aggregate to the group level, one would expect them to persist or even be amplified. But if individual biases are cancelled out – for example, through group discussions – biases may disappear.

Jury trials in eighteenth century London

We empirically test for biases in sequential decisions using a dataset of more than 27,000 jury verdicts reached by over 900 juries between 1751 and 1808. We leverage a unique feature of the judicial system in England at the time: each jury (unanimously) decided the verdicts in multiple trials presented to them sequentially.

We extracted the data from a digitized version of The Proceedings of the Old Bailey (Hitchcock et al, 2013). This document, which dates back to 1674 and runs through to the early 1900s, was published after each (monthly) session at the Old Bailey. It includes an account of all criminal cases from London and the surrounding county of Middlesex that went on trial at the Old Bailey.

The information we extracted identifies the unique case, session date, defendant’s name, gender and age (for convicts), as well as the offense (31 categories). The digitized data also contain information about the juries’ verdicts (guilty, guilty of a lesser charge, guilty with recommendation to mercy, acquit) as well as the judges’ sentence. At this time, the main sentences were death, transportation, and corporal punishment; imprisonment, though recorded, did not become a common sentence until the early 1800s.

While information about the (unique) jury and jurors was reported in The Proceedings of the Old Bailey for these years, it was not tagged in the digital files and so we had to code it manually. This allows us to create unique jury identifiers for all juries during our sample period.

To reduce concerns about finite sample bias, we restrict our analysis to those juries that faced at least 20 trials. On average, juries in our sample each saw 42 trials sequentially over a period of a few days. It was not uncommon for jurors to be called for jury duty multiple times, such that there was typically at least one ‘experienced’ juror on the jury. We use the full names and jurisdictions (London or Middlesex) to create measures of jury experience for each juror.

In our final sample, 29% of defendants were females (a much higher share than seen in the courts today) and approximately 43% of the trials were for capital offenses. The prevalence of capital punishment, even for offenses that would be considered minor today, is apparent in the fact that 84% of the (felony) trials were for property offenses and 8% for violent offenses. There was a guilty verdict in 63% of the trials.

Case order at the Old Bailey

To test for ‘path dependency’ in sequential jury decisions, there are two essential ingredients. First, we must know the actual order of cases as faced by the jury. We use the order of cases as presented in The Proceedings of the Old Bailey and cross-validate this order (out-of-sample) using corresponding records in the Central Criminal Court: Court Books held in the National Archives (London).

Second, a key identifying assumption is that the cases are presented to the jury in an order that is not sorted on unobserved characteristics that predict conviction. Intuitively, if such sorting occurred, then one would expect the verdicts of sequential cases to be positively correlated with each other, due to this common unobserved factor. But the context of the eighteenth century criminal courts makes it unlikely that much information (as unobserved by us) could have gone into sorting trials.

Nonetheless, we use non-parametric runs test to investigate carefully whether sorting of trials can be observed in the data. Specifically, we test whether there are more or less streaks in case characteristics (actual data) than would be observed in randomly ordered cases (simulated).

The results show that the vast majority of juries face a set of cases that are randomly ordered in terms of observable case characteristics (offense type, defendant gender, eligibility for capital punishment). These conclusions are essential to interpret our estimates of path dependency in jury verdicts as causal.

Path dependency in jury decisions

Simple tabulations of sequential verdicts indicate that a defendant had a 10 percentage point higher chance of being convicted if their case followed a defendant who was convicted (69% conviction rate) versus one who was acquitted (59% conviction rate). Similar raw gaps are seen regardless of whether the case was capital or non-capital, decided by a London or Middlesex jury, or involving a male or female defendant.

To be able to draw causal conclusions, we move to a more formal regression analysis of within-jury decisions. That is, we regress our dependent variable (a dummy that equals one if a given jury found the defendant of the current case guilty) on the lagged dependent variable (a dummy equal to one if the same jury found the defendant of the previous case guilty), conditional on observable case characteristics.

To account for unobserved heterogeneity across juries and to exploit within-jury variation, we include jury fixed effects. As the inclusion of the latter may lead to a downwards bias of our fixed effects estimator (providing a lower bound of the true effect), we present the OLS estimate (likely to be upwards biased in our set-up) as an upper bound. We find that alternative dynamic panel estimators generally yield results within these two bounds.

Our main findings are as follows:

  • A previous guilty verdict significantly increased the chance of a subsequent guilty verdict: by between 6.7% and 14.1%.
  • Over and above the positive autocorrelation between verdicts, there was an increased chance of a conviction of a lesser offence if the previous case was capital.
  • The positive autocorrelation is robust to multiple estimation strategies, independent of juror experience, and driven by the most recent cases as well as pairs of similar cases.

Linking our findings back to theoretical considerations of sequential decision-making, our results are consistent with sequential assimilation effects: juries may attempt to be internally consistent in their decision-making and particularly so when comparing two similar cases one after the other.

While this does not rule out alternative explanations for the positive autocorrelation, such as ‘common shocks’ affecting a jury’s mood, these are less likely as we find the effects to persist throughout a session and to be driven by the most recent lag. It is, however, possible that the characteristics of a specific case affect the mood of the jury in the short run – that is, while deciding on the next case.

Such potential ‘emotional’ bias is in line with our finding that a lagged capital case matters over and above the positive autocorrelation between the current verdict and the previous verdict, and with the fact that we find positive autocorrelation for dissimilar cases when a non-capital case follows a capital case.

Contemporary Insights

Despite the historical context, our findings are relevant to many situations today that involve sequential decision-making. Examples include teachers marking examinations, caseworkers deciding on benefit applications, or at the group level, hiring committees, groups of lay judges, or grand juries.

Naturally, the context of our study is one of extremely high stakes (given the harsh punishments of the time) and one might want to be cautious to extrapolate precisely to today’s various settings. But the fact that we find decision-making biases in these high-stakes situations, with explicit deliberations at the group (jury) level, raises the possibility that these would be found in lower stakes settings too.

Furthermore, our study speaks to the policy question of whether juries should decide single cases only or sit for consecutive cases. As raised by Sunstein et al (2002), criminal justice systems in which juries decide on single cases only may lead to incoherent decisions. Our findings suggest that while the opposite case of sequential jury decisions may overcome that issue, it may also lead to path dependency in convictions, which is likely to be undesirable.

This article summarizes ‘Path Dependency in Jury Decision-Making’ by Anna Bindler and Randi Hjalmarsson, published in the Journal of the European Economic Association in December 2019.

Anna Bindler is at the University of Gothenburg. Randi Hjalmarsson is at the University of Gothenburg and CEPR.

Further reading

Anwar, Shamena, Patrick Bayer, and Randi Hjalmarsson (2012) ‘Jury Discrimination in Criminal Trials’, Quarterly Journal of Economics 127(2): 1017-55.

Anwar, Shamena, Patrick Bayer, and Randi Hjalmarsson (2019a) ‘Politics in the Courtroom: Political Ideology and Jury Decision Making’, Journal of the European Economic Association 17(3): 834-75.

Anwar, Shamena, Patrick Bayer, and Randi Hjalmarsson (2019b) ‘A Jury of Her Peers: The Impact of the First Female Jurors on Criminal Convictions’, Economic Journal 129: 603-50.

Bhargava, Saurabh, and Ray Fisman (2014) ‘Contrast Effects in Sequential Decisions: Evidence from Speed Dating’, Review of Economics and Statistics 96(3): 444-57.

Bindler, Anna, and Randi Hjalmarsson (2019) ‘Path Dependency in Jury Decision-Making’, Journal of the European Economic Association 17(6): 1971-2017.

Chen, Daniel L, Tobias J Moskowitz, and Kelly Shue (2016) ‘Decision-Making under the Gambler's Fallacy: Evidence from Asylum Judges, Loan Officers, and Baseball Umpires’, Quarterly Journal of Economics 131(3): 1181-1241.

DellaVigna, Stefano (2009) ‘Psychology and Economics: Evidence from the Field’, Journal of Economic Literature 47(2): 315-72.

Eren, Ozkan, and Naci Mocan (2018) ‘Emotional Judges and Unlucky Juveniles’, American Economic Journal: Applied Economics 10(3): 171-205.

Hartzmark, Samuel M, and Kelly Shue (2018) ‘A Tough Act to Follow: Contrast Effects in Financial Markets’, Journal of Finance 73(4): 1567-1613.

Hitchcock, Tim, Robert Shoemaker, Clive Emsley, Sharon Howard, Jamie McLaughlin et al (2013) ‘The Old Bailey Proceedings Online, 1674–1913’ (www.oldbaileyonline.org version 7.1, retrieved April 2013).

Lim, Claire S, James M Snyder Jr, and David Strömberg (2015) ‘The Judge, the Politician, and the Press: Newspaper Coverage and Criminal Sentencing Across Electoral Systems’, American Economic Journal: Applied Economics 7(4): 103-35.

Ouss, Aurelie, and Arnaud Philippe (2018) ‘No Hatred or Malice, Fear or Affection: Media and Sentencing’, Journal of Political Economy 126(5): 2134-78.

Simonsohn, Uri (2006) ‘New Yorkers Commute More Everywhere: Contrast Effects in the Field’, Review of Economics and Statistics 88: 1-9.

Simonsohn, Uri, and George Loewenstein (2006) ‘Mistake #37: The Effect of Previously Encountered Prices on Current Housing Demand’, Economic Journal 116: 175-99.

Sunstein, Cass R, Daniel Kahneman, Ilana Ritov, and David Schkade (2002) ‘Predictably Incoherent Judgments’, Stanford Law Review 54: 1153-1216.

Tversky, Amos, and Daniel Kahneman (1971) ‘Belief in the Law of Small Numbers’, Psychological Bulletin 76: 105-10.

Tversky, Amos, and Daniel Kahneman (1974) ‘Judgment under Uncertainty: Heuristics and Biases’, Science 185: 1124-31.

West, Colin, and Chen-Bo Zhong (2015) ‘Moral Cleansing’, Current Opinion in Psychology 6: 221-25.