Disaggregating Repression: Identifying Physical Integrity Rights Allegations in Human Rights Reports

Most cross-national human rights datasets rely on human coding to produce yearly, country-level indicators of state human rights practices. Hand-coding the documents that contain the information on which these scores are based is tedious and time-consuming, but has been viewed as necessary given the complexity and detail of the information contained in the text. However, advances in automated text analysis have the potential to streamline this process without sacrificing accuracy. In this research note, we take the first step in creating this streamlined process by employing a supervised machine learning automated coding method that extracts specific allegations of physical integrity rights violations from the original text of
country reports on human rights. This method produces a dataset including 163,512 unique abuse allegations in 196 countries between 1999 and 2016. This dataset and method will assist researchers of physical integrity rights abuse because it will allow them to produce allegation-level human rights measures that have previously not existed and provide a jumping-off point for future projects aimed at using supervised machine learning to create global human rights metrics.

Cordell, Rebecca et al. (2022) Disaggregating Repression: Identifying Physical Integrity Rights Allegations in Human Rights Reports. International Studies Quarterly,
https://doi.org/10.1093/isq/sqac016

Full Article


The Latest News from SPIA

January 2026 MPA Student of the Month: Rakib Avi

Join us for an Event

Jan 23
Jan 23
Declaration Day

2:00 PM - 3:00 PM

Feb 23