Download the SHRUG

Cite the data you use, and contribute back what you can.

contributing to SHRUG

What is the SHRUG?

The SHRUG is an easily linkable dataset covering a wide range of socioeconomic variables in India. Some highlights:

  • Open-source geometries at the village- and town-level based on 2011 Census polygons.
  • Socioeconomic data covering a huge range of dimensions, all linked with the same identifiers.
  • All data is available at the subdistrict, district, constituency, town, and village level.
  • The only large-scale socioeconomic data at the level of assembly constituencies.
  • Everything merges cleanly and easily.

The essential unit of observation of the shrug is the shrid, which is a village or town unit with consistent boundaries since 1991. We put in a ton of work to reconcile boundary changes across censuses. Villages and towns merge and split; shrids are aggregated units that keep the same boundaries across all periods. You can read more about how we did this in the SHRUG paper.

The variables are grouped into different modules based on the data source and subject matter, and each module can be independently downloaded from links provided below. Each download link contains flat data tables at all available geographic levels (e.g. village, town, constituency). You can merge at any geographic level you wish using the appropriate key in the core keys module. Many SHRUG variables can be interactively visualized using the SHRUG Atlas.

If you are using previous versions, we recommend you download version 2.0 or newer. v.1.5 Samosa shrid IDs do not match to 2.0 Pakora, and many other improvements have been made.

For more information, please see:

Bugs, feedback, or requests? Let us know!
Hidden Hidden Hidden Hidden Module Description

Visit the SHRUG GitHub repository for release notes and to file a bug report or feature request. For older versions of the data, visit the Harvard Dataverse. Ignore the Harvard Dataverse version header which is Harvard’s own numbering system.

By accepting data provided by the Development Data Lab (DDL), you agree to the following conditions of release and acknowledge the following disclaimers: Neither DDL nor the contributors of data to DDL shall be held liable for any improper or incorrect use or application of the data provided, and assume no responsibility for the use or application of the data or information derived from interpretation of the data. In no event shall DDL or its collaborators be liable for any direct, indirect, or incidental damages arising from the use or application of these data. This disclaimer of liability applies to any damages or injury, including but not limited to those caused by any failure of performance, error, omission, defect, delay in operation or transmission, computer virus, alteration, use, application, analysis, or interpretation of data. No warranty, expressed or implied, is made regarding the accuracy, adequacy, completeness, reliability, or usefulness of any data provided. These data are provided on an “as is” basis. All warranties of any kind, express or implied, including but not limited to fitness for a particular use, freedom from computer viruses, and non-infringement of proprietary rights, are disclaimed. Data are added and changed periodically, and data may become out-of-date quickly. It is recommended that the user not let a significant period of time elapse between obtaining and using the data.

© Development Data Lab, 2024

About DDL

Team

Careers