Open-source COVID-19 data to parameterize epidemiological models, assess tradeoffs between policy alternatives, and track the spread of the disease in India.

Subdistrict-level hospital and clinic beds, population age distributions and pre-existing conditions (soon), urbanization, population density, infections and deaths, with new fields added daily.

Download the data or Explore the documentation

The COVID-19 pandemic has generated an enormous demand for data in India and around the world. Policymakers responding to the crisis need data on the spread of the disease, the healthcare resources at their disposal and the economic hardships being faced by the population. Epidemiologists need data to parameterize their models and estimate the dangers. Economists need data to assess the tradeoffs between policy alternatives.

DDL COVID India comprises an administrative data backbone with local estimates of health system capacity and local economic conditions. All data will be provided with consistent location identifiers (state, district, etc.) to allow for data to be easily merged and analyzed together. We'll supplement what's here with additional information like real-time COVID-19 case data and further demographic and economics data; we have many variables and data sources we've identified and targeted for inclusion, but are limited by manpower.

Moving forward, this effort will integrate three types of data in a common geographic frame: (i) baseline data on health, economic conditions, demographics, and state capacity; (ii) surveys collecting data on the rapidly evolving conditions on the ground; and (iii) real time data on cases, policy responses, etc.

For the source code, please see the GitHub repository.

To contribute your data, please get in touch, and view the format requirements here.

Interactive Plots

Hospital beds

Per 1000 people

Estimated Case Fatality Rate Distribution

Based on Age Distribution Alone

Agricultural commodity market volumes and Covid case count

Across 3000 Mandis

What's in the data?

Data Description Geographic Level
Public Hospital Capacity Facilities, doctors, and beds. Sources: 2011 Population Census and DLHS-4 (2012-14). District, Subdistrict (PC only)
Private Hospitals Public and private hospital employment from 2013 Economic Census. Can estimate private system beds based on public employment:bed ratios. District
Predicted COVID-19 mortality rates Predictions based strictly on local age distributions, which create substantial risk differences across locations. Subdistrict
District correspondences Keys linking current districts to 2011 Population Census districts, which are the basis of many datasets District
Agricultural commodities Quantity of arrivals and prices in terms of maximum, minimum and mode price traded for a specific commodity. Mandi (Market)
Migration District-level short-term and long-term in and outmigration. District
Demographics 2011 Population Census (most recent) population, density, literacy rate, urbanization State / District / Subdistrict / Town / Village
HMIS Child immunisations, maternal health, hospitalisations and lab testing among others across all years of available data. District
NFHS Fourth round of the National Family Health Survey (NFHS-IV) with district level health infrastructure. District
Air Pollution Average ground-level fine particulate matter (PM2.5), with dust and sea-salt removed, for 2016. District

While some of the input data cannot be shared in its raw format, all processing and data construction steps are reproducible. You can also directly access the data files here:

Metadata describing each component of DDL COVID India the dataset and how it was built are available in the GitHub repository:


Development Data Lab works with governments and private firms to generate bespoke insights from our data platform or your own data. For more information, send us an email.