The IndiGen Genome Project is a whole-genome sequencing programme of India's Council of Scientific and Industrial Research (CSIR), formally launched in April 2019 and announced as completed in October 2019. It was executed jointly by two CSIR laboratories—the Institute of Genomics and Integrative Biology (CSIR-IGIB) in Delhi and the Centre for Cellular and Molecular Biology (CSIR-CCMB) in Hyderabad—under the council's mission-mode framework. The scientific rationale rests on the observation that global reference genomes, including the original Human Genome Project assembly (2003) and major databases such as gnomAD and the 1000 Genomes Project, under-represent South Asian genetic diversity. India's population, structured by endogamy, geography, and the practice of consanguineous marriage in some communities, harbours founder mutations and disease-allele frequencies poorly captured by Western datasets. IndiGen was conceived to supply an indigenous baseline of genetic variation as a public-health and precision-medicine resource.
Procedurally, the project recruited 1,008 self-declared healthy adult volunteers drawn from across India, with substantial outreach through college and university campuses to span the country's linguistic and geographic spread. Participants provided informed consent and a blood sample; DNA was extracted and subjected to whole-genome sequencing, in which the entire roughly three-billion base-pair genome is read, rather than only protein-coding regions. The raw sequence reads were aligned to a human reference genome, variants were called and annotated, and quality-control filters were applied to distinguish genuine variants from sequencing artefacts. The aggregated, de-identified variant data were then compiled into a searchable resource. CSIR-IGIB released the IndiGenomes database, a public web portal cataloguing the genetic variants identified, their frequencies, and their clinical annotations, enabling clinicians and researchers to query whether a given variant is common or rare in the sampled Indian cohort.
A distinctive deliverable of the project was the return of individual results to participants through a "genomics for public health" model. Volunteers received a personalised report and, via a companion mobile application, could access information on their genetic predispositions, carrier status for recessive conditions, and pharmacogenomic markers—variants that influence how an individual metabolises specific drugs. This emphasis on pharmacogenomics reflected one of IndiGen's stated translational aims: reducing adverse drug reactions and tailoring dosage by identifying actionable variants in the population. The project also served as a proof-of-concept for scaling genomic infrastructure, validating the sequencing pipeline, data-storage architecture, and bioinformatic capacity that larger national programmes would subsequently require.
IndiGen sits within a wider Indian genomics landscape that took shape around the same period. In January 2020 the Department of Biotechnology (DBT) launched the far larger Genome India Project (GIP), coordinated by the Centre for Brain Research at the Indian Institute of Science (IISc), Bengaluru, with a consortium of around twenty institutions; GIP targeted the sequencing of 10,000 and later reported 10,000-plus genomes spanning roughly 99 population groups, with the milestone of completing its initial 10,000 genomes announced in early 2024. The Ministry of Science and Technology, then headed in coordination with figures such as the DBT secretary, framed these programmes as foundational to India's biotechnology ambitions, including the BioE3 policy direction announced in 2024. State-level efforts, such as Andhra Pradesh's regional genome initiatives, complemented the national picture.
IndiGen must be distinguished from the Genome India Project, the adjacent concept most frequently conflated with it. IndiGen is a CSIR programme of 1,008 genomes intended primarily as a rapid, scalable demonstration and a clinical-variant baseline; Genome India is a DBT-led, multi-institution effort an order of magnitude larger, designed to build a comprehensive reference panel of Indian genetic diversity. IndiGen should also not be confused with whole-exome sequencing, which reads only the coding fraction (about 1–2 percent) of the genome, nor with the global 1000 Genomes Project, an international consortium whose South Asian sample was comparatively thin. Each programme answers a different scale of question, but together they aim to correct the structural under-representation of Indian populations in world genomic reference data.
The project also surfaced the controversies that attend population genomics in any jurisdiction. Concerns include the adequacy of informed consent for the secondary research use of genetic data, the security and sovereignty of genomic information, the risk of genetic discrimination by insurers or employers, and the ethical handling of incidental findings disclosed to healthy volunteers. India's data-protection framework matured only later, with the Digital Personal Data Protection Act, 2023, leaving early genomics projects to operate under institutional ethics committees and ICMR's ethical guidelines rather than a dedicated genetic-data statute. Questions of equitable benefit-sharing—whether communities whose DNA enriches a database see returns in the form of affordable diagnostics—remain unresolved policy debates.
For the working practitioner, policy analyst, or civil-services aspirant, IndiGen is significant as the demonstrator that operationalised India's entry into population-scale genomics and seeded the human capital and infrastructure later leveraged by Genome India. It is frequently cited in UPSC General Studies Paper III in the context of biotechnology, indigenous scientific capacity, and precision medicine, and it illustrates a recurring governance theme: that scientific capability now outpaces the legal and ethical frameworks meant to regulate genetic data. Understanding the distinction between IndiGen and its larger successor, and the data-sovereignty questions both raise, is essential for anyone tracking India's biotechnology policy and its intersection with privacy law and public health.
Example
In October 2019, CSIR announced the completion of the IndiGen Genome Project, sequencing the whole genomes of 1,008 healthy Indian volunteers across CSIR-IGIB, Delhi, and CSIR-CCMB, Hyderabad.
Frequently asked questions
IndiGen is a CSIR initiative that sequenced 1,008 genomes in 2019 as a scalable demonstration and clinical-variant baseline. The Genome India Project, launched by the Department of Biotechnology in 2020 and coordinated through IISc Bengaluru, is far larger, targeting 10,000-plus genomes across roughly 99 population groups to build a comprehensive Indian reference panel.
Keep learning