GitHunt
RA

raysas/genetic-linkage-gwas

group project - application of family-based and population-based genetic analyses to both a monogenic disease (Clouston disease) and a complex multifactorial disease (Rheumatoid Arthritis)

NGS Linkage and GWAS Analysis

This repository contains the analyses performed as part of a Next-Generation Sequencing (NGS) methodologies project, focusing on the genetic investigation of two diseases with distinct genetic architectures:

  • Clouston disease (monogenic, autosomal dominant)
  • Rheumatoid arthritis (complex, multifactorial)

The project combines genetic linkage analysis and association studies using standard statistical genetics tools.

tools

  • R (paramlink, qqman)
  • PLINK
  • MERLIN
  • FBAT

๐Ÿ“ repository structure

simplified structure for project submission, the report in docs/

.
โ”œโ”€โ”€ code/
โ”‚   โ””โ”€โ”€ main.ipynb
โ”œโ”€โ”€ assets/
โ”œโ”€โ”€ report/
โ”‚   โ””โ”€โ”€ NGS_Methodologies_Report.pdf
โ””โ”€โ”€ README.md

Project Overview

Clouston Disease (Monogenic Disorder)

For Clouston disease, family-based analyses were performed to identify the disease-associated locus.

1. Genetic Linkage Analysis (LOD score)

  • Parametric linkage analysis using the LOD score method
  • Analysis conducted on chromosome 13 using microsatellite markers
  • Evaluation of different recombination fractions (ฮธ values)

Example family structure used in the analysis:

Pedigree example for Clouston disease

Figure 1: Example pedigree used for linkage analysis. Affected individuals are shown in black.

LOD score curve for a representative marker:

LOD score curve

Figure 2: LOD score as a function of recombination fraction (ฮธ). The maximum LOD score is observed at ฮธ โ‰ˆ 0.

2. Familial Association Analysis (TDT)

Following linkage analysis, candidate variants within the GJB6 gene were tested using the Transmission Disequilibrium Test (TDT).

TDT results for GJB6 SNPs

Figure 3: TDT results for SNPs within the GJB6 gene. SNP rs76179836 shows significant association.

Linkage disequilibrium verification using Ensembl:

LD plot around rs76179836

Figure 4: Linkage disequilibrium (rยฒ) between rs76179836 and surrounding variants, indicating the SNP is likely causal.

Rheumatoid Arthritis (Complex Disease)

For Rheumatoid Arthritis (RA), both linkage and genome-wide association analyses were performed.

1. Non-Parametric Linkage Analysis (sib-pairs)

  • Analysis conducted using MERLIN
  • Identification of suggestive and significant linkage regions

LOD score plot for chromosome X:

NPL linkage plot chromosome X

Figure 5: Non-parametric linkage (NPL) analysis for chromosome X. The red line indicates the significance threshold.

LOD score plot for chromosome 6:

NPL linkage plot chromosome 6

Figure 6: Suggestive linkage region on chromosome 6 around 50โ€“53 cM.

2. Genome-Wide Association Study (GWAS)

  • GWAS performed using PLINK
  • Allelic and genotypic association tests
  • Bonferroni correction applied for multiple testing

Manhattan plots of GWAS results:

GWAS Manhattan plots

Figure 7: Manhattan plots for allelic (left) and genotypic (right) association tests. No variants surpass the Bonferroni threshold.

raysas/genetic-linkage-gwas | GitHunt