How to read Manhattan plot

That "gap" or "dip" in the middle of Chromosome 1 (and other chromosomes on the plot) is exactly where the Centromere is located. Here is the breakdown of why it looks that way on a Manhattan Plot: 1. The Centromere is "Gene-Poor" The centromere is made of highly repetitive DNA sequences (satellite DNA). Unlike the "arms" of the chromosome, it contains very few actual genes. Since Manhattan Plots track SNPs (Single Nucleotide Polymorphisms) to find gene-disease links, there are simply fewer variants to measure in that specific region. 2. Difficulty in Sequencing Because the centromere is so repetitive, it is incredibly difficult for standard sequencing machines to "read" and map correctly. Imagine trying to assemble a puzzle where 1,000 pieces are exactly the same shade of blue—that is what the centromere is like for a computer. Most researchers filter out these regions because the data there is often unreliable or "noisy." 3. Biological "Silent Zones" In Genetic Association studies (GWAS), we look for areas where DNA varies. The centromere is a structural anchor used for cell division (attaching to spindle fibers). Evolution keeps these areas very stable, so you don't see the same kind of "skyscraper" peaks of variation that you see in the gene-rich arms. 4. Physical Gap in the Reference Genome In older versions of the human reference genome (like hg19), these centromeric regions were often represented as "N" (unknown bases) or large gaps because we couldn't bridge the repetitive sequences yet. This creates a literal physical empty space on the X-axis of your graph. Would you like me to explain why some chromosomes (like 13, 14, and 15) have even larger gaps at the very beginning of their plots? https://www.youtube.com/watch?v=OezG73EN_kg

Иконка канала Nikolay's Genetics Lessons
46 подписчиков
12+
2 просмотра
11 дней назад
12+
2 просмотра
11 дней назад

That "gap" or "dip" in the middle of Chromosome 1 (and other chromosomes on the plot) is exactly where the Centromere is located. Here is the breakdown of why it looks that way on a Manhattan Plot: 1. The Centromere is "Gene-Poor" The centromere is made of highly repetitive DNA sequences (satellite DNA). Unlike the "arms" of the chromosome, it contains very few actual genes. Since Manhattan Plots track SNPs (Single Nucleotide Polymorphisms) to find gene-disease links, there are simply fewer variants to measure in that specific region. 2. Difficulty in Sequencing Because the centromere is so repetitive, it is incredibly difficult for standard sequencing machines to "read" and map correctly. Imagine trying to assemble a puzzle where 1,000 pieces are exactly the same shade of blue—that is what the centromere is like for a computer. Most researchers filter out these regions because the data there is often unreliable or "noisy." 3. Biological "Silent Zones" In Genetic Association studies (GWAS), we look for areas where DNA varies. The centromere is a structural anchor used for cell division (attaching to spindle fibers). Evolution keeps these areas very stable, so you don't see the same kind of "skyscraper" peaks of variation that you see in the gene-rich arms. 4. Physical Gap in the Reference Genome In older versions of the human reference genome (like hg19), these centromeric regions were often represented as "N" (unknown bases) or large gaps because we couldn't bridge the repetitive sequences yet. This creates a literal physical empty space on the X-axis of your graph. Would you like me to explain why some chromosomes (like 13, 14, and 15) have even larger gaps at the very beginning of their plots? https://www.youtube.com/watch?v=OezG73EN_kg

, чтобы оставлять комментарии