Principal Investigator

Ellen McRae Greytak, Ph.D.
Ellen McRae Greytak, Ph.D
Parabon NanoLabs

Background

While a number of risk factors associated with Alzheimer's Disease have been identified, such as the E4 allele of the Apolipoprotein E (APOE) gene, much of the underlying genetics of this terrible disease remain a mystery. We and others hypothesize that this is because it is not single genes that cause Alzheimer's Disease, but rather interactions among many genes (also known as epistasis). Studies of candidate genes have confirmed that there are many significant interactions among genes that impact an individual's probability of developing the disease. For this project, we are extending this work to look at epistasis on a genome-wide scale.

For this project, we are using the wealth of data made available by the Alzheimer's Disease Neuroimaging Initiative (ADNI). The ADNI team has collected detailed information about the precise changes that occur in Alzheimer's patients' brains, alongside in-depth genetic data, allowing us to search for associations between the two. With this data, we are able to look at not just the outcome (affected vs. unaffected), but also many of the specific changes that take place during disease development (also known as endophenotypes). These include specific measures of cognitive decline, atrophy of particular regions of the brain, and levels of proteins that accumulate during disease progression, such as amyloid beta and tau.

The challenge of searching for combinations of genetic markers

Example SNP Interaction
Comparison of interactions between multiple SNPs

Previously, it has been impossible to study interactions among genetic markers (also known as SNPs) at a genome-wide scale because of the large number of calculations required. For 1 million SNPs — a typical number for this kind of study — there are nearly 500 billion possible two-SNP combinations and more than 1017 three-SNP combinations... and the significant interactions may actually contain even more SNPs, giving rise to an exponential increase in possible SNP combinations.

Even with high-powered supercomputers running for hundreds of years, it is impossible to exhaustively test that many combinations. This is why previous studies have been forced to only look at individual candidate genes. Here, we are using newly developed search algorithms across the genome to examine far more SNP combinations than has previously been possible with the goal of discovering new interactions affecting the development of Alzheimer's Disease.

Distributed computing and evolutionary search

With funding from the National Institutes of Health (NIH) Small Business Innovation Research (SBIR) program, we have implemented sophisticated new algorithms in our DNA search application, Parabon Crush. We are now able to undertake this work using an evolutionary search algorithm that smartly searches through the space of possible interactions, rather than looking at every single combination of SNPs. This drastically reduces the amount of time needed to find significant results. Nevertheless, the space of possible SNP combinations is vast and the quality of our findings improves with each additional computer on which we can execute Crush tasks.

If you are volunteering computation to this project, we greatly appreciate your help! With each additional machine connected to the Compute Against Alzheimer's Disease program, we are able to examine millions more SNP combinations to find those that are most predictive of one's risk for Alzheimer's Disease, bringing us closer to the development of diagnostic tests, interventions, treatments and someday a cure.