On assigning individuals from cryptic population structures to optimal predicted subpopulations: An empirical evaluation of non-parametric population structure analysis techniques

Conference proceedings article


Authors/Editors


Strategic Research Themes

No matching items found.


Publication Details

Author listDeejai P., Assawamakin A., Wangkumhang P., Poomputsa K., Tongsima S.

PublisherSpringer Verlag (Germany): Computer Proceedings

Publication year2010

Volume number115 CCIS

Start page58

End page70

Number of pages13

ISBN3642167497; 9783642167492

ISSN1865-0929

URLhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-78649501046&doi=10.1007%2f978-3-642-16750-8_6&partnerID=40&md5=b9c78d055fec446c275a65f0dd159bfc

LanguagesEnglish-Great Britain (EN-GB)


View in Web of Science | View on publisher site | View citing articles in Web of Science


Abstract

Many algorithms have been proposed to analyze population structures from the single nucleotide polymorphism (SNP) genotyping data of some number of individuals and try to assign individuals to genetically similar groups. These algorithms can be categorized into two computational paradigms: parametric and non-parametric approaches. Although the parametric-based approach is a gold standard for population structure analysis, the computational burden incurred by running these algorithms is unacceptable for large complex dataset. As genotyping platforms incorporating more SNPs, analyzing ever larger and more complex datasets are becoming a standard practice. Hence, the computationally efficient non-parametric methods for analysis of genotypic datasets are needed to reveal the population structure. In this study, we evaluated two leading non-parametric population structure analysis techniques, namely ipPCA and AWclust, on their abilities to characterize the genetic diversity and population structure of two complex SNP genotype datasets (as many as 243855 SNPs). The head-to-head comparisons were conducted on two major aspects: ability to infer the number of genetically related subpopulations (K) and ability to correctly assign individuals to these subpopulations. The experimental results suggested that AWclust could be more suitable when applying to a small and less complex dataset. However, with a large and more complex dataset, ipPCA is a much better choice yielding higher accuracy on assigning genetically similar individuals to the inferred groups. ฉ 2010 Springer-Verlag Berlin Heidelberg.


Keywords

non-parametric-based methodparametric-based methodPopulation geneticpopulation genetic structure


Last updated on 2023-03-10 at 07:35