Latest Cover

Online Office

Contact Us

Issue:ISSN 1000-7083
          CN 51-1193/Q
Director:Sichuan Association for Science and Technology
Sponsored by:Sichuan Society of Zoologists; Chengdu Giant Panda Breeding Research Foundation; Sichuan Association of Wildlife Conservation; Sichuan University
Address:College of Life Sciences, Sichuan University, No.29, Wangjiang Road, Chengdu, Sichuan Province, 610064, China
Fax:+86-28-85410485 &
Your Position :Home->Past Journals Catalog->预发布卷

Clustering mitochondrial DNA sequences experienced tandem duplication based on alignment-free comparison in the frog Quasipaa boulengeri
Author of the article:Yue Cao,Yun Xia,Yuchi Zheng
Author's Workplace:Department of Herpetology, Chengdu Institute of Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences
Key Words:Quasipaa boulengeri; mitochondrial DNA; alignment-free comparison; clustering; duplication region; Robinson-Foulds distance; protein-coding gene; maximum likelihood tree
Abstract:Animal mitochondrial genome regions experienced tandem duplication and the following random loss are often hypervariable and hence challenging for alignment algorithms. In theory, alignment-free comparison methods (AFM) can be used to summarize and visually present the relationships and similarities of such sequences. To our knowledge, relevant evaluations and applications are lacking. We evaluated three types of commonly used k-mer-based AFM with a system of intraspecific sequence variation for one such region around the origin of light strand replication. From the frog species Quasipaa boulengeri (Dicroglossidae), 19 sequences ranging from 583 to 695 bp were clustered using 28 AFM. For each method, substrings of length k = 4, 6, 8, 10, 12, 14, 16, 18, and 20 bp were tried. From the same individuals, the mitochondrial protein-coding sequences of length 1518 bp were used to reconstruct a maximum likelihood tree as the reference topology. Between the reference and AFM topologies, the Robinson-Foulds distance was calculated and the major topological difference was recorded. Using a k value of typically eight, half of the methods produced a tree different from the reference one by only two nodes (11.8%). However, poor performances were constantly observed for some methods. A small k value of four was found to be suitable for inferring the relationships among sequence groups. These findings support a successful application of AFM on animal mitochondrial tandem duplication regions. The combinations between methods and k values with ideal performance obtained here may be applied to similar systems. For different systems, similar evaluations will be helpful.
CopyRight©2018 Editorial Office of Sichuan Journal of Zoology