Additionally, these file types are not useable in downstream analyses. Prior to analysis, however, researchers must determine which loci are suitable for the questions being asked by assessing key parameters such as coverage and number of polymorphic sites or whether all populations are represented.Ĭurrent NGS file types are efficient at manipulating and storing alignment data but the parameters of interest are difficult to extract and can require custom bioinformatics scripts.
This results in next-generation alignments to a reference and a set of loci for the individuals in the study the loci can then be used in standard phylogeographic, phylogenetic or population genetic studies or other multi-locus analyses (e.g., ). Genotypes can then be called from the alignments, using a variety of bioinformatics methods (e.g.,, ). Genome enrichment methods often utilize a known or constructed reference for easing alignment of sequencing reads. Examples of these methods include amplicon sequencing, RAD-tags, complexity reduction of multilocus sequences (or CRoPS) and sequence capture for a review of NGS methods suitable for multi-locus studies, see.
These methods aim to sample the genome at a reproducible subset of markers that can be obtained from many individuals and reduced to genotype (i.e., a set of phased alleles). To apply the immense sequencing capabilities of next-generation sequencing (NGS) technologies to population-level questions (i.e., those that require multi-locus, multi-individual data), genome enrichment methods are frequently employed. This does not alter the author’s adherence to all the PLOS ONE policies on sharing data and materials. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Ĭompeting interests: The author was funded by Google Inc. through the 2011 Google Summer Of CodeTM Program ( ). This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.įunding: This work was supported by Google Inc. Received: JAccepted: SeptemPublished: October 10, 2012Ĭopyright: © Hird. University of California Riverside, United States of America lociNGS is written in Python and is supported on MacOSX and Unix it is distributed under a GNU General Public License.Ĭitation: Hird SM (2012) lociNGS: A Lightweight Alternative for Assessing Suitability of Next-Generation Loci for Evolutionary Analysis. lociNGS is available at and is dependent on installation of MongoDB (freely available at ). lociNGS also reformats subsets of loci in three commonly used formats for multi-locus phylogeographic and population genetics analyses – NEXUS, IMa2 and Migrate. The program can output the raw sequences used to call loci from next-generation sequencing data. Summary information includes the parameters coverage per locus, coverage per individual and number of polymorphic sites, among others. lociNGS is a user-friendly accessory program that takes multi-FASTA formatted loci, next-generation sequence alignments and demographic data as input and collates, displays and outputs information about the data. Genomic enrichment methods and next-generation sequencing produce uneven coverage for the portions of the genome (the loci) they target this information is essential for ascertaining the suitability of each locus for further analysis.