作者: Juan Falgueras , Antonio J Lara , Noe Fernandez-Pozo , Francisco R. Canton , Guillermo Perez-Trabado
关键词: Web service 、 Workflow 、 Genetics 、 Data mining 、 Software 、 Throughput (business) 、 Line (text file) 、 Pipeline (software) 、 Sequence (medicine) 、 Interface (computing) 、 Biology
摘要: High-throughput automated sequencing has enabled an exponential growth rate of data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival pyrosequencing enhances this problem necessitates customisable pre-processing algorithms. SeqTrim been implemented both as a Web standalone command line application. Already-published newly-designed algorithms have included identify inserts, remove low quality, vector, adaptor, complexity contaminant sequences, detect chimeric reads. availability several input output formats allows its inclusion processing workflows. Due specific algorithms, outperforms other pre-processors services or applications. It performs equally well sequences from EST libraries, SSH genomic DNA libraries reads does not lead over-trimming. is efficient pipeline designed for any type read, including next-generation sequencing. easily configurable provides friendly interface that users know what happened at every stage, verify individual if desired. recommended reveals more information about each than previously described can discard experimental artefacts.