||An optimistic protein assembly from sequence reads salvaged an uncharacterized segment of mouse picobirnavirus
Gonzalez, Gabriel Sasaki, Michihito ,
Burkitt-Gray, Lucy ,
Kamiya, Tomonori ,
Tsuji, Noriko M. Sawa, Hirofumi ,
7p.40447 , 2017-01-10 , Nature Publishing Group
Advances in Next Generation Sequencing technologies have enabled the generation of millions of sequences from microorganisms. However, distinguishing the sequence of a novel species from sequencing errors remains a technical challenge when the novel species is highly divergent from the closest known species. To solve such a problem, we developed a new method called Optimistic Protein Assembly from Reads (OPAR). This method is based on the assumption that protein sequences could be more conserved than the nucleotide sequences encoding them. By taking advantage of metagenomics, bioinformatics and conventional Sanger sequencing, our method successfully identified all coding regions of the mouse picobirnavirus for the first time. The salvaged sequences indicated that segment 1 of this virus was more divergent from its homologues in other Picobirnaviridae species than segment 2. For this reason, only segment 2 of mouse picobirnavirus has been detected in previous studies.