Speaker:   Solon Pissis
  Department of Informatics
  King's College, London


Title:  Aligning next-generation sequencing reads to multiple reference genomes

Motivation: The constant advances in sequencing technology have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads), during a single experiment, and with a much lower cost than previously possible. Due to this massive amount of data, efficient algorithms for mapping these reads to reference genomic sequences are in great demand, and recently, there has been ample work for publishing such algorithms. In this paper, we study a different version of this problem; mapping these reads to multiple reference related genomes (e.g., individuals of the same species). A simple method would be to map a set of reads against all known genomes for a species separately. However, this procedure will come with the overhead of redundant alignments in conserved regions.
Results: We propose a new practical algorithm, which employs a suitable data structure that takes into account potential inherent genomic variability (replacements, insertions, deletions) between related genomic sequences. Therefore, if a small number of differences occurs within a genomic sequence, the already mapped reads can be altered dynamically. The presented experimental results demonstrate that the proposed approach can efficiently and accurately address this problem.
Joint work with Costas S. Iliopoulos and Tomas Flouri