Freie Universität Berlin has been selected as new Intel® Parallel Computing Center (Intel® PCC)

The Intel® PCC program supports universities, institutions, and labs identified as leaders in their fields. It focuses on modernizing applications or software libraries to increase performance on modern microprocessors and coprocessors using parallel computing.

Parallel computing involves the simultaneous operation of multiple processors running elements of a computer program at the same time and is hence much faster than executing the operations sequentially, one after another. This necessitates routines to address the proper use of cores, caches, threads, and vector capabilities of the hardware as well as synchronization routines for parallel computations.

The latest Intel® PCC will be led by Knut Reinert, who is professor for Bioinformatics at the department of Mathematics and Computer Science at Freie Universität (FU) Berlin and a fellow at the Max-Planck-Institute for Molecular Genetics. Mr. Reinert’s research focuses on providing efficient tools for the analysis of Next Generation Sequencing (NGS) data stemming from a technology breakthrough several years ago which enables the cheap sequencing of terabytes of genomic sequence. The Reinert lab based its application development on well-designed algorithmic components and their implementation in the SeqAn C++ software library.

The SeqAn library is well-established and used worldwide in numerous analysis tools for NGS analysis. Since this year, it has also been supported in the CIBI center as part of the German bioinformatics infrastructure network (de.NBI), demonstrating its leading role in the development of new analysis tools for biomedical applications.

“We are very glad that Intel supports our vision for moving biomedical software development forward. SeqAn is a software library containing well designed key components for sequence analysis and we always thought that parallelizing and vectorizing key components of SeqAn will have a big impact on the field by accelerating many applications that make use of those components.”

Knut Reinert, PI

The center will work on abstracting primitives in SeqAn’s template based core to offer a unified interface to multicore and SIMD vector units including the Intel® Xeon and Intel® Xeon Phi™ coprocessors and then accelerate key routines like alignment algorithms or traversing data parallel containers. The generic design of SeqAn makes it well-suited for this approach.

“Intel regards SeqAn as a very promising software package that has all the right ingredients to considerably speed up Next Generation Sequencing analysis on modern Intel processors. We are looking forward to collaborating with Professor Reinert and his team to add our technical know-how about Intel® Architecture and combine it with his algorithmic expertise, and in this way turn SeqAn into a premier software tool in this domain of rapidly growing importance.”

Kristina Kermanshahche
Chief Architect, Intel® Health & Life Sciences

The above strategy will accelerate existing and future applications based on the free SeqAn library (under BSD license) and make hardware acceleration easily available for developers. The FU Berlin will benefit from its role as an Intel® PCC by working with Intel experts and their software tools, and advanced technologies. The Reinert lab will incorporate its work in advanced tutorials and courses at FU Berlin and looks forward to sharing its findings at conferences such as the International Supercomputing Conference and the Intel® Xeon Phi™ Coprocessor User Group (IXPUG) meetings.

FU Berlin Partner im Deutschen Netzwerk Bioinformatik-Infrastruktur (de.NBI)

Das Bundeministerium für Bildung und Forschung (BMBF) fördert ab März für fünf Jahre das Deutsche Netzwerk für Bioinformatik Infrastruktur (de.NBI). Eines der acht Leistungszentren in diesem Netzwerk – das Zentrum für Integrative Bioinformatik (CIBi) – wird dabei für die nächsten fünf Jahre mit zwei Millionen Euro gefördert. CIBi ist ein gemeinsames Zentrum der Universitäten Tübingen und Konstanz sowie der Freien Universität Berlin.

In der biomedizinischen Forschung hat die Einführung von neuen Sequenziermethoden und der hochauflösenden Massenspektrometrie einen Paradigmenwechsel ermöglicht. Die darauf basierenden Hochdurchsatzmethoden wie Genomik, Transkriptomik, Proteomik und Metabolomik – auch Omics-Methoden genannt – geben zwar sehr umfängliche und tiefe Einsichten in zelluläre Systeme, aber die erzeugten Daten sind äußerst umfangreich (im Bereich von Terabytes) und sehr komplex. Zunehmend werden heute auch Daten aus mehreren Technologien parallel erzeugt, zum Beispiel Daten zum Genom und zu den Proteinkonzentrationen in einer Zelle. Für Analyse und Interpretation solcher Datensätze sind daher innovative Algorithmen notwendig. Da ein einzelner Algorithmus für die Analyse dieser Daten nicht mehr ausreicht, werden diese Werkzeuge dann in komplexe Datenanalyse-Workflows eingebunden. Damit wird dann die automatisierte Auswertung selbst komplexester Daten möglich. Das BMBF fördert im Rahmen des Netzwerks die Weiterentwicklung von Algorithmen für die Analyse von Proteom- und Metabolomdaten entwickelt (Tübingen, Softwarepaket OpenMS, Prof. Oliver Kohlbacher), von Genom- und Transkriptomdaten (Berlin, Softwarepaket SeqAn, Prof. Knut Reinert) und der Integration dieser Tools in Workflows (Konstanz, Softwarepaket KNIME, Prof. Michael Berthold)

Das Zentrum für Integrative Bioinformatik ist eng an die anderen sieben Leistungszentren im Deutschen Netzwerk für Bioinformatik angebunden. Das Gesamtnetzwerk wird für fünf Jahre gefördert und nach drei Jahren zwischenevaluiert. Die Koordination des Netzwerks liegt bei der Universität Bielefeld

SeqAn 2.0 released

We are happy to announce the new release of SeqAn 2.0.0.

We have many new features and applications for you and improved many parts in sense of usability, performance and stability.

For example, we have improved the usability and the performance of the I/O-modules. Now BAM-I/O supports parallel read and write operations. We improved automatic read/write operations for compressed file formats, like gzip, bzip2, etc..

We improved the performance of several data structures like the FMIndex, which runs now up to 4 times faster than the old version.

We extended the library with new features like the X-drop extensions for alignments allowing affine gap costs. We added a new realignment module as well as a translation module to translate amino acid alphabet into DNA alphabet.

We implemented many new apps like ANISE and BASIL for insert assembly, Fiona for read error correction, Yara, an enhanced read aligner replacing Masai, and many many more.

With SeqAn 2.0.0 we moved the complete sources to GitHub, which enhances the development cycle in a great way. We improved our build system and added Continous Integration builds with Travis CI. We also updated and improved the API documentation (docs.seqan.de) and switched the tutorials to seqan.readthedocs.org.

You can download the new release and the updated apps from www.seqan.de in the downloads section. You can get the complete sources of the SeqAn 2.0.0 from https://github/seqan/seqan as well.

Simply run:

git clone -b seqan-v2.0.0 https://github.com/seqan/seqan.git seqan-src

and start developing and having fun with SeqAn 2.0.0.

Enjoy!

The SeqAn Team

_______________________________________________
seqan-dev mailing list
seqan-dev@lists.fu-berlin.de
https://lists.fu-berlin.de/listinfo/seqan-dev

Papers at ECCB 2014

The Reinert lab and collaborators presented two papers at the 14th European Conference on Computational biology in Strassbourg.

Hannes Hauswedell from our group presented
“Lambda: the local aligner for massive biological data [1] while Marcel Schulz from the Max-Planck in Saarbrücken presented “Fiona: a parallel and automatic strategy for read error correction” [1].

[1] Unknown bibtex entry with key []
[Bibtex]