Crossroads The ACM Magazine for Students

Sign In

Association for Computing Machinery

Magazine: December 2006 | Volume 13, No. 2

Achieving I/O improvements in a mass spectral database

Research in proteomics has created two significant needs: the need for an accurate public database of empirically derived mass spectrum information and the need for managing the I/O and organization of mass spectrometry data in the form of files and structures. Lack of an empirically derived database limits the ability of proteomic researchers to identify and study proteins. Managing the I/O and organization of mass spectrometry data is often time-consuming due to the many fields that need to be set and retrieved. As a result, incompatibilities and inefficiencies are created by each programmer handling this in his or her own way. Until recently, storage space and computing power has been the limiting factor in developing tools to handle the vast amount of mass spectrometry information. Now the resources are available to store, organize, and analyze mass spectrometry information.The Illinois Bio-Grid Mass Spectrometry Database is a database of empirically derived tandem mass spectra of peptides created to provide researchers with an organized and searchable database of curated spectrum information to allow more accurate protein identification. The Mass Spectrometry I/O Project creates a framework that handles mass spectrometry data I/O and data organization, allowing researchers to concentrate on data analysis rather than I/O. In addition, the Mass Spectrometry I/O Project leverages several cross-platform and portability-enhancing technologies, allowing it to be utilized on a variety of hardware and operating systems.

By Eric Puryear, Jennifer Van Puymbrouck, David Sigfredo Angulo, Kevin Drew, Lee Ann Hollenbeck, Dominic Battre, Alex Schilling, David Jabon, Gregor von Laszewski

HTML | In the Digital Library
Tags: Computational biology, Design, Distributed retrieval, Distributed storage, Genetics, Information systems applications, Management, Peer-to-peer retrieval, Scientific visualization, Storage network architectures, Systems biology