LSDIS > Projects > Glycomics

Bioinformatics for Glycan Expression

Integrated Technology Resource for Biomedical Glycomics: Technological Research and Development Project IV, A Project funded by NIH

Integrated Technology Resource for Biomedical Glycomics at the UGA Complex Carbohydrate Research Center

NCRR

Introduction

The goal of the Bioinformatics Project 4 of this glycomics resource is to develop a suite of databases along with computational tools that facilitate efficient acquisition, description, analysis, sharing and dissemination of the data contained therein.  This represents a major challenge, as the potential of this data to explain important biological phenomena will only be fully realized if it is examined in the context of the vast amounts of other data that are becoming available. Therefore, a major emphasis will be placed on data structures and tools that have a high degree of interoperability with the computational infrastructure now being developed for the storage and analysis of genomics and proteomics data.  The specific aims of this research are as follows.

  • Develop and implement efficient workflow tools for tracking physical samples and for automating data collection, data verification, compression, and storage.  These will include tools for automatic identification of glycan structures and/or glycan structural families from mass spectral data.
  • Build an integrated database termed GlycomBin that describes the populations of specific glycan structures and structural families of glycopeptides and glycolipids in different cell lines.  For example, the database will provide the foundation for a detailed, quantitative understanding of the ( N - and O -linked) glycosylation patterns of specific glycoproteins. In this context, it will include both structural and quantitative information ( i.e ., the identities and relative populations of the various glycan structures and/or glycan structural families that are attached to each glycopeptide).  Similar information regarding the identities and populations of glycosphingolipids will also be included, as will biochemical information obtained by diverse techniques, such as the distribution of glycan epitopes in different cell lines.
  • Develop tools that facilitate interoperability of the databases with existing proteomics tools that can be used, for example, to identify and quantitate the expression level of each glycopeptide's parent protein and the expression levels of the proteins involved in glycan biosynthesis. Also support open standards-based access of GlycomBin and its interoperability with external databases.  This will include Web Services enabled access to data and computational resources.
  • Develop tools that facilitate the description, classification, and clustering of glycopeptides and glycolipids, including ontology based semantic descriptions of glycan structure, biosynthesis, and biological context.
  • Develop tools for semantic data analysis and discovery, including tools for finding correlations between glycosylation patterns and patterns of gene expression within a cell line or between different cell lines.  These will include a blended ontology-supported browsing and querying interface.

This approach will provide a highly flexible environment for the development of distributed and semantic bioinformatics approaches for analysis of glycosylation patterns and their biological relevance

The current status of the project is described on the Status page, which is frequently updated. The Demos page links to various demonstartion tools on the web. Downloadable code can be found on the Downloads page.


Project related links

This page is a working document. It does neither claim to be correctly ordered, nor complete nor to contain the most relevant links.
If you have suggestions as to which additional links should be on this page, please send an email to Christopher Thomas .


Funding: Bioinformatics of Glycan Expression (one of the four components of the "Integrated Technology Resource for Biomedical Glycomics," appox. $6 million+), National Institute of Health, July 1, 2003 - June 30, 2008.