Management of Microarray Data
One of the major chellenges in the field of Microarray is to manage/store the
data. There are three major complementry projects initiated by Microarray Gene Expression Data
(MGED) Society which are working to set the standards in the field.
These projects are
Software for Management of Microarray Data at UAMS
Its important to install database management system, in order to manage Microarray data at central location in university/institute. We have install following two public domain software to manage data at University of Arkansas for Medical Sciences (UAMS), Little Rock, AR, USA.
- BASE (BioArray Software Environment) It is a comprehensive database server to manage the massive amounts of data generated by microarray analysis. It manages biomaterial information, raw data and images, and provides integrated and "plug-in"-able normalization, data viewing and analysis tools.
- AMAD : AMAD is a flat file, web driven database system written entirely in PERL and javascript, and intended for use with microarray generated data.
Gene Annotation:
- GeneCruiser: It is a gene annotation tool that allows users to annotate their genomic data in several ways: i) Map gene in genedatabases; ii) Find Affymetrix probe etc.
- Onto-Express : Onto-Express (OE) translate lists of differentially regulated genes into functional profiles characterizing the impact of the condition studied.
- CGAP (The CANCER GENOME ANATOMY PROJECT) : The goal of the NCI's Cancer Genome Anatomy Project is to determine the gene expression profiles of normal, precancer, and cancer cells, leading eventually to improved detection, diagnosis, and treatment for the patient. By collaborating with scientists worldwide
- HPRD (Human Protein Database) : All the information in HPRD has been manually extracted from the literature by expert biologists who read, interpret and analyze the published data.
- SOURCE : SOURCE is a unification tool which dynamically collects and compiles data from
many scientific databases, and thereby attempts to encapsulate the genetics and
molecular biology of genes from the genomes of Homo sapiens, Mus musculus,
Rattus norvegicus into easy to navigate GeneReports.
Public Repositories:
There are number of projects in progress world over to store the microarray data in order to create the public repositoies. Following are major efforts :
- ArrayExpress from EBI: ArrayExpress is a public repository for microarray based gene expression data.
- Gene Expression Omnibus (GEO): This microarray database is developed by National Center for Biotechnology (NCBI) at National Institutes og Health.
- Stanford Microarray Database (SMD): SMD stores raw and normalized data from microarray experiments, as well as their corresponding image files. In addition, SMD provides interfaces for data retrieval, analysis and visualization. Data is released to the public at the researcher's discretion or upon publication.
- RNA Abundance Database (RAD): RAD (RNA Abundance Database) is a public gene expression database designed to hold data from array-based(microarrays, high-density oligo arrays, macroarrays) and nonarray-based (SAGE) experiments. The ultimate goal is to allow comparative analysis of experiments performed by different laboratories using different platforms and investigating
different biological systems.
-
ChipDB: This database provides interface for users to search through the collection of strains,
experiments, and genetic features (read: genes) in the database
-
ExpressDB:
ExpressDB is a relational database containing yeast and E. coli RNA expression data which has contains more than 20 million pieces of information loaded from numerous published and
in-house expression studies.
-
Gene
Expression Atlas - A database for gene expression profile from
91 normal human and mouse samples across a diverse array of tissues, organs,
and cell lines. Reference[PubMed][pdf]
-
Gene
Expression Database (GXD) - A database of Mouse
Genome Informatics at the Jackson laboratory.
-
GeneX
- National Center for Genome Resources's initative to provide an
Internet-available repository of gene expression data
-
Human
Gene Expression Index (HuGE Index) - aims to provide a
comprehensive database to understand the expression of human genes in normal
human tissues. Reference[PubMed]
-
M-CHiPS
(Multi-Conditional Hybridization Intensity Processing System): The Multi-Conditional Hybridization Intensity Processing System (M-CHiPS), a data warehousing
concept, focuses on providing a structure suitable for statistical analysis of a microarray database's entire components including the experiment annotations. Currently, 3498 hybridizations are stored (1/1/03).
-
READ
(RIKEN cDNA Expression Array Database): READ has been an integrated system for microarray data, which works like a `glue' in post-sequence
& post-hybridization analysis. Frequently referred data are stored in RDBMS(Relational DataBase Management System). Renewed with TiO(Tissue Ontology) slim. Similar tissues (eg. heart and muscle) are clustered in the gene expression viewer. User can query by GenBank/EMBL/DDBJ accession.
-
Yale
Microarray Database:
The Yale Microarray Database Project is a collaborative effort between several laboratories and centers for i) managing two channel experiment/analysis data from hundreds of investigators; ii) link with NCBI sequence database and iii) integration of statistical tools for analysis.
-
yeast
Microarray Global Viewer (yMGV) - A database for yeast gene
expression data maintained by Laboratoire
de genetique moleculaire, Ecole Normale Superieure.
-
3D-GeneExpression
Database - Preliminary structure for a database of 3D-visualization
of developmental gene expression
-
BODYMAP
- A databank of gene expression information of human and mouse genes, created
by random sequencing of clones in 3'-directed cDNA libraries
-
Gene
Resource Locator - the goal is to map millions of ESTs to the
human genome for the study of the exon-intron structures of genes, the
alternative splicing of pre-mRNAs, the promoter regions of full-length-enriched
cDNA sequences, and the gene-expression patterns associated with ESTs.
-
TissueInfo
- an online database determines the tissue expression profile of a sequence
by comparing the given sequence against the EST database. Each EST comes
from a library derived from a specific tissue type