Nucleotide Sequence Databases (the principal ones)
- NCBI - National Center for Biotechnology Information
- EBI - European Bioinformatics Institute
- DDBJ - DNA Data Bank of Japan
Protein Sequence Databases
- SWISS-PROT & TrEMBL - Protein sequence database and computer annotated supplement
- UniProt - UniProt (Universal Protein Resource) is the world's most comprehensive catalog of information on proteins. It is a central repository of protein sequence and function created by joining the information contained in Swiss-Prot, TrEMBL, and PIR.
- PIR - Protein Information Resource
- MIPS - Munich Information centre for Protein Sequences
- HUPO - HUman Proteome Organization
Database Searching by Sequence Similarity
Sequence Alignment
- USC Sequence Alignment Server - align 2 sequences with all possible varieties of dynamic programming
- T-COFFEE - multiple sequence alignment
- ClustalW @ EBI - multiple sequence alignment
- MSA 2.1 - optimal multiple sequence alignment using the Carrillo-Lipman method
- BOXSHADE - pretty printing and shading of multiple alignments
- Splign - Splign is a utility for computing cDNA-to-Genomic, or spliced sequence alignments. At the heart of the program is a global alignment algorithm that specifically accounts for introns and splice signals. New!
- Spidey - an mRNA-to-genomic alignment program
- SIM4 - a program to align cDNA and genomic DNA (My Personal favorite!)
- Wise2 - align a protein or profile HMM against genomic sequence to predict a gene structure, and related tools
- PipMaker - computes alignments of similar regions in two (long) DNA sequences (Yet another of my favorites!)
- VISTA - align + detect conserved regions in long genomic sequences
- myGodzilla - align a sequence to its ortholog in the human genome
Human Genome Databases
[Top]
Databases of other Organisms
[Top]
Genome-wide Analysis
- MBGD - comparative analysis of completely sequenced microbial genomes
- COGs - phylogenetic classification of orthologous proteins from complete genomes
- STRING - detect whether a given query gene occurs repeatedly with certain other genes in potential operons
- Pedant - automatic whole genome annotation
- GeneCensus - various whole genome comparisons
[Top]
Protein Domains: Databases and Search Tools
- InterPro - integration of Pfam, PRINTS, PROSITE, SWISS-PROT + TrEMBL
- PROSITE - database of protein families and domains
- Pfam - alignments and hidden Markov models covering many common protein domains
- SMART - analysis of domains in proteins
- ProDom - protein domain database
- PRINTS Database - groups of conserved motifs used to characterise protein families
- Blocks - multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins
- Protein Domain Profile Analysis @ BMERC - search a library of profiles with a protein sequence
- TIGRFAMs - yet more protein families based on Hidden Markov Models
[Top]
Motif and Pattern Search in Sequences
- Gibbs Motif Sampler - identification of conserved motifs in DNA or protein sequences
- AlignACE Homepage - gene regulatory motif finding
- MEME - motif discovery and search in protein and DNA sequences
- SAM - tools for creating and using Hidden Markov Models
- Pratt - discover patterns in unaligned protein sequences
-
Motivated Proteins - a web facility for
exploring small hydrogen-bonded motifs
-
[Top]
Protein 3D Structure
[Top]
Phylogeny & Taxonomy
[Top]
Gene Prediction
[Top]
Gene Expression Databases
[Top]
Gene Regulation
- TRAFAC - For identifying conserved and shared cis regulatory elements between a pair of genes.
- CisMols - For identifying conserved and shared cis regulatory elements between a set of co-expressed genes.
- TRANSFAC - database of eukaryotic cis-acting regulatory DNA elements and trans-acting factors
- EPD - eukaryotic promoter database
- DBTSS - DataBase of Transcriptional Start Sites (human)
- SCPD - Saccharomyces cerevisiae promoter database
- DCPD - Drosophila Core Promoter Database
- RegulonDB - a database on transcriptional regulation in E. coli
- DPInteract - protein binding sites on E. coli DNA
- PromoterInspector - prediction of promoter regions in mammalian genomic sequences
- MatInspector - search for transcription factor binding sites
- Cister - cis-element cluster finder
- Gene regulatory Tools
-
microRNA.org: microRNA Targets & Expression
Profiles New!
-
miRBase New!
-
TarBase Provides a means of searching
through a comprehensive set of
experimentally supported microRNA targets in
at least 8 organisms New!
-
microRNA resource A gateway to all types
of information about microRNAs, including
articles, products, news, events, and other
websites New!
[Top]
Metabolic, Gene Regulatory & Signal Transduction Network Databases
- KEGG - Kyoto Encyclopedia of Genes and Genomes
- BioCarta
- DAVID - Database for Annotation, Visualization and Integrated Discovery - A useful server to for annotating microarray and other genetic data.
- stke - Signal Transduction Knowledge Environment
- BIND - Biomolecular Interaction Network Database
- EcoCyc
- WIT
-
PathGuide A very useful
collection of resources dealing primarily
with pathways New!
- SPAD - Signaling Pathway Database
- CSNDB - Cell Signalling Networks Database
- PathDB
- Transpath
- DIP - Database of Interacting Proteins
- PFBP - Protein Function and Biochemical Networks
- Alliance for Cellular Signalling
[Top]
Systems
Biology New!
Other Databases (Annotations, Ontologies, Consortia, etc.)
- Entrez Gene - Gene provides a unified query environment for genes defined by sequence and/or in NCBI's Map Viewer. You can query on names, symbols, accessions, publications, GO terms, chromosome numbers, E.C. numbers, and many other attributes associated with genes and the products they encode. Replaces LocusLink.
- Cancer Genome Anatomy Project
- HUGO's Human Gene Nomenclature
- Gene Ontology Consortium - a controlled vocabulary of eukaryotic gene roles
- Open Biological Ontologies an umbrella web address for well-structured controlled vocabularies for shared use across different biological domains.
- ACUTS - compilation of Ancient Conserved UnTranslated Sequences
- UTR database
- ENZYME - enzyme nomenclature database
- BRENDA - enzyme database
- TC-DB - comprehensive classification of membrane transport proteins
- The SNP Consortium
- HGBASE - database of sequence variations in the human genome
- MethDB - DNA methylation database
- SpliceDB - canonical and non-canonical splice site sequences in mammalian genes
- SpliceOme - database of intron-exon boundaries
- InBase - intein database
- The I.M.A.G.E. Consortium
- The Kabat Database of Sequences of Proteins of Immunological Interest
- Nelson Lab: Cytochrome C
- REBASE - restriction enzyme database
- Chemfinder.com - molecule database
- Genomics Institute of the Novartis Research Foundation
- Mouse SNPs Database- 670,000+ SNP records, 8.0+ million allele calls. Allele tables are provided by investigators or retrieved from public sources. All SNPs are mapped to NCBI Mouse Genome build 33 (C57BL/6J assembly). Most are linked to NCBI dbSNP build 123. New!
-
MetaBase
is a user contributed database of databases,
listing all the biological databases
currently available on the internet. New!
-
Bio-computing.org
Bioinformatics, Databases and Software for
Medicine. New!
[Top]
Miscellaneous Tools
[Top]
Computational Resources
[Top]
Bioinformatics on-line course materials and tutorials (not an exhaustive collection)
Intro to bioinformatics and computational biology:
Algorithms:
[Top]
Miscellaneous:
[Top]
Web Sites for Background Information & News
[Top]
Other Collections of Bioinformatics Resources
[Top]
Suggestions and comments: Anil Jegga
This page was last updated on
August 3, 2010
|
|
|