GENOMICS I


Genome and genomic variation projects in many organisms have generated a wealth of data analyzable  only by computer - based methods.

Genomes and genome projects:

HUMAN

Genomes Guide at NCBI - National Center for Biological Information
Ensembl- NCBI/Sanger
Human, Mouse and Rat Genome Browsers at UC Santa Cruz
NHGRI - National Human Genome Research Institute
Human Genome Project Information at Oak Ridge National Lab


MOUSE

Mouse Genome Resources at NCBI
NIH Mouse Initiative Web site
Mouse Genome Sequencing Consortium
Celera Genomics
Mouse Genome Informatics (MGI) at the Jackson Laboratory

.

RAT

Rat Genome Browser at UC Santa Cruz and at Jackson Labs.

FUGU (Salt water) and TETRAODON (Fresh Water) Pufferfishes

Compact genomes due to near absence of repetitive sequences

Fugu Rubripes - 365 MB genome, many gene clusters conserved with mammals

BLASTable genome at DOE Joint Genome Institute


AMPHIOXUS (Lancelet) and STRONGYLOCENTRUS (Sea Urchin)

Amphioxus, Branchiostoma floridae, a simple cephalochordate vertebrate with a much simplified vertebrate genome (500 MB)
 is valuable for studies on gene evolution

e.g. a single HOX gene cluster compared to 4 in mammals

BLASTable genome at Max Planck Institute, Berlin


CIONA (Ascidian, Sea Squirt)

Ciona Intestinalis - a primitive (uro)chordate, genome size 155MB.

Larval stage resembles a tadpole.

Has many genes orthologous to mammals, some to plants.

Also useful for gene evolution and function studies.

Draft genome described in Science 298: 2157 - 2167, 2002

Genome with BLAST capability at DOE Joint Genome Institute


D. MELANOGASTER

BDGP - Berkeley Drosophila Genome Project
NCBI - Genome Biology Site


C. ELEGANS

The C. Elegans Genome Project at the Sanger Center, UK


MICROBIAL GENOMES

TIGR - The Institute for Genome Research


ORGANELLE GENOMES

 
Mitochondrial and plastid genomes derived from eubacteria via symbiosis.

Endosymbionts have lost the bulk of their genomes and acquired many host - derived properties.

Recent review in Ancient Invasions: from endosymbionts to organelles. Science (2004) 304:253 - 7


GENERAL AND COMPENDIA

The Genome Atlas at the Denmark Technical University

NCBI Entrez Complete Genomes

Related Genome Databases at U.Oregon

Celera Genomics

Genomics - a Global Resource at PHRMA

NCGR - National Center for Genome Resources

Genome Mapping Links at Roslin Institute, Edinburgh, Scotland


GENOME MINING GUIDE
:

Current Topics in Genome Analysis 2003 and 2005 Webvideo courses at NHGRI


PROBLEMS:

1. For the Mus musculus cadherin 11 (Cdh11) gene, RefSeq accession number NM_009866.3:

a. What is the genomic location on the genetic and physical mouse chromosome maps and orientation of the gene? What are some neighboring genes? Is there a cluster of related genes?

b. What is the size of the genomic sequence ?

c. How many exons are there in the gene?

d. What kind of exon is exon 1? What characterisitic verifies it as a genuine exon?

e.  What are the principal repeat elements?

f.   Using the  answer to part (a) and the Human-Mouse Homology Map find the  chromosomal location of the human Cdh11 gene

g.   How much homology is there between the mouse and human genes and proteins?

h.   What are suggested functions of the Cdh11 gene product (e.g. at OMIM)

i.  Is there significant homology between the 5' UTRs of the human, mouse and rat transcripts?
 

Genetic variation databases:

1. SNP databases:

dbSNP at NCBI

EntrezSNP at NCBI

The mouse SNP database at Roche

PROBLEM 1:

What nonsynonymous SNPs have been recorded for the human Cdh11 gene?

(Hint: use EntrezSNP, LocusLink, Molecular Variation Database option (VAR) )
 

PROBLEM 2:

Identify the SNP in the coding sequence of the Nramp gene between the C57Bl/6 and A/J mouse strains.

Does the SNP result in a change in aminoacid sequence? In what part of the protein?

Do you predict that the change will cause a change in protein function?
 

2. Haplotype

The pattern of SNPs in a genomic region of an individual or animal strain.

Genomes appear to consist of long sequences of low interindividual variability (Haploblocks) separated by recombination hotspots.

Haplotype analysis is more powerful than single SNP analysis in identifying disease & susceptibility traits.

example: Haplotype Mapping and Sequence Analysis of the Mouse Nramp Gene predict Susceptibility to Infection with Intracellular Parasites. Genomics (1994) 23, 51-61

HapMap now under construction

International Hap Map project

NHGRI Hap Map Page

RESOURCES:

"Studying Genetic Variation I and II" lectures in Current Topics in Genome Analysis 2003 and 2005 Webvideo courses at NHGRI

The international HapMap Consortium. A haplotype map of the human genome. Nature vol 437, pp1299-1320, 2005.