Fast and systematic
genome-wide discovery of conserved
regulatory elements using a non-alignment based approach
Olivier Elemento and Saeed Tavazoie
Lewis-Sigler Institute for Integrative Genomics
Update 04/20/2007: new improved Fastcompare distribution, following the publication of Fastcompare: a non-alignment approach for genome-scale discovery of DNA and mRNA regulatory elements using network-level conservation (book chapter in Methods in Molecular Biology series on Comparative Genomics, Humana Press, edited by Nick Bergman)
Binaries and source code
- Fastcompare C source
code
- Fastcompare
executable for Linux
- REcompare C source code
- Comprehensive tutorial
on using FastCompare
Sequence data (FASTA format)
- S. cerevisiae / S.
bayanus, 4,358 orthologous 1,000 bp upstream regions (source
: SGD)
- S.
cerevisiae / S.
paradoxus, 4,695 orthologous 1,000 bp upstream regions
(source :
SGD)
- S.
cerevisiae / S.
castelli, 4,113 orthologous 1,000 bp upstream regions (source
:
SGD)
- C. elegans / C. briggsae,
10,894 orthologous 2,000 bp
upstream
regions (source : ENSEMBL)
- D. melanogaster / D. pseudoobscura,
11,306 orthologous 2,000 bp and 5,000 bp upstream regions (source :
ENSEMBL, Baylor College)
- H. sapiens / M.
musculus, 15,983
orthologous 2,000 bp and 5,000 bp upstream regions (source :
ENSEMBL)
Conserved regulatory elements between S.
cerevisiae vs S. bayanus
- raw sorted
lists of k-mers (and gapped k-mers)
- highest
scoring 379 k-mers (k=7,8,9), with support by independent
biological
data and orientation/position biases
- highest
scoring interactions, with support by independent biological data
(functional categories) and median distances
Conserved regulatory elements between S.
cerevisiae vs S. paradoxus
- raw
sorted lists of k-mers (and gapped k-mers)
- highest
scoring 400 k-mers (k=7,8,9), with support by independent
biological data
Conserved regulatory elements between S.
cerevisiae vs S. castelli
- raw
sorted lists of k-mers
- highest
scoring 376 k-mers (k=7,8,9), with support by independent
biological data and orientation/position biases
Conserved regulatory elements between C.
elegans and C. briggsae
- raw sorted
lists of k-mers (and gapped k-mers)
- highest
scoring 375 k-mers (k=7,8,9), with support by independent
biological data and orientation/position biases
- highest
scoring interactions, with support by independent biological data
(functional categories) and median distances
Conserved regulatory elements between D.
melanogaster and D.
pseudoobscura, 2000 bp upstream
regions
- raw
sorted lists of k-mers (and gapped k-mers)
- highest
scoring 371 k-mers (k=7,8,9), with support by independent
biological data and orientation/position biases
- highest
scoring interactions, with support by independent biological data
(functional categories) and median distances
Conserved regulatory elements between D.
melanogaster and D.
pseudoobscura, 5000 bp upstream
regions
- raw sorted lists
of k-mers (k=7,8,9)
Conserved regulatory elements between H.
sapiens and M. musculus, 2,000 bp upstream regions
- raw sorted
lists of k-mers (and gapped k-mers)
- highest
scoring 272 k-mers (k=7,8,9), with support by independent
biological data and orientation/position biases
- highest
scoring interactions, with support by independent biological data
(functional categories) and median distances
Conserved regulatory elements between H.
sapiens and M. musculus, 5,000 bp upstream regions
- raw
sorted lists of k-mers (and gapped k-mers)
Conserved regulatory elements between H.
sapiens and R. norvegicus, 2,000
bp upstream regions
- raw
sorted lists of k-mers (and gapped k-mers)
Conserved regulatory
elements between M. musculus and R. norvegicus, 2,000
bp upstream regions
- raw
sorted lists of k-mers (and gapped k-mers)