Description

The RepMiner package takes a graph theory approach to the classification and assembly of the repetitive fraction of genomic sequence data. Sequences analyzed by RepMiner can range from full length transposable elements in well characterized genomes to short length sequence reads resulting from low coverage sample sequence data. RepMiner makes use of transposable elements identified from model species to map the location of putative transposable elements onto homology based networks derived from comparing the sequences of the query genome to itself. Individual clusters representing Pseudo Assembly Networks (PANs) may be selected and assembled using the TGICL/CAP3 program.

This package is currently under heavy development.

Features

Fully implemented features:

Future directions:

Screenshots

Example Set of Pseudo Assembly Networks (PANs)

The 'constellations' below represents networks of homology within query genome. Each individual node in the network below represents a short sequence read. The color of each individual node represents the source BAC. The shape of the node represents any homology to known transposable elements. The color of the perimeter of the node represents any homology to known transposable element proteins. The color of the lines connecting any two nodes indicates the degree of homology between the two sequences.

HTML Output Header

Each PAN is assembled with TGICL and BLASTEd against databases of known transposable elements. The variables used for the assembly and BLAST based homology comparison is indicated in the header of the HTML output.

 

HTML Output Record

For each of the PANs, the assembled molecules are BLASTed against a set of known transposable elements and compared to a database of hidden Markov models profiles for MITEs. The best BLAST hit is shown for any of the TE databases that had a significant hit. The entire BLAST report is available by clicking on the database name under each contig name. The parsed hidden Markov model output can be accessed by clicking on the hmm_mite link under each contig name. The PAN shown below contains a Stowaway mite.


Author: James Estill
Last Updated: Thursday, 19 April 2007

SourceForge.net Logo