Genome Browser(JBrowse)

JBrowse is a genome browser, being developed as the successor to GBrowse. It is very fast and scales well to large datasets.

We have JBrowse for Strongylocentrotus purpuratus,Lytechinus variegatus and Patiria miniata

Strongylocentrotus purpuratus genome Tracks

jbrowse

Following tracks can be viewed on genome browser for Strongylocentrotus purpuratus genome

Assembly contig

This track shows the position of a contig within the scaffold. A contig is continuous sequence of DNA that has been assembled from overlapping cloned DNA fragments. It may include draft and finished sequence. It may contain sequence gaps (within a clone), but it does not include gaps between clones (http://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/).

eBACs

This track displays the position of the deconvoluted individual BAC sequences enriched with WGS sequences. These sequences were generated during the genome sequencing project using pools of arrayed BACs from a whole genome tiling path.

BAC-END

This track shows the location and coordinates of the sequenced ends of clones from the sperm genomic BAC library. These sequences provided the first virtual map of the genome (http://www.spbase.org/SpBase/resources/index.php).

EST

This track shows the position of expressed sequence tags (EST) in the genome. An EST is a short sub-sequence of a transcribed cDNA sequence produced by one-shot sequencing of a cloned mRNA.

GLEAN-UTR:Prediction

This track shows the location and coordinates of predicted genes based on the Spur0.5 genome assembly (Sodergren ref). GLEAN refers to a program that combined data from several diverse predictive methods (ab initio, homology-based or empirical) to reach a consensus prediction. The consensus gene-set contains 28,944 unique genes. The added UTR predictions were identified form a whole genome tiling array hybridized with embryo mRNA (samanta science ref). SpBase converted those 28,944 GLEAN identifiers to SPU numbers by adding a zero and changing the prefix; GLEAN3_00001 is the same as SPU_000001 SPU identifiers over SPU_030000 denote gene sequences derived from other sources than the predicted set.
You can visualize 3'UTR for GLEAN gene, a vertical line separates the gene with 3'UTR

Transcriptome

This track shows the position and structure of expressed gene sequences read from cDNA samples collected from 10 different embryonic stages, 6 feeding larval and metamorphosed juvenile stages, and 6 adult tissues. The reads were produced in the Illumina sequencing machine using a paired-end strategy and assembled using the Bowtie-TopHat-Cufflink software package.

Transcriptome Read Coverage

This track shows an X-Y plot of the coverage of the 621 million transcriptome reads mapped to the genome.

Lytechinus variegatus

This track shows the position of conserved non-coding sequence patches between the S. purpuratus genome and the L. variegatus genome. These two genomes shared a common ancestor 50 million years ago. In the time since divergence only the function sequences are likely to remain as the rest changes due to genetic drift and small insertions and deletions.

AF reads

This track shows the position of Roche 454 reads from Allocentrotus fragilis (AF) mapped to the S. purpuratus genome. This comparison conforms to a rule that modestly distant species comparisons reveal regulatory modules because large indels (>20 bp) are statistically almost absent within the regulatory modules, although they are common in flanking intergenic or intronic sequence.

Sf reads

This track shows the position of Roche 454 reads from Strongylocentrotus franciscanus (SF) mapped to the S. purpuratus genome. This comparison conforms to a rule that modestly distant species comparisons reveal regulatory modules because large indels (>20 bp) are statistically almost absent within the regulatory modules, although they are common in flanking intergenic or intronic sequence.

Inhouse Repeats

In house repeats are predicted using our custom repeatmasker pipeline. We have identified and annotated 3019 transposable elements using this pipeline.

Repbase repeats

These are transposable elements identified by Repbase in Strongylocentrotus purpuratus genome.

Low complexity repeats

We have identified low complexity repeats in the genome using our custom repeatmasker pipeline.

Lytechinus variegatus genome Tracks

jbrowse

Following tracks can be viewed on genome browser for Lytechinus variegatus genome

Assembly contig

This track shows the position of a contig within the scaffold. A contig is continuous sequence of DNA that has been assembled from overlapping cloned DNA fragments. It may include draft and finished sequence. It may contain sequence gaps (within a clone), but it does not include gaps between clones (http://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/).

MAKER2 predicted genes

We have used MAKER2 pipeline to predict genes. There are 28,094 predicted genes. Briefly gene models were generated by masking all repeats, training set was generated using EST(isotigs) and protein evidence alignments to the genome(Scaffold) and finally genes were predicted using ab initio predictors(SNAP and augustus).

Inhouse Repeats

In house repeats are predicted using our custom repeatmasker pipeline. We have identified and annotated 3019 transposable elements using this pipeline.

Repbase repeats

These are transposable elements identified by Repbase in Strongylocentrotus purpuratus genome.

Low complexity repeats

We have identified low complexity repeats in the genome using our custom repeatmasker pipeline.

Patiria miniata genome Tracks

jbrowse

Following tracks can be viewed on genome browser for Patiria miniata genome

Assembly contig

This track shows the position of a contig within the scaffold. A contig is continuous sequence of DNA that has been assembled from overlapping cloned DNA fragments. It may include draft and finished sequence. It may contain sequence gaps (within a clone), but it does not include gaps between clones (http://www.ncbi.nlm.nih.gov/genome/annotation_euk/process/).

MAKER2 predicted genes

We have used MAKER2 pipeline to predict genes. There are 29,697 predicted genes. Briefly gene models were generated by masking all repeats, training set was generated using EST(isotigs) and protein evidence alignments to the genome(Scaffold) and finally genes were predicted using ab initio predictors(SNAP and augustus).

Inhouse Repeats

In house repeats are predicted using our custom repeatmasker pipeline. We have identified and annotated 3019 transposable elements using this pipeline.

Repbase repeats

These are transposable elements identified by Repbase in Strongylocentrotus purpuratus genome.

Low complexity repeats

We have identified low complexity repeats in the genome using our custom repeatmasker pipeline.

Visualizing data

Users can view genome annotations against a reference “ruler,” with an overhead bar giving a visual indication of Scaffold position. The user can navigate by dragging the display left or right (which creates a smooth panning effect, with no page reload) or by clicking on navigation buttons; analogous buttons allow the display to be zoomed in or out (again, this is a smooth effect, with no page reload). Alternatively, users can navigate directly to a region (or feature) of interest by typing the region coordinates (or feature name) into a search box. Additional annotation tracks can be added to the current display by dragging them from a reservoir bar on the left of the screen, and can be removed by dragging them back off the main display. Tracks can, similarly, be reordered by dragging. All track manipulation, as with navigation, is live and requires no page reloads.

jbrowse

Screenshot of JBrowse, illustrating the various parts of the screen. The navigation panel (a) includes an overview of the genome and the current location, together with navigation buttons for panning and zooming, a menu for selecting the current chromosome, and a text box for navigating directly to coordinates or named features. Below this are the currently active tracks, which can include feature tracks (b) or bar graph image tracks for displaying quantitative information (c). which can be dragged to the active track area, upon which they will be expanded.