oligocounter logo




Input files

Put in the same directory from which you run the OligoViz jar.


An output file from the Java program OligoCounter. Be sure to put only one file with a particular RefSeq in the working directory.

Recommended: The corresponding .GFF file from the NCBI RefSeq collection. This contains coding positions which allows OligoViz to create two colour graphics. If no GFF file is found a black and white graph will be drawn instead.


Running OligoViz

Run OligoViz with 400 megabytes of memory from the command line

java -Xmx400m -jar OligoViz.jar

You can run the jar file by double clicking on it on some Windows PCs, however the memory assigned by default is not sufficient to run the program properly. Therefore it is best to run it from the command line with the -Xmx400m switch to make more 400 megabytes of additional memory available.

Using OligoViz 

A drop down list selects the genome by its RefSeq identifier.

A dot is printed in each 10kbp region if an oligo is present there. A black dot indicates the instance of the oligo was located in a non-coding region, while a green dot is painted for coding regions.

The genome position to the nearest 10kbp is output to the top right dialog box if the figure is clicked.

Divergent regions tend to be occupied by fewer or no overrepresented oligos and appear as horizontal lines.

Two different tab-separated files can be created. These are best opened with WordPad or MS Word (not Notepad!) under Windows or any Linux text editor.

The "Save distances" button saves a data level summary to the working directory, with the oligos and the average distance between each instance i.e. position of the oligo. Pushing this button creates a text file like this


in the current directory. This file contains all overrepresented oligos, the log (base10) average distance between all instances of the oligo and the average distance in nucleotides between all instances. These distances allow estimation of how much space occurs between instances of an oligo in the genome - repeats tend to have large average distances while coding oligos tend to be more evenly distributed, i.e. with a smaller average distance, as in the images.

>gi|18311643|ref|NC_003364.1| Pyrobaculum aerophilum str. IM2, complete genome
Genome size: 2222430
Oligo LogAverageDist AverageDist

The "Save coding" button works only if a .GFF file containing information on coding regions is in the same directory as the current files. Pushing this button creates a text file called


or similar. Parameters in this file include the oligo, the total number of this oligo found in the genome (instances), and the instances in coding regions. From this the percent coding is calculated, and of the coding oligos the percent which are coding but out of frame are also derived using coding region start codon positions. Reading frames are thus all relative to the respective start codon. The number of coding oligos found in the three forward reading frames are also presented. OligoCounter only deals with the direct DNA strand. Frame1 indicates "in frame", i.e. the oligos which appear to be directly translated.

>gi|20088899|ref|NC_003552.1| Methanosarcina acetivorans C2A, complete genome
Genome size: 5751492
Oligo totalOligos codingOligos pcCoding pcCodingOutOfFrame Frame1 Frame2 Frame3
TTCTTTTT 1241 713 57.5 45.7 387 204 122
AAAAAGAA 1134 623 54.9 68.9 194 297 132