org.biojava.bio.alignment
Class SubstitutionMatrix

java.lang.Object
  extended by org.biojava.bio.alignment.SubstitutionMatrix

public class SubstitutionMatrix
extends java.lang.Object

This object is able to read a substitution matrix file and constructs a short matrix in memory. Every single element of the matrix can be accessed by the method getValueAt with the parameters being two BioJava symbols. This is why it is not necessary to access the matrix directly. If there is no value for the two specified Symbols an Exception is thrown.

Substitution matrix files, are available at the NCBI FTP directory.

Author:
Andreas Dräger

Field Summary
protected  FiniteAlphabet alphabet
           
protected  java.util.Map<Symbol,java.lang.Integer> colSymbols
           
protected  java.lang.String description
           
protected  short[][] matrix
           
protected  short max
           
protected  short min
           
protected  java.lang.String name
           
protected  java.util.Map<Symbol,java.lang.Integer> rowSymbols
           
 
Constructor Summary
SubstitutionMatrix(java.io.File file)
          This constructor can be used to guess the alphabet of this substitution matrix.
SubstitutionMatrix(FiniteAlphabet alpha, java.io.File matrixFile)
          This constructs a SubstitutionMatrix object that contains two Map data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.
SubstitutionMatrix(FiniteAlphabet alpha, short match, short replace)
          Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters.
SubstitutionMatrix(FiniteAlphabet alpha, java.lang.String matrixString, java.lang.String name)
          With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file.
 
Method Summary
 FiniteAlphabet getAlphabet()
          Gives the alphabet used by this matrix.
 java.lang.String getDescription()
          This gives you the description of this matrix if there is one.
 short getMax()
          The maximum score in this matrix.
 short getMin()
          The minimum score of this matrix.
 java.lang.String getName()
          Every substitution matrix has a name like "BLOSUM30" or "PAM160".
static SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader reader)
          This constructor can be used to guess the alphabet of this substitution matrix.
 short getValueAt(Symbol row, Symbol col)
          There are some substitution matrices containing more columns than lines.
 SubstitutionMatrix normalizeMatrix()
          With this method you can get a “normalized” SubstitutionMatrix object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten.
 void printMatrix()
          Just to perform some test.
 void setDescription(java.lang.String desc)
          Sets the description to the given value.
 java.lang.String stringnifyDescription()
          Converts the description of the matrix to a String.
 java.lang.String stringnifyMatrix()
          Creates a String representation of this matrix.
 java.lang.String toString()
          Overrides the inherited method.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

rowSymbols

protected java.util.Map<Symbol,java.lang.Integer> rowSymbols

colSymbols

protected java.util.Map<Symbol,java.lang.Integer> colSymbols

matrix

protected short[][] matrix

min

protected short min

max

protected short max

alphabet

protected FiniteAlphabet alphabet

description

protected java.lang.String description

name

protected java.lang.String name
Constructor Detail

SubstitutionMatrix

public SubstitutionMatrix(FiniteAlphabet alpha,
                          java.io.File matrixFile)
                   throws BioException,
                          java.lang.NumberFormatException,
                          java.io.IOException
This constructs a SubstitutionMatrix object that contains two Map data structures having BioJava symbols as keys and the value being the index of the matrix containing the substitution score.

Parameters:
alpha - the alphabet of the matrix (e.g., DNA, RNA or PROTEIN, or PROTEIN-TERM)
matrixFile - the file containing the substitution matrix. Lines starting with '#' are comments. The line starting with a white space, is the table head. Every line has to start with the one letter representation of the Symbol and then the values for the exchange.
Throws:
java.io.IOException
BioException
java.lang.NumberFormatException

SubstitutionMatrix

public SubstitutionMatrix(FiniteAlphabet alpha,
                          java.lang.String matrixString,
                          java.lang.String name)
                   throws BioException,
                          java.lang.NumberFormatException,
                          java.io.IOException
With this constructor it is possible to construct a SubstitutionMatrix object from a substitution matrix file. The given String contains a number of lines separated by System.getProperty("line.separator"). Everything else is the same than for the constructor above.

Parameters:
alpha - The FiniteAlphabet to use
matrixString -
name - of the matrix.
Throws:
BioException
java.io.IOException
java.lang.NumberFormatException

SubstitutionMatrix

public SubstitutionMatrix(FiniteAlphabet alpha,
                          short match,
                          short replace)
Constructs a SubstitutionMatrix with every Match and every Replace having the same expenses given by the parameters. Ambiguous symbols are not considered because there might be to many of them (for proteins).

Parameters:
alpha -
match -
replace -

SubstitutionMatrix

public SubstitutionMatrix(java.io.File file)
                   throws java.lang.NumberFormatException,
                          java.util.NoSuchElementException,
                          BioException,
                          java.io.IOException
This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.

Parameters:
file - A file containing a substitution matrix.
Throws:
java.lang.NumberFormatException
java.util.NoSuchElementException
BioException
java.io.IOException
Method Detail

getSubstitutionMatrix

public static SubstitutionMatrix getSubstitutionMatrix(java.io.BufferedReader reader)
                                                throws java.lang.NumberFormatException,
                                                       BioException,
                                                       java.io.IOException
This constructor can be used to guess the alphabet of this substitution matrix. However, it is recommended to apply another constructor if the alphabet is known.

Parameters:
reader -
Throws:
java.lang.NumberFormatException
BioException
java.io.IOException

getValueAt

public short getValueAt(Symbol row,
                        Symbol col)
                 throws BioException
There are some substitution matrices containing more columns than lines. This has to do with the ambiguous symbols. Lines are always good, columns might not contain the whole information. The matrix is supposed to be symmetric anyway, so you can always set the ambiguous symbol to be the first argument.

Parameters:
row - Symbol of the line
col - Symbol of the column
Returns:
expenses for the exchange of symbol row and symbol column.
Throws:
BioException

getDescription

public java.lang.String getDescription()
This gives you the description of this matrix if there is one. Normally substitution matrix files like BLOSUM contain some lines of description.

Returns:
the comment of the matrix

getName

public java.lang.String getName()
Every substitution matrix has a name like "BLOSUM30" or "PAM160". This will be returned by this method.

Returns:
the name of the matrix.

getMin

public short getMin()
The minimum score of this matrix.

Returns:
minimum of the matrix.

getMax

public short getMax()
The maximum score in this matrix.

Returns:
maximum of the matrix.

setDescription

public void setDescription(java.lang.String desc)
Sets the description to the given value.

Parameters:
desc - a description. This doesn't have to start with '#'.

getAlphabet

public FiniteAlphabet getAlphabet()
Gives the alphabet used by this matrix.

Returns:
the alphabet of this matrix.

stringnifyMatrix

public java.lang.String stringnifyMatrix()
Creates a String representation of this matrix.

Returns:
a string representation of this matrix without the description.

stringnifyDescription

public java.lang.String stringnifyDescription()
Converts the description of the matrix to a String.

Returns:
Gives a description with approximately 60 letters on every line separated by System.getProperty("line.separator"). Every line starts with #.

toString

public java.lang.String toString()
Overrides the inherited method.

Overrides:
toString in class java.lang.Object
Returns:
Gives a string representation of the SubstitutionMatrix. This is a valid input for the constructor which needs a matrix string. This String also contains the description of the matrix if there is one.

printMatrix

public void printMatrix()
Just to perform some test. It prints the matrix on the screen.


normalizeMatrix

public SubstitutionMatrix normalizeMatrix()
                                   throws BioException,
                                          java.lang.NumberFormatException,
                                          java.io.IOException
With this method you can get a “normalized” SubstitutionMatrix object; however, since this implementation uses an short matrix, the normalized matrix will be scaled by ten. If you need values between zero and one, you have to divide every value returned by getValueAt by ten.

Returns:
a new and normalized SubstitutionMatrix object given by this substitution matrix. Because this uses an short matrix, all values are scaled by 10.
Throws:
BioException
java.io.IOException
java.lang.NumberFormatException