|
|
RUNE LYNGSOE |
Domain Finding in the Human Genome |
Keywords: Human genome, protein domains, clustering, hidden Markov models |
With the human genome project (as well as other genome
projects) nearing completion the biomolecular community is faced with soon
having huge amounts of raw sequence data on its hand. Some of this data
will consist of known and well studied genes and sequences. Some of it
can be classified based on similarity to other well known sequences. But
a significant part of the genome will be of unknown content. At UCSC the
task of predicting genes in the human genome has recently been undertaken.
The aim of this project is to attempt to further process the predicted
new genes by clustering genes with shared protein domains. For each cluster
discovered a hidden Markov model representation will be constructed and
the validity of it will be assessed by hidden Markov model comparison techniques. |
|
|