The HMMerThread database stores remotely conserved domains in entire genomes. By combining sensitive domain searches (HMMer) against Pfam-domains with associated 3d-structures and a subsequent threading step (Threader 3.5), the HMMerThread tool can identify domain hits that show only very remote sequence conservation. By applying HMMerThread as a batch system to entire genomes, we are able to provide remote domain information of all proteins within an organism. Currently available genomes include human, mouse, fly, worm and the fungi S. cerevisiae and S. pombe.
Reference: Bradshaw CR & Habermann B (in preparation).



Pipeline of HMMerThread:

  • HMMerThread starts by performing a HMMer search against those Pfam domains that have an associated 3d-structure
  • identified domains are in the next step selected for processing using the threading pipeline
  • pre-processing of threading includes masking of low-complexity regions (SEG) and coiled-coil regions (COILS) followed by a secondary structure prediction (PSI-PRED); for accuracy reasons, all preprocessing steps are executed on the full-length sequence
  • pre-processed conserved domains are then sent to the threading step using Threader 3.5
  • finally, results are collected and stored in the HMMerThread database.







HMMerThread protein entry page.
The HMMerThread database shows next to bona-fide domains (blue bars and CDD Domains table) those with low sequence conservation that were only identified by using HMMerThread (green bar and HMMerThread Hits tabe). In this case, the protein Sortin Nexin 1 (SNX1) is shown, which contains a Sortin_nexin and a PX domain, followed by a C-terminal BAR domain (weak domain hit). To evaluate the quality of the domain hit, the alignment of HMMerThread is presented (based on the HMMer search). Next to the domain information, the HMMerThread database stores literature & GeneRif information, as well as the Summary taken from the NCBI RefSeq database (middle). The entry is directly linked to the NCBI (human, mouse, pombe), Flybase (fly), Wormbase (worm) and SGD (yeast). Orthology information (based on HomoloGene and extended by reciprocal BLASTP searches) are shown with links to HomoloGene and the respective protein entry in the HMMerThread database (left). Gene Ontology information including Process, Function and Compartment, as well as Synonyms for the protein/gene are given on the right.