<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1745-7580-3-4</ui>
   <ji>1745-7580</ji>
   <fm>
      <dochead>Database</dochead>
      <bibl>
         <title>
            <p>ImmTree: Database of evolutionary relationships of genes and proteins in the human immune system</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Ortutay</snm>
               <fnm>Csaba</fnm>
               <insr iid="I1"/>
               <email>csaba.ortutay@uta.fi</email>
            </au>
            <au id="A2">
               <snm>Siermala</snm>
               <fnm>Markku</fnm>
               <insr iid="I1"/>
               <email>markku.siermala@luukku.com</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Vihinen</snm>
               <fnm>Mauno</fnm>
               <insr iid="I1"/>
               <insr iid="I2"/>
               <email>mauno.vihinen@uta.fi</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Institute of Medical Technology, FI-33014 University of Tampere, Finland</p>
            </ins>
            <ins id="I2">
               <p>Research Unit, Tampere University Hospital, FI-33520 Tampere, Finland</p>
            </ins>
         </insg>
         <source>Immunome Research</source>
         <issn>1745-7580</issn>
         <pubdate>2007</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>4</fpage>
         <url>http://www.immunome-research.com/content/3/1/4</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17376226</pubid>
               <pubid idtype="doi">10.1186/1745-7580-3-4</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>12</day>
               <month>2</month>
               <year>2007</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>21</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>21</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Ortutay et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>The immune system, which is a complex machinery, is based on the highly coordinated expression of a wide array of genes and proteins. The evolutionary history of the human immune system is not well characterised. Although several studies related to the development and evolution of immunological processes have been published, a full-scale genome-based analysis is still missing. A database focused on the evolutionary relationships of immune related genes would contribute to and facilitate research on immunology and evolutionary biology.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>An Internet resource called ImmTree <url>http://bioinf.uta.fi/ImmTree</url> was constructed for studying the evolution and evolutionary trees of the human immune system. ImmTree contains information about orthologs in 80 species collected from the HomoloGene, OrthoMCL and EGO databases. In addition to phylogenetic trees, the service provides data for the comparison of human-mouse ortholog pairs, including synonymous and non-synonymous mutation rates, Z values, and K<sub>a</sub>/K<sub>s </sub>quotients. A versatile search engine allows complex queries from the database. Currently, data is available for 847 human immune system related genes and proteins.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>ImmTree provides a unique data set of genes and proteins from the human immune system, their phylogenetics, and information for comparisons of human-mouse ortholog pairs, synonymous and non-synonymous mutation rates, as well as other statistical information.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The immune system is a very complex machinery thet has evolved and diversified over time. Numerous processes are necessary for mounting adaptive and innate immune responses to protect an individual from invading organisms and molecules. Acquired and congenital problems in almost any part of the immune system can lead to diseases, many of which are very severe or even life threatening. The different processes and pathways of the immune system have evolved gradually and become increasingly complex. More ancient innate or intrinsic immunity has been further complemented by adaptive processes, which provide a specific response when required.</p>
         <p>Although intensively studied, the evolutionary history of this system is not well known. The evolution of certain immunological protein groups of the human immunome have already been studied. For example, five gene groups of the NF-&#954;B signaling pathway in vertebrates and insects <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>, or the evolution of the interleukin-1 protein family in vertebrates <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> are extensively studied. To explore the molecular evolution of the human immune system, a reference set of genes and proteins needs to be defined <abbrgrp><abbr bid="B3">3</abbr></abbrgrp>. We have identified and collected genes and proteins essential for human immunity and a genome wide investigation of the evolution of these genes has been carried out <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. Here, we describe a database for the evolutionary trees of proteins in the human immune system (ImmTree) <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. ImmTree contains information for orthologs of the human genes in 80 species, including all the major model organisms from Eukaryota. The evolutionary relationships of the orthologs are presented as phylogenetic trees. Further, ImmTree provides a unique data set for comparison of human-mouse ortholog pairs by the presented synonymous and non-synonymous mutation rates of the genes.</p>
      </sec>
      <sec>
         <st>
            <p>Construction and content</p>
         </st>
         <sec>
            <st>
               <p>Collecting human immune system related genes and proteins and their orthologs</p>
            </st>
            <p>We collected from articles, textbooks and electronic sources altogether 847 human genes that are involved in immunology related processes, or which are essential for the life of immunological cells and organs <abbrgrp><abbr bid="B4">4</abbr></abbrgrp>. The variable chains of the immunoglobulins (Igs), B and T cell receptors (BCRs and TCRs) and major histocompatibility complexes (MHCs) were not included since these proteins are not coded by conventionally structured genes but by gene fragments. These gene fargments and their products are already exclusively collected and listed in IMGT, the international ImMunoGeneTics information system at National Computer Centre of Higher Education <abbrgrp><abbr bid="B6">6</abbr></abbrgrp> and European Bioinformatics Institute <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>. ImmTree contains the genes and proteins that are required for processing these gene fragments. In the ImmTree database Entrez Gene <abbrgrp><abbr bid="B8">8</abbr></abbrgrp> identifiers were used to refer to genes. Protein sequences were downloaded from NCBI GenBank <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>. Ortholog sequences are from the Eukaryotic Gene Orthologs (EGO) <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, HomoloGene <abbrgrp><abbr bid="B11">11</abbr></abbrgrp> and OrthoMCL <abbrgrp><abbr bid="B12">12</abbr></abbrgrp> databases. HomoloGene contains groups of homologs for completely sequenced eukaryotic genomes, while EGO has (tentative) ortholog groups of the eukaryotic sequences in the TIGR sequence database. OrthoMCL contains sequences exclusively from 55 complete genomes and therefore the number of sequences from the different branches is limited. The releases used were EGO version 9.0, released 15 February 2005; HomoloGene build 50.1, released 25 July 2006; Ortho MCL version 1.0, released 19 October 2005.</p>
            <p>The nucleotide sequences of ortholog groups were taken from EGO and the protein sequences from HomoloGene and OrthoMCL. The sequences were aligned using ClustalW <abbrgrp><abbr bid="B13">13</abbr></abbrgrp> with the default parameters. Phylogenetic trees were reconstructed for all three type of ortholog groups using the PAUP* program package <abbrgrp><abbr bid="B14">14</abbr></abbrgrp> when the group contained at least three sequences. We thus created three trees for most of the ortholog groups for the data from the three independent databases. A simple neighbour-joining method was applied if the ortholog group contained only three taxa, otherwise bootstrap analysis was applied with parsimony method, heuristic tree search, and 1000 replications. The number of bootstrap replicates was reduced to 100 in the case of OrthoMCL ortholog groups where more than 50 sequences were in the group. Similarly the number of replicates was reduced even further, to 50, where the number of sequences exceeded 100. This was necessary due to computational time requirements, since some OrthoMCL groups contain numerous paralogs. In these cases, the tree constructing becomes very CPU intensive without any further phylogenetic advantage.</p>
            <p>For a general overview of the ortholog groups, we generated a fourth tree. This tree represents protein sequences from all the species in any of the three datasets. Moreover, each species is represented by just one sequence, preventing the accumulation of identical sequences from multiple data sources. This way the large paralog groups from the OrthoMCL database are represented by just a single sequence.</p>
            <p>The nucleotide sequences from the EGO database were translated to amino acids to align the representative protein sequences from the three databases. The translation was done in all six frames, and all six transcripts were aligned with the human protein sequence using bl2seq from the BLAST package <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>. Only the transcript with the longest identical stretch with the human ortholog was retained for further analysis. The protein sequences collected this way were aligned and phylogenetic trees were constructed as described above.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison of the human-mouse ortholog pairs</p>
            </st>
            <p>In 603 cases orthologs were present both in the mouse and human genome in the HomoloGene database. These pairs were further analysed in detail. The cDNA sequences of the human and mouse genes were translated to protein sequences and then aligned using the blast2seq program. The corresponding cDNA sequences were aligned based on the amino acid sequence alignment with proprietary Perl scripts, some of which utilize modules from the Bioperl Project <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>. The estimates of synonymous mutations per synonymous sites (K<sub>s </sub>or dS) and of non-synonymous mutations per non-synonymous sites (K<sub>a </sub>or dN) values were calculated <abbrgrp><abbr bid="B17">17</abbr></abbrgrp>. Z values and the K<sub>a</sub>/K<sub>s </sub>quotients describe the conservation of given genes since the human-mouse divergence.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Utility and Discussion</p>
         </st>
         <sec>
            <st>
               <p>Database access and search</p>
            </st>
            <p>The ImmTree database can be accessed online <abbrgrp><abbr bid="B5">5</abbr></abbrgrp>. The service provides two search modes. The first search page is an interface for finding human genes by GenBank gi numbers, GenBank accession numbers, or UniProt <abbrgrp><abbr bid="B18">18</abbr></abbrgrp> accession numbers. The other engine is for searching ortholog groups by using more complex criteria (Fig <figr fid="F1">1A</figr>). The first options concentrate on features of human genes and proteins. One can search for protein domains either by InterPro <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> id or name of the domain. Ontology queries are based either on GeneOntology <abbrgrp><abbr bid="B20">20</abbr></abbrgrp> ids or ontology keywords. In addition, keyword searches are possible for gene identification. Also, some predefined categories like 'CD molecules', 'complement system' or 'inflammation' can be searched.</p>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Examples of ImmTree search functions and data presentation</p>
               </caption>
               <text>
                  <p><b>Examples of ImmTree search functions and data presentation</b>. A) Search form for a gene group. B) Top of the result list for query 'genes which appeared earlier than group Coelomata in the EGO database' C) Information in ImmTree for the human membrane alanine aminopeptidase precursor and its orthologs.</p>
               </text>
               <graphic file="1745-7580-3-4-1"/>
            </fig>
            <p>The second group of search options helps to identify features common for ortholog groups. The most basic option is to search for organisms within an ortholog group either by NCBI's Taxonomy <abbrgrp><abbr bid="B21">21</abbr></abbrgrp> id or by the name of the taxon. Ortholog groups can be searched also by their ancestor taxa. Such complex searches can be performed, for example, only for 'genes which appeared earlier than Coelomata according to the EGO database' (94 result groups) or 'genes which emerged in the Bilateria group according to the HomoloGene database' (41 result groups).</p>
            <p>The third type of search option is based on the statistical information of human-mouse ortholog pair comparisons. Gene pairs can be found by the K<sub>a</sub>/K<sub>s </sub>quotient value or by the Z value. Both these parameters refer to the conservation of sequences <abbrgrp><abbr bid="B22">22</abbr><abbr bid="B23">23</abbr></abbrgrp>. Is is also possible to combine search options, for example, to search for the 'genes related to the complement system which have a K<sub>a</sub>/K<sub>s </sub>value less than 0.15' (5 result groups) or 'genes with the keyword lectin which have a K<sub>a</sub>/K<sub>s </sub>value greater than 0.6' (4 result groups).</p>
         </sec>
         <sec>
            <st>
               <p>Reports of results</p>
            </st>
            <p>All the search results are displayed in an interactive list from which one can investigate details for each of the identified ortholog groups (Fig <figr fid="F1">1B</figr>). Similarly to the gene group search page, the results for a single ortholog group are divided into three main parts (Fig <figr fid="F1">1C</figr>). The header of the page presents details of the human gene. Sequences are available via links to GenBank and UniProt. Evolutionary levels denoting the appearance of the gene are shown based on the EGO, HomoloGene and OrthoMCL databases and combined data. Then, the results of the human-mouse ortholog comparison, including the values for the number of synonymous and nonsynonymous substitutions per site (K<sub>s</sub>, K<sub>a</sub>), their quotient value (K<sub>a</sub>/K<sub>s</sub>) and Z value, are presented. The evolutionary trees for the combined, EGO, HomoloGene and OrthoMCL datasets are in the third section. Links for the trees for the four datasets are also provided. The multiple sequence alignments and the evolutionary trees are available in nexus format <abbrgrp><abbr bid="B24">24</abbr></abbrgrp> for download and can be visualized with the ATV (A Tree Viewer) Java Applet <abbrgrp><abbr bid="B25">25</abbr></abbrgrp>.</p>
            <p>Figure <figr fid="F2">2</figr> presents the four phylogenetic trees for the orthologs of the human membrane alanine aminopeptidase precursor. The differences of the ortholog definitions in the different databases are clearly visible. The most strict definition of an ortholog group is in the HomoloGene database (Fig <figr fid="F2">2A</figr>). There are sequences just from a few species, and just a few paralogs in the dataset. Contrastingly, the tree for the EGO data (Fig <figr fid="F2">2B</figr>) contains sequences from more species. EGO's definition of an ortholog group is less strict and therefore the groups are called tentative ortholog groups. Consequently the sequences are usually more distant. Many OrthoMCL groups (Fig <figr fid="F2">2C</figr>) contain lots of paralogs. OrthoMCL includes proteins from only 55 selected genomes. Paralogs are presented if they appeared after the most recent divergence of the included genomes. EGO and HomoloGene have sequences from a much broader species spectrum, and in addition they try to avoid the inclusion of paralogs. In ImmTree all three datasets with the corresponding trees are provided, and the user can use any of them according to their needs. For a more general overview, ImmTree provides a fourth tree (Fig <figr fid="F2">2D</figr>) to combine the data from the three databases. In this tree, only one sequence from each species is included. ImmTree thus allows one to investigate how broadly spread genes are among the taxa.</p>
            <fig id="F2">
               <title>
                  <p>Figure 2</p>
               </title>
               <caption>
                  <p>Phylogenetic trees for the orthologs of the human membrane alanine aminopeptidase precursor in the different ortholog databases</p>
               </caption>
               <text>
                  <p><b>Phylogenetic trees for the orthologs of the human membrane alanine aminopeptidase precursor in the different ortholog databases</b>. The trees were constructed using parsimony method, a heuristic tree search, and 1000 replications. Bootstrap values are shown at the nodes. The trees and multiple sequence alignments can be downloaded from the ImmTree database. A) Orthologs in HomoloGene, B) EGO, and C) OrthoMCL (Note the number of paralogs.) D) Overview tree presenting one sequence for each species in any of the databases.</p>
               </text>
               <graphic file="1745-7580-3-4-2"/>
            </fig>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>ImmTree is a new and unique data resource for exploring the molecular evolution of the immune system. Although excellent databases, such as The Adaptive Evolution Database (TAED) <abbrgrp><abbr bid="B26">26</abbr></abbrgrp> or the Database of Evolutionary Distances (DED) <abbrgrp><abbr bid="B27">27</abbr></abbrgrp> are available for studying molecular evolution, they are general systems for all genes. It would be hard to collect molecular evolution related data for the immune system from them. ImmTree is a dedicated resource considering the special needs of researchers of evolution of the immune system. ImmTree facilitates queries according to the classic groupings of immune functions, such as humoral immunity, cellular immunity, complement system. The database will be continuously updated.</p>
      </sec>
      <sec>
         <st>
            <p>Availability and requirements</p>
         </st>
         <p>The ImmTree database is freely available for academic use from the URL: <url>http://bioinf.uta.fi/ImmTree</url></p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The author(s) declare that they have no competing interests.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>CO and MS collected the sequences of the immunome genes. CO carried out the phylogenetic analysis and MS collected the identification numbers connected to the immunome genes. MV designed and coordinated the project and compiled the list of genes and proteins. All authors drafted the manuscript and approved its content.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>We thank the Medical Research Fund of Tampere University Hospital and the CAMKIN Research Network of the European Commission for financial support.</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Molecular evolution of the NF-&#954;B signaling system</p>
            </title>
            <aug>
               <au>
                  <snm>Friedman</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Hughes</snm>
                  <fnm>AL</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>2002</pubdate>
            <volume>53</volume>
            <fpage>964</fpage>
            <lpage>974</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00251-001-0399-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">11862396</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>The molecular evolution of the interleukin-1 family of cytokines; IL-18 in teleost fish</p>
            </title>
            <aug>
               <au>
                  <snm>Huising</snm>
                  <fnm>MO</fnm>
               </au>
               <au>
                  <snm>Stet</snm>
                  <fnm>RJ</fnm>
               </au>
               <au>
                  <snm>Savelkoul</snm>
                  <fnm>HF</fnm>
               </au>
               <au>
                  <snm>Verburg-van Kemenade</snm>
                  <fnm>BM</fnm>
               </au>
            </aug>
            <source>Dev Comp Immunol</source>
            <pubdate>2004</pubdate>
            <volume>28</volume>
            <fpage>395</fpage>
            <lpage>413</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.dci.2003.09.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15062640</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>A systems approach to dissecting immunity and inflammation</p>
            </title>
            <aug>
               <au>
                  <snm>Aderem</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Smith</snm>
                  <fnm>KD</fnm>
               </au>
            </aug>
            <source>Semin Immunol</source>
            <pubdate>2004</pubdate>
            <volume>16</volume>
            <fpage>55</fpage>
            <lpage>67</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.smim.2003.10.002</pubid>
                  <pubid idtype="pmpid" link="fulltext">14751764</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Molecular characterization of the immune system: Emergence of proteins, processes and domains</p>
            </title>
            <aug>
               <au>
                  <snm>Ortutay</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Siermala</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Vihinen</snm>
                  <fnm>M</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>2007</pubdate>
            <note>DOI: 10.1007/s00251-007-0191-0</note>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">17294181</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>ImmTree</p>
            </title>
            <url>http://bioinf.uta.fi/ImmTree</url>
         </bibl>
         <bibl id="B6">
            <title>
               <p>IMGT, the international ImMunoGeneTics information system</p>
            </title>
            <aug>
               <au>
                  <snm>Lefranc</snm>
                  <fnm>MP</fnm>
               </au>
               <au>
                  <snm>Giudicelli</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Kaas</snm>
                  <fnm>Q</fnm>
               </au>
               <au>
                  <snm>Duprat</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Jabado-Michaloud</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Scaviner</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ginestoux</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Clement</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Chaume</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lefranc</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D593</fpage>
            <lpage>597</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540019</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608269</pubid>
                  <pubid idtype="doi">10.1093/nar/gki065</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>The IMGT/HLA and IPD databases</p>
            </title>
            <aug>
               <au>
                  <snm>Robinson</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Waller</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Fail</snm>
                  <fnm>SC</fnm>
               </au>
               <au>
                  <snm>Marsh</snm>
                  <fnm>SG</fnm>
               </au>
            </aug>
            <source>Hum Mutat</source>
            <pubdate>2006</pubdate>
            <volume>27</volume>
            <issue>12</issue>
            <fpage>1192</fpage>
            <lpage>9</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1002/humu.20406</pubid>
                  <pubid idtype="pmpid" link="fulltext">16944494</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>Entrez Gene: gene-centered information at NCBI</p>
            </title>
            <aug>
               <au>
                  <snm>Maglott</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Pruitt</snm>
                  <fnm>KD</fnm>
               </au>
               <au>
                  <snm>Tatusova</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D26</fpage>
            <lpage>31</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1761442</pubid>
                  <pubid idtype="pmpid" link="fulltext">17148475</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl993</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B9">
            <title>
               <p>GenBank</p>
            </title>
            <aug>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Karsch-Mizrachi</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
               <au>
                  <snm>Ostell</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D21</fpage>
            <lpage>25</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1781245</pubid>
                  <pubid idtype="pmpid" link="fulltext">17202161</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl986</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Cross-referencing eukaryotic genomes: TIGR Orthologous Gene Alignments (TOGA)</p>
            </title>
            <aug>
               <au>
                  <snm>Lee</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sultana</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Pertea</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Cho</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Karamycheva</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Tsai</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Parvizi</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Cheung</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Antonescu</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>White</snm>
                  <fnm>J</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>493</fpage>
            <lpage>502</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">155294</pubid>
                  <pubid idtype="pmpid" link="fulltext">11875039</pubid>
                  <pubid idtype="doi">10.1101/gr.212002</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>NCBI genetic resources supporting immunogenetic research</p>
            </title>
            <aug>
               <au>
                  <snm>Feolo</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Helmberg</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Sherry</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Maglott</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Rev Immunogenet</source>
            <pubdate>2000</pubdate>
            <volume>2</volume>
            <fpage>461</fpage>
            <lpage>467</lpage>
            <xrefbib>
               <pubid idtype="pmpid">12361089</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups</p>
            </title>
            <aug>
               <au>
                  <snm>Chen</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Mackey</snm>
                  <fnm>AJ</fnm>
               </au>
               <au>
                  <snm>Stoeckert</snm>
                  <fnm>CJ</fnm>
                  <suf>Jr</suf>
               </au>
               <au>
                  <snm>Roos</snm>
                  <fnm>DS</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D363</fpage>
            <lpage>368</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347485</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381887</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj123</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice</p>
            </title>
            <aug>
               <au>
                  <snm>Thompson</snm>
                  <fnm>JD</fnm>
               </au>
               <au>
                  <snm>Higgins</snm>
                  <fnm>DG</fnm>
               </au>
               <au>
                  <snm>Gibson</snm>
                  <fnm>TJ</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1994</pubdate>
            <volume>22</volume>
            <fpage>4673</fpage>
            <lpage>4680</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308517</pubid>
                  <pubid idtype="pmpid" link="fulltext">7984417</pubid>
                  <pubid idtype="doi">10.1093/nar/22.22.4673</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4</p>
            </title>
            <aug>
               <au>
                  <snm>Swofford</snm>
                  <fnm>DL</fnm>
               </au>
            </aug>
            <source>Version 4</source>
            <publisher>Sinauer Associates</publisher>
            <pubdate>2003</pubdate>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Basic local alignment search tool</p>
            </title>
            <aug>
               <au>
                  <snm>Altschul</snm>
                  <fnm>SF</fnm>
               </au>
               <au>
                  <snm>Gish</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Miller</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Myers</snm>
                  <fnm>EW</fnm>
               </au>
               <au>
                  <snm>Lipman</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>J Mol Biol</source>
            <pubdate>1990</pubdate>
            <volume>215</volume>
            <fpage>403</fpage>
            <lpage>410</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">2231712</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>The Bioperl toolkit: Perl modules for the life sciences</p>
            </title>
            <aug>
               <au>
                  <snm>Stajich</snm>
                  <fnm>JE</fnm>
               </au>
               <au>
                  <snm>Block</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Boulez</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Brenner</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Chervitz</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Dagdigian</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Fuellen</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Gilbert</snm>
                  <fnm>JG</fnm>
               </au>
               <au>
                  <snm>Korf</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Lapp</snm>
                  <fnm>H</fnm>
               </au>
               <etal/>
            </aug>
            <source>Genome Res</source>
            <pubdate>2002</pubdate>
            <volume>12</volume>
            <fpage>1611</fpage>
            <lpage>1618</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">187536</pubid>
                  <pubid idtype="pmpid" link="fulltext">12368254</pubid>
                  <pubid idtype="doi">10.1101/gr.361602</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions</p>
            </title>
            <aug>
               <au>
                  <snm>Nei</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Gojobori</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>Mol Biol Evol</source>
            <pubdate>1986</pubdate>
            <volume>3</volume>
            <fpage>418</fpage>
            <lpage>426</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">3444411</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>The Universal Protein Resource (UniProt)</p>
            </title>
            <aug>
               <au>
                  <cnm>The UniProt Consortium</cnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D193</fpage>
            <lpage>197</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1669721</pubid>
                  <pubid idtype="pmpid" link="fulltext">17142230</pubid>
                  <pubid idtype="doi">10.1093/nar/gkl929</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>New developments in the InterPro database</p>
            </title>
            <aug>
               <au>
                  <snm>Mulder</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Apweiler</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Attwood</snm>
                  <fnm>TK</fnm>
               </au>
               <au>
                  <snm>Bairoch</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Bateman</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Binns</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Bork</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Buillard</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Cerutti</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Copley</snm>
                  <fnm>R</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2007</pubdate>
            <volume>35</volume>
            <fpage>D224</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/nar/gkl841</pubid>
                  <pubid idtype="pmpid" link="fulltext">17202162</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>The Gene Ontology (GO) database and informatics resource</p>
            </title>
            <aug>
               <au>
                  <snm>Harris</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Clark</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ireland</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Lomax</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Ashburner</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Foulger</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Eilbeck</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Lewis</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Marshall</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Mungall</snm>
                  <fnm>C</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2004</pubdate>
            <volume>32</volume>
            <fpage>D258</fpage>
            <lpage>261</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">308770</pubid>
                  <pubid idtype="pmpid" link="fulltext">14681407</pubid>
                  <pubid idtype="doi">10.1093/nar/gkh066</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Database resources of the National Center for Biotechnology Information</p>
            </title>
            <aug>
               <au>
                  <snm>Wheeler</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Barrett</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Benson</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Bryant</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Canese</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Chetvernin</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Church</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>DiCuccio</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Edgar</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Federhen</snm>
                  <fnm>S</fnm>
               </au>
               <etal/>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2006</pubdate>
            <volume>34</volume>
            <fpage>D173</fpage>
            <lpage>180</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1347520</pubid>
                  <pubid idtype="pmpid" link="fulltext">16381840</pubid>
                  <pubid idtype="doi">10.1093/nar/gkj158</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>The Ka/Ks ratio: diagnosing the form of sequence evolution</p>
            </title>
            <aug>
               <au>
                  <snm>Hurst</snm>
                  <fnm>LD</fnm>
               </au>
            </aug>
            <source>Trends Genet</source>
            <pubdate>2002</pubdate>
            <volume>18</volume>
            <fpage>486</fpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0168-9525(02)02722-1</pubid>
                  <pubid idtype="pmpid" link="fulltext">12175810</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B23">
            <title>
               <p>Widespread selection for local RNA secondary structure in coding regions of bacterial genes</p>
            </title>
            <aug>
               <au>
                  <snm>Katz</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Burge</snm>
                  <fnm>CB</fnm>
               </au>
            </aug>
            <source>Genome Res</source>
            <pubdate>2003</pubdate>
            <volume>13</volume>
            <fpage>2042</fpage>
            <lpage>2051</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">403678</pubid>
                  <pubid idtype="pmpid" link="fulltext">12952875</pubid>
                  <pubid idtype="doi">10.1101/gr.1257503</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>NEXUS: an extensible file format for systematic information</p>
            </title>
            <aug>
               <au>
                  <snm>Maddison</snm>
                  <fnm>DR</fnm>
               </au>
               <au>
                  <snm>Swofford</snm>
                  <fnm>DL</fnm>
               </au>
               <au>
                  <snm>Maddison</snm>
                  <fnm>WP</fnm>
               </au>
            </aug>
            <source>Syst Biol</source>
            <pubdate>1997</pubdate>
            <volume>46</volume>
            <fpage>590</fpage>
            <lpage>621</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.2307/2413497</pubid>
                  <pubid idtype="pmpid">11975335</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>ATV: display and manipulation of annotated phylogenetic trees</p>
            </title>
            <aug>
               <au>
                  <snm>Zmasek</snm>
                  <fnm>CM</fnm>
               </au>
               <au>
                  <snm>Eddy</snm>
                  <fnm>SR</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2001</pubdate>
            <volume>17</volume>
            <fpage>383</fpage>
            <lpage>384</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/17.4.383</pubid>
                  <pubid idtype="pmpid" link="fulltext">11301314</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>The Adaptive Evolution Database (TAED): a phylogeny based tool for comparative genomics</p>
            </title>
            <aug>
               <au>
                  <snm>Roth</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Betts</snm>
                  <fnm>MJ</fnm>
               </au>
               <au>
                  <snm>Steffansson</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Saelensminde</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Liberles</snm>
                  <fnm>DA</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D495</fpage>
            <lpage>497</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540044</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608245</pubid>
                  <pubid idtype="doi">10.1093/nar/gki090</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>DED: Database of Evolutionary Distances</p>
            </title>
            <aug>
               <au>
                  <snm>Veeramachaneni</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Makalowski</snm>
                  <fnm>W</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>D442</fpage>
            <lpage>446</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">540048</pubid>
                  <pubid idtype="pmpid" link="fulltext">15608234</pubid>
                  <pubid idtype="doi">10.1093/nar/gki094</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>

