<?xml version='1.0'?>
<!DOCTYPE art SYSTEM 'http://www.biomedcentral.com/xml/article.dtd'>
<art>
   <ui>1745-7580-3-5</ui>
   <ji>1745-7580</ji>
   <fm>
      <dochead>Research</dochead>
      <bibl>
         <title>
            <p>Strength in numbers: achieving greater accuracy in MHC-I binding prediction by combining the results from multiple prediction tools</p>
         </title>
         <aug>
            <au id="A1">
               <snm>Trost</snm>
               <fnm>Brett</fnm>
               <insr iid="I1"/>
               <email>brett.trost@usask.ca</email>
            </au>
            <au id="A2">
               <snm>Bickis</snm>
               <fnm>Mik</fnm>
               <insr iid="I1"/>
               <email>bickis@math.usask.ca</email>
            </au>
            <au id="A3" ca="yes">
               <snm>Kusalik</snm>
               <fnm>Anthony</fnm>
               <insr iid="I1"/>
               <email>kusalik@cs.usask.ca</email>
            </au>
         </aug>
         <insg>
            <ins id="I1">
               <p>Departments of Computer Science and Mathematics &amp; Statistics, University of Saskatchewan, Saskatchewan, Canada</p>
            </ins>
         </insg>
         <source>Immunome Research</source>
         <issn>1745-7580</issn>
         <pubdate>2007</pubdate>
         <volume>3</volume>
         <issue>1</issue>
         <fpage>5</fpage>
         <url>http://www.immunome-research.com/content/3/1/5</url>
         <xrefbib>
            <pubidlist>
               <pubid idtype="pmpid">17381846</pubid>
               <pubid idtype="doi">10.1186/1745-7580-3-5</pubid>
            </pubidlist>
         </xrefbib>
      </bibl>
      <history>
         <rec>
            <date>
               <day>14</day>
               <month>11</month>
               <year>2006</year>
            </date>
         </rec>
         <acc>
            <date>
               <day>24</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </acc>
         <pub>
            <date>
               <day>24</day>
               <month>3</month>
               <year>2007</year>
            </date>
         </pub>
      </history>
      <cpyrt>
         <year>2007</year>
         <collab>Trost et al; licensee BioMed Central Ltd.</collab>
         <note>This is an Open Access article distributed under the terms of the Creative Commons Attribution License (<url>http://creativecommons.org/licenses/by/2.0</url>), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</note>
      </cpyrt>
      <abs>
         <sec>
            <st>
               <p>Abstract</p>
            </st>
            <sec>
               <st>
                  <p>Background</p>
               </st>
               <p>Peptides derived from endogenous antigens can bind to MHC class I molecules. Those which bind with high affinity can invoke a CD8<sup>+ </sup>immune response, resulting in the destruction of infected cells. Much work in immunoinformatics has involved the algorithmic prediction of peptide binding affinity to various MHC-I alleles. A number of tools for MHC-I binding prediction have been developed, many of which are available on the web.</p>
            </sec>
            <sec>
               <st>
                  <p>Results</p>
               </st>
               <p>We hypothesize that peptides predicted by a number of tools are more likely to bind than those predicted by just one tool, and that the likelihood of a particular peptide being a binder is related to the number of tools that predict it, as well as the accuracy of those tools. To this end, we have built and tested a heuristic-based method of making MHC-binding predictions by combining the results from multiple tools. The predictive performance of each individual tool is first ascertained. These performance data are used to derive weights such that the predictions of tools with better accuracy are given greater credence. The combined tool was evaluated using ten-fold cross-validation and was found to signicantly outperform the individual tools when a high specificity threshold is used. It performs comparably well to the best-performing individual tools at lower specificity thresholds. Finally, it also outperforms the combination of the tools resulting from linear discriminant analysis.</p>
            </sec>
            <sec>
               <st>
                  <p>Conclusion</p>
               </st>
               <p>A heuristic-based method of combining the results of the individual tools better facilitates the scanning of large proteomes for potential epitopes, yielding more actual high-affinity binders while reporting very few false positives.</p>
            </sec>
         </sec>
      </abs>
   </fm>
   <bdy>
      <sec>
         <st>
            <p>Background</p>
         </st>
         <p>The major histocompatibility complex (MHC) is a set of genes whose products play a crucial role in immune response. Peptides derived from the proteasomal degradation of intracellular proteins are presented by MHC class I molecules to cytotoxic T lymphocytes (CTL) <abbrgrp><abbr bid="B1">1</abbr><abbr bid="B2">2</abbr><abbr bid="B3">3</abbr></abbrgrp>, and recognition of a non-self peptide by a CTL can result in the destruction of an infected cell. Peptides that can complete this pathway are called T cell epitopes.</p>
         <p>Only 0.5% of peptides are estimated to bind to a given MHC-I molecule, making this the most selective step in the recognition of intracellular antigens <abbrgrp><abbr bid="B4">4</abbr><abbr bid="B5">5</abbr></abbrgrp>. Given the large size of many viral and bacterial proteomes, it is prohibitive in terms of time and money to test every possible peptide for immunogenicity; thus, tools for the computational prediction of peptides that are likely to bind to a given MHC-I allele are invaluable in facilitating the identification of T cell epitopes.</p>
         <p>Many tools for performing such predictions, of varying quality, are available. We hypothesize that greater predictive accuracy can be achieved by combining the predictions from several of these tools rather than using just one tool. Further, contributions from individual tools should be related to their accuracy. To test this hypothesis, we have built a prediction tool which assigns a "combined score" to each peptide in a given protein by taking into account the predictive performance of each tool, and the score given by that same tool to a given peptide. We also compare our technique with combined predictions made using linear discriminant analysis, a standard statistical method for combining variables to distinguish two groups (in this case, "binder" and "non-binder"). In this paper, the acronym "HBM" will refer to our heuristic-based method and "LDA" will refer to the predictor built using linear discriminant analysis.</p>
      </sec>
      <sec>
         <st>
            <p>Results and discussion</p>
         </st>
         <sec>
            <st>
               <p>Performance of the individual tools</p>
            </st>
            <p>Table <tblr tid="T1">1</tblr> shows the ability of each individual tool to discrimine between the binders and nonbinders to HLA-A*0201 derived from the community binding database <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>. As we are interested in good sensitivity at high specificity, the sensitivity of each tool at 0.99 specificity and 0.95 specificity are shown. The <it>A</it><sub><it>ROC </it></sub>value for each tool is also given; these values are very similar, but not completely identical, to those given by the authors of the community binding resource; the small discrepancies are likely due to the use of differing methods of calculating the area under the ROC curve. Individual tool performance data for the HLA-B*3501 and H-2Kd peptides from the community binding database, as well as for the HLA-A*0201 peptides gathered from the literature, are shown in Tables <tblr tid="T2">2</tblr>, <tblr tid="T3">3</tblr>, and <tblr tid="T4">4</tblr>, respectively.</p>
            <tbl id="T1">
               <title>
                  <p>Table 1</p>
               </title>
               <caption>
                  <p>Performances of the individual prediction tools on the HLA-A*0201 peptides from the community binding resource.</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>Tool</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>A</it>
                           <sub>
                              <it>ROC</it>
                           </sub>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.99 Specificity</it>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.95 Specificity</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Rank<sup>1</sup></p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold<sup>2</sup></p>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ARB Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.935</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.188</p>
                     </c>
                     <c ca="center">
                        <p>2.190</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.601</p>
                     </c>
                     <c ca="center">
                        <p>42.950</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NetMHC 2.0 ANN</p>
                     </c>
                     <c ca="center">
                        <p>0.932</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.286</p>
                     </c>
                     <c ca="center">
                        <p>153.000</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.611</p>
                     </c>
                     <c ca="center">
                        <p>920.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SMM</p>
                     </c>
                     <c ca="center">
                        <p>0.922</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.201</p>
                     </c>
                     <c ca="center">
                        <p>38.092</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.543</p>
                     </c>
                     <c ca="center">
                        <p>454.865</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Bimas</p>
                     </c>
                     <c ca="center">
                        <p>0.920</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.198</p>
                     </c>
                     <c ca="center">
                        <p>324.068</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.552</p>
                     </c>
                     <c ca="center">
                        <p>47.991</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.885</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.170</p>
                     </c>
                     <c ca="center">
                        <p>27.000</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.421</p>
                     </c>
                     <c ca="center">
                        <p>24.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Multipred ANN</p>
                     </c>
                     <c ca="center">
                        <p>0.884</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0.140</p>
                     </c>
                     <c ca="center">
                        <p>5.820</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0.373</p>
                     </c>
                     <c ca="center">
                        <p>5.560</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NetMHC 2.0 Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.872</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.177</p>
                     </c>
                     <c ca="center">
                        <p>24.329</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.358</p>
                     </c>
                     <c ca="center">
                        <p>20.129</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC MHCPEP</p>
                     </c>
                     <c ca="center">
                        <p>0.870</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.115</p>
                     </c>
                     <c ca="center">
                        <p>1.000</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.334</p>
                     </c>
                     <c ca="center">
                        <p>0.520</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Logistic Regression</p>
                     </c>
                     <c ca="center">
                        <p>0.862</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>0.101</p>
                     </c>
                     <c ca="center">
                        <p>0.364</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0.364</p>
                     </c>
                     <c ca="center">
                        <p>0.108</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.854</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0.176</p>
                     </c>
                     <c ca="center">
                        <p>0.950</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.367</p>
                     </c>
                     <c ca="center">
                        <p>0.490</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>HLA Ligand</p>
                     </c>
                     <c ca="center">
                        <p>0.825</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.137</p>
                     </c>
                     <c ca="center">
                        <p>141.000</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>0.274</p>
                     </c>
                     <c ca="center">
                        <p>127.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Rankpep</p>
                     </c>
                     <c ca="center">
                        <p>0.822</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.077</p>
                     </c>
                     <c ca="center">
                        <p>103.000</p>
                     </c>
                     <c ca="center">
                        <p>13</p>
                     </c>
                     <c ca="center">
                        <p>0.306</p>
                     </c>
                     <c ca="center">
                        <p>83.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (Interactions)</p>
                     </c>
                     <c ca="center">
                        <p>0.818</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.182</p>
                     </c>
                     <c ca="center">
                        <p>43.350</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.377</p>
                     </c>
                     <c ca="center">
                        <p>99.080</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (position only)</p>
                     </c>
                     <c ca="center">
                        <p>0.814</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.116</p>
                     </c>
                     <c ca="center">
                        <p>21.330</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.311</p>
                     </c>
                     <c ca="center">
                        <p>65.310</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Multipred HMM</p>
                     </c>
                     <c ca="center">
                        <p>0.798</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>0.090</p>
                     </c>
                     <c ca="center">
                        <p>7.530</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.244</p>
                     </c>
                     <c ca="center">
                        <p>7.090</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Predep</p>
                     </c>
                     <c ca="center">
                        <p>0.788</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>0.045</p>
                     </c>
                     <c ca="center">
                        <p>-6.000</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>0.217</p>
                     </c>
                     <c ca="center">
                        <p>-5.130</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The predictive performance of each tool for the HLA-A*0201 community binding data is shown using two measures: <it>A</it><sub><it>ROC </it></sub>score, and sensitivity when specificity is 0.99 and 0.95. <sup>1</sup>Indicates how the sensitivity of each tool compares to that of the other tools at the indicated specificity; the tool with rank 1 has the highest sensitivity. <sup>2</sup>The scoring threshold corresponding to the indicated specificity.</p>
               </tblfn>
            </tbl>
            <tbl id="T2">
               <title>
                  <p>Table 2</p>
               </title>
               <caption>
                  <p>Individual tool <it>A</it><sub><it>ROC </it></sub>values and sensitivity data for HLA-B*3501 using binders and nonbinders from the community binding resource.</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>Tool</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>A</it>
                           <sub>
                              <it>ROC</it>
                           </sub>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.99 Specificity</it>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.95 Specificity</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ARB Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.849</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.242</p>
                     </c>
                     <c ca="center">
                        <p>12.890</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.422</p>
                     </c>
                     <c ca="center">
                        <p>296.090</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Bimas</p>
                     </c>
                     <c ca="center">
                        <p>0.808</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0.047</p>
                     </c>
                     <c ca="center">
                        <p>60.000</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.166</p>
                     </c>
                     <c ca="center">
                        <p>24.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NetMHC 2.0 Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.789</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.137</p>
                     </c>
                     <c ca="center">
                        <p>32.386</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.336</p>
                     </c>
                     <c ca="center">
                        <p>28.404</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>HLA Ligand</p>
                     </c>
                     <c ca="center">
                        <p>0.786</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.043</p>
                     </c>
                     <c ca="center">
                        <p>162.000</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.237</p>
                     </c>
                     <c ca="center">
                        <p>145.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Rankpep</p>
                     </c>
                     <c ca="center">
                        <p>0.769</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.071</p>
                     </c>
                     <c ca="center">
                        <p>124.000</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.256</p>
                     </c>
                     <c ca="center">
                        <p>107.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Logistic Regression</p>
                     </c>
                     <c ca="center">
                        <p>0.764</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.024</p>
                     </c>
                     <c ca="center">
                        <p>0.655</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.123</p>
                     </c>
                     <c ca="center">
                        <p>0.259</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.742</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.118</p>
                     </c>
                     <c ca="center">
                        <p>0.680</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.204</p>
                     </c>
                     <c ca="center">
                        <p>0.420</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC MHCPEP</p>
                     </c>
                     <c ca="center">
                        <p>0.733</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.043</p>
                     </c>
                     <c ca="center">
                        <p>1.560</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.204</p>
                     </c>
                     <c ca="center">
                        <p>1.140</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (position only)</p>
                     </c>
                     <c ca="center">
                        <p>0.692</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.057</p>
                     </c>
                     <c ca="center">
                        <p>140.600</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0.156</p>
                     </c>
                     <c ca="center">
                        <p>210.860</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (interactions)</p>
                     </c>
                     <c ca="center">
                        <p>0.683</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.090</p>
                     </c>
                     <c ca="center">
                        <p>52.240</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.209</p>
                     </c>
                     <c ca="center">
                        <p>179.470</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Predep</p>
                     </c>
                     <c ca="center">
                        <p>0.587</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.038</p>
                     </c>
                     <c ca="center">
                        <p>-6.020</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.128</p>
                     </c>
                     <c ca="center">
                        <p>-5.090</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>For details, see the caption for Table 1.</p>
               </tblfn>
            </tbl>
            <tbl id="T3">
               <title>
                  <p>Table 3</p>
               </title>
               <caption>
                  <p>Individual tool <it>A</it><sub><it>ROC </it></sub>values and sensitivity data for H-2Kd using binders and nonbinders from the community binding resource.</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>Tool</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>A</it>
                           <sub>
                              <it>ROC</it>
                           </sub>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.99 Specificity</it>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.95 Specificity</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.918</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.300</p>
                     </c>
                     <c ca="center">
                        <p>27.000</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.483</p>
                     </c>
                     <c ca="center">
                        <p>25.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ARB Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.899</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.500</p>
                     </c>
                     <c ca="center">
                        <p>15.470</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.583</p>
                     </c>
                     <c ca="center">
                        <p>40.800</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Bimas</p>
                     </c>
                     <c ca="center">
                        <p>0.888</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.183</p>
                     </c>
                     <c ca="center">
                        <p>4800.000</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.533</p>
                     </c>
                     <c ca="center">
                        <p>2880.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Rankpep</p>
                     </c>
                     <c ca="center">
                        <p>0.886</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.317</p>
                     </c>
                     <c ca="center">
                        <p>107.000</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.567</p>
                     </c>
                     <c ca="center">
                        <p>98.000</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>For details, see the caption for Table 1.</p>
               </tblfn>
            </tbl>
            <tbl id="T4">
               <title>
                  <p>Table 4</p>
               </title>
               <caption>
                  <p>Individual tool <it>A</it><sub><it>ROC </it></sub>values and sensitivity data for HLA-A*0201 using binders and nonbinders gathered from the literature</p>
               </caption>
               <tblbdy cols="8">
                  <r>
                     <c ca="center">
                        <p>Tool</p>
                     </c>
                     <c ca="center">
                        <p>
                           <it>A</it>
                           <sub>
                              <it>ROC</it>
                           </sub>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.99 Specificity</it>
                        </p>
                     </c>
                     <c cspan="3" ca="center">
                        <p>
                           <it>0.95 Specificity</it>
                        </p>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c cspan="6">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c>
                        <p/>
                     </c>
                     <c>
                        <p/>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                     <c ca="center">
                        <p>Rank</p>
                     </c>
                     <c ca="center">
                        <p>Sensitivity</p>
                     </c>
                     <c ca="center">
                        <p>Threshold</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="8">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Multipred ANN</p>
                     </c>
                     <c ca="center">
                        <p>0.772</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.083</p>
                     </c>
                     <c ca="center">
                        <p>5.830</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.278</p>
                     </c>
                     <c ca="center">
                        <p>5.600</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NetMHC 2.0 ANN</p>
                     </c>
                     <c ca="center">
                        <p>0.772</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.139</p>
                     </c>
                     <c ca="center">
                        <p>99.000</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0.231</p>
                     </c>
                     <c ca="center">
                        <p>300.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.762</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.194</p>
                     </c>
                     <c ca="center">
                        <p>27.000</p>
                     </c>
                     <c ca="center">
                        <p>2</p>
                     </c>
                     <c ca="center">
                        <p>0.278</p>
                     </c>
                     <c ca="center">
                        <p>25.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC MHCPEP</p>
                     </c>
                     <c ca="center">
                        <p>0.745</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.194</p>
                     </c>
                     <c ca="center">
                        <p>0.910</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.269</p>
                     </c>
                     <c ca="center">
                        <p>0.740</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Logistic Regression</p>
                     </c>
                     <c ca="center">
                        <p>0.743</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.093</p>
                     </c>
                     <c ca="center">
                        <p>0.424</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.167</p>
                     </c>
                     <c ca="center">
                        <p>0.281</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>ARB Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.742</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.102</p>
                     </c>
                     <c ca="center">
                        <p>1.860</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>0.139</p>
                     </c>
                     <c ca="center">
                        <p>3.550</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Bimas</p>
                     </c>
                     <c ca="center">
                        <p>0.722</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.074</p>
                     </c>
                     <c ca="center">
                        <p>437.482</p>
                     </c>
                     <c ca="center">
                        <p>9</p>
                     </c>
                     <c ca="center">
                        <p>0.213</p>
                     </c>
                     <c ca="center">
                        <p>159.970</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>NetMHC 2.0 Matrix</p>
                     </c>
                     <c ca="center">
                        <p>0.719</p>
                     </c>
                     <c ca="center">
                        <p>14</p>
                     </c>
                     <c ca="center">
                        <p>0.046</p>
                     </c>
                     <c ca="center">
                        <p>27.193</p>
                     </c>
                     <c ca="center">
                        <p>10</p>
                     </c>
                     <c ca="center">
                        <p>0.194</p>
                     </c>
                     <c ca="center">
                        <p>23.585</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SMM</p>
                     </c>
                     <c ca="center">
                        <p>0.716</p>
                     </c>
                     <c ca="center">
                        <p>4</p>
                     </c>
                     <c ca="center">
                        <p>0.157</p>
                     </c>
                     <c ca="center">
                        <p>100.484</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.250</p>
                     </c>
                     <c ca="center">
                        <p>262.434</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Rankpep</p>
                     </c>
                     <c ca="center">
                        <p>0.708</p>
                     </c>
                     <c ca="center">
                        <p>3</p>
                     </c>
                     <c ca="center">
                        <p>0.176</p>
                     </c>
                     <c ca="center">
                        <p>89.000</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.259</p>
                     </c>
                     <c ca="center">
                        <p>83.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (interactions)</p>
                     </c>
                     <c ca="center">
                        <p>0.707</p>
                     </c>
                     <c ca="center">
                        <p>6</p>
                     </c>
                     <c ca="center">
                        <p>0.139</p>
                     </c>
                     <c ca="center">
                        <p>61.660</p>
                     </c>
                     <c ca="center">
                        <p>1</p>
                     </c>
                     <c ca="center">
                        <p>0.287</p>
                     </c>
                     <c ca="center">
                        <p>100.690</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>SVMHC SYFPEITHI</p>
                     </c>
                     <c ca="center">
                        <p>0.706</p>
                     </c>
                     <c ca="center">
                        <p>5</p>
                     </c>
                     <c ca="center">
                        <p>0.148</p>
                     </c>
                     <c ca="center">
                        <p>0.970</p>
                     </c>
                     <c ca="center">
                        <p>11</p>
                     </c>
                     <c ca="center">
                        <p>0.185</p>
                     </c>
                     <c ca="center">
                        <p>0.820</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>HLA Ligand</p>
                     </c>
                     <c ca="center">
                        <p>0.705</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.074</p>
                     </c>
                     <c ca="center">
                        <p>147.000</p>
                     </c>
                     <c ca="center">
                        <p>12</p>
                     </c>
                     <c ca="center">
                        <p>0.167</p>
                     </c>
                     <c ca="center">
                        <p>138.000</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>MHCPred (position only)</p>
                     </c>
                     <c ca="center">
                        <p>0.700</p>
                     </c>
                     <c ca="center">
                        <p>8</p>
                     </c>
                     <c ca="center">
                        <p>0.102</p>
                     </c>
                     <c ca="center">
                        <p>28.250</p>
                     </c>
                     <c ca="center">
                        <p>7</p>
                     </c>
                     <c ca="center">
                        <p>0.231</p>
                     </c>
                     <c ca="center">
                        <p>67.300</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Multipred HMM</p>
                     </c>
                     <c ca="center">
                        <p>0.695</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>0.009</p>
                     </c>
                     <c ca="center">
                        <p>7.570</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.102</p>
                     </c>
                     <c ca="center">
                        <p>7.290</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>Predep</p>
                     </c>
                     <c ca="center">
                        <p>0.627</p>
                     </c>
                     <c ca="center">
                        <p>15</p>
                     </c>
                     <c ca="center">
                        <p>0.019</p>
                     </c>
                     <c ca="center">
                        <p>-6.450</p>
                     </c>
                     <c ca="center">
                        <p>16</p>
                     </c>
                     <c ca="center">
                        <p>0.093</p>
                     </c>
                     <c ca="center">
                        <p>-5.330</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>For details, see the caption for Table 1. The peptides in this literature-derived dataset are available in Additional File <supplr sid="S1">1</supplr>.</p>
               </tblfn>
            </tbl>
         </sec>
         <sec>
            <st>
               <p>Performance of the combined methods</p>
            </st>
            <p>The HBM and LDA were evaluated using ten-fold cross-validation on the same four datasets (the HLA-A*0201, HLA-B*3501, and H-2Kd datasets from the community binding resource, and the HLA-A*0201 dataset from the literature) as the individual tools.</p>
            <p>The HBM requires that an individual tool specificity parameter be chosen such that the tools' sensitivities at that specificity can be used as the weights in equation 1. The performance of the HBM was determined using individual tool specificities of 0.99, 0.95, 0.90, and 0.80. In general, it was found that using 0.99 individual tool specificity resulted in the best performance, while the use of lower individual tool specificity parameters resulted in somewhat weaker performance. Thus, all of the HBM performance data described below were obtained using 0.99 individual tool specificity. Table <tblr tid="T5">5</tblr> shows the performance of the HBM on all four datasets.</p>
            <tbl id="T5">
               <title>
                  <p>Table 5</p>
               </title>
               <caption>
                  <p>Performance of the heuristic-based method on all four datasets</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="center">
                        <p>Specificity</p>
                     </c>
                     <c ca="center">
                        <p>HLA-A*0201 (comm)</p>
                     </c>
                     <c ca="center">
                        <p>HLA-B*3501 (comm)</p>
                     </c>
                     <c ca="center">
                        <p>H-2Kd (comm)</p>
                     </c>
                     <c ca="center">
                        <p>HLA-A*0201 (lit)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>0.99</p>
                     </c>
                     <c ca="center">
                        <p>0.404</p>
                     </c>
                     <c ca="center">
                        <p>0.313</p>
                     </c>
                     <c ca="center">
                        <p>0.467</p>
                     </c>
                     <c ca="center">
                        <p>0.271</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>0.95</p>
                     </c>
                     <c ca="center">
                        <p>0.618</p>
                     </c>
                     <c ca="center">
                        <p>0.393</p>
                     </c>
                     <c ca="center">
                        <p>0.617</p>
                     </c>
                     <c ca="center">
                        <p>0.475</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The sensitivity of the HBM is shown at 0.99 specificity and 0.95 specificity for all four of the datasets used in this study. All values were obtained using a value of 0.99 for the individual tool specificity parameter. The abbreviation "comm" refers to peptides derived from the community binding database, while "lit" refers to peptides gathered from the literature.</p>
               </tblfn>
            </tbl>
            <p>For two of the three alleles, the HBM showed marked improvements in sensitivity at high specificity compared with the best-performing individual tools. The sensitivity of the HBM at 0.99 specificity for HLA-A*0201 was 0.40, a large increase over NetMHC ANN, whose sensitivity of 0.29 was the best among the individual tools. For HLA-B*3501, the HBM sensitivity was 0.31 at a specificity of 0.99, while the highest sensitivity obtained by an individual tool was 0.24. The HBM showed similarly strong performance when tested using the literature-derived HLA-A*0201 data, achieving a sensitivity of 0.27, compared with 0.19 for the best-performing individual tool. For H-2Kd, however, the HBM was outperformed at 0.99 specificity by the ARB matrix tool, which had a sensitivity of 0.50 versus 0.47 for the HBM. We note, however, that ARB Matrix was trained using binders from the community binding database, so its performance on the community datasets is likely inflated <abbrgrp><abbr bid="B7">7</abbr></abbrgrp></p>
            <p>At lower specificity thresholds, the advantage of the HBM was only marginal. For instance, the sensitivity of the HBM at 0.95 specificity for the HLA-A*0201 community dataset was almost identical to that of the best individual tool; for HLA-B*3501, the sensitivity of the HBM at specificity 0.95 was slightly worse than the individual tool with the highest sensitivity at that specificity. Interestingly, however, the HBM actually outperforms the individual tools at specificity 0.95 for H-2Kd.</p>
            <p>The linear discriminant scores displayed approximately normal distributions, with moderate separation between binders and non-binders. The distributions were closer to normality for HLA-A*0201 dataset from the literature and the H-2Kd datset, with more systematic deviations for the other two datasets. While the nominal sensitivity and specificity of the LDA agreed reasonably well with the actual and cross-validated values, we used the cross-validated values for comparison purposes (Table <tblr tid="T6">6</tblr>). The distinction between nominal and actual specificity is illustrated in Figure <figr fid="F1">1</figr>.</p>
            <tbl id="T6">
               <title>
                  <p>Table 6</p>
               </title>
               <caption>
                  <p>Performance of linear discriminant analysis on all four datasets</p>
               </caption>
               <tblbdy cols="5">
                  <r>
                     <c ca="center">
                        <p>Specificity</p>
                     </c>
                     <c ca="center">
                        <p>HLA-A*0201 (comm)</p>
                     </c>
                     <c ca="center">
                        <p>HLA-B*3501 (comm)</p>
                     </c>
                     <c ca="center">
                        <p>H-2Kd (comm)</p>
                     </c>
                     <c ca="center">
                        <p>HLA-A*0201 (lit)</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>0.99</p>
                     </c>
                     <c ca="center">
                        <p>0.324</p>
                     </c>
                     <c ca="center">
                        <p>0.213</p>
                     </c>
                     <c ca="center">
                        <p>0.417</p>
                     </c>
                     <c ca="center">
                        <p>0.102</p>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>0.95</p>
                     </c>
                     <c ca="center">
                        <p>0.718</p>
                     </c>
                     <c ca="center">
                        <p>0.436</p>
                     </c>
                     <c ca="center">
                        <p>0.633</p>
                     </c>
                     <c ca="center">
                        <p>0.333</p>
                     </c>
                  </r>
                  <r>
                     <c cspan="5">
                        <hr/>
                     </c>
                  </r>
                  <r>
                     <c ca="center">
                        <p>
                           <it>A</it>
                           <sub>
                              <it>ROC</it>
                           </sub>
                        </p>
                     </c>
                     <c ca="center">
                        <p>0.956</p>
                     </c>
                     <c ca="center">
                        <p>0.885</p>
                     </c>
                     <c ca="center">
                        <p>0.935</p>
                     </c>
                     <c ca="center">
                        <p>0.828</p>
                     </c>
                  </r>
               </tblbdy>
               <tblfn>
                  <p>The sensitivity of the combined tool is shown at 0.99 specificity and 0.95 specificity for all four of the datasets used in this study. The abbreviation "comm" refers to peptides derived from the community binding database, while "lit" refers to peptides gathered from the literature.</p>
               </tblfn>
            </tbl>
            <fig id="F1">
               <title>
                  <p>Figure 1</p>
               </title>
               <caption>
                  <p>Q-Q plot showing distribution of LDA scores for the HLA-A*0201 community data set</p>
               </caption>
               <text>
                  <p><b>Q-Q plot showing distribution of LDA scores for the HLA-A*0201 community data set</b>. The horizontal axis has been scaled according to normal probabilities, so that points from a normally distributed variable would fall along a straight line (shown in blue). Scores lying above a threshold indicated by a horizontal line would be classified as epitopes. A level exceeding 99% of a normal distribution defines a nominal specificity of 0.99, whereas an actual specificity of 0.99 requires a threshold meeting the actual distribution of points at the 0.99 vertical line. The realized sensitivity of 0.32 for a specificity of 0.99 is indicated as the proportion of epitopes whose scores lie above the threshold of 0.95.</p>
               </text>
               <graphic file="1745-7580-3-5-1"/>
            </fig>
            <p>LDA displayed an improvement over the individual tools for the HLA-A*0201 community dataset, attaining a sensitivity of 0.33 at 0.99 specificity &#8211; higher than that of all the individual tools, but lower than that of the HBM. The performance of the LDA on the other datasets was less substantial. Its sensitivity on the HLA-A*3501 communtiy data at 0.99 specificity was 0.21, compared to 0.24 for ARB matrix and 0.31 for the HBM. However, we note again that the ARB matrix sensitivity is probably inflated, especially considering that the sensitivity for the second-best tool at 0.99 specificity (NetMHC 2.0 Matrix) was 0.14. The performance of LDA on the H-2Kd dataset was fairly strong, but still lower than that of both ARB Matrix and the HBM. Finally, the performance of LDA on the literature-derived HLA-A*0201 dataset was fairly weak at both 0.99 specificity and 0.95 specificity.</p>
            <p>Purely in terms of the <it>A</it><sub><it>ROC </it></sub>value, however, LDA outperforms the individual tools on all four datasets. This suggests that while LDA provides strong "overall" performance across the entire spectrum of specificities, it achieves less improvement in the region of the ROC curve that is of interest in this study &#8211; namely, the regions of very high specificity.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Discussion</p>
         </st>
         <p>In this paper, results are given only for the three alleles HLA-A*0201, HLA-B*3501, and H-2Kd. The approach can be easily extended to any arbitrary MHC-I allele, provided that a sufficient number of tools make predictions for that allele, and that there exists an adequate number of known binding and non-binding peptides that can be used to test the individual tools on that allele. The effects of the latter conditions are born out in our results for H-2Kd versus HLA-A*0201.</p>
         <p>We have used our HBM tool for the prediction of binders from bench-lab experiments, with positive results. For instance, in predicting binders for influenza virus in mice, the best two 9-mers predicted by HBM turned out to generate the strongest responses in immunoassays <abbrgrp><abbr bid="B8">8</abbr></abbrgrp>.</p>
         <p>Some comparative studies of binding prediction tools use randomly-generated nonbinders. This study used known nonbinders only. We contend that the use of known nonbinders contributes to a stronger practical assessment of each tool's utility. Such nonbinders that might have been selected by an experimenter for binding-affinity testing due to the presence of good anchor residues. Randomly-generated nonbinders tend to have anchor residues that poorly match established motifs, and thus are typically very easy to classify; in contrast, nonbinders reported in the literature frequently have anchor residues that do conform to an established motif, making them more difficult to classify. For a tool to be truly useful, it must be able to differentiate between peptides that all have good anchor residues, but whose non-anchor residues confer different degrees of binding affinity.</p>
         <sec>
            <st>
               <p>Availability</p>
            </st>
            <p>The authors have elected not to make the HBM available online, for two reasons: first, frequent server outages and other problems with individual web-based tools often prevent acquisition of all the requisite scores. Automatic operation is therefore not possible. Second, the querying of all the web-based tools can take a long time, making the tool inconvenient for real-time web-based access. Interested researchers may, however, contact the authors regarding obtaining the scripts implementing the HBM.</p>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Conclusion</p>
         </st>
         <p>We have built a tool that heuristically combines the output of several individual MHC-binding prediction tools, and have shown that it achieves substantially improved sensitivity at high specificity compared to the best individual tools, and is also superior to linear discriminant analysis at high specificity. This technique is very general, and can be updated as new prediction tools become available. Given this, the HBM should be extremely valuable for researchers wishing to scan large proteomes for potential epitopes. Additionally, the combination of the tools using linear discriminant analysis consistently displays improved overall operating characteristics (as measured by the <it>A</it><sub><it>ROC </it></sub>value) over the individual tools, and thus would be useful for researchers desiring to identify a large number of the potential binders in a smaller dataset, such as a single protein.</p>
         <p>The success of our heuristic-based tool substantiates the hypothesis that peptides predicted by a number of tools is more likely to bind than those predicted by just one tool, and that the likelihood of a particular peptide being a binder is related to the number of tools that predict it, as well as the accuracy of those tools. In the same vein, our data suggests that the performance of the heuristic-based approach improves when more individual prediction tools are available. The fact that combining the output of several tools results in increased performance indicates that, as of now, no single tool is able to extract all the information inherent in the data currently available. Thus, continued work on improved MHC-binding prediction is necessary.</p>
      </sec>
      <sec>
         <st>
            <p>Methods</p>
         </st>
         <sec>
            <st>
               <p>Determination of prediction tools</p>
            </st>
            <p>We have identified a total of 16 different prediction tools from 12 different research groups. Where there are two tools from the same group, they differ either in the method used to predict binding affinity or in the data used to train the model. The tools tested are as follows: Bimas <abbrgrp><abbr bid="B9">9</abbr></abbrgrp>, Rankpep <abbrgrp><abbr bid="B10">10</abbr></abbrgrp>, SYFPEITHI <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, NetMHC 2.0 ANN and NetMHC 2.0 Matrix <abbrgrp><abbr bid="B5">5</abbr><abbr bid="B12">12</abbr><abbr bid="B13">13</abbr></abbrgrp>, SVMHC SYFPEITHI and SVMHC MHCPEP <abbrgrp><abbr bid="B14">14</abbr></abbrgrp>, HLA Ligand <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, Predep <abbrgrp><abbr bid="B16">16</abbr></abbrgrp>, SMM <abbrgrp><abbr bid="B2">2</abbr></abbrgrp>, MHCPred (position only) and MHCPred (interactions) <abbrgrp><abbr bid="B17">17</abbr><abbr bid="B18">18</abbr></abbrgrp>, Multipred HMM and Multipred ANN <abbrgrp><abbr bid="B19">19</abbr><abbr bid="B20">20</abbr><abbr bid="B21">21</abbr></abbrgrp>, ARB Matrix <abbrgrp><abbr bid="B7">7</abbr></abbrgrp>, and a locally implemented logistic regression-based tool <abbrgrp><abbr bid="B22">22</abbr></abbrgrp>.</p>
         </sec>
         <sec>
            <st>
               <p>Creating a collection of peptides for evaluating the predictive performance of each tool</p>
            </st>
            <p>Prediction of peptide binding was evaluated for three different alleles: HLA-A*0201, HLA-B*3501, and H-2Kd. These alleles differ substantially in the number of available tools that make predictions for them: all of the aforementioned tools predict for HLA-A*0201, eleven make predictions for HLA-B*3501, and just four predict for H-2Kd. Thus, these alleles were chosen so that the performance of our combined tool (HBM) and linear discriminate analysis (LDA) could be evaluated when different numbers of individual tools are employed.</p>
            <p>Two sources of data were used for comparative analysis of prediction tools in this study. The first was the community binding resource <abbrgrp><abbr bid="B6">6</abbr></abbrgrp>, a large, recently published database containing experimentally determined affinity values for the binding of peptides to many different MHC-I alleles. This dataset of testing peptides could potentially be expanded further by incorporating peptides from such online databases as SYFPEITHI <abbrgrp><abbr bid="B11">11</abbr></abbrgrp>, MHCPEP <abbrgrp><abbr bid="B23">23</abbr></abbrgrp>, HLA Ligand <abbrgrp><abbr bid="B15">15</abbr></abbrgrp>, and EPIMHC <abbrgrp><abbr bid="B24">24</abbr></abbrgrp>. However, the use of the latter online databases presents a problem for the current study. As the models underlying many existing prediction tools were trained using data from these latter databases, the subsequent testing of the individual tools with these same peptides may result in an inaccurate estimation of each tool's predictive performance. For instance, tool A may be judged better than tool B merely because tool A was trained using the same peptides with which it was tested, while tool B was not. As combining the scores of the individual tools relies on an accurate appraisal of the performance of each tool, it is necessary to avoid the use of peptides with which the individual tools have been trained. Thus, we used only the community binding resource as our source of binding-affinity data. Only peptides of length 9 were considered, because all tools make predictions for peptides of this length. Peptides with <it>IC</it><sub>50 </sub>&lt; 500 nM were classified as binders, while those having <it>IC</it><sub>50 </sub>> 500 nM were classified as nonbinders. In total, there were 1184 binders and 1905 non-binders to HLA-A*0201, 211 binders and 525 nonbinders to HLA-A*3501, and 60 binders and 116 nonbinders to H-2Kd.</p>
            <p>For comparison purposes, the tools were also tested using an independent dataset consisting of peptides gathered only from published literature <abbrgrp><abbr bid="B25">25</abbr><abbr bid="B26">26</abbr><abbr bid="B27">27</abbr><abbr bid="B28">28</abbr><abbr bid="B29">29</abbr><abbr bid="B30">30</abbr><abbr bid="B31">31</abbr><abbr bid="B32">32</abbr><abbr bid="B33">33</abbr></abbrgrp>. Again, only nonamers were chosen. Classifying a given peptide as a binder or a nonbinder was performed as follows: if <it>IC</it><sub>50 </sub>values were reported (as in the community binding database and most literature sources), then the standard binding threshold of 500 nM was used; where some other type of assay was done to determine binding affinity, the classification given by the authors was used. In the latter case, if no classification was given by the authors, the peptides were not used. Finally, to avoid bias in the data, peptides were filtered such that where two peptides differed at fewer than two residues, one peptide was randomly removed. The resultant dataset consisted of 108 binders and 108 nonbinders to HLA-A*0201, and are given in Additional File <supplr sid="S1">1</supplr>. Due to scarcity of published data, it was not possible to construct similar datasets for HLA-B*3501 or H-2Kd.</p>
            <suppl id="S1">
               <title>
                  <p>Additional File 1</p>
               </title>
               <text>
                  <p>Literature-derived HLA-A*0201 binders and non-binders. List of HLA-A*0201 binding and non-binding peptides gathered from the literature. The papers from which these peptides were derived are cited in the text.</p>
               </text>
               <file name="1745-7580-3-5-S1.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
         <sec>
            <st>
               <p>Performance measures</p>
            </st>
            <p>Binding prediction programs give a numeric score to each considered peptide. Each score can be converted to a binary prediction by comparing against a tool-specific threshold &#8211; if the score is greater or equal, then the peptide is a predicted binder; otherwise, it is a predicted nonbinder.</p>
            <p>Sensitivity is the proportion of experimentally determined binders that are predicted as binders and is defined as <it>true positives</it>/(<it>true positives + false negatives</it>). Specificity is the proportion of experimentally determined nonbinders that are predicted as nonbinders, and is defined as <it>true negatives</it>/(<it>true negatives + false positives</it>). The traditional way to measure the performance of a classifier is to use a receiver operating characteristic (ROC) curve. However, ROC curves do not always give a good measure of practical utility. For a researcher scanning a large proteome for potential epitopes, specificity may be much more important than sensitivity. Imagine scanning a proteome consisting of 10,000 overlapping nonamers, 50 of which (unbeknownst to the experimenter) are good binders to the MHC-I allele of interest. Consider further that prediction tool A has 0.70 sensitivity at 0.80 specificity and 0.05 sensitivity at 0.99 specificity.</p>
            <p>Tool B has 0.50 sensitivity at 0.80 specificity and 0.20 sensitivity at 0.99 specificity. While tools A and B might have the same area under the ROC curve (<it>A</it><sub><it>ROC</it></sub>), tool A is superior at 0.80 specificity and tool B is superior at 0.99 specificity. If tool A is used at a threshold corresponding to 0.80 specificity, then approximately 2000 peptides must be tested in order to find 35 of the high-affinity binders. In contrast, if tool B is used at a threshold corresponding to 0.99 specificity, only about 100 peptides would have to be tested in order to find 10 of the high-affinity binders. Due to the high cost of experimental testing, and because knowledge of all the binders in a given proteome is usually not needed, the latter scenario would be preferable. We therefore conclude that good sensitivity at very high specificity is a more practical measure of a tool's usefulness than the <it>A</it><sub><it>ROC </it></sub>value, and have thus used sensitivity at high values of specificity as the primary assessor of the practical utility of each tool. For completeness, however, we also include each tool's <it>A</it><sub><it>ROC </it></sub>value.</p>
         </sec>
         <sec>
            <st>
               <p>Combining the scores of the individual tools</p>
            </st>
            <p>We propose a heuristic-based method (HBM) for combining scores from individual prediction tools to make a better prediction. This method takes advantage of the observation that most of the individual tools make very few false positive predictions when the classification threshold is set sufficiently high, but correspondingly make few predictions of positives. If the tools identify different actual binders, combining such predictions may result in a greater number of rrue positives. The method also tries to take advantage of the "collective wisdom" of a group of predictive tools. The individual tools are based on a variety of techniques. Instead of trying to find the "best" technique, we try to combine the best that each technique has to offer. This is an extension of the idea used by prediction tools such as MULTIPRED <abbrgrp><abbr bid="B19">19</abbr></abbrgrp> which combine predictions made by a few methods.</p>
            <p>Our proposed combined prediction tool ("HBM") takes a protein sequence as input, queries all of the individual prediction tools getting from each the predicted binding affinity for all nonamers in the protein, computes a combined score for each nonamer, and finally predicts binders based on the combined scores for all nonamers. The tool is implemented as a Perl script.</p>
            <p>The first step in our HBM is to select a specificity for the individual tools. Each tool is then weighted according to its sensitivity at that specificity. Next, the score given to each peptide by a given prediction tool is compared to the tool-specific threshold value for that specificity. If the score is better than or equal to the threshold score, then that tool predicts the peptide as a binder, and the weight (sensitivity at the chosen specificity) for that tool is added to the total score for the peptide. Otherwise, the peptide's total score remains unchanged. For peptide <it>x </it>and each prediction tool <it>t</it>, we have</p>
            <p>
               <m:math name="1745-7580-3-5-i1" xmlns:m="http://www.w3.org/1998/Math/MathML">
                  <m:semantics>
                     <m:mrow>
                        <m:mtext>CombinedScore</m:mtext>
                        <m:mo stretchy="false">(</m:mo>
                        <m:mi>x</m:mi>
                        <m:mo stretchy="false">)</m:mo>
                        <m:mo>=</m:mo>
                        <m:mstyle displaystyle="true">
                           <m:munder>
                              <m:mo>&#8721;</m:mo>
                              <m:mi>t</m:mi>
                           </m:munder>
                           <m:mrow>
                              <m:msub>
                                 <m:mi>B</m:mi>
                                 <m:mi>t</m:mi>
                              </m:msub>
                              <m:mo stretchy="false">(</m:mo>
                              <m:mi>x</m:mi>
                              <m:mo stretchy="false">)</m:mo>
                              <m:msub>
                                 <m:mi>W</m:mi>
                                 <m:mi>t</m:mi>
                              </m:msub>
                           </m:mrow>
                        </m:mstyle>
                        <m:mtext>&#160;&#160;&#160;&#160;&#160;</m:mtext>
                        <m:mrow>
                           <m:mo>(</m:mo>
                           <m:mn>1</m:mn>
                           <m:mo>)</m:mo>
                        </m:mrow>
                     </m:mrow>
                     <m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqqGdbWqcqqGVbWBcqqGTbqBcqqGIbGycqqGPbqAcqqGUbGBcqqGLbqzcqqGKbazcqqGtbWucqqGJbWycqqGVbWBcqqGYbGCcqqGLbqzcqGGOaakcqWG4baEcqGGPaqkcqGH9aqpdaaeqbqaaiabdkeacnaaBaaaleaacqWG0baDaeqaaOGaeiikaGIaemiEaGNaeiykaKIaem4vaC1aaSbaaSqaaiabdsha0bqabaaabaGaemiDaqhabeqdcqGHris5aOGaaCzcaiaaxMaadaqadaqaaiabigdaXaGaayjkaiaawMcaaaaa@51F5@</m:annotation>
                  </m:semantics>
               </m:math>
            </p>
            <p>where <it>B</it><sub><it>t</it></sub>(<it>x</it>) is 1 if peptide <it>x </it>is predicted to bind by tool <it>t </it>and 0 otherwise, and <it>W</it><sub><it>t </it></sub>is the weight of tool <it>t</it>. CombinedScore(<it>x</it>) is then compared to a threshold in order to classify <it>x </it>as either a predicted binder or a predicted nonbinder.</p>
            <p>The performance of the HBM was determined using 10-fold cross-validation: in each fold, 90% of the peptides (the "training peptides") were used to determine the performances of the individual tools, and these performance data were used by the HBM as described above to make predictions for the remaining 10% (the "testing peptides"). Each peptide was used as a testing peptide exactly once. The scores given to each testing peptide were then used to calculate specificity and sensitivity values for the HBM in the same manner as was described for the individual tools. To minimize experimental error due to the random partitioning of the peptides into training and testing sets, the entire process described above was repeated ten times, and the HBM sensitivity at each specificity was taken to be the average of its sensitivity in the ten trials. While <it>A</it><sub><it>ROC </it></sub>values are shown for the individual tools and for the LDA, no such values could be computed for the HBM. The reason for this is that, at high individual tool specificity parameters, most nonbinding peptides get an HBM score of zero, and therefore the ROC curve contains no points for specificities between 0 and approximately 0.85&#8211;0.90.</p>
         </sec>
         <sec>
            <st>
               <p>Comparison technique</p>
            </st>
            <p>A standard method for combining variables to distinguish two categories is linear discriminant analysis (LDA) <abbrgrp><abbr bid="B34">34</abbr></abbrgrp>. If <it>y </it>is the vector of scores from all the tools for a particular peptide, it is classified according to the value of the linear discriminant</p>
            <p>(<it>&#956;</it><sub>1 </sub>- <it>&#956;</it><sub>0</sub>)'&#8721;<sup>-1</sup><it>y</it>,</p>
            <p>where <it>&#956;</it><sub>0 </sub>and <it>&#956;</it><sub>1 </sub>are the vectors of means for non-epitopes and epitopes, respectively, and &#8721; is the average covariance matrix of the scores within the two groups. This method is optimal (in the sense of minimizing the probability of misclassification) if the scores have a multivariate normal distribution with the same covariance matrix for epitopes and non-epitopes. More sophisticated methods have been developed without the normality assumption, but doubts have been expressed about their advantage <abbrgrp><abbr bid="B35">35</abbr></abbrgrp>. The separation between the groups can then be quantified by</p>
            <p><it>&#948;</it><sup>2 </sup>= (<it>&#956;</it><sub>1 </sub>- <it>&#956;</it><sub>0</sub>)'&#8721;<sup>-1</sup>(<it>&#956;</it><sub><it>1 </it></sub>- <it>&#956;</it><sub>0</sub>).</p>
            <p>Under the normality assumption, if the specificity is fixed at 1 - <it>&#945;</it>, then the sensitivity will be</p>
            <p>&#934; (<it>&#948; </it>+ &#934;<sup>-1</sup>(<it>&#945;</it>)),</p>
            <p>where &#934; is the cumulative distribution function (cdf) of the standard normal distribution. <it>A</it><sub><it>ROC </it></sub>can be calculated as &#934; (<it>&#948;</it>/<m:math name="1745-7580-3-5-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:msqrt><m:mn>2</m:mn></m:msqrt></m:mrow><m:annotation encoding="MathType-MTEF">
 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaGcaaqaaiabikdaYaWcbeaaaaa@2DB9@</m:annotation></m:semantics></m:math>). The threshold for classification is determined by the prior probability <it>p</it><sub>1 </sub>that a peptide is an epitope, which is related to the specificity by</p>
            <p><it>p</it><sub>1 </sub>= [1 + exp (-<it>&#948;</it><sup>2</sup>/2 - <it>&#948;</it>&#934;<sup>-1</sup>(<it>&#945;</it>))]<sup>-1</sup>.</p>
            <p>A number of the tools displayed notably non-normal distributions. Most of these were highly skewed, but became close to normal when transformed to logarithms. The scores of three tools (NetMHC 2.0 ANN, Multipred ANN, and the logistic regression-based tool) had sigmoidal distributions. These became approximately normal when converted to scaled logits. A "logit" is a transformation of a probability <it>p </it>(between 0 and 1) to log(<it>p</it>/(1 - <it>p</it>)). For a variable <it>y </it>which is restricted between <it>a </it>and <it>b</it>, a "scaled logit" can be calculated via log((<it>y </it>- <it>a </it>+ <it>&#949;</it>)/(<it>b </it>- <it>y </it>- <it>&#948;</it>)), where <it>&#949; </it>and <it>&#948; </it>are small adjustments to avoid zeros. <it>&#949; </it>= (<it>y</it><sub>- </sub>- <it>a</it>)/2 and <it>&#948; </it>= (<it>b </it>- <it>y</it><sub>+</sub>)/2, <it>y</it><sub>- </sub>and <it>y</it><sub>+ </sub>being the smallest and largest observed values greater or less than <it>a </it>or <it>b</it>, respectively. The actual performance of the linear discriminant on the transformed scores was estimated using ten-fold cross-validation. Computations were done using S-PLUS version 7.0.0. Figures were created with MATLAB 7.</p>
            <p>Except for the H-2Kd dataset, the cross-validated specificities fell short of the nominal ones. To realize specificities of 0.99 and 0.90, the threshold was adjusted to a nominal specificity such that the cross-validated values were as close as possible to the target values. Figure <figr fid="F1">1</figr> shows the distributions of the LDA scores for the community HLA-A*0201 data set. The diagonal lines indicate where the points are expected to fall for perfectly normal data. A specificity of 0.99 corresponds to a horizontal line such that 99% of the non-epitopes fall below this line. Because of the slight upward curvature of the non-epitope distribution, a nominal specificity of 0.99 falls short of this goal, but the larger nominal value of 0.9975 gives the correct threshold. About 32% of the epitopes give LDA scores above this value. Distributions of LDA scores for the the other datasets are given in Additional Files <supplr sid="S2">2</supplr>, <supplr sid="S3">3</supplr> and <supplr sid="S4">4</supplr>.</p>
            <suppl id="S2">
               <title>
                  <p>Additional File 2</p>
               </title>
               <text>
                  <p>HLA-B*3501 LDA Q-Q plot. Q-Q plot showing distribution of LDA scores for the H-2Kd dataset from the community binding resource. The horizontal axis has been scaled according to normal probabilities, so that points from a normally distributed variable would fall along a straight line (shown in blue). Scores lying above the thresholds indicated would be classified as epitopes. The realized sensitivity of 0.44 for a specificity of 0.95 is indicated as the proportion of epitopes whose scores lie above the threshold of 0.95. Of the four datasets used, this one deviates most strongly from normality.</p>
               </text>
               <file name="1745-7580-3-5-S2.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S3">
               <title>
                  <p>Additional File 3</p>
               </title>
               <text>
                  <p>H-2Kd LDA Q-Q plot. Q-Q plot showing distribution of LDA scores for the H-2Kd dataset from the community binding resource. The horizontal axis has been scaled according to normal probabilities, so that points from a normally distributed variable would fall along a straight line (shown in blue). Scores lying above the thresholds indicated would be classified as epitopes. The realized sensitivity of 0.42 for a specificity of 0.99 is indicated as the proportion of epitopes whose scores lie above the threshold of 0.99. Only the nominal values for specificity are used, since the actual ones coincide or are better.</p>
               </text>
               <file name="1745-7580-3-5-S3.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
            <suppl id="S4">
               <title>
                  <p>Additional File 4</p>
               </title>
               <text>
                  <p>HLA-A*0201 (literature) LDA Q-Q plot. Q-Q plot showing distribution of LDA scores for the HLA-A*0201 dataset derived from literature. The horizontal axis has been scaled according to normal probabilities, so that points from a normally distributed variable would fall along a straight line (shown in blue). Scores lying above the thresholds indicated would be classified as epitopes. The realized sensitivity of 0.33 for a specificity of 0.95 is indicated as the proportion of epitopes whose scores lie above the threshold of 0.95. Of the four datasets used, this one best fits the normality assumption.</p>
               </text>
               <file name="1745-7580-3-5-S4.pdf">
                  <p>Click here for file</p>
               </file>
            </suppl>
         </sec>
      </sec>
      <sec>
         <st>
            <p>Abbreviations</p>
         </st>
         <p>LDA &#8211; linear discriminant analysis</p>
         <p>HBM &#8211; heuristic-based method</p>
      </sec>
      <sec>
         <st>
            <p>Competing interests</p>
         </st>
         <p>The authors have no conflict-of-interest with respect to this work. In particular, they have no direct connection with any of the researchers involved with the binding prediction tools studied.</p>
      </sec>
      <sec>
         <st>
            <p>Authors' contributions</p>
         </st>
         <p>BT performed the design, programming work, and evaluation of the HBM. MB performed the linear discriminate analysis work. AK proposed the original idea, provided bioinformatics expertise, contributed to the methodology, and supervised the work. All three authors contributed to the paper, with a majority written by BT.</p>
      </sec>
   </bdy>
   <bm>
      <ack>
         <sec>
            <st>
               <p>Acknowledgements</p>
            </st>
            <p>Funding was provided by the Natural Sciences and Engineering Research Council of Canada (NSERC).</p>
         </sec>
      </ack>
      <refgrp>
         <bibl id="B1">
            <title>
               <p>Virtual models of the HLA class I antigen processing pathway</p>
            </title>
            <aug>
               <au>
                  <snm>Petrovsky</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Methods</source>
            <pubdate>2004</pubdate>
            <volume>34</volume>
            <issue>4</issue>
            <fpage>429</fpage>
            <lpage>435</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/j.ymeth.2004.06.005</pubid>
                  <pubid idtype="pmpid" link="fulltext">15542368</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B2">
            <title>
               <p>Examining the independent binding assumption for binding of peptide epitopes to MHC-I molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Peters</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Tong</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Sidney</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Weng</snm>
                  <fnm>Z</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2003</pubdate>
            <volume>19</volume>
            <issue>14</issue>
            <fpage>1765</fpage>
            <lpage>1772</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/btg247</pubid>
                  <pubid idtype="pmpid" link="fulltext">14512347</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B3">
            <title>
               <p>Human CD8+ T cells recognize epitopes of the 28-kDa hemolysin and the 38-kDa antigen of Mycobacterium tuberculosis</p>
            </title>
            <aug>
               <au>
                  <snm>Shams</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Barnes</snm>
                  <fnm>PF</fnm>
               </au>
               <au>
                  <snm>Weis</snm>
                  <fnm>SE</fnm>
               </au>
               <au>
                  <snm>Klucar</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Wizel</snm>
                  <fnm>B</fnm>
               </au>
            </aug>
            <source>J Leukoc Biol</source>
            <pubdate>2003</pubdate>
            <volume>74</volume>
            <issue>6</issue>
            <fpage>1008</fpage>
            <lpage>1014</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1189/jlb.0403138</pubid>
                  <pubid idtype="pmpid" link="fulltext">12972510</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B4">
            <title>
               <p>Immunodominance in major histocompatibility complex class-I restricted T lymphocyte responses</p>
            </title>
            <aug>
               <au>
                  <snm>Yewdell</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Bennink</snm>
                  <fnm>JR</fnm>
               </au>
            </aug>
            <source>Annu Rev Immunol</source>
            <pubdate>1999</pubdate>
            <volume>17</volume>
            <fpage>51</fpage>
            <lpage>88</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1146/annurev.immunol.17.1.51</pubid>
                  <pubid idtype="pmpid" link="fulltext">10358753</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B5">
            <title>
               <p>Sensitive quantitative predictions of peptide-MHC binding by a 'Query by Committee' artificial neural network approach</p>
            </title>
            <aug>
               <au>
                  <snm>Buus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lauemoller</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Worning</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Kesmir</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Frimurer</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Corbet</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fomsgaard</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Hilden</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Holm</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Tissue Antigens</source>
            <pubdate>2003</pubdate>
            <volume>62</volume>
            <issue>5</issue>
            <fpage>378</fpage>
            <lpage>384</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1034/j.1399-0039.2003.00112.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">14617044</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B6">
            <title>
               <p>A community resource benchmarking predictions of peptide binding to MHC-I molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Peters</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Bui</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Frankild</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lundegaard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Kostem</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Basch</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Lamberth</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Harndahl</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Fieri</snm>
                  <fnm>W</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>SS</fnm>
               </au>
               <au>
                  <snm>Sidney</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Lund</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Buus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>PLoS Comput Biol</source>
            <pubdate>2006</pubdate>
            <volume>2</volume>
            <issue>6</issue>
            <fpage>e65</fpage>
            <url>http://mhcbindingpredictions.immuneepitope.org/dataset.html</url>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1475712</pubid>
                  <pubid idtype="pmpid" link="fulltext">16789818</pubid>
                  <pubid idtype="doi">10.1371/journal.pcbi.0020065</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B7">
            <title>
               <p>Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications</p>
            </title>
            <aug>
               <au>
                  <snm>Bui</snm>
                  <fnm>HH</fnm>
               </au>
               <au>
                  <snm>Sidney</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Peters</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Sathiamurthy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sinichi</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Purton</snm>
                  <fnm>KA</fnm>
               </au>
               <au>
                  <snm>Mothe</snm>
                  <fnm>BR</fnm>
               </au>
               <au>
                  <snm>Chisari</snm>
                  <fnm>FV</fnm>
               </au>
               <au>
                  <snm>Watkins</snm>
                  <fnm>DI</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>2005</pubdate>
            <volume>57</volume>
            <issue>5</issue>
            <fpage>304</fpage>
            <lpage>314</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00251-005-0798-y</pubid>
                  <pubid idtype="pmpid" link="fulltext">15868141</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B8">
            <title>
               <p>personal communication</p>
            </title>
            <aug>
               <au>
                  <snm>Kobinger</snm>
                  <fnm>G</fnm>
               </au>
            </aug>
            <pubdate>2007</pubdate>
         </bibl>
         <bibl id="B9">
            <title>
               <p>Scheme for ranking potential HLA-A2 binding peptides based on independent binding of individual peptide side-chains</p>
            </title>
            <aug>
               <au>
                  <snm>Parker</snm>
                  <fnm>KC</fnm>
               </au>
               <au>
                  <snm>Bednarek</snm>
                  <fnm>MA</fnm>
               </au>
               <au>
                  <snm>Coligan</snm>
                  <fnm>JE</fnm>
               </au>
            </aug>
            <source>J Immunol</source>
            <pubdate>1994</pubdate>
            <volume>152</volume>
            <fpage>163</fpage>
            <lpage>175</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">8254189</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B10">
            <title>
               <p>Prediction of MHC class I binding peptides using profile motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Reche</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Glutting</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Reinherz</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Hum Immunol</source>
            <pubdate>2002</pubdate>
            <volume>63</volume>
            <issue>9</issue>
            <fpage>701</fpage>
            <lpage>709</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0198-8859(02)00432-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">12175724</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B11">
            <title>
               <p>SYFPEITHI: database for MHC ligands and peptide motifs</p>
            </title>
            <aug>
               <au>
                  <snm>Rammensee</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Bachmann</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Emmerich</snm>
                  <fnm>NP</fnm>
               </au>
               <au>
                  <snm>Bachor</snm>
                  <fnm>OA</fnm>
               </au>
               <au>
                  <snm>Stevanovic</snm>
                  <fnm>S</fnm>
               </au>
            </aug>
            <source>Immunogenetics</source>
            <pubdate>1999</pubdate>
            <volume>50</volume>
            <issue>3</issue>
            <fpage>213</fpage>
            <lpage>219</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s002510050595</pubid>
                  <pubid idtype="pmpid" link="fulltext">10602881</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B12">
            <title>
               <p>Reliable prediction of T-cell epitopes using neural networks with novel sequence representations</p>
            </title>
            <aug>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lundegaard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Worning</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Lauemoller</snm>
                  <fnm>SL</fnm>
               </au>
               <au>
                  <snm>Lamberth</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Buus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lund</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2003</pubdate>
            <volume>12</volume>
            <issue>5</issue>
            <fpage>1007</fpage>
            <lpage>1017</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1110/ps.0239403</pubid>
                  <pubid idtype="pmpid" link="fulltext">12717023</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B13">
            <title>
               <p>Improved prediction of MHC class I and class II epitopes using a novel Gibbs sampling approach</p>
            </title>
            <aug>
               <au>
                  <snm>Nielsen</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Lundegaard</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Worning</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Hvid</snm>
                  <fnm>CS</fnm>
               </au>
               <au>
                  <snm>Lamberth</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Buus</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Brunak</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Lund</snm>
                  <fnm>O</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>9</issue>
            <fpage>1388</fpage>
            <lpage>1397</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bth100</pubid>
                  <pubid idtype="pmpid" link="fulltext">14962912</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B14">
            <title>
               <p>Prediction of MHC class I binding peptides, using SVMHC</p>
            </title>
            <aug>
               <au>
                  <snm>Bonnes</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Elofsson</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>BMC Bioinformatics</source>
            <pubdate>2002</pubdate>
            <volume>3</volume>
            <issue>25</issue>
         </bibl>
         <bibl id="B15">
            <title>
               <p>Population of the HLA ligand database</p>
            </title>
            <aug>
               <au>
                  <snm>Sathiamurthy</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hickman</snm>
                  <fnm>HD</fnm>
               </au>
               <au>
                  <snm>Cavett</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Zahoor</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Prilliman</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Metcalf</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Fernandez Vina</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Hildebrand</snm>
                  <fnm>WH</fnm>
               </au>
            </aug>
            <source>Tissue Antigens</source>
            <pubdate>2003</pubdate>
            <volume>61</volume>
            <fpage>12</fpage>
            <lpage>19</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1034/j.1399-0039.2003.610102.x</pubid>
                  <pubid idtype="pmpid" link="fulltext">12622773</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B16">
            <title>
               <p>Structure-based prediction of binding peptides to MHC class I molecules: application to a broad range of MHC alleles</p>
            </title>
            <aug>
               <au>
                  <snm>Schueler-Furman</snm>
                  <fnm>O</fnm>
               </au>
               <au>
                  <snm>Altuvia</snm>
                  <fnm>Y</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Margalit</snm>
                  <fnm>H</fnm>
               </au>
            </aug>
            <source>Protein Sci</source>
            <pubdate>2000</pubdate>
            <volume>9</volume>
            <issue>9</issue>
            <fpage>1838</fpage>
            <lpage>1846</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11045629</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B17">
            <title>
               <p>MHCPred: A server for quantitative prediction of peptide-MHC binding</p>
            </title>
            <aug>
               <au>
                  <snm>Guan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Doytchinova</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Zygouri</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Flower</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2003</pubdate>
            <volume>31</volume>
            <issue>13</issue>
            <fpage>3621</fpage>
            <lpage>3624</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">168917</pubid>
                  <pubid idtype="pmpid" link="fulltext">12824380</pubid>
                  <pubid idtype="doi">10.1093/nar/gkg510</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B18">
            <title>
               <p>Quantitative online prediction of peptide binding to the major histocompatibility complex</p>
            </title>
            <aug>
               <au>
                  <snm>Hattotuwagama</snm>
                  <fnm>CK</fnm>
               </au>
               <au>
                  <snm>Guan</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>Doytchinova</snm>
                  <fnm>IA</fnm>
               </au>
               <au>
                  <snm>Zygouri</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Flower</snm>
                  <fnm>DR</fnm>
               </au>
            </aug>
            <source>J Mol Graph Model</source>
            <pubdate>2004</pubdate>
            <volume>22</volume>
            <issue>3</issue>
            <fpage>195</fpage>
            <lpage>207</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S1093-3263(03)00160-8</pubid>
                  <pubid idtype="pmpid" link="fulltext">14629978</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B19">
            <title>
               <p>MULTIPRED: a computational system for prediction of promiscuous HLA binding peptides</p>
            </title>
            <aug>
               <au>
                  <snm>Zhang</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>August</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>2005</pubdate>
            <volume>33</volume>
            <fpage>W172</fpage>
            <lpage>179</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">1160213</pubid>
                  <pubid idtype="pmpid" link="fulltext">15980449</pubid>
                  <pubid idtype="doi">10.1093/nar/gki452</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B20">
            <title>
               <p>Prediction of promiscuous peptides that bind HLA class I molecules</p>
            </title>
            <aug>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Petrovsky</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Bajic</snm>
                  <fnm>VB</fnm>
               </au>
            </aug>
            <source>Immunol Cell Biol</source>
            <pubdate>2002</pubdate>
            <volume>80</volume>
            <issue>3</issue>
            <fpage>280</fpage>
            <lpage>285</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1046/j.1440-1711.2002.01088.x</pubid>
                  <pubid idtype="pmpid">12067415</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B21">
            <title>
               <p>Prediction of class I T-cell epitopes: evidence of presence of immunological hot spots inside antigens</p>
            </title>
            <aug>
               <au>
                  <snm>Srinivasan</snm>
                  <fnm>KN</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>GL</fnm>
               </au>
               <au>
                  <snm>Khan</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>August</snm>
                  <fnm>JT</fnm>
               </au>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2004</pubdate>
            <volume>20</volume>
            <issue>S1</issue>
            <fpage>i297</fpage>
            <lpage>i302</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1093/bioinformatics/bth943</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B22">
            <title>
               <p>Leveraging information across HLA alleles/supertypes improves epitope prediction</p>
            </title>
            <aug>
               <au>
                  <snm>Heckerman</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Kadie</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Listgarten</snm>
                  <fnm>J</fnm>
               </au>
            </aug>
            <source>RECOMB '06</source>
            <pubdate>2006</pubdate>
            <fpage>296</fpage>
            <lpage>308</lpage>
         </bibl>
         <bibl id="B23">
            <title>
               <p>MHCPEP, a database of MHC-binding peptides: update 1997</p>
            </title>
            <aug>
               <au>
                  <snm>Brusic</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Rudy</snm>
                  <fnm>G</fnm>
               </au>
               <au>
                  <snm>Harrison</snm>
                  <fnm>LC</fnm>
               </au>
            </aug>
            <source>Nucleic Acids Res</source>
            <pubdate>1998</pubdate>
            <volume>26</volume>
            <fpage>368</fpage>
            <lpage>371</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="pmcid">147255</pubid>
                  <pubid idtype="pmpid" link="fulltext">9399876</pubid>
                  <pubid idtype="doi">10.1093/nar/26.1.368</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B24">
            <title>
               <p>EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology</p>
            </title>
            <aug>
               <au>
                  <snm>Reche</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Zhang</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Glutting</snm>
                  <fnm>JP</fnm>
               </au>
               <au>
                  <snm>Reinherz</snm>
                  <fnm>EL</fnm>
               </au>
            </aug>
            <source>Bioinformatics</source>
            <pubdate>2005</pubdate>
            <volume>21</volume>
            <issue>9</issue>
            <fpage>2140</fpage>
            <lpage>2141</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1093/bioinformatics/bti269</pubid>
                  <pubid idtype="pmpid" link="fulltext">15657103</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B25">
            <title>
               <p>Identification of multiple HLA-A*0201-restricted cruzipain and FL-160 CD8+ epitopes recognized by T cells from chronically Trypanosoma cruzi-infected patients</p>
            </title>
            <aug>
               <au>
                  <snm>Fonseca</snm>
                  <fnm>SG</fnm>
               </au>
               <au>
                  <snm>Moins-Teisserenc</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Clave</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Ianni</snm>
                  <fnm>B</fnm>
               </au>
               <au>
                  <snm>Nunes</snm>
                  <fnm>VL</fnm>
               </au>
               <au>
                  <snm>Mady</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Iwai</snm>
                  <fnm>LK</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Sidney</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Marin</snm>
                  <fnm>ML</fnm>
               </au>
               <au>
                  <snm>Goldberg</snm>
                  <fnm>AC</fnm>
               </au>
               <au>
                  <snm>Guilherme</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Charron</snm>
                  <fnm>D</fnm>
               </au>
               <au>
                  <snm>Toubert</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kalil</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Cunha-Neto</snm>
                  <fnm>E</fnm>
               </au>
            </aug>
            <source>Microbes Infect</source>
            <pubdate>2005</pubdate>
            <volume>7</volume>
            <issue>4</issue>
            <fpage>688</fpage>
            <lpage>697</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15848276</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B26">
            <title>
               <p>Role of HLA-A motifs in identification of potential CTL epitopes in human papillomavirus type 16 E6 and E7 proteins</p>
            </title>
            <aug>
               <au>
                  <snm>Kast</snm>
                  <fnm>WM</fnm>
               </au>
               <au>
                  <snm>Brandt</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Sidney</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Drijfhout</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Kubo</snm>
                  <fnm>RT</fnm>
               </au>
               <au>
                  <snm>Grey</snm>
                  <fnm>HM</fnm>
               </au>
               <au>
                  <snm>Melief</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Immunol</source>
            <pubdate>1994</pubdate>
            <volume>152</volume>
            <issue>8</issue>
            <fpage>3904</fpage>
            <lpage>3912</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">7511661</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B27">
            <title>
               <p>Mapping and binding analysis of peptides derived from the tumor-associated antigen survivin for eight HLA alleles</p>
            </title>
            <aug>
               <au>
                  <snm>Bachinsky</snm>
                  <fnm>MM</fnm>
               </au>
               <au>
                  <snm>Guillen</snm>
                  <fnm>DE</fnm>
               </au>
               <au>
                  <snm>Patel</snm>
                  <fnm>SR</fnm>
               </au>
               <au>
                  <snm>Singleton</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Chen</snm>
                  <fnm>C</fnm>
               </au>
               <au>
                  <snm>Soltis</snm>
                  <fnm>DA</fnm>
               </au>
               <au>
                  <snm>Tussey</snm>
                  <fnm>LG</fnm>
               </au>
            </aug>
            <source>Cancer Immun</source>
            <pubdate>2005</pubdate>
            <volume>5</volume>
            <fpage>6</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15779886</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B28">
            <title>
               <p>Spontaneous T-cell responses against peptides derived from the Taxol resistance-associated gene-3(TRAG-3) protein in cancer patients</p>
            </title>
            <aug>
               <au>
                  <snm>Meier</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Reker</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Svane</snm>
                  <fnm>IM</fnm>
               </au>
               <au>
                  <snm>Holten-Andersen</snm>
                  <fnm>L</fnm>
               </au>
               <au>
                  <snm>Becker</snm>
                  <fnm>JC</fnm>
               </au>
               <au>
                  <snm>Sondergaard</snm>
                  <fnm>I</fnm>
               </au>
               <au>
                  <snm>Andersen</snm>
                  <fnm>MH</fnm>
               </au>
               <au>
                  <snm>Thor Straten</snm>
                  <fnm>P</fnm>
               </au>
            </aug>
            <source>Cancer Immunol Immunother</source>
            <pubdate>2005</pubdate>
            <volume>54</volume>
            <issue>3</issue>
            <fpage>219</fpage>
            <lpage>228</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1007/s00262-004-0578-9</pubid>
                  <pubid idtype="pmpid" link="fulltext">15580499</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B29">
            <title>
               <p>Recognition of variant HIV-1 epitopes from diverse viral subtypes by vaccine-induced CTL</p>
            </title>
            <aug>
               <au>
                  <snm>McKinney</snm>
                  <fnm>DM</fnm>
               </au>
               <au>
                  <snm>Skvoretz</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Livingston</snm>
                  <fnm>BD</fnm>
               </au>
               <au>
                  <snm>Wilson</snm>
                  <fnm>CC</fnm>
               </au>
               <au>
                  <snm>Anders</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Chesnut</snm>
                  <fnm>RW</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Essex</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Novitsky</snm>
                  <fnm>V</fnm>
               </au>
               <au>
                  <snm>Newman</snm>
                  <fnm>MJ</fnm>
               </au>
            </aug>
            <source>J Immunol</source>
            <pubdate>2004</pubdate>
            <volume>173</volume>
            <issue>3</issue>
            <fpage>1941</fpage>
            <lpage>1950</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">15265928</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B30">
            <title>
               <p>Identification of potential HLA-A *0201 restricted CTL epitopes derived from the epithelial cell adhesion molecule (Ep-CAM) and the carcinoembryonic antigen (CEA)</p>
            </title>
            <aug>
               <au>
                  <snm>Ras</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>van der Burg</snm>
                  <fnm>SH</fnm>
               </au>
               <au>
                  <snm>Zegveld</snm>
                  <fnm>ST</fnm>
               </au>
               <au>
                  <snm>Brandt</snm>
                  <fnm>RM</fnm>
               </au>
               <au>
                  <snm>Kuppen</snm>
                  <fnm>PJ</fnm>
               </au>
               <au>
                  <snm>Offringa</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Warnarr</snm>
                  <fnm>SO</fnm>
               </au>
               <au>
                  <snm>van de Velde</snm>
                  <fnm>CJ</fnm>
               </au>
               <au>
                  <snm>Melief</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>Hum Immunol</source>
            <pubdate>1997</pubdate>
            <volume>53</volume>
            <fpage>81</fpage>
            <lpage>89</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1016/S0198-8859(97)00032-3</pubid>
                  <pubid idtype="pmpid" link="fulltext">9127151</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B31">
            <title>
               <p>Vaccination with predesignated or evidence-based peptides for patients with recurrent gynecologic cancers</p>
            </title>
            <aug>
               <au>
                  <snm>Tsuda</snm>
                  <fnm>N</fnm>
               </au>
               <au>
                  <snm>Mochizuki</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Harada</snm>
                  <fnm>M</fnm>
               </au>
               <au>
                  <snm>Sukehiro</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Kawano</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Yamada</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Ushijima</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Sugiyama</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Nishida</snm>
                  <fnm>T</fnm>
               </au>
               <au>
                  <snm>Yamana</snm>
                  <fnm>H</fnm>
               </au>
               <au>
                  <snm>Itoh</snm>
                  <fnm>K</fnm>
               </au>
               <au>
                  <snm>Kamura</snm>
                  <fnm>T</fnm>
               </au>
            </aug>
            <source>J Immunother</source>
            <pubdate>2004</pubdate>
            <volume>27</volume>
            <fpage>60</fpage>
            <lpage>72</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1097/00002371-200401000-00006</pubid>
                  <pubid idtype="pmpid" link="fulltext">14676634</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B32">
            <title>
               <p>Identification of new epitopes from four different tumor-associated antigens: recognition of naturally processedepitopes correlates with HLA-A*0201-binding affinity</p>
            </title>
            <aug>
               <au>
                  <snm>Keogh</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Fikes</snm>
                  <fnm>J</fnm>
               </au>
               <au>
                  <snm>Southwood</snm>
                  <fnm>S</fnm>
               </au>
               <au>
                  <snm>Celis</snm>
                  <fnm>E</fnm>
               </au>
               <au>
                  <snm>Chesnut</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Sette</snm>
                  <fnm>A</fnm>
               </au>
            </aug>
            <source>J Immunol</source>
            <pubdate>2001</pubdate>
            <volume>167</volume>
            <issue>2</issue>
            <fpage>787</fpage>
            <lpage>796</lpage>
            <xrefbib>
               <pubid idtype="pmpid" link="fulltext">11441084</pubid>
            </xrefbib>
         </bibl>
         <bibl id="B33">
            <title>
               <p>Efficient identification of novel HLA-A(*)0201-presented cytotoxic T lymphocyte epitopes in the widely expressed tumor antigen PRAME by proteasome-mediated digestion analysis</p>
            </title>
            <aug>
               <au>
                  <snm>Kessler</snm>
                  <fnm>JH</fnm>
               </au>
               <au>
                  <snm>Beekman</snm>
                  <fnm>NJ</fnm>
               </au>
               <au>
                  <snm>Bres-Vloemans</snm>
                  <fnm>SA</fnm>
               </au>
               <au>
                  <snm>Verdijk</snm>
                  <fnm>P</fnm>
               </au>
               <au>
                  <snm>van Veelen</snm>
                  <fnm>PA</fnm>
               </au>
               <au>
                  <snm>Kloosterman-Joosten</snm>
                  <fnm>AM</fnm>
               </au>
               <au>
                  <snm>Vissers</snm>
                  <fnm>DC</fnm>
               </au>
               <au>
                  <snm>ten Bosch</snm>
                  <fnm>GJ</fnm>
               </au>
               <au>
                  <snm>Kester</snm>
                  <fnm>MG</fnm>
               </au>
               <au>
                  <snm>Sijts</snm>
                  <fnm>A</fnm>
               </au>
               <au>
                  <snm>Drijfhout</snm>
                  <fnm>JW</fnm>
               </au>
               <au>
                  <snm>Ossendorp</snm>
                  <fnm>F</fnm>
               </au>
               <au>
                  <snm>Offringa</snm>
                  <fnm>R</fnm>
               </au>
               <au>
                  <snm>Melief</snm>
                  <fnm>CJ</fnm>
               </au>
            </aug>
            <source>J Exp Med</source>
            <pubdate>2001</pubdate>
            <volume>193</volume>
            <fpage>73</fpage>
            <lpage>88</lpage>
            <xrefbib>
               <pubidlist>
                  <pubid idtype="doi">10.1084/jem.193.1.73</pubid>
                  <pubid idtype="pmpid" link="fulltext">11136822</pubid>
               </pubidlist>
            </xrefbib>
         </bibl>
         <bibl id="B34">
            <aug>
               <au>
                  <snm>Lachenbruch</snm>
                  <fnm>PA</fnm>
               </au>
            </aug>
            <source>Discriminant Analysis</source>
            <publisher>New York: Hafner</publisher>
            <pubdate>1975</pubdate>
         </bibl>
         <bibl id="B35">
            <title>
               <p>Classifier technology and the illusion of progress</p>
            </title>
            <aug>
               <au>
                  <snm>Hand</snm>
                  <fnm>DJ</fnm>
               </au>
            </aug>
            <source>Statist Sci</source>
            <pubdate>2006</pubdate>
            <volume>21</volume>
            <fpage>1</fpage>
            <lpage>14</lpage>
            <xrefbib>
               <pubid idtype="doi">10.1214/088342306000000060</pubid>
            </xrefbib>
         </bibl>
      </refgrp>
   </bm>
</art>
