|Exact peptide match|
Celiac Disease (CD) Novel Protein Risk Assessment Tool
The Food Allergy Research & Resource Program (FARRP) in the Department of Food Science & Technology, University of Nebraska, has added a new bioinformatics tool to identify Exact Peptide matches between the amino acid sequence of a query protein and the 1,016 naturally occurring, mutated or deamidated (Gln converted to Glu by tissue transglutaminase) peptides from wheat and wheat relatives (barley, rye and two proteins from oats) that have been demonstrated to elicit celiac disease or activate MHC Class II restricted T cells of subjects with celiac disease. The basis of specificity is due to antigen presentation of these peptides by genetically inherited specific Major Histocompatibility class II receptors HLA DQ 2.5 or DQ 8 receptor variants that activate T cells in affected individuals. Proteins derived from the wheat subfamily (Pooideae) of the grass family (Poaceae), which are considered for use as Novel food ingredients or introduced into other species of food crops through genetic engineering may pose a risk for those with celiac disease if they contain celiac active peptides. The database provides a simple screening tool to identify those proteins that might pose a risk of eliciting celiac disease, or are sufficiently similar to CD eliciting proteins/ peptides that further testing would be reasonable to demonstrate safety for consumption by affected individuals. In addition to the Exact Peptide match, the linked Celiac Disease database also includes a FASTA algorithm to compare the query protein against 68 celiac inducing proteins that are the sources of the peptides and list of 53 published references supporting the inclusion of peptides and proteins in the database. Proteins lacking any identity match to the 1,016 peptides are not likely to trigger celiac disease.
The FASTA comparison to the 68 proteins has not (yet) been validated sufficiently to set absolute thresholds of concern for celiac disease. However, based on preliminary searches with proteins from rice, sorghum, maize and other food sources that are considered safe for those with celiac disease, identity matches of less than 45 percent over FASTA alignment or the alignment is less than one-half of a CD protein, or the E score is greater than 1 x 10-15th using this database are unlikely to present a risk of inducing celiac disease. Certainly query sequences lacking an Exact Peptide match and lacking this defined identity match of >45% to the 68 proteins in this database should be viewed as representing a low risk to those with celiac disease.
Note: This database was available for public use on 14 February, 2012. The peptide and protein entries as well as alignment tools will be evaluated further before May 2013.
Celiac Disease, also known as gluten-sensitive enteropathy or celiac sprue, is a genetically linked inflammatory immune disease with varying severity in an estimated 0.5% to 1.5% of the population in various geographies. (A, B) Affected individuals experience symptoms after the consumption of food containing proteins from wheat, barley, rye and possibly oats and other grass family grains closely related to wheat.(C) The primary target organ is the upper small intestine and symptoms are usually associated with the digestive tract with chronic diarrhea, abdominal pain, cramping, bloating or irritable bowel syndrome.(D) However, general nutritional deficiency, failure to thrive, mouth ulcers, and fatigue are experienced by many subjects. Continuing exposure to glutens leads to increased immune response, increased expression of tissue transglutaminase and inflammation that leads to flattening of the villi in the small intestine, erosion of the mucosal epithelium and loss of absorptive capacity.(E) Vitamin deficiency is common. Loss of calcium density in bones is associated with the disease and there is an increased risk of developing adenocarcinoma of the small intestine and T-cell lymphoma.
The specificity of the disease is determined by T lymphocytes that bind to specific native or deamidated peptides of certain wheat-family glutens (glutenins and gliadins) that are presented in the antigen presenting groove of MHC class II, HLA DQ2.5 or DQ8, leading to activation of CD4 T cell response driven inflammation involving macrophages, NK cells and other inflammatory cells that cause tissue destruction.(F) Interestingly, while nearly 20% of individuals in North America and Europe express HLA DQ2, and ~ 95% of those with celiac disease have HLA DQ2.5 (and the others DQ8), only about 1% of the population has been diagnosed with celiac disease. Thus, other unknown factors are also very important determinants leading to celiac disease or tolerance. Many speculate that there is a much higher percent of the population that simply have subtle symptoms or are undiagnosed, but there is little hard evidence to support much greater prevalence of disease than 1% of the global population.
Avoiding the proteins that stimulate the immune response is the only effective treatment for those with celiac disease. Since these cereal grains are commonly used not only as major carbohydrate and protein food sources in breads and pasta, but processed wheat and wheat relatives are also used as functional food ingredients in many restaurant and processed foods, making dietary avoidance complex. Protection of those with celiac disease requires separation of commodities that are intended for “gluten-free” foods, from the source of the commodity, through processing and packaging. Gluten-free foods must also be labeled clearly and accurately in order to protect the most sensitive affected consumers. Food companies who produce gluten-free foods work hard to source commodities from suppliers with minimal (no) contamination. Interestingly, there is evidence that gluten-like proteins in oats do not affect most celiac patients if the oats are pure and free-from wheat, barley and rye grain. However, oats are often produced on farms that also grow wheat, or wheat is grown in neighboring farms, producing potential source of contamination. Further, farming equipment and shipping containers (trucks, trains and ships) would often carry wheat, barley or rye and can serve as sources of contamination. In addition, commodity processing and food manufacturing facilities are often used for products that contain wheat, so accurate segregation is difficult. Consumers with celiac disease must trust food producers to accurately represent foods as being free from gluten and there are stringent standards for claiming “gluten-free” in many industrialized countries.
Genetically Modified Organisms and Novel Food Ingredients
In order to help ensure that those with gluten-sensitivity would not be at greater risk of exposure, regulatory guidelines for genetically modified crops recommend that the proteins encoded by genes transferred from wheat and wheat relatives into different food sources (e.g. rice, maize, potato), should be evaluated regarding their capacity to elicit celiac disease. (G) FARRP believes that the current Celiac Disease Peptide and Protein searchable database provides an efficient screening tool to determine whether additional tests (e.g. laboratory T cell activation tests using samples from individuals with CD or performing tissue biopsy challenge or clinical challenge from volunteers with CD) would need to be undertaken to demonstrate safety of a new protein. Proteins isolated from wheat and wheat relatives for use as novel food ingredients could also be assessed using this computer-comparison.
Our opinion is that proteins that do not contain an exact peptide match to those identified in this database are unlikely to induce symptoms in those with celiac disease. A FASTA search of the query protein against the 68 proteins in the Celiac Database may also prove predictive, primarily as a negative screen, or as a safety check to evaluate overall protein sequence similarity to CD eliciting proteins. Query sequences that do not show any celiac peptide identical match and also do not have a FASTA match of at least 45% identity in alignment to a celiac protein over at least one-half of the length of the celiac inducing protein in our celiac database, with an E score smaller than 1 x 10-15th, are highly unlikely to elicit symptoms in those with celiac disease. Proteins with matches more significant than those criteria (>45% identity in an alignment of at least 50% of the length of the celiac protein, with an E score less than 1e-015) should probably be tested to demonstrate a lack of eliciting potential even if they do not contain an exact match to a celiac peptide. The additional testing might include in vitro lymphocyte activation tests using whole white-cell preparations from a number of individuals with celiac disease, or in vitro challenges of cultured small intestinal biopsy samples from a number of celiac affected individuals, in lieu of in vivo challenges of CD volunteers.
Compilation and Review of FARRP Celiac Disease Peptide and Protein database
A graduate student, Plaimein Amnuaycheewa, MS, compiled the set of celiac active peptides from his review of approximately 100 publications describing proteins and peptides (1,016) that have been tested for T cell activation potential or celiac enteropathy in peer reviewed publications. The list of 68 celiac associated wheat-related proteins was compiled as representing proteins containing one or more of the peptides. Our bioinformatician, John Wise, compiled the data and structured the database and search routines. The dataset was reviewed by Postdoctoral fellow Afua Ofori-Anti (Afua Tetteh), PhD and Richard Goodman, PhD, manager of AllergenOnline.org. The Celiac Disease database was first released on 14 February, 2012. The effort was funded primarily by FARRP and partly by the six biotechnology companies that fund the FARRP AllergenOnline.org database. The database and bioinformatics methods will be reviewed further and updated by May, 2013.
- A. Biagi F, Klersy C, Balduzzi D, Corazza GR. 2010. Are we not over-estimating the prevalence of coeliac disease in the general population? Annals of Medicine 42:557-561. PMID:20883139
- B. Abadie V, Sollid LM, Barreiro LB, Jabri B. 2011. Integration of genetic and immunological insights into a model of celiac disease pathogenesis. Annual Reviews in Immunology. 29:493-525. PMID 21219178
- C. Tye-Din JA, Stewart JA, Dromey JA, Beissbarth T, van Heel DA, Tatham A, Hederson K, Mannering SI, Gianfrani C, Jewell DP, Hill AVS, McCluskey J, Rossjohn J, Anderson RP. 2010. Comprehensive, quantitative mapping of T cell epitopes in gluten in celiac disease. Science Translational Medicine 2(41):41ra51. PMID:20650871
- D. Scanlon SA, Murray JA. 2011. Update on celiac disease—etiology, differential diagnosis, drug targets, and management advances. Clinical and Experimental Gastroenterology. PMID:22235174
- E. Sollid LM, Jabri B. 2011. Celiac disease and transglutaminase 2: a model for posttranslational modification of antigens in HLA association in the pathogenesis of autoimmune disorders. Current Opinion in Immunology 23:732-738. PMID: 21917438.
- F. Kagnoff MF. 2007. Celiac disease: pathogenesis of a model immunogenetic disease. Journal of Clinical Investigation. 117:41-49. PMID:17200705
- G. Codex (2003). Codex Alimentarius Guidelines. Alinorm 03/34, Joint FAO/WHO Food Standards Programme, Twenty-Fifth Session (FA), Rome, Italy