GENE EXPRESSION ANALYSIS OF THE MHC GENE COMPLEX AFTER g-IFN INDUCTION

 

 

 

 

 

 

 

 

 

Abrar Khan

Genomics and Bioinformatics

Fall 2000

 

 

 

 

 

 

 

 

 

 

 

 

 

            The unprecedented increase in computer processing power over the last decade has enabled the field of Genomics and Bioinformatics to very rapidly develop into an extremely useful tool that can now address previously unanswerable questions in biology.  One of the questions that this author’s laboratory has been working on is the molecular mechanism of antigen processing and presentation by MHC Class II.  More specifically, as antigen proteins enter the endocytic pathway, they need to be degraded to peptide fragments that are suitable for association with the peptide-binding groove on the surface of MHC class II (1).  Many proteases, such as cathepsins B, D, S, L, and non-cysteine proteases, have been implicated in the degradation of intact antigen proteins (2).  Recently, however, a novel protein involved in protein antigen degradation was isolated and sequenced – Gamma interferon inducible lysosomal thiol reductase (GILT) (3).  Since proteins have disulfide bonds, its presence had been inferred for a while, but only recently was it identified and characterized.  However, this thiol reductase is different from other known thiol reductases in some important ways.   First, members of the thioredoxin family usually share a common active site (WCGH/PCK) – this one does not.  Second, known members of the family function optimally at pH 7 – this one functions at acidic pH 4.5 (3).  Third, it is the first one found in the endocytic/endosomal pathway – the others are found in the cytosol or endoplasmic reticulum.  Thioredoxin, a member of the thiol reductase family, depends on other accessory molecules, such as thioredoxin reductase and NADPH to carry out its function.  The fact that known thioredoxin family members require accessory molecules for optimal function, and the fact that GILT demonstrates major differences compared to known members of the thioredoxin family, would strongly suggest that other uncharacterized proteins may exist that function in association with GILT, and may be concomitantly up-regulated with GILT.  These proteins can now potentially be identified by analysis of the transcriptosome using Bioinformatics. 

 

             Two proteins up-regulate components of the antigen processing system – Class II transactivator (CIITA) and gamma interferon (g-IFN) (3).  GILT is up-regulated by g-IFN, as are class II MHC molecules, the associated invariant chain, and HLA-DM, which catalyzes peptide loading in the MIIC, the putative peptide-loading compartment (4).  The fact that up-regulation of these proteins is known will be extremely helpful in the experiments proposed below since they can be used as positive controls. 

 

            It is also important to understand the MHC gene complex in humans to accurately plan and interpret the experiments to be proposed.  The MHC gene complex, found on chromosome 6, extends over approximately 4 centimorgans of DNA, or about 4 x 106 base pairs, although recent experiments have suggested that it may extend over approximately 7 x 106 bps (1).  Whereas the overwhelming majority of the proteins involved in antigen processing and presentation are found on this chromosome and within the defined MHC gene complex, there are nevertheless certain crucial proteins, such as beta-2-microglobulin and invariant chain, that are found on other chromosomes (1).  The experiments proposed below however, which attempt to identify proteins up regulated in association with GILT, will focus primarily on the MHC gene complex.  If no results are obtained with this approach, then the experimental paradigm can be repeated with areas surrounding the gene location of beta-2-microglobulin and invariant chain. 

 

            The experiments proposed below would not be possible without the use of high-density synthetic oligonucleotide arrays, where nucleotides of desired length and sequence can be synthesized. (5).  These can be designed based on sequence information alone, in this particular case, the sequence of the MHC region on chromosome 6 obtained from the human genome project.  Non- overlapping oligonucleotides that span the length of the MHC gene complex, approximately 7 x 106 bps, will be placed on the chip.  The optimal length of an oligonucleotide that will be a unique identifier can be calculated using the formula  (100.6x =10y), where 10y = # of base pairs, and x = the number of nucleotides necessary to uniquely identify a sequence within a series of 10y base pairs.  So, for this experiment, we have 7 x 106 bps, this can be approximated to 107 bps (approximation should always be upwards to the next order of magnitude).  So 0.6x = 7, or x = 11.6, or approximately 12 (again approximation should be upwards to enhance unique identification).  Thus we will need non-overlapping 12mers that span the entirety of the MHC gene complex on the chip.   This will of course be done in all six open reading frames.  In addition, a factor of redundancy will be included to enhance the accuracy of the results (5).  This means multiple independent detectors on the chip for the same oligonucleotide sequence.  This consists of one perfect match, and multiple mismatches, and allows discrimination between real signals and those due to non-specific or semi-specific hybridization.  Additionally, it enhances the signal to noise ratio and improves the overall accuracy.  

 

            Once the chip has been manufactured, the experiment can begin, and the overall experimental approach will be as follows (the intricate details of the experimental procedure are beyond the scope of this presentation which is limited to 1000 words).  The assumption behind this project would be that GILT, and proteins that assist in the functioning of GILT, are concomitantly induced by g-IFN.  The second assumption is that mRNA would in fact hybridize with the single stranded exons in genomic DNA.  We will use Human B-lymphoblastoid cell lines (B-LCL), which were originally used for isolation and identification of GILT (3).  They will be incubated with medium, or g-IFN as described (3), and the messenger RNA will then be isolated at 0, 6, 12, and 24 hours.  Following analysis of polyadenylated and total mRNA, the RNA will be amplified, labeled, and hybridized to the custom-built oligonucleotide arrays described above.  The expression of mRNA will then be quantitated as described (6), and a list of oligonucleotides that are positive in g-IFN cultured cells, but not in medium cultured cells, will be identified.  Known oligos, such as those covering the MHC-II alpha and beta sequences, will be used as positive controls.  The level of increase in mRNA synthesis of such proteins known to be crucial for antigen processing and presentation will indicate what level of mRNA should be considered significant.    From the list of oligos positive only in the g-IFN treated cells, those with a level of increase in mRNA similar to that of the positive controls (MHC II alpha and beta, invariant chain, HLA-DM) will then be identified, and will be considered candidates for unknown proteins that are induced in conjunction with GILT.  It would be somewhat difficult, but possible, to deduce the precise amino acid composition of these proteins from genomic DNA because of the widely spaced exons.  However, by analyzing contiguous positive oligos, the approximate location of the protein on the MHC gene complex can be determined.  Using this location, and knowing the sequence of this location, the gene can be cloned and the protein synthesized and sequenced in vitro.  Using this procedure repeatedly, proteins that are synthesized in high quantities in association with GILT following g-IFN induction can be identified.  Whether these newly identified proteins actually assist GILT in its enzymatic function will subsequently have to be determined, but is not part of this experimental paradigm. 

           

            The methods now available to the basic scientist by the merging of computers and Genomics are very powerful, and give us insight into vast amounts of information at the same time.  This was not possible just a few years ago, and with the pace ever increasing, quantum leaps in the progress of molecular biology can be expected in the next few decades.

 

 

 

 

 

 

 

 

Reference List

 

    1.    Janeway Jr., C. A., P. Travers, M. Walport, and J. D. Capra. Immunobiology:the immune system in health and disease. Elsevier Science Ltd/Garland Publishing, New York.

    2.    Villadangos, J. A., R. A. Bryant, J. Deussing, C. Driessen, A. M. Lennon-Dumenil, R. J. Riese, W. Roth, P. Saftig, G. P. Shi, H. A. Chapman, C. Peters, and H. L. Ploegh. 1999. Proteases involved in MHC class II antigen presentation. Immunol.Rev 172:109-120.

    3.    Arunachalam, B., U. T. Phan, H. J. Geuze, and P. Cresswell. 2000. Enzymatic reduction of disulfide bonds in lysosomes: characterization of a gamma-interferon-inducible lysosomal thiol reductase (GILT). Proc.Natl.Acad.Sci.U.S.A 97:745-750.

    4.    Mach, B., V. Steimle, E. Martinez-Soria, and W. Reith. 1996. Regulation of MHC class II genes: lessons from a disease. Annu.Rev Immunol. 14:301-331.

    5.    Lipshutz, R. J., S. P. Fodor, T. R. Gingeras, and D. J. Lockhart. 1999. High density synthetic oligonucleotide arrays. Nat.Genet. 21:20-24.

    6.    Harkin, D. P., J. M. Bean, D. Miklos, Y. H. Song, V. B. Truong, C. Englert, F. C. Christians, L. W. Ellisen, S. Maheswaran, J. D. Oliner, and D. A. Haber. 1999. Induction of GADD45 and JNK/SAPK-dependent apoptosis following inducible expression of BRCA1. Cell 97:575-586.