In all theories of Junk one must tread carefully and differentiate between cause and consequence of its existence. The most functionless theory of Junk DNA claims that it is just that, a generous juxtaposition of non-functional junk. This useless DNA grows in the genome until the costs of replicating it become too great to maintain. Thus organisms that develop at a slower rate tolerate more junk and use it to their advantage to slow down the rate of development via increased cell cycle length.  
Some scientists have posited that junk has only a passive purpose.  The total genome size is related to a number of organismal and cellular level traits, thereby suggesting that there is a selective advantage in larger genomes, including those that result from junk DNA filler. One theory claims that Junk absorbs harmful chemicals that could affect genuine genes. This has been refuted; in fact, larger genomes are subjected to more physical and chemical damage, outweighing the bodyguard function. Moreover, it has been shown that most mutations that reduce viability occur in non-coding DNA, possibly indicating that Junk plays an active role in the genome.
Another postulated purpose hypothesizes Junk as a sink for DNA-tropic proteins, thereby buffering the effect of intracellular solute concentrations on nuclear machinery. This energy independent function of Junk could allow for a reduced basal metabolic rate and therefore an evolutionary niche.
Researchers studying Junk tend to favour searching through repeat elements for function. Repeat elements are thought to be involved in chromosomal integrity. Many are also found in the heterochromatin and may be involved in centromeric activity and chromosome pairing, both have relevance to evolutionary divergence and speciation through manipulation of chromosomes. The Alu sequence comprising the largest class of SINEs  is just one of the many transposon elements that make up roughly 35% of our genome. Past studies have indicated no selective pressure for the non-functional Alu repeats. A recent theory suggests that hypomethylation of Alus in sperm, compared with the female germ line implies that Alus (and their mRNAs through the action of PKR) may be involved in signaling events in early embryogenisis. Using computers it is possible to determine consensus sequences for Alu insertions sites, which would help resolve their functions. Moreover, one could use phylogenic tools to confirm the selective placement of Alu repeats across species.
Additionally, non-repetitive Junk may also have functions that can be elucidated. It was determined, that non-coding DNA has a high GC concentration. Theoretically ORFs within these regions would have higher adaptive plasticity, as they could use GC rich DNA to vary their final products for selective advantage via alternative splicing.  Junk sequences may also function as cis-acting transcriptional regulators.  One theory posits that base pair distribution in non-coding DNA has an effect on gene transcription through a thermodynamic process. Moreover, the movement of transposable Junk results in a dynamic system of gene activation, which allows for the organism to adapt to its environment without redesigning its hardwired system of gene activators.
Even random Junk may not be just random junk. Powerful algorithms are currently searching for homologous sequences in Junk to find promoter functions in our DNA netherworlds. These algorithms are designed to detect sequences that are also highly conserved across evolution, thus probably being functional.   Statistical models have also been used to study the randomness of spreading and loss of repeat sequences throughout the genome. 
Similarly, researchers, using a variety of statistical techniques have deduced long-range correlations in Junk DNA. This may be attributable to the new found functions of many intergenic non-coding regions, which include replication, chromosome segregation, recombination, chromosome stability and interaction with the nuclear matrix. All of these require the ‘high redundancy low information’ sequence inherent in Junk.
Obviously, there are functions hidden in Junk DNA. While previously genetics or biochemistry had been the main thrust of this research, I believe that bioinformatical resources will be much more powerful. The same tools that have been used in the past to delve into the secrets of coding DNA can be used on Junk. A database of Junk sequences should be set up so that one can perform informatics experiments on the data; as there should be few biases and redundancies in a database of Junk. BLAST searches, sequence analysis, phylogenic analysis, even secondary structure analysis.   Molecular modeling may define docking sites in Junk secondary structure for DNA associated proteins binding. Junk specific alignment programs will have to be written to take into account their high mutation rate. Furthermore, other statistical analysis along the same lines as the failed Zipf’s law experimentation, will definitely prove extremely useful in uncovering Junk’s hidden treasures. Moreover, computational analysis of Junk will reveal other useful information regarding evolutionary progress and evolutionary knowledge is the key to understanding our biology.
 Theodosius Dobzhansky (as heard at the Genetics Graduate Student Seminar)
 This pejorative name for the silent majority of DNA was coined by Suumu Ohno in the early 70’s See:
Kuska B Journal of the National Cancer Institute 90, 1032 (1998)
 See: Brosius J Gould SJ PNAS 89, 10706 (1992) for failed attempt to conjure up new nomenclature
 As opposed to non-coding RNA which is another topic unto itself.
See for example Askew DS, Xu F Histology and Histopathology 14, 235 (1999)
 These are not the same. See: Ohno S Yomo T Electrophoresis 12, 103 (1991)
 Orgel et al Nature 288, 645 (1980)
 Dover G Doolittle WF Nature 288, 646 (1980)
 Tycowski KT et al Genes Development 7A, 1176 (1993) For example of a coding gene found in intronic
region of a different gene
 Hall DL et al Canadian Journal of Statistics 26, 455 (1998) Found evidence using sequence alignment
programs and clustering algorithms, of non randomness in introns but does not speculate as to the function
 Introns may have a second gene regulatory control mechanism that has yet to be worked out see Mattick
J Current Opinions in Genetics and Development 4, 823 (1994)
 Moore, MJ Nature 379, 402 (1996) snoRNAs are encoded by introns also see article Tycowski et al
Nature 379, 464 (1996)
 Kuska B Journal of the National Cancer Institute 90, 1125 (1998) Introns are 33% repetitive
 Gardiner K Gene 205 39 (1997) Many lower eukaryotes have the same gene with the same function
 Nowak R Science 263, 608 (1994)
Satellites-repeats at ends and centers of chromosomes
UTR-untranslated regions DNA that is transcribed into RNA but not translated
SINEs Short interspersed elements i.e. Alu
LINEs long interspersed elements
HnRNA heterogeneous nuclear RNA 25% is immature mRNA the other 75% are a mystery
 The ORFans of the yeast genome are thought to be non coding DNA as well.
See Mackiewicz et al Nucleic Acids Research 27, 3503 (1999)
 Nowak R Science 263, 608 (1994)
 Provata A Almirantis Y Physica A 247, 482 (1997) (emphasis mine)
 In Viruses see for example: Maki et al Journal of Genetic Virology 77, 453 (1996)
 In Bacteria see for example: Higgens CF et al Gene 72, 3 (1988)
 In Plants see for example: Kubis S Annals of Botany 82, 45 (1998)
 See editorial content of Koonin who puts it on the top ten list of things to do for bioinformatics.
Koonin Bioinformatics 15, 265 (1999)
 See talk next week EM Rubin (Dec 14, 1999)
 Edgell et al Current Biology 6, 385 (1996)
 via mechanisms such as transposition, slippage, gene conversion, unequal crossing over ect. See
Vinogradov AE Journal of Theoretical Biology, 193, 197 (1993)
 Pagel M et al Proceedings of the Royal Society of London Biological Sciences 249, 119 (1992)
 See Orgel LE et al Nature 288, 645 (1980) who claim the opposite effect on cell development
 There are many papers that the lack of Junk or messed up Junk actively results in disease. See for
example Epplen JT et al Cytogenic Cell Genetics 80, 75 (1998)
 Hsu, TS Bioessays 14, 785 (1992)
 Tachida H Japanese Journal of Genetics 68, 549 (1993)
 Vinogradov AE Journal of Theoretical Biology 193, 197 (1993)
 Dimitri P, Junakovic N Trends in Genetics 15, 123 (1999)
And thus an important function that must be conserved.
 A 282 nt consensus sequence See: Schmidt CW Progress in Nucleic Acids Research and Molecular
Biology 53, 283 (1996)
 Other types include the Mariner which is many orders of magnitude less in copy number See:
Robertson HM Martos R Gene 205, 219 (1997)
 They are present in primates at 5x105-1x106 copies per cell See: Vansent G, Reynolds WF PNAS, 92, \
 Denninger PL Batzer MA Molecular Genetic Metabolism 67, 183 (1999)
 Schmidt CW Nucleic Acids Research 26, 4541 (1998)
 Raghavan S et al Journal of Molecular Evolution 45, 485 (1997)
 Guigo R, Fickett JW Journal of Molecular Biology 13, 51 (1995)
 Possibly what is meant in Jain HK Nature 288, 647 (1980)
 Brahmachari SK el al Gene 190, 17 (1997)
 Lipman DJ Nucleic Acids Research 15, 3580 (1997) non coding regions may be involved in mRNA
stability again influencing the post transcriptionally the function of genes.
 Sandler U, Wyler A Journal of Theoretical Biology 193, 85 (1998)
 See Zuckerkandl E Gene 205, 323 (1997) who as a major proponent of functional junk DNA, also
proposes that Junk is involved in sectorial gene repression
 Ohler U et al Bioinformatics 15, 362 (1999) as an example of such a search
 Duret L, Bucher P Current Opinions in Structural Biology 7, 399 (1997)
 Donehower LA et al Nucleic Acids Research 17, 699 (1989)
 Ohta T Nature 292, 648 (1981)- occur randomly (Cell control independent?)
 Charlesworth B et al Nature 371, 215 (1994) non random loss (Cell control?)
 Mantegna RN et al Physics Review Letters 73, 3169 (1994)
 Flam F Science 266, 1320 (1994) Many papers on the subject cite this news article.
 i.e. Shannon’s redundancy function. This theory states that a language can lose words or letters and still
be decipherable, Shannon computed this redundancy using the concept of entropy.
 Simply that, if one were to create a histogram containing the total amount of words in a language and
their occurrence, the arrangement in rank order would be linear on a double logarithmic scale with a slope of -z. This is the case for all natural languages
 See for interesting usage of this phenomenon, S Singh The Code Book : The Evolution of Secrecy from Mary, Queen of Scots to Quantum Cryptography, Doubleday 1999
 Konopka AK, Marindale C Science 268, 789 (1995)
 Bonhoeffer et al Science 271, 14 (1996) and Bonhoeffer et al Physical Review Letters 76, 1977 (1996)
Claims that the results are do solely to unequal nucleotide frequencies in coding vs. non coding DNA
 Israekoff NE et al Physical Review Letters 76, 1976 (1996) claimed that there was no control study
backing up their results. Did their own study and found that could not differentiate using Zipf, between language and power-law noise
 Voss RF Physical Review Letters 76, 1978 (1996) Claims that the paper ignores the fact that while
Zipf's law exists, it provides no useful information about a language. As well random sequences are also found to follow Zipf’s law.
 Tsonis AA et al Journal of Theoretical Biology184, 25 (1997) As opposed to the other short letters this
paper is a little more in depth claiming both biological and statistical proof
 Attard GS et al Europhysics Letters 36, 391 (1996) The paper is somewhat misleading as the reader
might mistake it for supporting Mantegna et al. But the paper’s conclusions are that any language
found in Junk DNA is that of opportunistic elements that exploit the structure of Junk DNA and
not the sequences themselves,
 Chatizidmitriou-Dreismann CA et al Nucleic Acids Research 24, 1676 (1996) used computer
simulations on both natural and artificial sequences to arrive at their conclusions
 See also Stanley et al Nuovo Cimento Della Societa Itlaiana Di Fisca D 16, 1339 (1994) for support of
the Zipf law theory.
 Mantegna et al Physical Review Letters76, 1980 (1996)
 Chechetkin VR, Lobzin VV Physics Letters A 222, 354 (1996) Supports the usage of the Shannon
Redundancy Function as proof that there is information in Junk DNA although what it is, is unknown
 for example the expansion modification system
 Li W, Kaneko K Europhysics Letters 17, 655 (1992) and Li W et al Physica D 75, 392 (1994)
 Frontali C, Pizzi E Gene 232, 87 (1999)
 This is a major problem in coding databases. See Altschul, SF et al Nature Genetics 6, 119 (1994)
 Psi Blast is probably more useful when dealing with Junk as it allows for gaps, something that is
probably likely as Polymerase slippage is responsible for a lot of Junk Altschul SF Nucleic Acids Research 25, 3389 (1997)
 See McMurry CT PNAS 96, 1823 (1999) for the current debate on different possibilities in DNA
 Lakhoita SC Indian Journal of Biochemistry and Biophysics 33, 93 (1996)
 Morozov Syu et al Journal of Biomolecular Structural Dynamics 11,837 (1994) for affects of DNA
structure on replication
 I would assume that harnessing the power of the NSA’s cipher cracking computers and analysts would
be very useful in this field See http://www.nsa.gov:8080/programs/msp/grants.html
 See for example Almirantis Y Journal of Theoretical Biology 196, 297 (1999) and Elder JF, Turner BJ
Quarterly Review of Biology 70, 297 (1995)
 See first endnote