
“In Biology nothing makes sense except in the light of evolution.”[1] What about Junk DNA[2]?  Non-coding DNA[3] [4] (also known as selfish[5], ignorant, parasitic[6] and incidental DNA[7]) includes introns [8] [9] [10] [11] [12] [13], transposable elements, pseudogenes, repeat elements, satellites, UTRs hnRNAs LINEs SINEs,[14] as well as unidentified junk[15] and makes up approximately 97% of the human genome.[16] Some scientists were so overwhelmed by the amount of non-coding DNA, that they referred to the genome as  “…a collection of non-coding regions interrupted by small coding regions.” [17]  Junk DNA is ubiquitous and extends to all forms of life, making it an exciting evolutionary phenomenon.[18] [19] [20] Thus, with the heralded human genome project on the horizon and the powerful tools of the bioinformatist, the prospect of shedding light on this problem is very appealing. [21] [22]


In all theories of Junk one must tread carefully and differentiate between cause and consequence of its existence.[23] The most functionless theory of Junk DNA claims that it is just that, a generous juxtaposition of non-functional junk. This useless DNA grows[24] in the genome until the costs of replicating it become too great to maintain.  Thus organisms that develop at a slower rate tolerate more junk and use it to their advantage to slow down the rate of development via increased cell cycle length. [25] [26]


Some scientists have posited that junk has only a passive purpose. [27] The total genome size is related to a number of organismal and cellular level traits, thereby suggesting that there is a selective advantage in larger genomes, including those that result from junk DNA filler. One theory claims that Junk absorbs harmful chemicals that could affect genuine genes.  This has been refuted; in fact, larger genomes are subjected to more physical and chemical damage, outweighing the bodyguard function.[28]  Moreover, it has been shown that most mutations that reduce viability occur in non-coding DNA,[29] possibly indicating that Junk plays an active role in the genome.


Another postulated purpose hypothesizes Junk as a sink for DNA-tropic proteins, thereby buffering the effect of intracellular solute concentrations on nuclear machinery.  This energy independent function of Junk could allow for a reduced basal metabolic rate and therefore an evolutionary niche.[30]


Researchers studying Junk tend to favour searching through repeat elements for function. Repeat elements are thought to be involved in chromosomal integrity.  Many are also found in the heterochromatin and may be involved in centromeric activity and chromosome pairing, both have relevance to evolutionary divergence and speciation through manipulation of chromosomes.[31] The Alu sequence[32] comprising the largest class of SINEs[33] [34] is just one of the many transposon elements that make up roughly 35% of our genome. Past studies have indicated no selective pressure for the non-functional Alu repeats.[35]  A recent theory suggests that hypomethylation of Alus in sperm, compared with the female germ line implies that Alus (and their mRNAs through the action of PKR) may be involved in signaling events in early embryogenisis.[36] Using computers it is possible to determine consensus sequences for Alu insertions sites, which would help resolve their functions.  Moreover, one could use phylogenic tools to confirm the selective placement of Alu repeats across species.


Additionally, non-repetitive Junk may also have functions that can be elucidated.  It was determined, that non-coding DNA has a high GC concentration[37].  Theoretically ORFs within these regions would have higher adaptive plasticity, as they could use   GC rich DNA to vary their final products for selective advantage via alternative splicing.[38]  [39]  Junk sequences may also function as cis-acting transcriptional regulators.[40] [41] One theory posits that base pair distribution in non-coding DNA has an effect on gene transcription through a thermodynamic process.  Moreover, the movement of transposable Junk results in a dynamic system of gene activation, which allows for the organism to adapt to its environment without redesigning its hardwired   system of gene activators.[42][43]


Even random Junk may not be just random junk.  Powerful algorithms are currently searching for homologous sequences in Junk to find promoter functions in our DNA netherworlds[44]. These algorithms are designed to detect sequences that are also highly conserved across evolution, thus probably being functional. [45] [46]  Statistical models have also been used to study the randomness of spreading and loss of repeat sequences throughout the genome.[47] [48]



In 1994 it was proposed[49] [50] that Junk might be similar to natural languages since it follows, among other things,[51] Zipf’s law.[52] [53] The researchers cited this as a possible proof that there exists one or more structured language in our Junk.  This idea was refuted in many letters that claimed, among other things, that Junk does not fit Zipf’s law any better than coding DNA. .[54][55][56][57][58][59][60]      Nevertheless supporters of Zipf’s law maintain their stance.[61] [62] [63]


Similarly, researchers, using a variety of statistical techniques[64] have deduced long-range correlations in Junk DNA.[65]  This may be attributable to the new found functions of many intergenic non-coding regions, which include replication, chromosome segregation, recombination, chromosome stability and interaction with the nuclear matrix.   All of these require the  ‘high redundancy low information’ sequence inherent in Junk.[66]


Obviously, there are functions hidden in Junk DNA. While previously genetics or biochemistry had been the main thrust of this research, I believe that bioinformatical resources will be much more powerful.  The same tools that have been used in the past to delve into the secrets of coding DNA can be used on Junk. A database of Junk sequences should be set up so that one can perform informatics experiments on the data; as there should be few biases and redundancies in a database of Junk.[67] BLAST searches,[68] sequence analysis, phylogenic analysis, even secondary structure analysis.[69] [70] [71] Molecular modeling may define docking sites in Junk secondary structure for DNA associated proteins binding.   Junk specific alignment programs will have to be written to take into account their high mutation rate.    Furthermore, other statistical analysis[72] along the same lines as the failed Zipf’s law experimentation, will definitely prove extremely useful in uncovering Junk’s hidden treasures. Moreover, computational analysis of Junk will reveal other useful information regarding evolutionary progress[73] and evolutionary knowledge is the key to understanding our biology.[74] 


