Yale CEGS publications: Phase I: 2001-2006


Urban AE, Korbel JO, Selzer R, Richmond T, Hacker A, Popescu GV, Richmond T, Cubells JF, Green R, Emanuel BS, Gerstein M, Weissman SM, Snyder M. High-resolution mapping of DNA copy alterations in human chromosome 22 using high-density tiling oligonucleotide arrays. Proc Natl Acad Sci U S A. 2006 Mar 21;103(12):4534-9.

Stolc V, Li L, Wang X, Li X, Su N, Tongprasit W, Han B, Xue Y, Li J, Snyder M, Gerstein M, Wang J, Deng XW. A pilot study of transcription unit analysis in rice using oligonucleotide tiling-path microarray. Plant Mol Biol. 2005 Sep;59(1):137-49.

Hartman SE, Bertone P, Nath AK, Royce TE, Gerstein M, Weissman S, Snyder M. Global changes in STAT target selection and transcription regulation upon interferon treatments. Genes Dev. 2005 Dec 15;19(24):2953-68.

Gilad Y, Rifkin SA, Bertone P, Gerstein M, White KP. Multi-species microarrays reveal the effect of sequence divergence on gene expression profiles. Genome Res. 2005 May;15(5):674-80.

Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S, Gerstein M, Snyder M. Global identification of human transcribed sequences with genome tiling arrays. Science. 2004 Dec 24;306(5705):2242-6.

White EJ, Emanuelsson O, Scalzo D, Royce T, Kosak S, Oakeley EJ, Weissman S, Gerstein M, Groudine M, Snyder M, Schubeler D. DNA replication-timing analysis of human chromosome 22 at high resolution and different developmental states. Proc Natl Acad Sci U S A. 2004 Dec 21;101(51):17771-6.

Rinn JL, Rozowsky JS, Laurenzi IJ, Petersen PH, Zou K, Zhong W, Gerstein M, Snyder M. Major molecular differences between mammalian sexes are involved in drug metabolism and renal function. Dev Cell. 2004 Jun ;6(6):791-800.

Euskirchen G, Royce TE, Bertone P, Martone R, Rinn JL, Nelson FK, Sayward F, Luscombe NM, Miller P, Gerstein M, Weissman S, Snyder M. CREB binds to multiple loci on human chromosome 22. Mol Cell Biol. 2004 May ;24(9):3804-14.

Martone R, Euskirchen G, Bertone P, Hartman S, Royce TE, Luscombe NM, Rinn JL, Nelson FK, Miller P, Gerstein M, Weissman S, Snyder M. Distribution of NF-kappaB-binding sites across human chromosome 22. Proc Natl Acad Sci U S A. 2003 Oct 14;100(21):12247-52.

Rinn JL, Euskirchen G, Bertone P, Martone R, Luscombe NM, Hartman S, Harrison PM, Nelson FK, Miller P, Gerstein M, Weissman S, Snyder M. The transcriptional activity of human Chromosome 22. Genes Dev. 2003 Feb 15;17(4):529-40.

Lian Z, Kluger Y, Greenbaum DS, Tuck D, Gerstein M, Berliner N, Weissman SM, Newburger PE. Genomic and proteomic analysis of the myeloid differentiation program: global analysis of gene expression during induced differentiation in the MPRO cell line. Blood. 2002 Nov 1;100(9):3209-20.

Horak CE, Mahajan MC, Luscombe NM, Gerstein M, Weissman SM, Snyder M. GATA-1 binding sites mapped in the beta-globin locus by using mammalian chIp-chip analysis. Proc Natl Acad Sci U S A. 2002 Mar 5;99(5):2924-9.

Zheng D, Zhang Z, Harrison PM, Karro J, Carriero N, Gerstein M. Integrated pseudogene annotation for human chromosome 22: evidence for transcription. J Mol Biol. 2005 May 27;349(1):27-45.

Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M. Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic Acids Res. 2005 ;33(8):2374-83.

Balasubramanian S, Xia Y, Freinkman E, Gerstein M. Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms. Nucleic Acids Res. 2005 ;33(5):1710-21.

Zhang Z, Carriero N, Gerstein M. Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet. 2004 Feb;20(2):62-7.

Zhang Z, Harrison PM, Liu Y, Gerstein M. Millions of years of evolution preserved: a comprehensive catalog of the processed pseudogenes in the human genome. Genome Res. 2003 Dec ;13(12):2541-58.

Harrison PM, Carriero N, Liu Y, Gerstein M. A "polyORFomic" analysis of prokaryote genomes using disabled-homology filtering reveals conserved but undiscovered short ORFs. J Mol Biol. 2003 Nov 7;333(5):885-92.

Qian J, Lin J, Luscombe NM, Yu H, Gerstein M. Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data. Bioinformatics. 2003 Oct 12;19(15):1917-26.

Zhang Z, Gerstein M. Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome. Genomics. 2003 May ;81(5):468-80.

Snyder M, Gerstein M. Genomics. Defining genes in the genomics era. Science. 2003 Apr 11;300(5617):258-60.

Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M. Identification of pseudogenes in the Drosophila melanogaster genome. Nucleic Acids Res. 2003 Feb 1;31(3):1033-7.

Harrison PM, Kumar A, Lang N, Snyder M, Gerstein M. A question of size: the eukaryotic proteome and the problems in defining it. Nucleic Acids Res. 2002 Mar 1;30(5):1083-90.

Zhang Z, Harrison P, Gerstein M. Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. Genome Res. 2002 Oct ;12(10):1466-82.

Balasubramanian S, Harrison P, Hegyi H, Bertone P, Luscombe N, Echols N, McGarvey P, Zhang Z, Gerstein M. SNPs on human chromosomes 21 and 22 -- analysis in terms of protein features and pseudogenes. Pharmacogenomics. 2002 May ;3(3):393-402.

Bertone P, Trifonov V, Rozowsky JS, Schubert F, Emanuelsson O, Karro J, Ka M-Y, Snyder M, Gerstein M. Design optimization methods for genomic DNA tiling arrays. Genome Res. 2006 Feb ;16(2):271-81.

Sayward F, Yang J, Nelson FK, Euskirchen G, Urban A, Bertone P, Luscombe N, Echols N, McGarvey P, Zhang Z, Gerstein M. Design Issues in Implementing a Portable Sample Tracking and Analysis Research Support (STARS) System for PCR Based Microarray Research. Proceedings IEEE 40th Annual Conference on Information Sciences and Systems, Princeton, NJ 2006; 1578-1598.

Royce TE, Rozowsky JS, Bertone P, Samanta M, Stolc V, Weissman S, Snyder M, Gerstein M. Issues in the analysis of oligonucleotide tiling microarrays for transcript mapping. Trends Genet. 2005 Aug ;21(8):466-75.

Smith A, Greenbaum D, Douglas SM, Long M, Gerstein M. Network security and data integrity in academia: an assessment and a proposal for large-scale archiving. Genome Biol. 2005 ;6(9):119.

Greenbaum D, Douglas SM, Smith A, Lim J, Fischer M, Schultz M, Gerstein M. Computer security in academia-a potential roadblock to distributed annotation of the human genome. Nat Biotechnol. 2004 Jun ;22(6):771-2.

Carriero N, Osier MV, Cheung K, Miller PL, Gerstein M, Zhao H, Wu B, Rifkin S, Chang J, Zhang H, White K, Williams K, Schultz M. A high productivity/low maintenance approach to high-performance computation for biomedicine: four case studies. J Am Med Inform Assoc. 12(1):90-8.

Berman P, Bertone P, Dasgupta B, Gerstein M, Kao M, Snyder M. Fast optimal genome tiling with applications to microarray design and homology search. J Comput Biol. 2004 ;11(4):766-85.

Qian J, Kluger Y, Yu H, Gerstein M. Identification and correction of spurious spatial correlations in microarray data. Biotechniques. 2003 Jul ;35(1):42-4, 46, 48.

Kluger Y, Yu H, Qian J, Gerstein M. Relationship between gene co-expression and probe localization on microarray slides. BMC Genomics. 2003 Dec 10;4(1):49.

Luscombe NM, Royce TE, Bertone P, Echols N, Horak CE, Chang JT, Snyder M, Gerstein M. ExpressYourself: A modular platform for processing and visualizing microarray data. Nucleic Acids Res. 2003 Jul 1;31(13):3477-82.

Kluger Y, Basri R, Chang JT, Gerstein M. Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 2003 Apr ;13(4):703-16.

Cheung K, White K, Hager J, Gerstein M, Reinke V, Nelson K, Snyder M, Gerstein M. YMD: a microarray database for large-scale gene expression analysis. Proc AMIA Symp. 2002 ;140-4.

Cheung KH, Hager J, Nelson K, White K, Li Y, Snyder M, Williams K, Miller P. A dynamic approach to mapping coordinates between microplates and microarrays. J Bioinformatics 2002; 35:306-12.

Cheung KH, Deshpande AM, Tosche N, Nath S, Agrawal A, Miller P, Kumar A, Snyder M. A metadata framework for interoperating heterogeneous genome data using XML. Proc. American Medical Informatics Assoc. Annual Symp 2001; 110-114.

Cheung KH, Liu Y, Kumar K, Snyder M, Gerstein M, Miller P. An XML application for genomic data interoperation. IEEE International Symposium on Bio-informatics and Biomedical Engineering (BIBE) 2001; 97-103.



Royce TE, Rozowsky JS, Luscombe NM, Emanuelsson O, Yu H, Zhu X, Snyder M, Gerstein M. Extrapolating traditional DNA microarray statistics to tiling and protein microarray technologies. Methods Enzymol. 2006 ;411282-311.

Martone R, Snyder M. Mapping transcription factor binding sites using ChIP Chip - general considerations. DNA Microarrays 13. 2005; In press.

Rinn JL, Snyder M. Sexual dimorphism in mammalian gene expression. Trends Genetics 2005; 21: 298-305

Bertone P, Gerstein M, Snyder M. Applications of DNA tiling arrays to experimental genome annotation and regulatory pathway discovery. Chromosome Res. 2005 ;13(3):259-74.

Zhang Z, Gerstein M. Large-scale analysis of pseudogenes in the human genome. Curr Opin Genet Dev. 2004 Aug ;14(4):328-35.

Euskirchen G, Snyder M. A plethora of sites. Nature Gen. 2004; 36, 325-326.

Zhang Z, Gerstein M. Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements. J Biol. 2003; 2: 11.

Lian Z, Euskirchen G, Rinn JL, Martone R, Bertone P, Hartman S, Royce TE, Nelson FK, Sayward F, Luscombe NM, Lang Y, Li J, Miller P, Urban AE, Gerstein M, Weissman S, Snyder M. Identification of novel functional elements in the human genome. Cold Spring Harb Symp Quant Biol 2003; 68:317-22.

Harrison PM, Gerstein M. Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol. 2002 May 17;318(5):1155-74.


Yale CEGS publications: Phase II: 2007-2008


Pan X, Urban AE, Palejev D., Schulz V., Grubert F, Hu Y, Snyder M, Weissman SM. A new procedure for highly specific, hypersensitive and unbiased whole genome amplification. Proc Natl Acad Sci (In Press)

Hasin Y, Olender T, Khen M, Gonzaga-Jauregui1 C, Kim P, Eckehart A, Snyder M, Gerstein M, Lancet D,. Korbel J. High-resolution Copy-Number Variation Map Reflects Human Olfactory Receptor Diversity and Evolution. PLOS Genetics (in press)

Lian Z, Karpikov A, Lian J, Mahajan MC, Hartman S, Gerstein M, Snyder M, Weissman SM. A genomic analysis of RNA polymerase II modification and chromatin architecture related to 3' end RNA polyadenylation. Genome Res. 2008 Aug ;18(8):1224-37.

Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008 Jun 6;320(5881):1344-9.

Wu J, Du J, Rozowsky J, Zhang Z, Urban A, Euskirchen G, Weissman S, Gerstein M, Snyder M. Systematic analysis of transcribed loci in ENCODE regions using RACE sequencing reveals extensive transcription in the human genome. Genome Biol. 2008 Jan 3;9(1):R3.

Korbel JO, Urban AE, Affourtit JP, Godwin B, Grubert F, Simons JF, Kim PK, Palejev D, Carriero N, Du L, Taillon B, Tanzer A, Chi J, Yang F, Carter N, Hurles ME, Weissman S, Harkins T, Gerstein M, Egholm M, Snyder M. Paired-end mapping reveals extensive structural variation in the human genome. Science. 2007 Oct 19;318(5849):420-6.

Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007;4: 651-7.

Rozowsky J, Wu J, Lian Z, Nagalakshmi U, Korbel JO, Kapranov PD, Zheng D, Dyke S, Newburger P, Miller P, Gingeras T, Weissman S, Gerstein M, Snyder M. Novel transcribed regions in the human genome. Cold Spring Harb Symp Quant Biol. 2007;71: 111-116.

Euskirchen GM, Rozowsky JS, Wei C, Lee WH, Zhang ZD, Hartman S, Emanuelsson O, Stolc V, Weissman S, Gerstein M, Ruan Y, Snyder M. Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 2007 Jun;17(6):898-909.

Smith MG, Gianoulis TA, Pukatzki S, Mekalanos JJ, Ornston LN, Gerstein M, Snyder M. New insights into Acinetobacter baumannii pathogenesis revealed by high-density pyrosequencing and transposon mutagenesis. Genes Dev. 2007 Mar 1;21(5):601-14.

Dewan A, Liu M, Hartman S, Zhang SS, Liu DT, Zhao C, Tam PO, Chan WM, Lam DS, Snyder M, Barnstable C, Pang CP, Hoh J. HTRA1 promoter polymorphism in wet age-related macular degeneration. Science. 2006;314: 989-92.


Informatics tools and databases

Korbel JO, Urban AE, Grubert F, Du J, Royce TE, Starr P, Zhong G, Emanuel B, Weissman S, Snyder M Gerstein M. Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome. Proc Natl Acad Sci U S A. 2007 Jun 12;104(24):10110-5. [TOOL] WEBSITE:

Royce TE, Rozowsky JS, Gerstein MB. Assessing the need for sequence-based normalization in tiling microarray experiments. Bioinformatics. 2007 Apr 15;23(8):988-97 [TOOL] WEBSITE:

Yu H, Nguyen K, Royce T, Qian J, Nelson K, Snyder M, Gerstein M. Positional artifacts in microarrays: experimental verification and construction of COP, an automated detection tool. Nucleic Acids Res. 2007 ;35(2):e8. [TOOL] WEBSITE: (COP submodule)

Zhang Z, Pang AWC, Gerstein M. Comparative analysis of genome tiling array data reveals many novel primate-specific functional RNAs in human. BMC Evol Biol. 2007 ;7 Suppl 1S14.

Karro JE, Yan Y, Zheng D, Zhang Z, Carriero N, Cayting P, Harrrison P, Gerstein M. a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res. 2007 Jan ;35(Database issue):D55-60. [TOOL] WEBSITE:

Yu H, Gerstein M. Genomic analysis of the hierarchical structure of regulatory networks. Proc Natl Acad Sci U S A. 2006 Oct 3;103(40):14724-31.

Kim PM, Korbel JO, Gerstein MB. Positive selection at the protein network periphery: evaluation in terms of structural constraints and cellular context. Proc Natl Acad Sci U S A. 2007 Dec 18;104(51):20274-9.

Zhang ZD, Weinstock G, Gerstein M. Rapid evolution by positive Darwinian selection in T-cell antigen CD4 in primates. J Mol Evol. 2008 May ;66(5):446-56.

Yip KY, Patel P, Kim PM, Engelman DM, McDermott D, Gerstein M. An integrated system for studying residue coevolution in proteins. Bioinformatics. 2008 Jan 15;24(2):290-2. [TOOL] WEBSITE:

Royce TE, Rozowsky JS, Gerstein MB. Toward a universal microarray: prediction of gene expression through nearest-neighbor probe sequence identification. Nucleic Acids Res. 2007 ;35(15):e99.

Royce TE, Carriero NJ, Gerstein MB. An efficient pseudomedian filter for tiling microrrays. BMC Bioinformatics. 2007 ;8186. [TOOL] WEBSITE:

Zheng D, Gerstein MB. The ambiguous boundary between genes and pseudogenes: the dead rise up, or do they? Trends Genet. 2007 May ;23(5):219-24.

Zhang ZD, Rozowsky J, Snyder M, Chang J, Gerstein M. Modeling ChIP sequencing in silico with applications. PLoS Comput Biol. 2008 ;4(8):e1000158. [TOOL] WEBSITE:



Wu JQ, Snyder M. RNA polymerase II stalling: loading at the start prepares genes for a sprint. Genome Biol. 2008 May 2;9:220. PMID: 18466645

Korbel JO, Kim PM, Chen X, Urban AE, Weissman S, Snyder M, Gerstein MB. The current excitement about copy-number variation: how it relates to gene duplications and protein families. Curr Opin Struct Biol. 2008 Jun ;18(3):366-74.

Gerstein M, Zheng D. The real life of pseudogenes. Sci Am. 2006 Aug ;295(2):48-55.


