Bioinformatics @ NYGC

Our team of bioinformaticians aims to develop, maintain and improve our analysis pipelines by leveraging the large amounts of sequencing data we produce. We work on estimating the sources of errors and variability in the data, defining methods to correct them, both computationally and on the lab side. We are also continually evaluating and benchmarking available tools, refining best practices to analyze and combine results, and are developing novel tools and methods.

We are also supporting our CLEP lab by providing the expertise in clinical interpretation of constitutional and cancer genomics.


processOur diverse team of bioinformatics scientists has expertise in:

  • Statistical and population genetics
  • Cancer genomics
  • Expression analysis
  • Epigenomics and functional genomics
  • de novo genome assembly
  • Metagenomics
  • Clinical interpretation


A typical project is initiated with one of the sequencing project managers. Our bioinformatics scientists are consulted to further refine the experimental design, analytic plan, and project deliverables.

The bioinformatics team performs standard and project-specific quality control, and analysis of sequencing data (e.g., differential expression and functional enrichment for RNA-Seq, variant annotation and interpretation for genome and exome sequencing, and somatic variant—both SNV and structural variant—for cancer). Results are delivered via our web interface or APIs and are stored and accessible for a period of time as part of NYGC’s Integrated Genomics.

Clinical Interpretation

As exome and genome sequencing data are processed and genomic variation between the sample and a reference are defined, annotated, and compared to existing databases, our bioinformatics scientists contribute to the last step of the analysis: clinical interpretation.

This usually requires ranking and filtering of putative candidates, manual curation, and functional validation (when possible) of our findings. NYGC’s analysis alleviates the need for the investigator to perform the standard computationally intensive analysis steps, thus freeing up time to focus on the biology.

Toby Bloom

Deputy Scientific Director, Informatics

Michael Zody

Research Director, Computational Biology

Dayna Oschwald

Senior Director, Informatics Program Management

Nicolas Robine

Assistant Director, Computational Biology

Avinash Abhyankar

Manager, Clinical Informatics

Kazimierz Wrzeszczynski

Bioinformatics Scientist

Giuseppe Narzisi

Senior Bioinformatics Scientist

Will Liao

Senior Bioinformatics Scientist

Phaedra Agius

Senior Bioinformatics Scientist

Andre Corvelo

Senior Bioinformatics Scientist

Kanika Arora

Senior Bioinformatics Analyst

Minita Shah

Senior Bioinformatics Analyst

Caitlin McHugh

Senior Bioinformatics Analyst, Statistical Genetics

Wayne Clarke

Senior Bioinformatics Analyst

Sadia Rahman

Biocurator, Molecular Diagnostics

Rajeeva Musunuri

Bioinformatics Programmer

Alice Fang


Molly Johnson

Bioinformatics Analyst

David Lin

Bioinformatics analyst

Jennifer Shelton

Bioinformatics programmer

Amrita Kar

Bioinformatics Analyst, Metagenomics

Heather Geiger

Bioinformatics Analyst

Genomic Patterns of De Novo Mutation in Simplex Autism

To further our understanding of the genetic etiology of autism, we generated and analyzed genome sequence data from 516 idiopathic autism families (2,064 individuals). This resource includes >59 million single-nucleotide variants (SNVs) and 9,212 private copy number variants (CNVs), of...

Authors:  Michael Zody  

Detection of long repeat expansions from PCR-free whole-genome sequence data

Identifying large expansions of short tandem repeats (STRs) such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step towards integrating...

Authors:  Giuseppe Narzisi  

Pancreatic intraductal tubulopapillary neoplasm is genetically distinct from intraductal papillary mucinous neoplasm and ductal adenocarcinoma

Intraductal tubulopapillary neoplasm is a relatively recently described member of the pancreatic intraductal neoplasm family. The more common member of this family, intraductal papillary mucinous neoplasm, often carries genetic alterations typical of pancreatic infiltrating ductal adenocarcinoma (KRAS, TP53, and CDKN2A)...

Authors:  Kazimierz Wrzeszczynski  

Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma

Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each.

Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was...

Authors:  Michael Zody   Nicolas Robine   Kazimierz Wrzeszczynski   Kanika Arora   Minita Shah  

The methyltransferase SETDB1 regulates a large neuron-specific topological chromatin domain.

We report locus-specific disintegration of megabase-scale chromosomal conformations in brain after neuronal ablation of Setdb1 (also known as Kmt1e; encodes a histone H3 lysine 9 methyltransferase), including a large topologically associated 1.2-Mb domain conserved in humans and mice that encompasses...

Authors:  Will Liao  

Indel variant analysis of short-read sequencing data with Scalpel

As the second most common type of variation in the human genome, insertions and deletions (indels) have been linked to many diseases, but the discovery of indels of more than a few bases in size from short-read sequencing data remains...

Authors:  Michael Zody   Giuseppe Narzisi   Kanika Arora