Careers > Our Culture & Values > Diversity, Equity & Inclusion > DEI Feature Spotlight Archive > Nicolas Robine Reflects on the NYGC Collaborative Initiative Polyethnic-1000

Nicolas Robine Reflects on the NYGC Collaborative Initiative Polyethnic-1000

World Cancer Day is observed annually on February 4 to raise awareness and promote education about cancer, its prevention, detection, and treatment. It has been organized by the Union for International Cancer Control since its conception in 2000 at the World Summit Against Cancer for the New Millennium in Paris. According to the World Health Organization, cancer is the second leading cause of death globally, resulting in approximately 10 million deaths every year. In the U.S. alone, more than 600,000 cancer deaths are projected for 2021 (source: National Cancer Institute).

Nevertheless, a lot of progress has been made in our understanding of cancer, its detection, and prevention through innovative research in science and medicine. The more we invest and learn about various cancers, the more progress we will make in our battle to defeat cancer. Likewise, it is imperative that the growing body of scientific knowledge is shared with the general public to increase awareness about cancer risks, prevention, diagnosis, and treatment and to eliminate social stigma around the disease. Observing World Cancer Day is one way for the general public to become more informed and educated about cancer.


On this front, the New York Genome Center (NYGC) is playing an important role to improve our understanding of cancer through genomic research. One pivotal example of NYGC’s impact, both scientific and social, is an ongoing project called the Polyethnic-1000. The vision of Polyethnic-1000 is to deepen our understanding of the contributions ethnicities make to the incidence and behavior of cancers, thereby improving outcomes for many patients, especially those who currently lack access to the most recent advances in medical science. This is an important step towards making cancer research more equitable and universal since most of the advances in our understanding of cancer have been centered on patients of European ancestry. The lack of studies on patients of non-European descent has limited our knowledge about the multiethnic complexities of cancer and has hindered diagnosis, prevention, and treatment of such patients. Polyethnic-1000 aims to address this knowledge gap in hopes of reducing ethnic health disparities in the future.

We observe this year’s World Cancer Day by discussing the NYGC’s efforts in fighting cancer with Nicolas Robine, PhD, Director of Computational Biology, who is co-Principal Investigator of the Polyethnic-1000 project. [To learn more about all areas of NYGC cancer research, visit our Cancer Research Areas page.]

How and why was the Polyethnic-1000 project conceived?
The New York Genome Center is engaged in multiple collaborations to apply genomic technologies to cancer research. In the last two decades, the scientific community has characterized tumors at the genetic and molecular levels and discovered recurrent mutations, driver events and subtypes of common cancers, establishing the “mutational landscape” of these cancer types and defining potential “vulnerabilities” in certain tumors. These findings are shared with the clinical and research community in large databases. We are now in the era of “personalized medicine,” which is the realization that all tumors are genetically unique and that profiling tumors and comparing them with existing profiles in public databases can lead to improved therapeutic options. Many of the cancer centers affiliated with NYGC are applying this concept, and we often support them with sequencing, analysis, research, etc.

However, a major issue is the lack of representative racial and ethnic diversity in the public databases. Indeed, most tumors that have been profiled so far are from patients of European ancestry, mostly as a result of disparities in the U.S. healthcare system. As an example, The Cancer Genome Atlas, launched 15 years ago and which contains 10,000 tumors from 33 cancer types, is around 80% self-declared “Non-Hispanic White”. Most large public genomics databases are affected by the same bias. As a consequence, using the databases for “personalized medicine” efforts has the risk of being less fruitful for populations that are already underserved in the current healthcare system.

Thanks to the incredible ethnic diversity in New York City, the Genome Center Cancer Group (GCCG) at NYGC, led by Harold Varmus, MD (NYGC and Weill Cornell Medicine) and Charles Sawyers, MD (Memorial Sloan Kettering Cancer Center), decided to tackle this issue by enrolling patients from diverse origins, collecting and sequencing tumor samples, and forming a consortium for the analyses. The first aim is to diversify the public databases used in cancer genomics (in accordance with standards respecting privacy and confidentiality of the patients, but allowing researchers to access the data). The second aim is to test hypotheses around the existence of molecular differences explaining some of the disparities between populations regarding cancer risk, disease evolution, response to treatments, and mortality. For many cancer types, these disparities are well-documented by epidemiological data and the causes are often due to differences in the environment (including access to healthcare), but some findings at the molecular level could have profound implications, in terms of risk assessment, treatment options, or even drug discovery. The third aim is to translate these findings into clinical practice and have more patients benefiting from personalized medicine approaches.

How will this project impact patients from underrepresented communities?
Many people hear about genomics and “personalized medicine” and about these technologies being offered to patients at elite medical centers. Though the field is not yet at the stage of being able to offer clinical sequencing to all cancer patients in New York and elsewhere, the first step towards achieving health equity in cancer care is to broaden the Cancer Atlas to include a much more diverse set of patients and to make sure that all the variability is represented in these databases. This is critical in order to progress toward a state in which all patients regardless of their ancestry can have their tumor profiled and their treatment informed by an appropriate comparison of their tumor to other profiles.

We also intend to engage with patient communities directly. Many communities are struggling to understand why they suffer disproportionately from certain cancer types and we plan to engage with these communities, in the form of informational sessions and joint research projects.

What are some of the challenges you currently face (and expect to face) during this project?
The first challenge is administrative. It is not trivial to coordinate a multi-institutional network. While we are benefiting greatly from the collaborative experience of NYGC, there are often significant administrative hurdles in negotiating signed agreements related to sample sharing and data access.

The second challenge is communication. While our colleagues in the scientific community are excited about the prospect of exploring new data and improving our collective understanding of cancer, we need to adequately engage with the communities, listen, understand, and address their potential fear or reluctance to participate and include them in the design of our project. This takes time and resources and we need to strike the right balance among all aspects of the project, under a very ambitious but constrained budget.

The third challenge is technical: to obtain 1,000 tumor samples (with their matched “normal samples”), extract DNA and RNA, sequence them, process the data, and analyze the results. This represents close to a petabyte of storage and approximately 50 years of computation (but much less using parallelized computing!). I am very confident that this challenge will be the easiest one, thanks to the excellent team at NYGC.

What can be done to disseminate the scientific knowledge we gain from Polyethnic-1000 to the general public?
The first way is peer-reviewed publication and deposition of the data in access-controlled databases, for the benefit of the scientific community. This ensures that the data and the results we obtained are solid and validated. Obviously, we also need to engage a larger non-scientific audience, for which we will work with journalists in various media to propagate our message. Finally, I am looking forward to having a direct and deep engagement with the patient communities to present and explain findings, and how our project could benefit them and their community.

In your opinion, what should be done to encourage other projects (related to cancer or other diseases) to become as more inclusive as the Polyethnic-1000 project?
The biases in databases and sources of knowledge is progressively being recognized by the scientific community, and there are several groups forming projects similar to P-1000 for different diseases. However, I think the most consequential transformation would be the diversification of the scientific workforce. If we had scientific leaders truly representing the diversity of the general population in all aspects (gender, ethnicity, socio-economic background, etc.), science would greatly benefit and its progress would be translated to a much larger share of the population, possibly contributing to reducing disparities, instead of increasing them.

Decorative image color fade left Decorative image color fade right Decorative image color fade