CPTAC and the Impact of Proteogenomics on Cancer Research
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) is an effort spearheaded by the National Cancer Institute (NCI) to advance cancer research with proteogenomic data. CPTAC contributes genomic and proteomic data from 1500+ patients across many cancer types, available through the NCI’s Genomic and Proteomic Data Commons.
What is proteogenomics, and what is its role in cancer research?
Proteogenomics integrates proteomic, genomic, and transcriptomic data to build a more comprehensive understanding of how different biological processes and components (DNA, RNA, and proteins) interrelate, particularly in the context of producing a certain disease phenotype. In cancer research, analyzing each of these data types alone has shown limited impact on understanding the molecular basis of cancer and developing treatments. Gene expression is not a simple unidirectional process, and genetic mutations don’t map directly to tumor behaviors. In a 2022 Nature review, Dr. D.R. Mani et al. wrote, “...genomics and epigenomics provide the cellular blueprint for what may happen, [while] proteomics provides a determination of what has happened.” Proteogenomic strategies allow researchers to more effectively distinguish “driver” mutations, or those that contribute to tumor malignancy, from “passenger” mutations that don’t ultimately contribute to the cancer phenotype.
Furthermore, cancer is not just one disease – dysregulated cell proliferation has a variety of causes and consequences, resulting in a complex web of distinct cancer types and subtypes with widely varying prognoses. Being able to differentiate these types with increasing granularity is critical for effectively intervening in these unique disease processes and developing precision medicines. With proteogenomic approaches, researchers can hone in on the proteomic, transcriptional, and genomic signatures of different tumors, enabling more targeted drug development and accurate patient segmentation.
How are CPTAC and proteogenomics improving our understanding of cancer and how to treat it?
Uncovering signaling pathways and molecular characteristics
Even if they occur in the same tissue type, tumors can be highly heterogeneous. There are numerous changes in signaling pathways, gene expression profiles, protein features, and so on that could result in tumor malignancy. Pinpointing molecular subtypes of tumors is crucial for understanding how to halt or reverse their growth.
Breast cancer is one example of a cancer that exhibits a wide range of potential molecular characteristics and avenues for druggability. In a 2020 study, Dr. Karsten Krug et al. leveraged CPTAC and other datasets to examine the proteogenomic landscape of breast cancer tumors and hone in on clinically-relevant features. Proteogenomic analyses revealed modifications to the ERBB2 receptor and RB pathways, which are both ongoing targets for therapeutics. The authors also uncovered novel phosphorylation and acetylation patterns on key proteins involved in tumor suppression and DNA damage response. These insights provide a more nuanced understanding of the mechanisms involved in different breast cancer cases, allowing for further segmentation and more precise treatment options.
Overcoming treatment resistance
While resistance to chemotherapy, known as refractoriness, is a well-known phenomenon that has been studied for decades, predicting tumor refractoriness prior to treatment is extremely difficult if not impossible for most cancer patients. Proteogenomic tools are revealing insights that can improve the accuracy of diagnoses and offer better treatment routes for patients that may otherwise suffer through multiple rounds of ineffective therapies.
In a 2023 study, Dr. Shrabanti Chowdhury et al. used proteogenomic analyses to characterize chemotherapy-resistant tumors in patients with high-grade serous ovarian cancer (HGSOC). Using data from CPTAC and other sources, Chowdhury et al. identified a 64-protein signature that predicts refractoriness in a subset of HGSOC patients, which they subsequently validated in two independent cohorts. The authors also identified five molecular subtypes of HGSOC. These data can lead to the development of more effective diagnostic tools and treatment options for patients with historically poor prognoses.
Improving diagnostics
Proteogenomics can reveal a wealth of information in a research setting, but to be viable as a clinical tool, these analyses need to be effectively performed on small, biopsy-sized patient samples. Researchers are developing microscaled proteogenomics techniques to help bring the promise of precision oncology to the clinic. In a study focusing on triple-negative breast cancer, a diagnosis associated with high mortality and resistance to chemotherapy, Dr. Meenakshi Anurag et al. successfully leveraged microscaled proteogenomics to uncover new biomarkers of chemotherapy response in this tumor type. The results from these analyses showed several metabolic pathways, as well as a deletion impacting LIG1, POLD1, and XRCC1 gene expression, associated with treatment resistance. Bringing this level of nuanced analysis to biopsy samples can equip clinicians and patients with more informative diagnoses and, ultimately, more effective treatment plants.
Revealing pan-cancer insights
Delving into tumor types and subtypes on a more granular level is important for providing meaningful diagnoses and tailoring treatments to specific patients. However, analyzing proteogenomic data across cancer types can provide a critical understanding of patterns common to tumors originating in different tissues.
In a 2022 study, Dr. Yiqun Zhang et al. performed proteogenomic characterization of just over 2000 tumors across 14 cancer types in order to discern pan-cancer molecular subtypes and their associated signaling pathways. They found 11 subtypes spanning tumor lineages from different tissues, each associated with enrichment of different pathways. Notably, many cancers showed increased activity of the MYC pathway without having mutations in the MYC gene itself, but rather in other genes with non-canonical roles. Carefully looking at the interplay between genome, transcriptome, and proteome enabled the authors to discover unexpected relationships between and genetic mutations and oncogenic pathway enrichment.
Beyond enabling better molecular characterization of the tumor itself, proteogenomics also offers an arsenal of tools for describing the tumor’s immune microenvironment (TME). Understanding how the TME influences cancer malignancy is crucial to developing targeted immunotherapies. In a 2023 study of pan-cancer CPTAC data, Dr. Francesca Petralia et al. investigated the TME of >1000 tumors, representing 10 different types of cancer. The authors integrated cell type and molecular pathway data to sort tumors into seven distinct immune subtypes, going on to characterize them by their unique genomic, transcriptomic, proteomic, and epigenetic features. These data – along with deeper analyses revealing distinct kinase activities in each subtype – present an opportunity for tailoring therapies to the tumor’s immune environment.
How will proteogenomic insights translate into medical advances?
The studies highlighted above are only a few examples of the groundbreaking proteogenomics work occurring in cancer research and immunology. By combining rich resources like CPTAC with rigorous multi-omics analytical tools, researchers are consistently revealing previously unknown relationships between genes, proteins, immune response, and tumor behavior. Many of these results are already being explored for diagnostic and therapeutic applications. Subtyping cancers by key characteristics like immune microenvironment, enrichment for certain pathways, or specific protein signatures is helping clinicians identify which patients will benefit from which treatments. These novel subdivisions also enable drug developers to focus on interrupting or reversing specific oncogenic mechanisms and immune responses. By hastening progress in both the diagnostic and therapeutic arenas, proteogenomics is helping us inch closer to a future where precision oncology is a reality rather than a promising concept.
How can I use CPTAC in my own research?
While CPTAC is an incredible resource for proteogenomic data, effectively utilizing it to derive meaningful insights is no small task. Watershed can help your team gain access to CPTAC’s protected datasets, efficiently develop and execute powerful workflows tailored to your research goals, integrate complex analyses with top-tier biological and computational expertise, and more. Get started by reaching out to our team of bioinformaticians and engineers at contact@watershed.bio.