Scientists developed an improved method for finding genetic alterations that are difficult to identify by other methods and are key in various cancers.
St. Jude Children’s Research Hospital scientists have developed a significantly better computing tool for finding genetic alterations that play an important role in many cancers but were difficult to identify with whole-genome sequencing. The tool is an algorithm called CONSERTING, which stands for Copy Number Segmentation by Regression Tree in Next Generation Sequencing.
The algorithm was created to improve identification of copy number alterations (CNAs) in the billions of pieces of genetic information generated by next-generation, whole-genome sequencing techniques. CNAs involve the gain or loss of DNA segments. The alterations affect just a few, or many hundreds, of genes depending on the size of the DNA segments.
In a study, published in the journal Nature Methods, researchers showed that the CONSERTING algorithm identified such alterations with dramatically better accuracy and sensitivity than other techniques, including four published algorithms used to recognize CAN in whole-genome sequencing data. The study involved the normal and tumor genomes from 43 children and adults with various cancers, including brain tumors, leukemia, melanoma and the pediatric eye tumor retinoblastoma.
“CONSERTING has helped us harness the power of next-generation, whole-genome sequencing to better understand the genetic landscape of cancer genomes and lay the foundation for the next era of cancer therapy,” said corresponding author Jinghui Zhang, PhD, a member of the St. Jude Department of Computational Biology. “In this study of the tumor and normal genomes of 43 patients, CONSERTING identified copy number alterations in children with 100 times greater precision and 10 times greater precision in adults.”
According to first author, Xiang Chen, PhD, senior research scientist at St. Jude, the algorithm helped the researchers identify alterations that other algorithms missed. These include undetected chromosomal arrangements and copy number alterations present in a small percentage of tumor cells.
The CONSERTING algorithm allowed researchers to discover genetic alterations driving pediatric leukemia, the pediatric brain tumor low-grade glioma, the adult brain tumor glioblastoma and retinoblastoma. Additionally, CONSERTING helped identify genetic changes that are present in a small percent of tumor’s cells. The alterations could play a key role in understanding why tumors sometimes return after treatment.
Zhang said that using CONSERTING should make it easier to track the evolution of tumors with complex genetic rearrangement, sometimes involving multiple chromosomes that swap pieces when broken and reassembled.
St. Jude has made the CONSERTING tool available free-of-charge for researchers around the world. The software, user manual and related data can be downloaded from St. Jude’s website. Researchers have also developed a cloud version of CONSERTING and related tools, and instead of downloading CONSERTING, scientists can upload data for analysis.
CONSERTING has now been used to analyze next-generation, whole-genome sequencing data for the Pediatric Cancer Genome Project, which includes the normal and cancer genomes of 700 pediatric cancer patients with 21 different subtypes of cancer. The algorithm combines a method of data analysis called regression tree, which is a machine learning algorithm, with next-generation, whole-genome sequencing. Machine learning capitalizes on advances in computing to design algorithms that repeatedly and rapidly analyze large, complex data sets and unearth unexpected insights.
“This combination has provided us with a powerful tool for recognizing copy number alterations, even those present in relatively few cells or in tumor samples that include normal cells along with tumor cells,” said Zhang.
Next-generation, whole-genome sequencing involves breaking down the human genome into about one billion pieces that are copied and reassembled using the normal genome as a template. The CONSERTING software compensates for gaps and variations in sequencing data. The sequencing data is integrated with information about the chromosomal rearrangements to find CNAs and identify their origins in the genome.
Source: St. Jude Children’s Research Hospital
Last updated: 5/4/15; 1:00pm EST