A bioinformatic pipeline for NGS data analysis and mutation calling in human solid tumors


1. “Genotek Ltd”, Moscow, Russia
2. Vavilov Institute of General Genetics, Moscow, Russia
3. Institute of Biology of Karelian Research Centre, Petrozavodsk, Russia
Type: OMICS-technologies
DOI: 10.18097/PBMC20176305413      PubMed Id: 29080873
Year: 2017 vol: 63  issue:5  pages: 413-417
Abstract: We aimed to develop a pipeline for the bioinformatic analysis and interpretation of NGS data and detection of a wide range of single-nucleotide somatic mutations within tumor DNA. Initially, the NGS reads were submitted to a quality control check by the Cutadapt program. Low-quality 3¢-nucleotides were removed. After that the reads were mapped to the reference genome hg19 (GRCh37.p13) by BWA. The SAMtools program was used for exclusion of duplicates. MuTect was used for SNV calling. The functional effect of SNVs was evaluated using the algorithm, including annotation and evaluation of SNV pathogenicity by SnpEff and analysis of such databases as COSMIC, dbNSFP, Clinvar, and OMIM. The effect of SNV on the protein function was estimated by SIFT and PolyPhen2. Mutation frequencies were obtained from 1000 Genomes and ExAC projects, as well as from our own databases with frequency data. In order to evaluate the pipeline we used 18 breast cancer tumor biopsies. The MYbaits Onconome KL v1.5 Panel (“MYcroarray”) was used for targeted enrichment. NGS was performed on the Illumina HiSeq 2500 platform. As a result, we identified alterations in BRCA1, BRCA2, ATM, CDH1, CHEK2, TP53 genes that affected the sequence of encoded proteins. Our pipeline can be used for effective search and annotation of tumor SNVs. In this study, for the first time, we have tested this pipeline for NGS data analysis of samples from patients of the Russian population. However, further confirmation of efficiency and accuracy of the pipeline is required on NGS data from larger datasets as well as data from several types of solid tumors.
Download PDF:
Reference: Tsukanov K.Yu., Krasnenko A.Yu., Plakhina D.A., Korostin D.O., Churov A.V., Druzhilovskaya O.S., Rebrikov D.V., Ilinsky V.V., A bioinformatic pipeline for NGS data analysis and mutation calling in human solid tumors, Biomeditsinskaya khimiya, 2017, vol: 63(5), 413-417.