A bioinformatic pipeline for NGS data analysis and mutation calling in human solid tumors

Tsukanov K.Yu.1, Krasnenko A.Yu.1, Plakhina D.A.1, Korostin D.O.2, Churov A.V.3, Druzhilovskaya O.S.2 , Rebrikov D.V.2, Ilinsky V.V.1

1. “Genotek Ltd”, Moscow, Russia
2. Vavilov Institute of General Genetics, Moscow, Russia
3. Institute of Biology of Karelian Research Centre, Petrozavodsk, Russia
Section: OMICS-Technologies
DOI: 10.18097/PBMC20176305413      PubMed Id: 29080873
Year: 2017  Volume: 63  Issue: 5  Pages: 413-417
We aimed to develop a pipeline for the bioinformatic analysis and interpretation of NGS data and detection of a wide range of single-nucleotide somatic mutations within tumor DNA. Initially, the NGS reads were submitted to a quality control check by the Cutadapt program. Low-quality 3¢-nucleotides were removed. After that the reads were mapped to the reference genome hg19 (GRCh37.p13) by BWA. The SAMtools program was used for exclusion of duplicates. MuTect was used for SNV calling. The functional effect of SNVs was evaluated using the algorithm, including annotation and evaluation of SNV pathogenicity by SnpEff and analysis of such databases as COSMIC, dbNSFP, Clinvar, and OMIM. The effect of SNV on the protein function was estimated by SIFT and PolyPhen2. Mutation frequencies were obtained from 1000 Genomes and ExAC projects, as well as from our own databases with frequency data. In order to evaluate the pipeline we used 18 breast cancer tumor biopsies. The MYbaits Onconome KL v1.5 Panel (“MYcroarray”) was used for targeted enrichment. NGS was performed on the Illumina HiSeq 2500 platform. As a result, we identified alterations in BRCA1, BRCA2, ATM, CDH1, CHEK2, TP53 genes that affected the sequence of encoded proteins. Our pipeline can be used for effective search and annotation of tumor SNVs. In this study, for the first time, we have tested this pipeline for NGS data analysis of samples from patients of the Russian population. However, further confirmation of efficiency and accuracy of the pipeline is required on NGS data from larger datasets as well as data from several types of solid tumors.
Download PDF:  
Keywords: mutation, bioinformatic pipeline, high-throughoutput sequencing, bioinformatic analyses of NGS data

Tsukanov, K. Yu., Krasnenko, A. Yu., Plakhina, D. A., Korostin, D. O., Churov, A. V., Druzhilovskaya, O. S., Rebrikov, D. V., Ilinsky, V. V. (2017). A bioinformatic pipeline for NGS data analysis and mutation calling in human solid tumors. Biomeditsinskaya Khimiya, 63(5), 413-417.
 2024 (vol 70)
 2023 (vol 69)
 2022 (vol 68)
 2021 (vol 67)
 2020 (vol 66)
 2019 (vol 65)
 2018 (vol 64)
 2017 (vol 63)
 2016 (vol 62)
 2015 (vol 61)
 2014 (vol 60)
 2013 (vol 59)
 2012 (vol 58)
 2011 (vol 57)
 2010 (vol 56)
 2009 (vol 55)
 2008 (vol 54)
 2007 (vol 53)
 2006 (vol 52)
 2005 (vol 51)
 2004 (vol 50)
 2003 (vol 49)