HPC-based genome variant calling workflow (HPC-GVCW)

Reference
Zhou, Y., Kathiresan, N., Yu, Z., Rivera, L. F., Thimma, M., Manickam, K., Chebotarov, D., Mauleon, R., Chougule, K., Wei, S., Gao, T., Green, C. D., Zuccolo, A., Ware, D., Zhang, J., McNally, K. L., & Wing, R. A. (2023). HPC-based genome variant calling workflow (HPC-GVCW). https://doi.org/10.1101/2023.06.25.546420
Publication File
Abstract

A high-performance computing genome variant calling workflow was designed to run GATK on HPC platforms.  This workflow efficiently called an average of 27.3 M, 32.6 M, 168.9 M, and 16.2 M SNPs for rice, sorghum, maize, and soybean, respectively, on the most recently released high-quality reference sequences.  Analysis of a rice pan-genome reference panel revealed 2.1 M novel SNPs that have yet to be publicly released.