Skip to main content


Benjamin Harvey

Departmental Affiliations

Experiences & Accomplishments
Bowie State University
Bowie State University
Mississippi Valley State University
I graduated from Mississippi Valley State University (MVSU) in 2008 with a B.S. in Pre-Medicine & Computer Science. I received a Master of Science in Computer Science from Bowie State University in 2011 and a Doctor of Science in Computer Science from Bowie State University (BSU) in 2015. I’m currently an assistant scientist at Johns Hopkins University within the Biostatistics Department supporting Dr. Chatterjee as data scientist in the Statistical Genetics Lab as well as the JHU Data Science Lab. I previously served as a Data Scientist and Solutions Architect at Databricks working with federal customers and developing genomics pipelines for Public Sector and Health & Life Sciences (HLS) organizations. I've worked as a Research Professor at Bowie State University (BSU) in the Computer Science department and George Washington University (GWU) within the Department of Engineering Management and Systems Engineering (EMSE) and Department of Computer Science’s joint Data Analytics graduate program where I taught Data Science and Big Data Analytics courses. I joined the National Security Agency (NSA) in 2009 and worked there for nearly a decade where my final position was the Chief of Operations Data Science. I was hired into the Cryptologic Computer Science Develop Program (CDP), graduated from the CDP in 2012, and was the first African American to be accepted and to finish the program. I was a research fellow at Harvard-Massachusetts Institute of Technology (MIT) Division of Health Sciences and Technology (HST) in the Bioinformatics and Integrative Genomics (BIG) program in 2008. I was a Post-Graduate Research-Fellow with i2b2, National Center for Biomedical Computing, Brigham and Women’s Hospital, and Children’s Hospital Boston Informatics Program (CHIP) in conjunction with at Harvard Medical School.
Honors & Awards
2021 Board of Advisors, Bowie State University, Department of Computer Science
2021 Board of Advisors, Virginia Tech, Department of Electrical and Computer Engineering
2020 NSF National ICorps, Principal Investigator, Securing the ML Lifecycle
2019 NSF National ICorps, Principal Investigator, AI Augmentation and Integration
2017 Intelligence Community and Counter-Intelligence Security Professional Award
2016 Office of the Director National Intelligence (ODNI) Award for Human Capital
2015 Bowie State University Dissertation of the Year Award
2015 Bowie State University Computer Science Chair’s Award Dissertation of the Year
2011 NSA/CSS Cryptologic Computer Science Development Program
2010 National Institute of Health Research Fellowship, Clinical Center
2009 Brigham and Women’s Hospital, Harvard Medical School (i2b2)
2009 National Institute of Health Clinical Center Research Fellow
2009 Harvard Medical School-Brigham and Women’s Hospital Post-Baccalaureate
2008 Harvard-MIT Health & Science Technology (HST) Internship Program
2008 National Association of Mathematicians Summer Research Grant, ECSU
2007 Ronald McNair Post-Baccalaureate Program Grant, University of Tennessee
Select Publications
Dr. Benjamin Harvey is currently a Assistant Scientist at the Johns Hopkins Bloomberg School of Public Health in Baltimore, Maryland. He has a B.S. in Computer Science from Mississippi Valley State University and a M.S. and D.Sc. in Computer Science from Bowie State University. As a data scientist, Dr. Harvey specializes in assisting researchers and universities in prepping, processing, and analyzing genomic data by implementing scalable systems including Apache Spark, and tools like Hail, GATK4, ADAM, SparkSeq, and VariantSpark. Dr. Harvey has utilized these tools to provide high level APIs that simplify implementing algorithms for analyzing large genomic datasets including GATK pipelines hosted in the cloud on AWS and Azure at scale. His work has enabled researchers and universities to process data 15x faster with workflows optimized to run in parallel and easily launch and scale pipelines with a few clicks. Dr. Harvey has also developed capabilities that enable researchers to interactively explore and classify data with prepackaged genomic analytics (e.g., Joint Variant Calling, GWAS, eQTL, etc.) and machine learning. He has developed capabilities to analyze hundreds of thousands of genomes while minimizing costs with autoscaling on AWS and Azure. This includes seamlessly connecting processed genomic data with downstream analytics for faster results. He has also developed solutions to enable researchers, computational biologists and bioinformaticians to iterate in real-time with collaborative workspaces in Databricks. This includes exploring data efficiently with familiar languages – SQL, R, Python, Java, and Scala and standardizing genomic workflows across teams to improve reproducibility.
  • Jin, J., Agarwala, N., Kundu, P., Harvey, B., Zhang, Y., Wallace, E., & Chatterjee, N. (2020). Individual and community-level risk for COVID-19 mortality in the United States. Nature Medicine, 1-6.
  • B. Harvey; S. Y. Ji, "Cloud-Scale Genomic Signals Processing for Robust Large-Scale Cancer Genomic Microarray Data Analysis," in IEEE Journal of Biomedical and Health Informatics, vol.PP, no.99, pp.1-1, November 2015
  • Harvey, B., Ji, S., "Cloud-Scale Genomic Signal Processing Classification Analysis for Gene Expression Microarray Data," Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE , vol., no., pp.7152,7155, 26-30 August 2014
  • Kato Mivule, Benjamin Harvey, Crystal Cobb, and Hoda El Sayed, "A Review of CUDA, MapReduce, and Pthreads Parallel Computing Models", IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 1 Issue 8, October 2014,Pages 208-217.