Computational Scientist I - Genome Wide Assoc. Studies (GWAS)
Job ID: req1355
Employee Type: exempt full-time
Facility: Rockville: 9615 MedCtrDr
Location: 9615 Medical Center Drive, Rockville, MD 20850 USA
The Frederick National Laboratory is a Federally Funded Research and Development Center (FFRDC) sponsored by the National Cancer Institute (NCI) and operated by Leidos Biomedical Research, Inc. The lab addresses some of the most urgent and intractable problems in the biomedical sciences in cancer and AIDS, drug development and first-in-human clinical trials, applications of nanotechnology in medicine, and rapid response to emerging threats of infectious diseases.
Our core values of accountability, compassion, collaboration, dedication, integrity, and versatility serve as a guidepost for how we do our work every day in serving the public's interest. Position Overview: PROGRAM DESCRIPTION
Join our talented team of bioinformaticians dedicated to understanding the genetics of cancer. We are seeking an enthusiastic, creative, and collaborative bioinformatics scientist to support our broad portfolio of genome-wide association studies (GWAS).
The Cancer Genomics Research Laboratory (CGR) investigates the contribution of germline and somatic genetic variation to cancer susceptibility and outcomes in support of the NCI's Division of Cancer Epidemiology and Genetics (DCEG), the world's most comprehensive cancer epidemiology research group. CGR is located at the NCI-Shady Grove campus in Rockville, MD and operated by Leidos Biomedical Research, Inc. We care deeply about discovering the genetic and environmental determinants of cancer, and new approaches to cancer prevention, through our contributions to the molecular, genetic, and epidemiologic research of the 70+ investigators in DCEG. Our bioinformaticians have both the passion to learn and the opportunity to apply their skills to our rich and varied genotyping and sequencing datasets, generated in support of DCEG's multidisciplinary family- and population-based studies. Working in concert with the epidemiologists, biostatisticians, and basic research scientists in DCEG's intramural research program, CGR conducts genome-wide discovery studies and targeted regional approaches to identify the heritable determinants of various forms of cancer.KEY ROLES/RESPONSIBILITIES
- Perform large-scale genotyping data QC, phasing and imputation, population structure testing, association studies, meta-analysis, and fine mapping
- Contribute to building, benchmarking, and maintaining bioinformatics pipelines to facilitate high-throughput genomic data analysis in HPC and cloud environments
- Harmonize and maintain diverse datasets and associated metadata, including performing meta-analyses of data run on multiple platforms and/or externally generated data
- Thoughtfully synthesize results into clear presentations (including QQ-plots, Manhattan plots) and concise summaries of work to support recommendations for next steps
- Perform advanced research including multiplicative interaction studies, pathway-based studies, and integrative analyses from multiple platforms and various data types
- Collaborate closely with DCEG PIs on scientific manuscript development, submission, and revision activities with significant co-authorship and potentially first authorship opportunities
- Possession of a Doctoral degree from an accredited college/university in bioinformatics, statistics, genetics, computational biology or related field. Foreign degrees must be evaluated for U.S equivalency.
- No experience required beyond a doctoral degree.
- In-depth knowledge of genome-wide association studies and interpretation, and applied computational research on large multivariate datasets
- Expertise in algorithmic implementation, statistical programming and data manipulation, using e.g. R/Bioconductor, Python, MATLAB, and a wide range of contemporary, open-source bioinformatics tools (e.g. PLINK, SNPTEST, IMPUTE2, BEAGLE, UCSC Genome Browser, Michigan Imputation Server, etc.)
- Proficiency with Bash, Python, Perl, R, C/C++, and/or JAVA
- Team-oriented with excellent written and verbal communication skills, organizational skills, and attention to detail; ability to organize and execute multiple projects in parallel
- Demonstrated ability to proactively remain up-to-date in current bioinformatics techniques and resources, and identify and benchmark novel software solutions against established reference datasets
- Experience in constructing practical computational tools/pipelines for data parsing, quality control, modelling, and analysis for large-scale genetic or genomics datasets
- Ability to obtain and maintain a security clearance
- Familiarity with publicly available data sources (such as dbGaP, GDC/TCGA, ENCODE, 1000 Genomes, gnomAD/ExAC, TARGET, GTEX) and diverse genomic annotations
- Experience managing large datasets and computational tasks in a Linux-based high-performance computing environment
- Pipeline development experience, including collaborative coding and use of source control (e.g. git)
- Experience with Snakemake, make, or other workflow management systems
- Experience with containerization (e.g. Singularity, Docker)
- Experience with Google Cloud, AWS, or managed cloud environments
- Experience in the field of molecular and population genetics with a strong publication record
Equal Opportunity Employer (EOE) | Minority/Female/Disabled/Veteran (M/F/D/V) | Drug Free Workplace (DFW)