Job Listings

Data Scientist 2

Pacific Northwest National Laboratory

Overview
The Physical and Computational Sciences Directorate (PCSD) researchers lead major R&D efforts in experimental and theoretical interfacial chemistry, chemical analysis, high energy physics, interfacial catalysis, multifunctional materials, and integrated high-performance and data-intensive computing.
PCSD is PNNL's primary steward for research supported by the Department of Energy's Offices of Basic Energy Sciences, Advanced Scientific Computing Research, and Nuclear Physics, all within the Department of Energy's Office of Science.
Additionally, Directorate staff perform research and development for private industry and other government agencies, such as the Department of Defense and NASA. The Directorate's researchers are members of interdisciplinary teams tackling challenges of national importance that cut across all missions of the Department of Energy.
Responsibilities
The successful candidate will have the opportunity to support the creation, and access to scientific data generated from computational chemistry and cheminformatics studies that advance drug discovery, molecular science and materials science research, among others. The candidate will apply their data science and data engineering skills to provide autonomous design, development, and support of web-facing applications tailored to the needs of these research users. They will contribute to the design/development of tools to simplify access to and utilization of the chemical data, be expected to work both independently and within a team to support software design/development activities, including requirements analysis, code development, testing, documentation, and deployment support.
Designs, develops, and implements methods, processes, and systems to analyze diverse data. Applies knowledge of statistics, machine learning, advanced mathematics, simulation, software development, and data modeling to integrate and clean data, recognize patterns, address uncertainty, pose questions, and make discoveries from structured and/or unstructured data. Produces solutions driven by exploratory data analysis from complex and high-dimensional datasets. Designs, develops, and evaluates predictive models and advanced algorithms that lead to optimal value extraction from the data. Demonstrates ability to transfer skills across application domains.
Qualifications
Minimum Qualifications:
- BS/BA and 2 years of relevant experience -OR-
- MS/MA -OR-
- PhD
Preferred Qualifications:
The Scientist is expected to have strong technical writing skills and be able to present complex ideas related to chemical data analysis and development at project reviews and conferences. The candidate will also support collaborative interaction through efforts including written communications, demonstrations, and presentations about technical activities related to computational chemistry and data science. The candidate will work collaboratively within a team to execute on the full system development lifecycle, including analyzing user needs to determine technical requirements for chemical data analysis and visualization; developing technical specifications based on conceptual design and requirements; Identifies and evaluates new technologies or methods in data science, machine learning, and high-performance computing for implementation and continuous improvement of computational chemistry and cheminformatics workflows.
- Extensive expertise in applying machine learning algorithms and packages to computational chemistry and cheminformatics problems, including but not limited to regression and classification algorithms, supervised/unsupervised learning techniques, Random Forest, SVM, and various neural networks.
- Proficient in using deep learning frameworks such as sci-kit Learn, MATLAB, theano, Torch, and TensorFlow for developing predictive models and analyzing chemical data.
- Solid experience working in Unix/Linux environments and High-Performance Computing and cloud computing settings, with a focus on running molecular modeling software packages.
- Proficiency in high-level programming languages like Python, R, and Matlab, with a strong emphasis on scientific computing libraries (e.g., NumPy, SciPy, Pandas) and cheminformatics toolkits (e.g., RDKit, OpenBabel, ChemoPy) for processing and analyzing chemical data.
- Strong background and hands-on experience in autonomous science, AI/ML, data science, and Natural Language Processing, along with a solid foundation in chemistry, materials science, cheminformatics, and computational chemistry/biology. Skilled in applying these techniques to accelerate drug discovery, materials design, and reaction prediction.
- Proficient in using cheminformatics databases and tools for managing, querying, and analyzing large chemical datasets. Skilled in applying data mining and machine learning techniques to extract meaningful insights and build predictive models from these datasets.
- Some grasp of software engineering concepts, including desi

Location: Cammack Village, AR

Posted: Aug. 7, 2024, 6:50 a.m.

Apply Now Company Website