Scientific computing and data science.
(And sometimes both together)
Last updated: 2020.08.14
Broad research interests
- Predicting materials properties with ML
- Text mining scientific literature
- Adaptive design (black box optimization) for scientific computation
- High throughput and massively parallel methods
- Presenting these technologies via accessible, open interfaces (e.g., webpages)
- B.Sc., Chemical Engineering - UCLA (2014 - 2017)
- M.S. Materials Science - UC Berkeley (2017 - 2020)
- Ph.D. Materials Science - UC Berkeley (2017 - present)
- Research topic: Materials Design with Data Science and HPC
Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A. Benchmarking Materials Property Prediction Methods: The Matbench Test Set and Automatminer Reference Algorithm npj Comput. Mater. TBD
Dylla, M. Dunn, A. Anand, S., Jain, A., Snyder, G. J. Machine Learning Chemical Guidelines for Engineering Electronic Structures in Half-Heusler Thermoelectric Materials Research 2020, 6375171, (2020)
Bartel, C. J., Trewartha, A., Wang, Q., Dunn, A., Jain, A., Ceder, G. A critical examination of compound stability predictions from machine-learned formation energies. npj Comput. Mater. 6, 97 (2020)
Ricci, F., Dunn, A., Jain, A., Rignanese, G. M., Hautier, G. Gapped metals as thermoelectric materials revealed by high-throughput screening J. Mater. Chem. A 8, 17579-17594 (2020)
Tshitoyan, V., Dagdelen, J., Weston, L., Dunn, A., Rong Z., Kononova, O., Persson, K.A., Ceder, G.,& Jain, A. Unsupervised word embeddings capture latent knowledge from materials science literature Nature 571, 95-98 (2019)
Dunn, A., Brenneck, J., Jain, A. Rocketsled: a software library for optimizing high-throughput computational searches. J. Phys. Mater. 2, 034002 (2019).
- Ward, L., Dunn, A., Faghaninia, A., Zimmermann, N. E. R., Bajaj, S., Wang, Q., Montoya, J. H., Chen, J., Bystrom, K., Dylla, M., Chard, K., Asta, M., Persson, K., Snyder, G. J., Foster, I., Jain, A. Matminer: An open source toolkit for materials data mining. Comput. Mater. Sci. 152, 60-69 (2018).
[Invited] Dunn, A., Jain, A. “Software tools for Accelerating Materials Discovery with Machine Learning” at Foundational and Applied Data Science for Molecular and Material Science Engineering (Lehigh I-DISC Institute for Data, Intelligent Systems, and Computation), Bethlehem, Pennsylvania. May 23, 2019.
Dunn, A., Wang, Q., Ganose, A., Faghaninia, A., Jain, A. “An Automatic Materials Science Machine Learning Tool for Benchmarking and Prediction” at AI-based Investigation of Material Properties (TMS 2019), San Antonio, Texas. March 12, 2019
Dunn A., Faghaninia, A. “Matminer: Data Mining for Materials Science” at Materials Project Workshop 2018, Berkeley, California. August 10, 2018
Dunn A., Bajaj, S., Jain, A. “Automatic Optimization Algorithms for Maximum-Throughput Materials Design and Discovery” at Science Undergraduate Laboratory Internship Program, Berkeley, California. August 5, 2016
Dunn, A., Ray, A., Daloglu, M.U., Ozcan, A. “The Development of Polymer-based Nanolenses Towards Enhanced Nanoparticle Imaging” at UCLA HHMI Day, Los Angeles, California. May 31, 2016
Ganose A., Dunn, A. “Data Mining for Materials” at Materials Project Workshop 2019, Berkeley, California. August 2, 2019
Graduate Student Research Assistant @ LBNL. Using data-mining to elucidate structure-property relationships and accelerate predictions of material properties. Running many thousands of density functional theory (DFT) calculations to evaluate candidate thermoelectrics, communicating results to experimental collaborators. Writing open-source software packages for data mining materials properties and running massively parallel calculations on supercomputers. (2017 - present)
Consultant @ MaterialsQM Consulting. High-throughput synthesis pathway screening using density functional theory and combinatorics. Communicating with clients, preparing reports, and helping guide discovery of novel semiconductor materials. (2018 - 2019)
Undergraduate Student Research Assistant @ LBNL. Remote position. Wrote a black-box Bayesian optimization (adaptive design) package for use with the workflow software FireWorks. Incorporated several machine learning algorithms as optimization engines, and tested the performance on two example use cases in materials science. (2016 - 2017)
Principal Web Developer @ RYE Limousine, Inc. Remote position. Designed and deployed corporate website serving hundreds of customers per month for limousine service using Wordpress and LimoAnywhere. Website included live chat between RYE employees and customers and ability for customers to interface with remote scheduling system. (2018)
Howard Hughes Medical Institute Undergraduate Researcher @ UCLA. Studied on-chip microscopy at the Ozcan Lab. Investigated techniques for rapidly polymerizing nanolenses inside mobile microscopes to identify nanoparticles (such as viruses). (2015 - 2016)
Lead App Designer @ UCLA Dept. of Anesthesiology Mobile App Team. Lead UX design for a mobile application for perioperative/anesthetic care for UCLA Health. Worked alongside the UCLA Health Center to develop a comprehensive program for wireless bioinformatics. (2015 - 2016)
Leadership, Memberships, and Awards
- NERSC User Group Executive Committee - Elected member of executive committee which administrates supercomputing policy at NERSC. (2019 - 2022)
- UC Berkeley Graduate Data Visualization Contest Overall Winner - Won schoolwide competition by creating interactive website for graduate financial data. (2019)
- Computational Materials Science at Berkeley - Co-Founder, officer (2018)
- Magna Cum Laude - UCLA (2017)
- Tau Beta Pi CA Epsilon - Distinguished member (2016)
- Edward and Doris Rhoad Scholarship - Selected recipient (2014)
- National AP Scholar with Distinction - (2014)
- Regent’s and Chancellor’s Scholarship at Berkeley - Selected but declined (2014)
- Python - 4+ years experience
- Bash scripting - 4+ years experience
- NoSQL (MongoDB) - 4+ years experience
- Julia - 1 year experience
- C++ - 1 year experience
- C - occasional use
- Go - occasional use
- Networking, file operations, process management
- Parallel CPU computing frameworks such as OpenMP and MPI
- GPU computing frameworks such as CUDA and Thrust
- Queue computing platforms such as SLURM and PBS
- DevOps tools such as Docker and Rancher
Data science libraries
- Machine learning - scikit-learn, keras, pytorch
- Numerical analysis - pandas, numpy, scipy
- Dashboarding - Plotly Dash
- Tools of the trade - ipython, jupyter notebook, matplotlib, plotly
Other software and frameworks (ordered by experience)
- HTML, CSS
- VASP - Ab-initio Density Functional Theory simulation software
Open source software
- rocketsled: (maintainer) Black box optimization framework for high-throughput computing.
- automatminer: (maintainer) An autoML tool for predicting materials properties..
- matminer: (primary developer) Data mining tools for materials science.
- matscholar: (developer) Text mining analysis of millions of materials science abstracts, including public API and website.
- pymatgen: (contributor) Python materials genomics.
- fireworks: (contributor) High throughput workflow management.
- atomate: (contributor) Pre-built workflows to calculate materials properties.
- among others…