TK. Mini-symposium: Machine Learning
Tuesday, 2022-06-21, 01:30 PM
Noyes Laboratory 217
SESSION CHAIR: Andrew White (University of Rochester, Rochester, NY)
|
|
|
TK01 |
Invited Mini-Symposium Talk |
30 min |
01:30 PM - 02:00 PM |
P6195: AN OVERVIEW OF MACHINE LEARNING IN ROTATIONAL SPECTROSCOPY |
STEVEN SHIPMAN, Department of Chemistry, New College of Florida, Sarasota, FL, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK01 |
CLICK TO SHOW HTML
Over the last several years, particularly with the advent of well-documented open source libraries, it has become increasingly easier to apply machine learning techniques to a wide range of problems. Spectroscopy has not been immune to this, and literature searches for "machine learning" and "spectroscopy" return thousands of hits. However, these techniques have not yet found widespread use in the area of high-resolution rotational spectroscopy. In this talk, I will give an overview of the current work in the field and highlight some of the challenges that make this a difficult problem. Along the way, I hope to also provide a kind of "baseline", showing what can be done without the use of machine learning techniques and where they may be particularly applicable.
|
|
TK02 |
Contributed Talk |
15 min |
02:06 PM - 02:21 PM |
P6014: DEVELOPMENT OF HIGH-SPEED AB INITIO CCSD(T) LEVEL NEURAL NETWORK POTENTIAL ENERGY SURFACES FOR DIFFUSION MONTE CARLO |
FENRIS LU, ANNE B McCOY, Department of Chemistry, University of Washington, Seattle, WA, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK02 |
CLICK TO SHOW HTML
Diffusion Monte Carlo (DMC) is a general statistical method that is capable of providing an accurate ground-state solution to the molecular Schrodinger equation of the system of interest. The approach is particularly well suited for systems like water clusters and CH 5+ that undergo large amplitude vibrational motions, providing a way to gain insights into their vibrational and rotational spectra that are difficult to achieve by other methods. The ability to perform DMC simulations is predicated by the availability of a fast and reliable Potential Energy Surface(PES), as billions of structures with energies up to ten times the zero point energy will be evaluated in a typical DMC simulation. Such strenuous demands for speed, accuracy and extrapolability to high-energy regions of the potential pose major challenges to most current PES developed by conventional methods.
To address these issues, we have developed a Neural Network(NN) architecture and training protocol to generate CCSD(T) level NN-PES specifically to meet all the demands of DMC. We validated this approach with CH 5+ and (H 2O) 2, and applied it to protonated ethylene (C 2H 5+). This proposed NN-PES is trained solely with ab initio data, and is versatile, so it can be applied to any small-to-medium-sized systems. Powered by the robust parallel computing ability of Graphics Processing Units (GPUs), this approach can be used to evaluate the energy of a single geometry with a microsecond. Its architecture also ensures remarkable extrapolability and no unphysical energy predictions (e.g. 'holes' in the potential) even in high energy regions of the potential where training data are extremely scarce. In this talk we will focus on the procedures taken to develop such NN-PES, and in the accompanying talk, we will share the results of DMC studies of C 2H 5+ that use a NN-PES, which has been developed using this approach.
|
|
TK03 |
Contributed Talk |
15 min |
02:24 PM - 02:39 PM |
P6211: DIFFUSION MONTE CARLO STUDY OF C2H5+ USING AN AB INITIO POTENTIAL ENERGY SURFACE |
PATTARAPON MOONKAEN, FENRIS LU, ANNE B McCOY, Department of Chemistry, University of Washington, Seattle, WA, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK03 |
CLICK TO SHOW HTML
Carbocations are a class of important organic intermediates, which exist in hydrocarbon plasmas and are believed to play a role in the chemistry in the interstellar medium. Protonated ethylene (C 2H 5+) is one such carbocation, which is formed from the smallest alkene family. It is also important in mass spectrometry as it appears in the mass spectra of many organic molecules and it is used as the protonating agent in chemical-ionization mass spectrometry. High-level electronic structure calculations predict that the minimum energy structure is the non-classical one in which the excess proton is equidistant from the two carbon atoms. This was confirmed by the IR spectrum of C 2H 5+ obtained by the Dopfer and Duncan groups.
In this work, the ground state wavefunction and structure of C 2H 5+ is obtained from Diffusion Monte Carlo (DMC) based on a potential with CCSD(T)-level accuracy, evaluated using several machine learning approaches. The effect of the shared proton motion on the IR spectrum as well as the coupling between the vibration of the shared proton and other higher frequency motion will be discussed. The impact of deuteration on these couplings also will be described. Lastly, the excited state for the shared proton motion can be obtained by fixed-node DMC, allowing us to explore the excited state wave functions, and particularly the possibility of accessing the classical carbocation structure through vibrational excitation.
|
|
|
|
|
02:42 PM |
INTERMISSION |
|
|
TK04 |
Invited Mini-Symposium Talk |
30 min |
03:03 PM - 03:33 PM |
P5938: PUTTING DENSITY FUNCTIONAL THEORY TO THE TEST WITH MACHINE LEARNING |
HEATHER J KULIK, Chemical Engineering, MIT, Cambridge, MA, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK04 |
CLICK TO SHOW HTML
Accelerated simulation with machine learning (ML) has begun to provide the advances in efficiency to make property prediction tractable at an unprecedented scale. Nevertheless, ML-accelerated workflows both inherit the biases of training data derived from density functional theory (DFT) and leads to many attempted calculations that are doomed to fail. Many compelling molecular systems involve strained chemical bonds, open shell radicals and diradicals, or metal–organic bonds to open-shell transition-metal centers. Although promising targets, these materials present unique challenges for electronic structure methods and combinatorial challenges for their discovery. I will describe some of my group’s recent advances in using artificial intelligence to address challenges in accuracy and efficiency beyond conventional DFT-based ML workflows. I will describe how we have developed ML models trained to predict the results of multiple methods or the differences between them, enabling quantitative sensitivity analysis. I will then describe ML models we have developed on a series of chemical and electronic structure descriptors that predict the likelihood of calculation success and detect the presence of strong correlation. Combining novel descriptors and developing consensus from multiple levels of theory empowers decision engines that represent the first steps toward autonomous workflows that avoid the need for expert determination of the robustness of DFT-based computational modeling.
|
|
TK05 |
Contributed Talk |
15 min |
03:39 PM - 03:54 PM |
P5962: PARTITION FUNCTION ESTIMATION FROM INCOMPLETE SPECTROSCOPIC GRAPHS |
KELVIN LEE, Accelerated Computing Systems and Graphics, Intel Corporation, Hillsboro, OR, USA; KYLE N. CRABTREE, Department of Chemistry, University of California, Davis, Davis, CA, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK05 |
CLICK TO SHOW HTML
Statistical mechanical treatment of molecules is a crucial part of the analysis workflow for many fields, ranging from reaction dynamics, spectral intensity simulation, to abundance characterization in the interstellar medium, to materials research and simulation. At the heart of this is computation of the partition function-the statistical equivalent to the quantum mechanical wavefunction-which involves summation over thermally relevant energy levels. Despite being conceptually straightforward, calculation of the partition function can be a challenging task: at high temperatures, the number of contributing states grows exponentially, and often the list of states is truncated for computational and portability reasons.
Here, we propose the use of physics informed graph neural networks to parameterize the partition function calculation based off incomplete spectroscopic graphs, and as a proof-of-concept, demonstrate its applicability and weaknesses through the study of pure rotational energy levels. In contrast to approximate analytical expressions based on the principal rotational constants, graph structures natively capture effects such as centrifugal distortion of varying degrees, which otherwise significantly undermine the accuracy of calculated partition functions at elevated temperatures. As part of our study, we discuss implications on computational performance, data requirements, and applicability in typical workflows.
|
|
TK06 |
Contributed Talk |
15 min |
03:57 PM - 04:12 PM |
P6335: ACCURATE PHOTOPHYSICS OF ORGANIC RADICALS FROM MACHINE LEARNED RANGE-SEPARATED FUNCTIONALS |
CHENGWEI JU, Pritzker School of Molecular Engineering, The University of Chicago, Chicago, IL, USA; YILI SHEN, College of Software Engineering, Tongji University, Shanghai, China; AARON TIAN, N/A, Massachusetts Academy of Math and Science, Worcester, MA, USA; ETHAN FRENCH, HONGSHAN BI, ZHOU LIN, Department of Chemistry, University of Massachusetts, Amherst, MA, USA; |
IDEALS Archive (Abstract PDF / Presentation File) |
DOI: https://dx.doi.org/10.15278/isms.2022.TK06 |
CLICK TO SHOW HTML
r0.3
Figure
Luminescent doublet-spin organic semiconducting radicals are emergent and unique candidates for organic light-emitting diodes because their internal quantum efficiency is not limited by intersystem crossing into any non-emissive high-spin state. The multi-configurational nature of their electronic structures challenges the usage of single-reference density functional theory (DFT), but the problem can be mitigated by designing more powerful exchange-correlation (XC) functionals.
In an earlier study, we developed a molecule-dependent range-separated functional, referred to as ML-ωPBE, using a stacked ensemble machine learning framework. Ju et al. J. Phys. Chem. Lett., 2021 12, 9516.n the present study, we assessed the performance ML-ωPBE for 64 organic semiconducting radicals from four categories, when similar radicals are absent from the training set.
Compared to the first-principles OT-ωPBE functional, ML-ωPBE reproduced the molecule-dependent range-separation parameter, ω, with a small mean absolute error (MAE) of 0.0214 a 0−1.
Using single-reference time-dependent DFT (TDDFT), ML-ωPBE exhibited outstanding behaviors in absorption and fluorescence energies for most radicals in question, with small MAEs of 0.22 and 0.12 eV compared to experimental sources, and approached the accuracy of OT-ωPBE (0.22 and 0.11 eV).
Our results demonstrated excellent generalizability and transferability of our ML-ωPBE functional from closed-shell organic semiconducting molecules to open-shell doublet-spin organic semiconducting radicals.
Footnotes:
Ju et al. J. Phys. Chem. Lett., 2021 12, 9516.I
|
|