BiGmax Workshop 2019 on Big-data-driven Materials Science

For each poster contribution there will be one poster wall (width: 97 cm, height: 250 cm) available. Please do not feel obliged to fill the whole space. Posters can be put up for the full duration of the event.

First-Principles Thermodynamics of ZrO2 at Hybrid-DFT Level Using a Machine-Learned Potential

Ahmetcik, Emre

Due to their outstanding electronic and thermal properties, zirconia-based materials are used in a wide range of industrial applications, e.g., as catalyst support, ionic conductor, and thermal barrier coating [1]. Computational studies of ZrO2 thermodynamic properties have hitherto relied on LDA/GGA-type functionals. However, it is well known that the exchange-correlation functional significantly affects the outcome of the calculations for this material [2]. We overcome this limitation by building a machine-learned Gaussian Approximation Potential [3] from a small number of first-principles calculations performed with a hybrid exchange-correlation functional. This allows us to simulate the dynamics of zirconia in supercells containing several hundreds of atoms and for several nanoseconds. By this means, we are able to obtain the phase diagram of ZrO2 and to understand the mechanism that drives the monoclinic-tetragonal phase-transition. [1] A. Evans, D. Clarke, and C. Levi, J. Eur. Ceram. Soc. 28, 1405 (2008). [2] C. Carbogno et al., Phys. Rev. B 90, 1441 (2014). [3] A. P. Bartok et al., Phys. Rev. Lett. 104, 136403 (2010).

Screening of small molecules with bilayer-modifying properties using coarse-grained simulations

Centi, Alessia

Small molecules, including alcohols and anesthetics, can alter the lateral organization of plasma membranes by preferentially partitioning between domains, thereby affecting lipid bilayer properties and stability. Although lipid segregation is key to many biological processes, precise understanding of the physical and chemical properties governing membrane phase behaviour is still lacking. Gaining more fundamental insight in the underlying mechanisms is pivotal for developing enhanced drugs that can act through targeted domain phase separation. In this work, we employ coarse-grained simulations based on the MARTINI force field as a screening tool to identify compounds which can affect phase separation in model membranes. This approach, combined with potential of mean force calculations, provides a rapid and affordable platform for gaining a better understanding of the driving forces of lipid domain stabilisation/destabilisation. [1] S. J. Marrink, et al. Journal of Physical Chemistry vol. 111 p. 7812-7824, 2007.

Convolutional Neural Networks for Near-field Spectroscopy

Eisfeld, Alexander

When molecules are assembled into an aggregate, their mutual dipole-dipole interaction leads to electronic eigenstates that are coherently delocalized over many molecules. Knowledge about these states is important to understand the optical and transfer properties of the aggregates. Optical spectroscopy, in principle, allows one to infer information on these eigenstates and about the interactions between the molecules. However, traditional optical techniques using an electromagnetic field which is uniform over the relevant size of the aggregate cannot access most of the excited states because of selection rules. We have demonstrated that by using localized fields one can obtain information about these otherwise inaccessible states [1]. As an example, we discuss in detail the case of local excitation via radiation from the apex of a metallic tip, which allows also scanning across the aggregate. The resulting spatially resolved spectra provide extensive information on the eigenenergies and wave functions. To extract the information on the wavefunction from the spatially resolved spectra is a non-trivial task, for which we use a convolutional neural network. [1] J. Phys. Chem. Lett. 9, 6003 (2018)

Denoising photoelectron spectra using autoencoder

Giri, Sajal Kumar

Nonlinear noisy response (e.g. photo-electron spectra (PES)) results from the interaction of matter with noisy free-electron laser pulses. We used an artificial neural network (ANN) to learn noise free PES from noisy PES. The performance of de-noising ANN is very good for a wide range of Hamiltonians and pulse parameters. This particular problem enables us to get knowledge about prediction error without knowing the actual reference.

Predicting solute-grain boundary segregation energies

Huber, Liam

Solute-grain boundary (GB) interaction has a strong influence on the evolution and stabilization of a metal's grain structure, and thus a strong influence on macroscopic properties such as strength. Due to the disordered nature of GB structures, there are a wide variety of local environments for a segregating solute to encounter. This results in a distribution of solute segregation energies which is often ignored and replaced by a single effective term, e.g. in the Langmuir-McLean model. Using classical molecular statics, we perform over 1.4 million segregation energy calculations to build these distributions for six different segregating solutes at 38 GBs in aluminium. This data set is sufficiently rich to apply machine learning techniques for building predictive models of the per-solute-per-site segregation energy. We present the results of these models with special attention on their predictive accuracy and on the thermodynamics of solute-GB concentration which can be probed when these distributions are available.

Structure-selection strategies in the cluster-expansion method

Hübner, Axel

Our work addresses structure selection strategies for the cluster expansion method (CE) of alloy theory. The selection strategies aim to optimize the construction of training data sets, leading to faster convergence and improved accuracy of the predictions. The later is estimated through cross validation (CV). We analyze multiple calculation schemes of the CV score and show how these relate to structure selection. Examples of structure selection strategies, as implemented in the python package CELL for cluster expansion, are presented.

Toward Generalised Subgroup Discovery

Kalofolias, Giannis

In subgroup discovery, we aim to find named subsets (subgroups) in data that exhibit exceptional behaviour on a property of interest, captured by the target variable. For each application, a new optimisation scheme must be developed for the efficient solution of the task at hand, e.g., for non-typical objectives or data types. We seek for a general framework that guarantees results to be exact and is adaptable to the notion of entity distance, as captured by an appropriate kernel.

Nonlinear classification: A Kernelized Support Tensor Train Machine

Kour, Kirandeep

There are several applications, e.g., neuroscience, materials science, signal processing etc., where enormous amounts of data are being generated. Generally, these data depend on various parameters, therefore can be interpreted as multidimensional arrays. Although computational power has been increased drastically over the last decade, a direct treatment, involving such multidimensional arrays, is still almost impossible due to the curse of dimensionality. This means the required memory storage of these multidimensional increases exponentially with respect to dimensionality, and hence, also the involved computational cost. Tensors (i.e., multi-way arrays) can be considered as an essential tool to mitigate the aforemen-tioned issue. Tensor decomposition often provides a natural and compact representation for such massive multidimensional data. There has been a significant advancement in the field of tensor calculus over the last decade, with successful applications in various fields of science and engineering. Currently, some fields such as Artificial Intelligence (AI), Machine Learning (ML), Deep Learning (DL) are getting more and more advanced in solving real-world problems due to the availability of high computational capacities such as GPU and TPU. But all these techniques are still lacking efficiency in the multidimensional case. In our work, we investigate a multidimensional problem, arising in machine learning. Particularly, our focus lies on the nonlinear classification problem for a model, having multi-millions parameters. In this talk, we will discuss how tensor techniques can be applied to such a problem and its related challenges. (P. Benner (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany und Fakultät für Mathematik, Technische Universität Chemnitz, Germany), K. Kour (Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany) and M. Stoll (Fakultät für Mathematik, Technische Universität Chemnitz, Germany))

Towards an Accurate, High-throughput Framework For the Prediction of Anharmonic Free Energies in Molecular Crystals: Benchmarks

Krynski, Marcin

Organic molecular crystals are a vast group of compounds with undisputed industry importance, known for their ability to form polymorphs with properties tied strongly to their crystallographic structure. A large body of theoretical research is centered on polymorph energy ranking [1], which is impacted by the (often neglected) thermodynamic conditions and anharmonicities of the potential energy surface (PES). We present a study of anharmonic contributions to free-energies, starting by choosing a representative set of polymorphic crystals to benchmark different methodologies: naphthalene (simple and stiff crystal) and pimelic acid (soft). For the p21a14 and p21c14 molecular crystal polymorphs of naphthalene, we employ dispersion-corrected density-functional theory and compare full anharmonic free-energy evaluations [2] to more computationally tractable approximate methods (self-consistent phonons), gauging the effect of lattice expansion at different temperatures. We show that at the PES, p21c14 is lower in energy than p21a14 by ca. 3meV/molecule for any combination of PBE/PBE0/B3LYP with pairwise or many-body van der Waals (vdW) corrections. Without vdW corrections, the crystals are not stable. We assess whether temperature and lattice expansion explain the experimentally observed stability of the p21a14 polymorph and extend our methodology to polymorphs of pimelic acid, which shows a puzzling temperature-dependent lattice contraction along one axis. [1] A. Reilly et al., Acta Cryst. B 72, 439 (2016) [2] M. Rossi, P. Gasparotto, M. Ceriotti, PRL 117, 115702 (2016)

A framework for studying similarity: Recommending materials in the NOMAD Encyclopedia

Kuban, Martin

The recent development of large databases for computational materials science, like NOMAD [1], allows researchers to reuse data that was generated for different purposes. In this work, we make use of the data contained in NOMAD to find materials with similar properties. Similarity can be evaluated and quantified by comparing specialized representations of the materials properties, so-called fingerprints. We design a family of fingerprints derived from the electronic density-of-states (DOS), consisting of vectorial representations obtained from non-uniform scalings of the DOS. In contrast to previous works [2], our approach allows us to set the focus of searches for similar materials on special features of the DOS, as for instance the band gap, or the amount of states close to the Fermi level. We present examples for several materials ranging from metals to insulators. To demonstrate the usefulness and applicability of our approach, we have devised a recommender system for the NOMAD Encyclopedia. [1] C. Draxl and M. Scheffler, MRS Bulletin, 43, 676, (2018). [2] O. Isayev et al., Chermistry of Materials 27, 735, (2015).

Robust crystal-structure recognition using Bayesian deep learning

Leitherer, Andreas

Assigning the crystal structure to local regions of large atomic structures can reveal hidden patterns and thus interesting material properties. Available computational methods either support a large number of space groups but show critically limited robustness, or are very robust but can treat only a handful of classes. We use Bayesian neural networks to robustly assign the correct crystalstructure type to a given material while being able to treat numerous space groups and chemical species. To capture information about the local chemical environments, we apply the smooth-overlap-of-atomic-positions (SOAP) descriptor, serving as input to the deep-learning model. Since the neural network provides an intrinsic similarity metric, we are able to investigate structural transitions such as the Bain path between face-centered cubic and body-centered cubic structures. We also discuss the application of our framework to detect precipitates in Ni-based superalloys (materials used in aircraft engines), whose structure is usually experimentally investigated via atom probe tomography. Andreas Leitherer, Angelo Ziletti, Matthias Scheffler, Luca M. Ghiringhelli (Fritz Haber Institute of the Max Planck Society)

Machine Learning of Free Energies

Rauer, Clemens

Free energies are important molecular properties which can provide an in- sight into the thermodynamic state of the respective system. Accurate calcula- tions of free energies are an important tool for many biophysical applications, ranging from protein-ligand binding [1] to the insertion of small molecules into a lipid [2]. However, computationally expensive high level simulations are necessary in order to obtain accurate free energy estimates, and therefore, only a small subset of chemical space can be accurately covered. We overcome this problem by building a Delta-machine learning [3] model. Using this approach we can use a \cheap" low level method to predict free energies and learn the correction to a higher level method or experimental value. Then, we can predict high level free energies for significantly larger compound sets than was used in the training of the model. We show that by using only limited high level data, highly accurate free energies can be calculated using this method. As a first system we apply this approach to the prediction of hydration free energies. [1] Mobley, D.L. & Gilson, M.K. Annu. Rev. Biophys. 2017, 46:531-58 [2] Menichetti, R. et al. J. Chem. Phys. 2017, 147, 125101 [3] Ramakrishnan et al. J. Chem. Theory Comput. 2015, 11, 2087-2096

Machine Learning for Multidimensional Photoemission Spectroscopy

Stimper, Vincent

Mulitdimensional Photoemission Spectroscopy is an experimental technique recently invented to study the structure and dynamics of electrons in solids. I am going to demonstrate how machine learning can be used to analyze the electron dynamics and reconstruct the electronic band structure.

Temperature-dependent properties of thermoelectric clathrates

Troppenz, Maria

Intermetallic clathrate compounds, among them the type-I clathrate Ba8AlxSi46-x, are promising candidates for high-efficiency thermoelectric applications. They exhibit a strong dependence of the electronic properties on the atomic arrangement of the substitutional Al atoms in the crystal framework [1]. At the charge-balanced composition (x=16), the ground-state configuration is semiconducting, however, configurations higher in energy are metallic. Understanding changes in electronic behavior with respect to temperature is essential, as semiconducting behavior is a prerequisite for thermoelectric applications. By employing the cluster expansion technique combined with Monte-Carlo samplings and the Wang-Landau method [2] we find a semiconductor-to-metal transition at around 600 K which is driven by a partial order-disorder transition. Signatures of this phase transition are observed in the temperature-dependent band structure, specific heat, and partial occupations. [1] M. Troppenz, S. Rigamonti, and C. Draxl, Chem. Mater. 29, 2411 (2017). [2] S. Rigamonti, M. Troppenz, M. Kuban, A. Huebner, C. Sutton, L. Ghiringhelli, M. Scheffler, and C. Draxl. CELL: a python package for cluster expansions with a focus on complex alloys, in preparation.