|
Chair morning talks: Ulf Saalmann
|
09:00 - 10:00
|
Alpha Lee
(University of Cambridge)
Data-driven chemical discovery: Making sense of noisy and limited data
Although significant progress has been made in combining machine learning with theoretical calculations, translating machine learning predictions to experimentally-confirmed materials is still a significant challenge. What is often overlooked is that experimental chemistry and materials science are expensive, time-consuming and the data is not necessarily noise-free. This has two important consequences: First, datasets of experimentally-measured properties are often quite small. Second, a robust estimation of model uncertainty is needed to gauge the “risk” versus “return” of experimentally verifying a particular prediction. I will discuss our recent work on developing uncertainty-calibrated model with noisy and limited data in the context of drug discovery. The mathematics of random matrices, as well as Bayesian statistics and statistical physics, all come to our rescue. We have developed experimentally-validated models that suggest novel and potent organic molecules against therapeutically relevant receptors, as well as ways to chemically synthesise those novel molecules. I will also discuss how insights from drug discovery can be translated into materials science problems.
|
10:00 - 10:10
|
Group photo (to be published on the event's web page)
|
10:10 - 10:30
|
Coffee break
|
10:30 - 11:00
|
Benedikt Hoock
(Fritz-Haber-Institut der Max-Planck-Gesellschaft & Humboldt-Universität Berlin)
Feature construction and selection towards optimal descriptors for materials properties
Materials data contained in repositories like NOMAD [1] can be exploited in many useful ways, such as to better understand existing materials or to discover new materials with desired properties. A crucial step towards these goals is to find a set of meaningful descriptors, i.e. parameters based on computationally cheap input data that capture the physical mechanisms underlying certain material properties. In this work, we develop principles for constructing up to millions of candidate descriptors from simple physical properties. These principles involve mathematical operations [2] and different averaging procedures considering the local ordering. We compare two compressed sensing methods, LASSO+l0 [2] and SISSO [3], at identifying optimal descriptors out of all the candidates. Likewise, we introduce and compare cross-validation based model-selection strategies that use either the average training or the average test error as a criterion, aiming at increasing the descriptors’ generalizability. We use two ab initio data sets, comprising group-IV zincblende ternaries and transparent conducting oxides, to test this methodological approaches.
[1]: C. Draxl & M. Scheffler, MRS Bulletin, 43, 676 (2018).
[2]: L. M. Ghiringhelli, et. al., Phys. Rev. Lett. 114, 105503 (2015).
[3]: R. Ouyang, et. al., Phys. Rev. Mater. 2, 083802 (2018).
|
11:00 - 11:30
|
Markus Kühbach
(Max-Planck-Institut für Eisenforschung Düsseldorf)
Materials Science Examples for Structural Characterizing of Point Cloud Data at Scale
Distilling knowledge from output of ever larger getting experimental characterization or full-field simulation output demands frequently to handle point clouds with associated mark data. I will present recent examples from processing Atom Probe Tomography and full-field annealing microstructure evolution simulations which detail how the Materials scientist can economize its knowledge extraction process by making rigorous use of High Performance Computing paradigms.
|
11:30 - 12:00
|
Markus Rampp
(Max-Planck-Gesellschaft - Computing and Data Facility)
A new machine learning cluster at the MPCDF
The groups of M. Scheffler, K. Kremer, D. Raabe, and J. Neugebauer, in collaboration with the MPCDF have designed and procured a dedicated machine-learning compute cluster which is operated by MPCDF (as of early 2019) and which will be open for collaborations arising in BiGmax. The talk will provide a brief overview of the cluster configuration (hardware and software environment), basic usage and first experiences with machine-learning applications.
|
12:00 - 13:00
|
Lunch break
|
13:00 - 14:00
|
Discussions
|
|
Chair afternoon talks: Christopher Sutton
|
14:00 - 14:30
|
Benjamin Regler
(Fritz-Haber-Institut der Max-Planck-Gesellschaft Berlin)
Discovering functional relationships between atomic and materials properties: an information-theoretic machine learning approach
Machine learning predictively maps features to target properties without providing insights why some features are more relevant than others. However, an identification of feature relevance does not only lead to useful knowledge about unknown outcomes but also improves the performance of learning algorithms. By means of feature selection and information theory, we quantify the relevance of features via estimating information shared between features and target properties. More specifically, we aim to identify relationships between fundamental properties at the atomistic scale and materials properties at the macroscopic scale. We perform comparisons with existing feature selection methods and explore potential issues of our design. Further, we highlight our approach on case studies and present first materials science applications, namely octet-binary compound semiconductors and predicting compressive strength of concrete. Eventually, we conclude that information-theoretic feature selection is a viable tool to explain feature relevance and to pre-process scientific data for machine-learning tasks.
|
14:30 - 15:00
|
Leigh Stephenson
(Max-Planck-Institut für Eisenforschung Düsseldorf)
Big-data-enabled true analytical atomic-scale tomography
Field ion microscopy (FIM) provides a magnified image of a cryogenically-cooled, needle-like specimen subjected to a high electric field [1]. Projected gas ions form an image in which a specimen's surface atoms can be individually observed. A three dimensional variant of FIM (3DFIM) stimulates incremental field evaporation of surface atoms so that a FIM image sequence can be converted into a 3D atomic reconstruction [2]. While 3DFIM can resolve both a crystalline lattice and its structural defects (such as vacancies and dislocations), the use of existing 3DFIM techniques are largely hampered by the enormous computational costs associated with acquiring and storing large quantities of image data and subsequently performing the required feature analysis and reconstruction. Its potential has been unexploited for two other reasons. Modern FIM techniques have been spurned in preference to atom probe tomography (APT), a technique that while giving lower spatial resolution otherwise provides invaluable time-of-flight spectrometry that allows the chemical identification of atoms. Secondary to this, few modern commercial atom probes have FIM capabilities and there are only a few bespoke FIM machines.
Preliminary work at MPIE (performed jointly between the atom probe group and the computational department) has demonstrated that FIM-equipped atom probes can operate in a hybrid pulsed FIM/APT mode for a varied materials applications. The resulting tof-FIM data is rich with nanostructural information, and can be processed to resemble APT point cloud data or transformed to a versatile 3DFIM representation. Significantly, we demonstrated that our atom probes are sufficiently able to perform time-of-flight mass spectrometry in the presence of an imaging gas without compromising instrument integrity and, furthermore, a signal from field evaporated specimen atoms can be easily differentiated from the field ionised imaging signal.
Here we present the first steps of a collaboration between MPIE, MPI for Intelligent Systems (MPIIS) and the Max Planck Computing and Data Facility (MPCDF). Some of the experimental aspects will be mentioned, but the focus will be on the early challenges of inferring missing information and intelligently turning "bad" data into "good" data for the use of predictive methods.
|
15:00 - 15:30
|
Paolino De Falco
(Max-Planck-Institut für Kolloid- und Grenzflächenforschung Potsdam)
3D SAXS Tomography
The structural complexity of biological materials requires new methodology of material characterization in three and even four dimensions including time. We explore new 3D imaging methods based on x-ray scattering using synchrotron sources that can provide important information on the nanostructure of materials. In a tomography experiment at a synchrotron source data collection requires several hours. Consequently, the size of relevant data produced by this new approach tremendously grows resulting also in an increasing need of computational power and time for 3D reconstruction of data. The focus of our research is the development of a new methodology for fast characterizations of the 3D nanostructure of bone. On the nanoscopic length scale bone is a composite of a fibrous collagen matrix in which inorganic calcium phosphate particles are incorporated. The mineral particles decisively contribute to the high mechanical stiffness and strength of bone material. We aim to elucidate the 3D distribution of mineral particle sizes within a certain bone volume, which is a relevant parameter to characterize the influence of bone diseases on the bone’s mechanical properties. We present results from our approach based on SAXS (small angle X-ray scattering) tomography experiments collected at synchrotron sources and mathematical algorithms of image reconstruction.
|
15:30 - 16:00
|
Coffee break
|
16:00 - 16:30
|
Markus Scheidgen
(Fritz-Haber-Institut der Max-Planck-Gesellschaft & Humboldt-Universität Berlin)
FAIR experimental material science data with NOMAD
The NOMAD (Novel Material Discovery) Center of Excellence (http://nomad-coe.eu) developed the worlds largest platform for sharing computational material science data over the past years. NOMAD integrates heterogenous data from many computational material science codes and allows to Find Access Interoperate and Reuse (FAIR) data through a common (meta)data format. The experimental material science community faces similar challenges when trying to share data from various experimental methods, hardware, and labs. In this talk, we look at the first steps in opening the NOMAD platform for experimental material science data.
|
16:30 - 17:00
|
Ye Wei
(Max-Planck-Institut für Eisenforschung Düsseldorf)
Deploying machine learning for information extraction from atom probe datasets
In this work, we explore data extraction in atom probe tomography utilizing
machine learning technique.
First, we applied unsupervised learning to atom probe datasets from a Zr-Al-
Cu-Fe bulk metallic glass (BMG) in both an undeformed and a deformed state to
detect amorphous nanospheres. More specically, we implemented a clustering
analysis using Hierarchical Density-Based Spatial Clustering of Applications
with Noise(HDBSCAN).
Second, assigning an elemental identity of peaks or deconvolute peak overlaps
in the mass spectrum is a critical step to perform accurate microanalysis by atom
probe and measure a precise chemical composition. This has been so far largely
performed by human expertise, which is time-consuming and often marred by
manual errors. Therefore, we propose a new automatic approach to perform
with assistance of machine learning (mean-shift clutering and gradient boosted
decision trees).
Ye Wei, Ebrahim Norouzi ,Micheal Herbig, Shanoob Balachandran Nair, Baptiste Gault, Dierk Raabe (Max-Planck-Institut für Eisenforschung, Düsseldorf, Germany)
|
17:00 - 17:30
|
Arghya Dutta
(Max-Planck-Institut für Polymerforschung Mainz)
Application of data mining techniques in soft matter systems
Data mining helps in finding, and predicting, materials showing some desired property using statistical methods. It becomes very useful when the subset of materials showing the property becomes small or the pattern becomes too intricate for relatively easy modeling. In this talk, I will present results from our recent and ongoing work on how data mining can provide useful insights for complex soft matter systems.
|
17:30 - 18:00
|
Discussions
|
18:15
|
Departure to the Neustadt of Dresden by tram from the institute
(meeting point: reception of MPIPKS)
|
19:00 - 21:00
|
Conference dinner at the restaurant and pub Bautzner Tor
Hoyerswerdaer Straße 37, 01099 Dresden
|