Abstract
It has been a longstanding materials science challenge to establish structureproperty relations in amorphous solids. Here we introduce a rotationally noninvariant local structure representation that enables different predictions for different loading orientations, which is found essential for highfidelity prediction of the propensity for stressdriven shear transformations. This novel structure representation, when combined with convolutional neural network (CNN), a powerful deep learning algorithm, leads to unprecedented accuracy for identifying atoms with high propensity for shear transformations (i.e., plastic susceptibility), solely from the static structure in both two and threedimensional model glasses. The datadriven models trained on samples at one composition and a given processing history are found transferrable to glass samples with different processing histories or at different compositions in the same alloy system. Our analysis of the new structure representation also provides valuable insight into key atomic packing features that influence the local mechanical response and its anisotropy in glasses.
Introduction
The mechanical response of a crystalline metal to applied stresses can be quantitatively explained by monitoring the evolution of dislocations^{1,2} in the lattice. In stark contrast, amorphous metals do not have such welldefined structural defects, owing to the absence of longrange atomic packing order. As a result, in the static structure of a glass, it remains a grand challenge to identify a priori local “defects” that are vulnerable to rearrangements, even when the positions of all atoms are fully known^{3,4,5,6,7}. Over the years, a number of physicalpropertyinformed parameters have been used to serve as indicators^{8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31} to forecast fertile sites for shear transformations. Also, several macroscopic and experimental parameters^{28,29,30,31,32} have been shown to correlate with the properties/processes in glasses. Datadriven models^{33,34,35,36} have been put forward as well for the same goal. However, all these endeavors have only achieved moderate success: the correlations with the plastic susceptibility of particles/atoms, i.e., the propensity of each particle to experience plastic rearrangement, have not been sufficiently strong to establish structure–property relations in glassy solids.
A critical reason for this status quo is that the scalar or rotationinvariant quantities employed so far are inherently inadequate in capturing the anisotropic response of a local configuration. A given local environment of an atom (particle) can respond quite differently when the externally applied global force is imposed along different orientations^{37,38,39,40}. Recently, additional indicators have been invoked to take into account anisotropy in predicting the propensity for plastic activity in twodimensional (2D) model glasses^{38,39,40}. But it remains unclear how well these new indicators fare in threedimensional (3D) amorphous solids. Furthermore, these indicators are physicsinformed quantities that need to be evaluated with known interparticle interactions (potentials), every time the structure changes. In other words, one cannot simply monitor the spatial atomic positions alone, to explain property changes. In terms of physical insight, it is difficult to use these parameters to directly identify the critical structure features (i.e., the atomic packing environment per se) responsible for the anisotropic local mechanical response. This has been a puzzle existing for at least more than two decades since Falk and Langer pointed it out in 1998^{37}.
Recently, deep convolutional neural networks (CNNs) have enabled revolutionary breakthroughs in the field of computer vision and pattern recognition^{41,42,43,44}. Inspired by such successes, here we explore the potential of this deep learning approach via converting local configurations into images and constructing largescale training datasets (up to ~20 million images) from extensive atomic simulations. More importantly, we also devise a rotationally noninvariant structure representation, to capture the anisotropic responses of a local configuration and the intrinsic sensitivity to the loading orientation. We will demonstrate that this combination achieves unprecedented accuracy when it is used to separate out particles with high plastic susceptibility, unveiling the power of local static structure for predicting orientationdependent plastic events in both 2D and 3D glasses. In terms of new physical insight, by identifying a clear difference in the new atomic structure representation for contrasting particles, i.e., those with high versus low plastic susceptibility, we have contributed, from the structural perspective, to fully resolving the puzzle as to why the same local configuration could respond very differently to external loading applied in different orientations.
Results
Atomic structure representation and model architecture
We have developed CNN models in both a 2D binary LennardJones (LJ) glass system^{45} and a 3D binary embeddedatom method (EAM)based Cu–Zr metallic glass system^{46}. To represent the local packing environment of each particle i, we use a new multidimensional structural function, termed a Gaussianweighted spatial density map (SDM). For the 2D model, the SDM of each particle i is defined as
where the summation is performed over all particles satisfying these conditions: \({\mathrm{species}}\left( j \right) \in \beta ; {r_{ij,x}}  < r_c\) and \( {r_{ij,y}}  < r_c\), \(r_{ij,x}\) and \(r_{ij,y}\) are the components along x and y dimensions, respectively, of \({\mathbf{r}}_{i,j}\), which is the vector connecting the central particle i with particle j of species \(\beta\) within a cutoff \(r_c\) (particle j may be the central particle when the species of the central particle is that under consideration in the channel). Here, \(x,y \in \left[ {  r_c + 0.5\Delta ,r_c  0.5\Delta } \right]\) with a constant increment of \(\Delta\) and \(\beta \in \left\{ {0,1} \right\}\) where 0 represents small (S) particles, and 1 large (L) ones. We set \(r_c = 5\sigma\) and \(\Delta = 0.2\sigma\) after trials using many different values (see Supplementary Fig. 1a and Supplementary Notes 1 and 2, and \(\sigma\) is the characteristic length scale parameter in the potential, see details in “Methods”).
Similarly, for 3D glass systems, the SDM of each atom i is defined as
here the summation is performed over all particles satisfying these conditions: \({\mathrm{species}}\left( j \right) \in \beta ; {r_{ij,x}}  < r_c; {r_{ij,y}}  < r_c\) and \( {r_{ij,z}}  < r_c\), and \(\beta\) = 0 or 1, to represent Zr and Cu atoms, respectively. Using an SDM to represent a local configuration can be viewed as projecting the local configuration to 2D (or 3D) grids for different species, which results in a multidimensional numerical array, equivalent to a 2D (or 3D) image containing \(\left( {2r_c/\Delta } \right)^2\) or \(\left( {2r_c/\Delta } \right)^3\) pixels. And each pixel has channels equal to the number of components in the sample. These images can be used directly as input into the CNN model. For an adequately large \(r_c\), a sufficient number of surrounding particles would be included. And, when \(\Delta\) is small enough, any tiny variation of particle positions in the local configuration would lead to a corresponding variation in SDM. Therefore, the SDM can faithfully and accurately represent the topological structure feature of a local configuration. Obviously, when a larger \(r_c\) is used, an image would contain structural information across a larger range, and a smaller \(\Delta\) means a higher resolution of resultant images; the difference between two very similar local configurations would be more pronounced in the images. Although a larger \(r_c\) and a smaller \(\Delta\) are good for more faithful and accurate representation of the topological structure feature and elevated predictive performance, these also mean larger size of each image, requiring more computational resource and memory to generate, store, and use these images. In this work, we will choose optimized values of both \(r_c\) and \(\Delta\) for different glass systems after trying many different values. See more discussion about the optimization of both \(r_c\) and \(\Delta\) in Supplementary Notes 1 and 2. In addition, the contribution to the SDM from different species are separated into different channels, taking into account the influence arising from chemical affinity. Figure 1a shows a typical 2D model glass, with a closeup view (a local configuration) in Fig. 1b. The SDM of S and L particles corresponding to Fig. 1b are shown in Fig. 1c, d, respectively, where each image contains 50 × 50 pixels. The pixels with high intensity (darker color) correspond to the locations of particles. Obviously, the SDM will be different when the local configuration is rotated. Before computing SDMs, we will use different coordinate systems or rotate configurations for different loading orientations such that the shear direction and the normal direction of the shear plane are parallel to X and Y axes, respectively, of the chosen coordinate system (several specific examples are given in Supplementary Fig. 2 for choosing appropriate coordinate systems for different loading protocols). Therefore, the SDM is rotationally noninvariant, which is the key feature we need for predicting orientationdependent plastic events in amorphous solids. As will be shown later, the rotationdependent SDM is very sensitive to small changes in sampleloading alignment. The SDM can also be used to predict rotationinvariant properties of crystalline materials via data augmentation^{41}, i.e., arbitrarily rotating images each time they are fed into the network. The SDM is a substantial extension of the radial Gaussian function pioneered by Behler and Parrinello previously^{47}, to multidimensional space. Our extension is important in the following aspects. First, it ensures a complete structure representation. Second, it enables rotationally noninvariant prediction. Third, it results in images that can be interpreted by CNN which promises stateoftheart learning capability.
We then applied the CNN method^{44} to each input image to predict the propensity of each particle to experience plastic rearrangement, at different global shear strains when the sample is loaded along a specific orientation. See more details on the loading protocol in “Methods.” Note that the CNN model is taking one image per atom (each image represents the local atomic environment centered on the atom of interest), rather than taking an image of the entire simulation box. Figure 1e illustrates the architecture of the CNN model for the 2D glasses. Specifically, each CNN model contained 25 convolutional layers, where filters with a small receptive field of 3 × 3 were used. Zero padding was involved in all but not in the fifth and last convolutional layers. Batch normalization^{48} was adopted right after each convolution and before activation with the rectification (ReLU) nonlinearity^{41}. A maxpooling layer is periodically inserted in between the 25 successive convolutional layers. Maxpooling is performed over a 2 × 2 pixel window, with a stride of 2. The number of filters is 20 in the first five convolutional layers and then doubled after each maxpooling layer. And the last convolutional layer is directly followed by the output layer, which is a sigmoid neuron as we are performing binary classification tasks. In the classification tasks, each input image has a true positive label (y_{i} = 1) when the nonaffine squared displacements (D^{2}_{min})^{37} of the atom i represented by the image is greater than a specific threshold after straining to a given degree under a specific loading condition, and a negative label otherwise (y_{i} = 0). More details about the D^{2}_{min} threshold and the loading protocol will be presented later.
For 3D glasses, r_{c} = 9.6 Å and Δ = 0.6 Å were found to produce optimized results (see Supplementary Fig. 1b, Supplementary Notes 1 and 2), and the resultant input images contain 32 \(\times\) 32 \(\times\) 32 pixels. The corresponding CNN model contains 20 convolutional layers and three maxpooling layers, and both convolution and maxpooling were performed in 3D space, as illustrated in Supplementary Fig. 3. The number of filters in each 3D convolutional layer was 60. To accelerate training and avoid shortage of memory, we did not double the filter number after maxpooling. This is not expected to influence significantly the training results, based on our experience with the 2D glasses for which doubling the filter number increased the validation accuracy by less than 1%. For tutorials about deep learning, the readers are referred to refs. ^{41,42,43,44}.
Structural insight into the anisotropic local mechanical response
In the literature, free volume^{8,49} has been commonly used as a (macroscopic) structural parameter. However, it has been known for a long time that the correlation between local excess volume and dynamics in glasses is rather weak^{10,17}. We will further confirm later that there is indeed no obvious correlation between the plastic susceptibility and local volume. There have also been recent observations that atoms with fewer neighboring particles in the nearestneighbor shell, but more in the troughs in between the shells, would be more flexible^{33,35}. However, this does not explain why the same local configuration responds differently to applied global force along different directions. Now that we have converted local configurations into images, let us first examine, from this perspective, if we could identify a critical structure feature responsible for the anisotropic mechanical response. To this end, we seek valuable insight by contrasting the SDMs of particles with extremely high plastic propensity (i.e., the tail end of the highsusceptibility side) versus those on the other end.
We first sheared 5000 2D glass samples to a shear strain of 3.0% using the athermal quasi static (AQS) method^{50,51}, such that we can identify all of the particles with extreme (the highest or the lowest 0.5%) D^{2}_{min}, for each of the four different loading protocols (positive and negative simple shear and positive and negative pure shear). In positive (negative) simple shear, the applied strain increment is \(\Delta \gamma _{xy} = \Delta \gamma\) (\(\Delta \gamma _{xy} =  \Delta \gamma\)) and in positive (negative) pure shear, the applied strain increments are \(\Delta \varepsilon _{xx} =  \Delta \varepsilon _{yy} =  \Delta \gamma /2\) (\(\Delta \varepsilon _{xx} =  \Delta \varepsilon _{yy} = \Delta \gamma /2\)), see details in Supplementary Fig. 2 and “Methods”). In this work, following the pioneering work by Falk and Langer^{37} as well as recent machinelearning attempts^{33,34}, we assume that an atom has a high plastic susceptibility if it shows a large D^{2}_{min} (the position of the atom is then a fertile site for shear transformations). The plastic susceptibility is a scalar quantity, but exhibits a different value each time the sample is globally loaded along a different orientation. We then went back to the original configurations before deformation and calculated the SDMs for these two groups of extreme particles that we want to compare. For each group an average SDM was obtained, representing the averaged environment of a representative particle. In the main text, we show and discuss the difference, in terms of the spatial distribution of surrounding S particles, between central S particles having the highest versus the lowest plastic susceptibility. Other cases, including the distribution of L particles around central S, and the distribution of both S and L particles around central L, are displayed in Supplementary Figs. 4–6 only, as they show features similar to that observed for the distribution of S particles around the central S particle. In 3D Cu–Zr glasses, Cu and Zr species correspond to S and L particles, respectively.
It is interesting to observe from Fig. 2a–h that for a particle of highest 0.5% D^{2}_{min} (see Fig. 2a–d), hereafter referred to as “fertile site,” its neighboring shells are more diffuse (wider), and the intensity in the same shell is nonuniform. For the other extreme, the lowest D^{2}_{min} particle (shown in Fig. 2e–h), each shell is sharper with more uniform intensity. In other words, the images do reveal structural differences between particles with high versus low plastic susceptibility. This difference in local environment can be better appreciated by subtracting the maps in Fig. 2a–d with those in Fig. 2e–h. As can be seen from the resultant difference in the SDMs, Fig. 2i–l, for “fertile sites” the surrounding shells are ellipses. The long axis is parallel to the macroscopic elongation direction \({\boldsymbol{\zeta}}\)—the white dashed line in Fig. 2i–l, which is the sum of the unit vector \(\hat {\boldsymbol{\iota}}\) (effective shear direction) and \({\hat {{{\boldsymbol{\eta}}}}}\) (normal to the effective shear plane), and the short axis is perpendicular to \({\boldsymbol{\zeta}}\). In contrast, for particles on the other end of the plastic susceptibility spectrum, their neighboring shells are circles. A schematic highlighting our observation in Fig. 2i–l is included in Supplementary Fig. 7, to better illustrate the critical structural difference.
To further understand the structural difference, we also used an orientational pair correlation function, \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\), to characterize the local atomic packing environment. The \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) for a single particle i is defined as
where \(\theta _{{\boldsymbol{\zeta}} ,{\mathbf{r}}_{ij}} = {\mathrm{arccos}}\left( { {\hat {\mathbf{{r}}}_{ij}\cdot \hat {\boldsymbol{\zeta}} } } \right)\) is the angle between the vectors \({\boldsymbol{\zeta}}\) and \({\mathbf{r}}_{ij}\) and has its values in the range of [0, \(\pi /2\)], \({r_{ij}} =  {{\mathbf{r}}_{ij}} \), \(\rho\) is the number density of atoms, and \(f_\beta\) is the composition fraction of species \(\beta\) in a sample. For 2D samples, \(\kappa = 4r\Delta r\Delta \theta\), and for 3D samples, \(\kappa = 4\pi r^2\Delta r\left[ {\cos \left( {\theta  0.5\Delta \theta } \right)  \cos \left( {\theta + 0.5\Delta \theta } \right)} \right]\). The \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) measures the line density of particles/atoms in the region with radial distance of \(r\) from the central atom i, oriented relative to elongation direction \({\boldsymbol{\zeta}}\) at angle \(\theta\) (this can be compared with the usual pair correlation function, which can be regarded as the average of orientational correlations over all \(\theta\)). Again, our goal is to probe the local environment of particles identified to exhibit extreme D^{2}_{min} in 3.0%strained 2D samples. In order to accomplish this, we went back to the original unsheared sample to calculate the \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\theta } \right)\) for these particles of interest. This curve is then averaged over all such S (or L) particles from 5000 different samples for a given protocol (each sample was sheared four times, with the different protocols). The \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) plots are shown in Supplementary Fig. 8 for S particles and Supplementary Fig. 9 for L particles. The \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) curves obtained for particles identified in the various loading protocols show little difference, e.g., (a–d) in the Supplementary Fig. 8 are very similar to one another; they were hence averaged to the one shown in Fig. 2m.
As seen from Fig. 2m, for particles with low plastic susceptibility, the gray curves of \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) for different \(\theta\) almost overlap on top of each other, which is consistent with the observation that around each of such particles the neighbors tend to form circularly symmetric shells. In stark contrast, for fertilesite particles, the curves of \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) at different \(\theta\) are quite different. Specifically, their neighboring particles with \(\theta\) close to 0 have longer bond lengths, whereas those with \(\theta\) closer to \(\pi /2\) have shorter bond lengths, see Fig. 2m. Note that other peaks at larger radial distance behave similarly, so only the first peaks are compared in the figures. We carried out similar \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) analysis for 3D Cu_{50}Zr_{50} model glasses. As shown in Fig. 2n, the structural features are similar to those in 2D glasses.
The analysis based on \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) unveils that a local configuration would be vulnerable to plastic rearrangement when its longest axis of the elliptical neighboring shells is parallel to the elongation direction of global loading. This identifies the softest direction of a 2D local configuration. But for a 3D local configuration under this same condition, there is still one more degree of freedom: the local configuration can rotate arbitrarily around the elongation direction. Additional conditions are therefore needed to identify the softest direction of 3D local configurations, to take into account this angle of rotation. To this end, we employ another orientational pair correlation function, \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)\)
Here, we shift the elongation vector \({\boldsymbol{\zeta}}\) such that it goes through the central atom i of a local configuration and then define a new vector \({\boldsymbol{\chi}} _j\) for each surrounding atom j of the local configuration. The \({\boldsymbol{\chi}} _j\) is perpendicular to \({\boldsymbol{\zeta}}\) and passes through the surrounding atom j of interest. Then, \(\varphi _j\) is the angle between \({\boldsymbol{\chi}} _j\) and the plane (\({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\)) defined by the shear direction vector (\({\boldsymbol{\iota}}\)) and the normal vector of shear plane (\({\boldsymbol{\eta}}\)), for each surrounding atom j. Thus, the \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)_\beta\) depicts neighboring shells which pass through the vector \({\boldsymbol{\zeta}}\) but with different \(\varphi\) relative to the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\). When \(\varphi = 0\), the shells are in the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\), and when \(\varphi = \pi /2\), they are perpendicular to the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\). Here, \(\kappa = 8r^2\Delta r\Delta \varphi\), different from that for \(g_{\boldsymbol{\zeta}} \left( {r,\theta } \right)\) in Eq. (3).
For the 3D glasses, Fig. 2o contrasts the partial \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)\) of the Cu atoms having the lowest 0.5% D^{2}_{min} with those having the 0.5% highest for Cu–Cu. These extreme particles were identified in the 1500 Cu_{50}Zr_{50} glasses deformed to 5% strain. For the lowest D^{2}_{min}, all partial \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)\) at different \(\varphi\) also overlap with one another, in conjunction with the overlapping partial \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\theta } \right)\) at different \(\theta\) shown in Fig. 2n. These unequivocally confirm that the neighboring shells of particles with lowest plastic susceptibility exhibit spherical symmetry. In contrast, for the fertile sites with the highest D^{2}_{min}, the surrounding particles inside the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\) tend to have shorter bond lengths, as seen in all the \(g_{{\boldsymbol{\zeta}} ,i}\left( {r,\varphi } \right)\) at small \(\varphi\). Combining this with the insight from Fig. 2n, a 3D local configuration is expected to emerge as a fertile site when (i) the degree of asymmetry of its atomic packing is sufficiently large; (ii) its longest axis is parallel to the elongation direction of the global loading; and (iii) its orientation around the elongation direction axis is in such a way that the neighboring shell inside the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\) coincides with the ellipse having the largest aspect ratio. Putting it another way, in the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\), the larger the ratio of the bond length parallel to the elongation direction to that parallel to the contraction direction (the elongation and contraction directions are perpendicular to each other), the higher the susceptibility to plastic rearrangement.
We emphasize here that these features above provide new insight into the difference in local structure, between particles with high versus low plastic susceptibility. To recapitulate, in both the 2D and 3D glasses, for particles with low plastic susceptibility, their neighbors tend to form circular shells with almost uniform intensity (peak height) at different \(\theta\) and \(\varphi\). In other words, the bond lengths are very similar for each shell, in which the neighbors uniformly distribute. In contrast, packing tends to be more asymmetric surrounding fertile sites, the neighboring shells are ellipses, each with nonuniform particle distribution. For a fertile site to be activated upon a particular mechanical loading, the longest and shortest axes of neighboring shells should be parallel to the elongation and contraction directions, respectively. In other words, a fertile site is the most prone to be activated when its surrounding packing along the elongation direction is the loosest and packing along the contraction direction is the densest, as illustrated by the schematic in Supplementary Fig. 7. This is the structural origin responsible for the anisotropy of local mechanical response in amorphous solids.
With these insights in mind, we expect that the mechanical response of a local configuration will not show much difference when one mirrors the local configuration around the plane perpendicular to the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\) and containing the vector \({\boldsymbol{\zeta}}\) or the vector perpendicular to \({\boldsymbol{\zeta}}\) or the plane of \({\boldsymbol{\iota}} \times {\boldsymbol{\eta}}\) (see details on these mirror operations in Supplementary Fig. 10), as these mirrors will not change the packing density of local configurations along the elongation or contraction direction. To verify this, we loaded three such mirrored configurations of a 2D LJ sample and seven mirrored Cu_{50}Zr_{50} configurations along a constant loading orientation. As expected, D^{2}_{min} of all particles in those mirrored samples are indeed almost identical to those in the corresponding original samples, when the strain is well below global yielding, as shown in Supplementary Fig. 11. In training our CNN model (next section), we will augment our training data with these mirror symmetries. This was found to improve accuracy by ~2%, lending further support to our insight above.
Training procedure and results
The insight from Fig. 2 discussed in the preceding section is important for our understanding of the structural origin of the anisotropy, as it provides a general picture of the main structural features/difference. But our trials found it difficult to translate the new insight into a simple handcrafted indicator that can predict plastic susceptibility with high accuracy. This underscores the need to take into account all the subtle but nontrivial structural features, when a concrete decision is to be made regarding whether a particular particle has high (or low) susceptibility. As will be demonstrated in the work that follows, deep learning is well suited for this purpose, in terms of providing adequate predictive power. Furthermore, the consistency between the CNN prediction with our expectation based on the insight in Fig. 2 will also be confirmed at the end of this subsection. To construct big datasets for the CNN models, we prepared 2D LJ (or 3D Cu_{50}Zr_{50}) glasses with constant a cooling time of 10^{6} t_{0} (or effective quench rate of 10^{10} K/s) from the corresponding wellequilibrium liquid. A total of 5000 (1500) samples were made, each containing 10,000 (32,000) particles. Each sample was then deformed under 4 (24) different loading protocols (see more details about sample preparation and deformation in “Methods”). These 2D (3D) glass models were divided into training, validation, and test datasets, each containing 4900 (1480), 50 (10), and 50 (10) samples, respectively. Since we are interested in identifying particles which will experience extreme (large) plastic rearrangement upon loading, the first step is to do a binary classification task. We will explore the relations between initial static structure and plastic responses at different degrees of strain before the yielding point. Thus, we constructed different datasets for the plastic responses at different degrees of strain. For a specific strain, a particle i is labeled y_{i} = 1 if its D^{2}_{min} is higher than a specific threshold and y_{i} = 0 otherwise. Here, we use f_{thres} to set the D^{2}_{min} threshold at a given shear strain. For example, f_{thres} = 0.5% means that we label particles in the top 0.5% D^{2}_{min} group as y_{i} = 1 and the remainder y_{i} = 0. The shear stress–strain curves and cumulative distribution function (CDF) of D^{2}_{min} are shown in Supplementary Figs. 12 and 13, respectively, for the 2D and 3D glasses. Our main purpose is to reveal how accurate it can be when identifying atoms with large plastic susceptibility a priori from initial static structure in glasses. We therefore choose f_{thres} = 0.5% to show the predictive performance (to ensure sufficiently large datasets, f_{thres} cannot be too small), after trying many different values of f_{thres}. Actually, the advantage of the CNN over other machinelearning methods is more obvious for larger f_{thres}, see more discussions and results in Supplementary Note 3 and Supplementary Fig. 14. Our training, validation, and test datasets were balanced, as in each dataset we included all the y_{i} = 1 particles in each sample deformed under each specific deformation protocol, with the same number of y_{i} = 0 particles selected at random from the rest. In contrast to previous works^{33,36}, we will train a single model for all species in a glass system to improve the user friendliness of machinelearning models; this does not sacrifice much predictive power compared to training different models for different species, as verified in Supplementary Fig. 15 and related discussion in Supplementary Note 4. Note again that the SDM was constructed for particles in the initial undeformed configurations; it is just that these particles were selected based on their D^{2}_{min} response upon straining.
The main interest in the glass community is to see how effective it could be when it comes to separating out fertile sites, i.e., atoms prone to large plastic rearrangement upon deformation. Classification suffices for this purpose. In performing binary classification tasks, the most commonly used metric to measure the predictive performance is “accuracy,” defined as the fraction of the number of images (local configurations) predicted correctly divided by the total number of images in a dataset. The image is predicted correctly, when the actual label y_{i} (1 or 0, determined by the value of D^{2}_{min} from simulations) of an image is identical to the label predicted by the machinelearning model. Since each of our datasets is balanced, i.e., the number of images with actual y_{i} = 0 is equal to that of images with actual y_{i} = 1, the lowest value of accuracy would therefore be 50%. We confirmed that, for each trained model, the validation accuracy and test accuracy are practically the same, suggesting that the trained models did not bias our validation datasets despite the hyperparameters involved, such as CNN architecture, \(r_c\) and Δ. We therefore only report the validation accuracy (or simply “accuracy”) in the following.
Figure 3a, b, for 2D and 3D glasses, respectively, shows the predictive accuracy achieved on validation datasets with our CNN method, the graph neural network (GNN)^{36} method, and the linear support vector machine (SVM)^{33} method, when differentiating particles/atoms with y_{i} = 1 from those with y_{i} = 0 at different strains with a constant f_{thres} = 0.5%. As seen from Fig. 3a for the 2D case, CNN exceeds SVM by far in performance, over the entire strain range we considered; the highest accuracy achieved was as high as 96.27%. This accuracy is very close to the ceiling (see “Methods” for details about the upper bound). Several advantages of CNN make it obviously superior to SVM: (i) CNN can capture more complicated relations between static structure and plastic activity compared to linear SVM, as the latter is merely a linear machinelearning algorithm; (ii) SVM loses some subtle but critical structural information during the conversion from atomic positions to the input features; and (iii) most importantly, for a given local configuration in different loading directions, we constructed images with different coordinate systems (in which the X axis is parallel to the shear direction (\({\boldsymbol{\iota}}\)) and the Y axis is the normal direction of the shear plane (\({\boldsymbol{\eta}}\))). This enabled orientationdependent predictions, whereas the input features to SVM are rotationinvariant, leading always to the same prediction. We observe that GNN produces slightly lower accuracy compared to the accuracy of our CNN when f_{thres} = 0.5% for 2D glasses, as shown in Fig. 3a. This is not surprising, as GNN used atomic positions directly to construct graphs as input, and in training GNN models we rotated edge vectors for different loading directions to take into account the orientation dependency of local plastic response, in contrast to the original procedure from ref. ^{36} (see details in “Methods”). The advantage of CNN over GNN is more obvious for larger f_{thres}, see Supplementary Fig. 14, and for all cases in 3D glasses (see discussion in next paragraph). In addition, the graph input to GNN cannot be used to provide the insights gained from Fig. 2 based on images. Supplementary Fig. 16 shows the particlelevel prediction results of a given 2D glass under different loading protocols for the three machinelearning methods (see more discussion in Supplementary Note 5).
When dealing with 3D Cu_{50}Zr_{50} glasses, the GNN method is also superior to SVM, consistent with the observations in ref. ^{36}. However, we found that our CNN, which has stronger learning capability, is far more powerful than GNN, as shown in Fig. 3b, over the entire strain range. This is because the relation between structure and plastic response in 3D glasses is much more complex than that in 2D glasses. The highest accuracy achieved with CNN is above 90%, in the strain range from 3 to 6%. At small strains, only some of the initial “defects” have been activated and thus D^{2}_{min} cannot reflect the system plastic response very well; on the other hand, at large strains close to global yielding, the role played by initial structural state becomes less dominant. As a result, the correlation between the initial structure and the plastic response peaks at an intermediate value of shear strain. Even at a strain of 10% where global yielding occurs (see Supplementary Fig. 12b), the accuracy is still >83%. This suggests our CNN model may be useful in predicting the location where shear band initiates. This point is beyond the scope of the current work and will be discussed elsewhere. We believe that the CNN method can achieve higher accuracy, especially for 3D glasses, if we use deeper networks. But it will require more computational resources, see more discussion in Supplementary Note 1. Our main purpose here is to demonstrate that the CNN is indeed a powerful method to meet the challenge we are dealing with. To benchmark the advantages of the CNN, a comparison is made against the reference—the fully connected neural networks, as shown in Supplementary Fig. 17.
One would be naturally curious as to how well our CNN models fare, when compared with the prediction based on some previously published physical indicators, such as the participation ratio in soft modes (SMs)^{17,18}, flexibility volume (v_{flex})^{19}, and atomic volume (\(\Omega _a\)). For binary classification tasks, the lowest accuracy of any machinelearning method is always 50%. But if the accuracy is used as the metric to assess the predictive performance of those physical parameters, it would drop below 50%. To avoid such an unfair comparison, we have chosen the following route. For the three machinelearning models, we use the class probability for CNN and GNN, and the distance to the separation boundary predicted by SVM, to denote the predicted plastic susceptibility of each particle in the samples constructed for the test dataset but never involved in the training dataset. Take CNN as an example, the class probability for each input image is a numerical value in the range of (0, 1) provided by the learning algorithm, to represent the probability of having a label of y_{i} = 1 for the input object. For binary classifications, class probability greater than 0.5 predicts y_{i} = 1 and y_{i} = 0 otherwise. For the three physical indicators, the value of each of these indicators is used directly to denote the predicted plastic susceptibility of each particle. Similar to ref. ^{22}, we define a cumulative rank correlation (C_{rank}) to compare the power of these six routes in predicting the particles which will have top 0.5% D^{2}_{min} when a sample is sheared to different strains in a given orientation. As the indicator based on each of the six methods are expected to correlate positively with the plastic response, i.e., larger indicator means larger D^{2}_{min}, here we define
where \(\rho _i\) is the plastic susceptibility of particle i predicted via one of the six routes, \({\mathrm{CDF}}\left( {\rho _i} \right)\) is the cumulative distribution function value for \(\rho _i\), and the bar on top represents the averaging of \({\mathrm{CDF}}\left( {\rho _i} \right)\) over all particles with top 0.5% D^{2}_{min}. The highest (lowest) value of \(C_{{\mathrm{rank}}}\)is around 0.995 (−0.995), which means perfect correlation (anticorrelation), and \(C_{{\mathrm{rank}}} = 0\) means no correlation.
As seen in Fig. 3c, d, the \(C_{{\mathrm{rank}}}\) achieved with our CNN method is almost equal to its highest possible value over a wide range of strain for both the 2D and 3D glasses. The worst case is \(\Omega _a\) for which \(C_{{\mathrm{rank}}}\) is always close to zero across the board, for either the 2D or 3D glasses. For the three datadriven methods, their relative predictive power (the magnitude of \(C_{{\mathrm{rank}}}\)) follows an order similar to that observed in Fig. 3a, b. The CNN prediction stands out to be the best. Note that the predictive power generally decreases with increasing strain for physical indicators. In this regard, our CNN has a major advantage over other methods at larger strains, which is important for monitoring the correlation between structure and mechanical response of amorphous solids. As SM analysis on large 3D samples is too computationally expensive, we used instead Cu_{50}Zr_{50} glasses containing 10,000 atoms. Such smaller samples were also analyzed via CNN, to allow headtohead comparison (included in Fig. 3d). The consistency in \(C_{{\mathrm{rank}}}\) obtained with our CNN models on smaller and larger samples (red circles almost overlap with white pentagons in Fig. 3d) indicates the robustness of our CNN models with no samplesize dependence. This is not a big surprise, as the r_{c} we used is sufficiently large, and so are our training samples.
In addition, we also used a different and extremely stringent criterion, an overlap ratio, \({\Re}\), to gauge the fraction of correctly predicted particles. \({\Re}\) is evaluated by dividing the number of particles belonging simultaneously to two groups that are intersecting (partially overlapping), with the number of particles known to have the top 0.5% D^{2}_{min}. Here, the intersection/overlapping particles are those that fall into not only the group known to have the top 0.5% D^{2}_{min}, but at the same time also the group having the top 0.5% plastic susceptibility predicted using that particular method/indicator. In the ideal case, the particles predicted to have the top 0.5% plastic susceptibility are expected to be the same particles known to have the top 0.5% D^{2}_{min} (i.e., all the particles in that pool). As such, the closer to 1.0 the \({\Re}\) ratio, the higher the predictive capability. The corresponding results are displayed in Supplementary Fig. 18, which shows a similar trend as in Fig. 3c, d. This further confirms the advantage of our CNN method as the most robust of the six, disregarding the metric adopted to characterize the strength of correlation. Under this extremely stringent criterion, the advantage of CNN over GNN is evident on the 2D glasses, see Supplementary Fig. 18a.
To demonstrate that the orientation dependence has been clearly captured, we show in Fig. 4 the CNNpredicted plastic susceptibility for a 2D local configuration when it is rotated gradually: the rotationally noninvariant SDM turns out to be highly sensitive to minor orientation change. In its initial orientation (\(\theta = 0^\circ\)), when its loosest and densest packing directions align with the elongation (red dashed line) and contraction (black dashed line) directions of mechanical loading, respectively, CNN predicts the highest plastic susceptibility (class probability very close to 1.0). With the counterclockwise rotation, these packing directions gradually misalign with the loading ones, and the predicted plastic susceptibility goes down. At \(\theta \cong 90^\circ\), the loosest packing of the configuration lines up with the contraction direction whereas the densest packing with the elongation direction, class probability drops to a value very close to zero. When \(\theta\) increases to ~180°, the favorable alignment comes back, such that the plastic susceptibility is predicted to be the highest again. Likewise, the lowest reemerges when \(\theta = 270^\circ\). This sensitivity to the coupling between the anisotropic local structure and the loading direction demonstrates that, when we feed the images (configurations) into CNN, the algorithm is very good at recognizing this coupling and churning out predictions consistent with the insight in Fig. 2. The blue dotted line in the polar plot of Fig. 4 represents D^{2}_{min} of the local configuration at different θ (see “Methods”), which almost matches the CNN prediction. This underscores the benefit and advance enabled by our approach to innovate a new structure representation and combine it with CNN.
Generalize to different processing history or compositions
Our previous work^{35} demonstrated that the SVM models, trained on samples having the same processing history at a single composition, can be generalized to samples with different processing histories or at different compositions in the same alloy system. To see if such a generalization is possible with our CNN models, 50 2D samples were quenched over different time periods in the range from 10^{0} to 10^{6} t_{0}. For 3D glasses, we also quenched 10 Cu_{50}Zr_{50} samples with effective cooling rates in the range from 10^{9} to 10^{13} K/s, and 10 Cu_{x}Zr_{100x} glasses with different compositions (x = 20, 25, …, 80) at the same effective cooling rate of 10^{10} K/s. Each of these 2D (3D) samples was sheared using 4 (24) different loading protocols with the AQS method. We then constructed test datasets for each processing history and each composition, by selecting all of the atoms with D^{2}_{min} above a threshold and a same number of atoms randomly from the rest. The accuracy achieved on these datasets, using the CNN model previously trained at a single composition for a particular processing history (see caption and “Methods”), is presented in Fig. 5a–c. We find a very high accuracy across a wide composition range (Fig. 5c), especially for samples with slower cooling (Fig. 5a, b). For example, for 3D Cu_{50}Zr_{50} quenched at cooling rate of 10^{9} K/s, the highest accuracy achieved is 92.40%. The decrease in accuracy for faster cooling rate was observed before^{22,35,38,52}, as the critical structural difference becomes increasingly difficult to distinguish^{38}. The more relaxed glasses, on the other hand, have a much smaller population of fertile sites that exhibit larger shear susceptibility, making the recognition easier and prediction more robust.
This transferability of the model offers the desirable applicability to glasses across different compositions and/or different processing histories, to compare plastic susceptibility. An example is shown in Supplementary Fig. 19a (for 2D) and b (for 3D). Here, the glass structure is generically characterized by the fraction of predicted “y_{i} = 1” particles out of all particles. This fraction, f_{py=1} (the subscript means “predicted (labeled) y = 1”), is given by the fraction of particles with class probability > 0.5, and scales almost linearly with the average class probability. These plots reveal a strong correlation between the glass structure and shear modulus (G), found to be robust at any strain as long as it is well below that corresponding to global yielding.
A possible reason for the transferability is that our training datasets, although from one composition and the same processing history, are large enough to include almost all of the possible local environments for samples with different processing histories and/or at different compositions within the same alloy system, as suggested in our previous work^{35}. Achieving transferability across different alloy systems while keeping high predictive capability requires more exploration but it is beyond the scope of the current work.
Discussion
Our results presented in this paper bring about several advances over previous efforts to identify structural “defects” in amorphous solids, i.e., particles that are most prone to rearrangements, in particular shear transformations. First of all, a qualitative but crucial structural difference, in terms of a more nonuniform and asymmetric packing environment, was discovered for atoms with high plastic susceptibility relative to those with low plastic susceptibility. This difference is found to be the origin of the anisotropic local mechanical response in amorphous solids, contributing to the full resolution of this longstanding puzzle. Our expectation based on this insight, starting from Fig. 2, is reflected in the eventual CNN predictions, as highlighted in Fig. 4. This new insight goes beyond that obtained based on the previous concept of geometrically unfavored motifs (GUMs)^{18}. The GUMs refer to coordination polyhedra, an elementary but incomplete structural representation, especially since the local property is known to be influenced by the surroundings beyond the first and second nearest neighbors. In addition, it is difficult to identify the softest shear direction merely based on the local GUMs. The insight presented in this work may be assessed from experiments if the orientational pair correlation functions can be mapped out in future experiments. Second, we achieved unprecedented accuracy in predicting orientationdependent local mechanical response over a wide range of shear strain, by designing a rotationally noninvariant structure representation coupled with a powerful deep learning algorithm, as highlighted in Fig. 3. This accomplishes what is left unsolved in recent attempts invoking vectors, symmetry, and orientational order^{53,54,55}, which only consider some aspects of the topological information in the first nearestneighbor shell and lack the capacity to handle the coupling with the direction of the imposed shear. Here, we have used the SDM and the orientational pair correlation functions to cover the entire local environment of an atom—the full gene of the glass structure beyond the first neighboring shell. Because our representation is no longer rotationinvariant, our approach unveils the predictive power of the static structure of amorphous solids in forecasting directiondependent local plastic response. We are aware that factors beyond static structures, including shortterm dynamics, are also relevant to relaxation responses at finite temperatures. However, there has been a pressing need to gauge how much predictive power can be garnered, from the initial static structure alone. This is a question that the community wants a clear answer for, as the static structure is usually the information that becomes available first. Third, once the optimized CNN model is in hand, all that is needed to predict plastic response is atomic positions, without relying on other knowledge such as interparticle interactions required by previous approaches using parameters based on physical properties. Fourth, we have demonstrated that the CNN models trained on a single glass (with one set of a particular composition and a specific processing history) can be generalized to samples with different processing histories or at different compositions in the same alloy system. This is important for probing into the effects of processing procedure or chemical composition on properties, enabling the comparison between different glasses. These four merits open new avenues to the understanding of the structure–property relations in amorphous solids. Fifth and finally, we anticipate that our novel structure representation (SDM), in combination with the powerful CNN method, would find use in the studies of other amorphous matter, beyond metallic glasses, as well as in predicting rotationinvariant properties of crystalline materials, via data augmentation^{41}.
Methods
The data used for training and validating deep learning models are from molecular dynamics simulations^{56}, which are implemented in the LAMMPS package^{57}.
Preparation of 2D model glasses
2D binary glasses are composed of equalmassed (m) small (S) and large (L) particles which interact via a standard 6–12 LJ potential. The full details of the potential are presented elsewhere^{45}. We chose our composition such that the ratio between the species is N_{L}:N_{S} = \(\left( {1 + \sqrt 5 } \right):4\) to be consistent with previous studies^{14,22,37} of this system and each of the sample contains N = 10,000 particles. All units will be expressed in terms of m as well as \({\it{\epsilon }}\) and \(\sigma\), the parameters describing the energy and length scale, respectively, of the interparticle interaction. The characteristic time is \(t_0 = \sigma \sqrt {m/\varepsilon }\). The glass transition temperature T_{g} of this system is known to be 0.325\(\varepsilon /k\)^{14,22,37}, where \(k\) is the Boltzmann constant. Periodic boundary conditions were imposed on square boxes of linear dimension 98.8045\(\sigma\). The density of the system (\(N/L^2 \approx 1.02\sigma ^{  2}\)) was kept constant. A total of 5000 2D glass samples were each obtained by continuously decreasing the temperature from a liquid state, well equilibrated at 1.08 T_{g} and quenched to a lowtemperature (0.092 T_{g}) solid state over a period of 10^{6} t_{0} using a Nose–Hoover thermostat. Then, a static relaxation using conjugate gradient algorithm was applied to bring the system to mechanical equilibrium before conducting deformation simulation^{22}. Particle positions in this state were used to construct structure representations. Our training, validation, and test datasets contained 4900, 50, and 50 samples, respectively. To demonstrate that the CNN models are valid for the glasses with different processing histories, 50 additional samples were quenched from 1.08 T_{g} to 0.092 T_{g} over different period in the range from 10^{5} t_{0} to 10^{0} t_{0}.
Deforming 2D model glasses
Each of the 2D glasses was deformed under four loading conditions (positive and negative simple shear and positive and negative pure shear) with an AQS method^{50,51}. In positive (negative) simple shear, the applied strain increment is \(\Delta \gamma _{xy} = \Delta \gamma\) (\(\Delta \gamma _{xy} =  \Delta \gamma\)), and in positive (negative) pure shear, the applied strain increments are \(\Delta \varepsilon _{xx} =  \Delta \varepsilon _{yy} =  \Delta \gamma /2\) (\(\Delta \varepsilon _{xx} =  \Delta \varepsilon _{yy} = \Delta \gamma /2\)). Here, \(\Delta \gamma\) = 10^{5}. After each deformation increment, the system was relaxed to its mechanical equilibrium.
Mechanical response of a local 2D configuration at arbitrary orientation
To probe the mechanical response of a local 2D configuration at arbitrary orientation, i.e., the blue dotted curve in the Fig. 4, we deleted particles^{58} outside the largest circular region from the original square box. Next, we rotated the circular configuration by various angles counterclockwise, and then conducted simple shear on it with the AQS method. During deformation, the particles in the outer annulus (wall) of width 5\(\sigma\) (two times of potential cutoff) only experience applied affine displacement, i.e., they are not allowed to relax such that the periodic boundary conditions are lost.
Preparation of 3D model glasses
The 3D Cu_{50}Zr_{50} metallic glasses were simulated with an optimized EAM potential adopted from ref. ^{46}. Each of the 1500 samples contained 32,000 atoms in a cubic box. After adequate equilibration at 2000 K, the liquid was first quenched to 1500 K at a cooling rate of 10^{13} K/s, followed by 10^{12} K/s to 1000 K, then at the desired rate of 10^{10} K/s (effective cooling rate, in the text) to 500 K, and finally at 10^{13} K/s to 50 K in the NPT ensemble using a Nose–Hoover thermostat with zero external pressure. The periodic boundary condition was applied in all three directions. The samples were then brought to mechanical equilibrium through a static relaxation via a conjugate gradient algorithm. Our training, validation, and test datasets for the 3D system contain 1480, 10, and 10 samples, respectively. To demonstrate that the CNN models are valid for glasses with different processing histories or at different compositions, 10 Cu_{50}Zr_{50} glasses were also quenched using different effective cooling rates in the range from 10^{9} to 10^{13} K/s, and at each different composition (Cu_{x}Zr_{100x,} x = 20, 25 …, 80, in step of 5), 10 metallic glasses were quenched with an effective cooling rate of 10^{10} K/s.
Deforming 3D model glasses
Before extracting atomic coordinates to construct structure representation and conducting deformation simulation in each of the 24 loading orientations listed in Supplementary Table 1, we rotated each configuration such that the shear direction and the normal direction of the shear plane were parallel to X and Y axes, respectively, of the coordinate system. For some loading orientations, appropriate atom replication was needed after rotation, which is illustrated in Supplementary Fig. 20. Each of the 3D glasses was deformed in simple shear along the 24 different loading orientations via an AQS method^{50,51}. After each applied strain increment of \(\Delta \gamma _{xy}\) = 10^{−4}, the system was relaxed to its mechanical equilibrium. We calculated D^{2}_{min} per definition in ref. ^{37}, implemented in the OVITO package^{59}.
Upper bound of the prediction accuracy
To assess this ceiling, we make use of the particles from the test dataset, which were already labeled after deformation. Another deformation simulation was then conducted, after shuffling the sequence of all particles, as a new job run on a different computer. The resultant new configuration was used to compute D^{2}_{min} of all the particles, which were labeled again. The number of particles always having the same label, determined based on both the old and the new D^{2}_{min}, was divided by the total number of particles in the dataset, to give the upper bound of accuracy that can be reached via any method. Note that numerical error, when the atom sequence in the initial configuration file is changed although all other deformation conditions are kept the same, renders deformation under the AQS not exactly reproducible. This error also grows with increasing strain.
Training procedure of CNN models
For the CNN models, we minimized the binary crossentropy loss between true labels and predicted labels, and used early stopping and selected the model with the highest accuracy in the validation dataset. The learning rate started from 0.001 and was then divided by \(\sqrt {10}\) once the validation accuracy plateaued. We chose RMSprop optimizer and minibatch size of 512 and 64, for the 2D and 3D glasses, respectively. We augmented the training data by applying randomly one of the 4 (or 8) mirror symmetries to the images of 2D (or 3D) glasses every time we fed the images into the CNN. The training was implemented in the TensorFlow package^{60}. And we conducted distributed deep learning with the Horovod package^{61} to accelerate the training for 3D glasses.
GNN methods
We followed the procedure in ref. ^{36} to construct graph with edge threshold of 2.0\(\sigma\) (and 4.0 Å) for 2D (and 3D) glasses. We used the same GNN architecture as in ref. ^{36}, except that we activated the output layer with a sigmoid function as we were performing binary classification tasks, and chose n_{rec} = 7 and 6 for 2D and 3D glasses, respectively, according to the optimization results shown in Supplementary Fig. 21. We also tried more neurons in all multilayer perceptors, but this did not further improve the validation accuracy.
For the 2D (3D) samples in both the test and validation datasets, we loaded each sample, and constructed corresponding graph, 4 (24) times, each graph corresponding to one of the 4 (24) loading protocols. And we updated edge vectors of those graphs using relative atomic positions in the new coordinate system, the X and Y axes of which were parallel to the shear direction and the normal of shear plane, respectively, of each specific loading protocol. Atoms were labeled based on the magnitude of D^{2}_{min} under the corresponding loading protocol. When evaluating the loss function and accuracy for the GNN models, only the atoms in the test and validation datasets were considered. Therefore, both test and validation datasets for the GNN are exactly the same as those for our CNN. For the 2D (3D) samples in training datasets, we constructed only one graph for each sample in order to relieve memory burden and save computation time. However, we took into account all loading directions for each of the 4900 (1480) training examples by updating at random their target labels with D^{2}_{min} from one of the 4 (24) loading protocols and by rotating edge vectors correspondingly every time we fed them (both targets and graphs) to the network (note that in our earlier attempts, we tried to directly construct and load multiple graphs for each training sample, but the results obtained were similar). Also, we augmented the training data by applying randomly one of the 4 (or 8) mirror symmetries to the edge vectors of graphs of 2D (or 3D) glasses every time we fed the graphs to the GNN. When optimizing GNN models and evaluating loss function and accuracy, only the atoms in the training datasets were used to enable balanced datasets and a fair comparison with our CNN.
To minimize the binary crossentropy loss between true labels and predicted labels, similarly to ref. ^{36}, we used early stopping and selected the model with the highest accuracy on the validation dataset after running enough epochs. We trained the GNN models with a learning rate of 10^{−4}, gradient clipping, Adam optimizer, and a TensorFlow implementation.
SVM methods
Same as in ref. ^{33}, we used the following radial and angular structural functions to construct input features to SVM models:
where \(r_{ij}\) denotes the distance between atoms i and j, and \(\alpha\) or \(\beta\) denote the species, \(\theta _{ijk}\) is the angle between vector \({\mathbf{{r}}}_{ij}\) and \({\mathbf{{r}}}_{ik}\), and \(r \in [r_0 + 0.5\Delta ,r_c)\) with an increment of \(\Delta\). We use \(r_0\) = 0.5\(\sigma\), \(r_c\) = 5.0\(\sigma\), and \(\Delta\) = 0.025\(\sigma\), and \(r_0\) = 2.0 Å, \(r_c\) = 8.0 Å, and \(\Delta\) = 0.050 Å for 2D and 3D glasses, respectively, based on the findings in Supplementary Fig. 22. For the parameters in the angular structural functions, similar to ref. ^{36}, we used \(\lambda \in \left\{ {  1,1} \right\},\varsigma \in \left\{ {1,2,4,8,16} \right\}\), and 16 values of \(\xi\) equally spaced between 0.75\(\sigma\) and 5.0\(\sigma\) (between 3.0 and 8.0 Å) for 2D (3D) glasses. We tried to use 4, 8, 16, or 32 values for \(\xi\) in the given region but found only a weak dependency. We also tried to use exactly the same angular parameters as in the original work^{33}, but the results came out slightly worse.
We conducted L2regularized L2loss support vector classifications with a linear kernel, which were executed in the LIBLINEAR package^{62} by solving the primal optimization problem. A regularization parameter of C = 100 was used, as we tried many different values over the range from 10^{−7} to 10^{5} and found only a weak dependency once C was above 10. We also tuned the termination criterion, which did not further improve the accuracy on validation datasets. Note that 500 (100) samples were sufficient to construct training datasets for 2D (3D) glasses, because training the SVM models does not require large datasets, as demonstrated in Supplementary Fig. 23.
Supplementary Tables 2 and 3, for 2D and 3D glasses, respectively, present a summary of the input features and trainable parameters of the three machinelearning models.
Calculating the flexibility volume
The flexibility volume is defined as the product of vibrational mean squared displacement (MSD) and atom spacing^{19}. We calculated MSD at 0.092 T_{g} for 2D glasses and at 50 K for 3D glasses, using the procedure in ref. ^{19}. The MSD was averaged over 100 independent runs, all starting from the same configuration but different initial velocities. And the atomic volume is calculated based on Voronoi analysis.
SM analysis
The normal mode analysis of the glasses was carried out by diagonalizing the dynamical matrix of the inherent structure of glasses. The participation fraction of atom i in eigenmode \(e_\omega\) is \(p_i = \left {\vec e_\omega ^i} \right^2\), where \(\vec e_\omega ^i\) is the corresponding polarization vector of atom i^{63}. Here, \(p_i\) was summed over a small fraction of 1% (same as that in ref. ^{18}) of the lowestfrequency normal modes. \(p_i\) measures the involvement of atom i in SMs.
Data availability
The data that support the findings of this study are available from the corresponding author on request.
Code availability
We used open source packages LAMMPS and TensorFlow for our molecular dynamics simulations and deep learning, respectively. All custom codes are available from the corresponding author upon request.
References
 1.
Cottrell, A. H. The Mechanical Properties of Matter (Wiley, 1964).
 2.
Hirth, J. & Lothe, J. Theory of Dislocations (John Wiley & Sons, 1982).
 3.
Schuh, C. A., Hufnagel, T. C. & Ramamurty, U. Mechanical behavior of amorphous alloys. Acta Mater. 55, 4067–4109 (2007).
 4.
Cheng, Y. Q. & Ma, E. Atomiclevel structure and structure–property relationship in metallic glasses. Prog. Mater. Sci. 56, 379–473 (2011).
 5.
Greer, A. L., Cheng, Y. Q. & Ma, E. Shear bands in metallic glasses. Mater. Sci. Eng.: R: Rep. 74, 71–132 (2013).
 6.
Ma, E. Tuning order in disorder. Nat. Mater. 14, 547–552 (2015). 2015.
 7.
Hufnagel, T. C., Schuh, C. A. & Falk, M. L. Deformation of metallic glasses: Recent developments in theory, simulations, and experiments. Acta Mater. 109, 375–393 (2016).
 8.
Spaepen, F. A microscopic mechanism for steady state inhomogeneous flow in metallic glasses. Acta Metall. 25, 407–415 (1977).
 9.
Srolovitz, D., Maeda, K., Vitek, V. & Egami, T. Structural defects in amorphous solids statistical analysis of a computer model. Philos. Mag. A 44, 847–866 (1981).
 10.
Egami, T. Atomic level stresses. Prog. Mater. Sci. 56, 637–653 (2011).
 11.
Johnson, W. L. & Samwer, K. A universal criterion for plastic yielding of metallic glasses with a (T/T_{g})^{2/3} temperature dependence. Phys. Rev. Lett. 95, 195501 (2005).
 12.
Johnson, W. L., Demetriou, M. D., Harmon, J. S., Lind, M. L. & Samwer, K. Rheology and ultrasonic properties of metallic glassforming liquids: a potential energy landscape perspective. MRS Bull. 32, 644–650 (2007).
 13.
Tsamados, M., Tanguy, A., Goldenberg, C. & Barrat, J.L. Local elasticity map and plasticity in a model LennardJones glass. Phys. Rev. E 80, 026112 (2009).
 14.
Shi, Y. & Falk, M. L. Strain localization and percolation of stable structure in amorphous solids. Phys. Rev. Lett. 95, 095502 (2005).
 15.
Shi, Y. & Falk, M. L. Atomicscale simulations of strain localization in threedimensional model amorphous solids. Phys. Rev. B 73, 214201 (2006).
 16.
Peng, H. L., Li, M. Z. & Wang, W. H. Structural signature of plastic deformation in metallic glasses. Phys. Rev. Lett. 106, 135503 (2011).
 17.
Manning, M. L. & Liu, A. J. Vibrational modes identify soft spots in a sheared disordered packing. Phys. Rev. Lett. 107, 108302 (2011).
 18.
Ding, J., Patinet, S., Falk, M. L., Cheng, Y. & Ma, E. Soft spots and their structural signature in a metallic glass. PNAS 111, 14052–14056 (2014).
 19.
Ding, J. et al. Universal structural parameter to quantitatively predict metallic glass properties. Nat. Commun. 7, 13733 (2016).
 20.
Fan, Z., Ding, J., Li, Q.J. & Ma, E. Correlating the properties of amorphous silicon with its flexibility volume. Phys. Rev. B 95, 144211 (2017).
 21.
Fan, Z., Ding, J. & Ma, E. Making glassy solids ductile at room temperature by imparting flexibility into their amorphous structure. Mater. Res. Lett. 6, 570–583 (2018).
 22.
Patinet, S., Vandembroucq, D. & Falk, M. L. Connecting local yield stresses with plastic activity in amorphous solids. Phys. Rev. Lett. 117, 045501 (2016).
 23.
Zylberg, J., Lerner, E., BarSinai, Y. & Bouchbinder, E. Local thermal energy as a structural indicator in glasses. PNAS 114, 7289–7294 (2017).
 24.
Xu, B., Falk, M. L., Li, J. F. & Kong, L. T. Predicting shear transformation events in metallic glasses. Phys. Rev. Lett. 120, 125503 (2018).
 25.
Liu, C., Guan, P. & Fan, Y. Correlating defects density in metallic glasses with the distribution of inherent structures in potential energy landscape. Acta Mater. 161, 295–301 (2018).
 26.
Peng, C.X. et al. Bond length deviation in CuZr metallic glasses. Phys. Rev. B 96, 174112 (2017).
 27.
Steinhardt, P. J., Nelson, D. R. & Ronchetti, M. Bondorientational order in liquids and glasses. Phys. Rev. B 28, 784–805 (1983).
 28.
Lewandowski, J. J., Wang, W. H. & Greer, A. L. Intrinsic plasticity or brittleness of metallic glasses. Philos. Mag. Lett. 85, 77–87 (2005).
 29.
Qiao, J. C. et al. Fast secondary relaxation and plasticity initiation in metallic glasses. Natl Sci. Rev. 5, 616–618 (2018).
 30.
Yang, Q., Peng, S.X., Wang, Z. & Yu, H.B. Shadow glass transition as a thermodynamic signature of β relaxation in hyperquenched metallic glasses. Natl Sci. Rev. 7, 1896–1905 (2020).
 31.
Kumar, G., Neibecker, P., Liu, Y. H. & Schroers, J. Critical fictive temperature for plasticity in metallic glasses. Nat. Commun. 4, 1536 (2013).
 32.
Zhu, F., Song, S., Reddy, K. M., Hirata, A. & Chen, M. Spatial heterogeneity as the structure feature for structure–property relationship of metallic glasses. Nat. Commun. 9, 3965 (2018).
 33.
Cubuk, E. D. et al. Identifying structural flow defects in disordered solids using machinelearning methods. Phys. Rev. Lett. 114, 108001 (2015).
 34.
Wang, Q. & Jain, A. A transferable machinelearning framework linking interstice distribution and plastic heterogeneity in metallic glasses. Nat. Commun. 10, 5537 (2019).
 35.
Fan, Z., Ding, J. & Ma, E. Machine learning bridges local static structure with multiple properties in metallic glasses. Mater. Today 40, 48–62 (2020).
 36.
Bapst, V. et al. Unveiling the predictive power of static structure in glassy systems. Nat. Phys. 16, 448–454 (2020).
 37.
Falk, M. L. & Langer, J. S. Dynamics of viscoplastic deformation in amorphous solids. Phys. Rev. E 57, 7192–7205 (1998).
 38.
Barbot, A. et al. Local yield stress statistics in model amorphous solids. Phys. Rev. E 97, 033001 (2018).
 39.
SchwartzmanNowik, Z., Lerner, E. & Bouchbinder, E. Anisotropic structural predictor in glassy materials. Phys. Rev. E 99, 060601 (2019).
 40.
Xu, B., Falk, M. L., Patinet, S. & Guan, P. Atomic Nonaffinity as a Predictor of Plasticity in Amorphous Solids. https://arxiv.org/abs/1905.12035 (2020).
 41.
Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. in Advances in Neural Information Processing Systems 25 (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–061105 (Curran Associates, Inc., 2012).
 42.
Simonyan, K. & Zisserman, A. Very Deep Convolutional Networks for Largescale Image Recognition. https://arxiv.org/abs/1409.1556 (2015).
 43.
He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition 770–778 (2016).
 44.
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
 45.
Lançon, F., Billard, L. & Chaudhari, P. Thermodynamical properties of a twodimensional quasicrystal from molecular dynamics calculations. EPL 2, 625–629 (1986).
 46.
Cheng, Y. Q., Ma, E. & Sheng, H. W. Atomic level structure in multicomponent bulk metallic glass. Phys. Rev. Lett. 102, 245501 (2009).
 47.
Behler, J. & Parrinello, M. Generalized neuralnetwork representation of highdimensional potentialenergy surfaces. Phys. Rev. Lett. 98, 146401 (2007).
 48.
Ioffe, S. & Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167 (2015).
 49.
Argon, A. S. Plastic deformation in metallic glasses. Acta Metall. 27, 47–58 (1979).
 50.
Malandro, D. L. & Lacks, D. J. Molecularlevel mechanical instabilities and enhanced selfdiffusion in flowing liquids. Phys. Rev. Lett. 81, 5576–5579 (1998).
 51.
Maloney, C. & Lemaître, A. Universal breakdown of elasticity at the onset of material failure. Phys. Rev. Lett. 93, 195501 (2004).
 52.
Richard, D. et al. Predicting Plasticity in Disordered Solids from Structural Indicators. https://arxiv.org/abs/2003.11629 (2020).
 53.
Rieser, J. M., Goodrich, C. P., Liu, A. J. & Durian, D. J. Divergence of voronoi cell anisotropy vector: a thresholdfree characterization of local structure in amorphous materials. Phys. Rev. Lett. 116, 088001 (2016).
 54.
Milkus, R. & Zaccone, A. Local inversionsymmetry breaking controls the boson peak in glasses and crystals. Phys. Rev. B 93, 094204 (2016).
 55.
Yang, J. et al. Structural parameter of orientational order to predict the boson vibrational anomaly in glasses. Phys. Rev. Lett. 122, 015501 (2019).
 56.
Brooks, C. L. Computer simulation of liquids. J. Solut. Chem. 18, 99–99 (1989).
 57.
Plimpton, S. Fast parallel algorithms for shortrange molecular dynamics. J. Comput. Phys. 117, 1–19 (1995).
 58.
Gendelman, O., Jaiswal, P. K., Procaccia, I., Gupta, B. S. & Zylberg, J. Shear transformation zones: state determined or protocol dependent? EPL 109, 16002 (2015).
 59.
Stukowski, A. Visualization and analysis of atomistic simulation data with OVITO—the open visualization tool. Model. Simul. Mater. Sci. Eng. 18, 015012 (2010).
 60.
Abadi, M. et al. Tensorflow: A system for largescale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16) 265–283 (2016).
 61.
Sergeev, A. & Del Balso, M. Horovod: Fast and Easy Distributed Deep Learning in TensorFlow. https://arxiv.org/abs/1802.05799 (2018).
 62.
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R. & Lin, C.J. LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008).
 63.
WidmerCooper, A., Perry, H., Harrowell, P. & Reichman, D. R. Irreversible reorganization in a supercooled liquid originates from localized soft modes. Nat. Phys. 4, 711–715 (2008).
Acknowledgements
The authors gratefully acknowledge Prof. M. L. Falk for discussions and his insightful comments on the manuscript, and V. Bapst for the useful discussions on training the GNN model. The work is supported at JHU by the U.S. Department of Energy (DOE) DOEBESDMSE under grant DEFG0219ER46056. This research used the CPU resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DEAC0205CH11231, and the Maryland Advanced Research Computing Center (MARCC). The authors also acknowledge the Texas Advanced Computing Center (TACC) at The University of Texas at Austin for providing GPU resources that have contributed to the research results reported within this paper. The early attempts of deep learning used the GPU resource from MARCC.
Author information
Affiliations
Contributions
Both authors conceived the research project. Z.F. designed the structure representation, conducted the MD simulations, and carried out the machine/deep learning analysis. Both authors contributed to the discussions and the writing of the paper. E.M. left the project in July 2020.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Communications thanks Victor Bapst, HaiBin Yu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Fan, Z., Ma, E. Predicting orientationdependent plastic susceptibility from static structure in amorphous solids via deep learning. Nat Commun 12, 1506 (2021). https://doi.org/10.1038/s4146702121806z
Received:
Accepted:
Published:
Further reading

Tension–compression asymmetry in amorphous silicon
Nature Materials (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.