Machine learning in materials genome initiative: A review

DOI URL PMID [Cited within: 3]

[4]

S.V.

Kalinin

, B.G.

Sumpter

, R.K.

Archibald

, Nat. Mater. 14(2015) 973-980.

Harnessing big data, deep data, and smart data from state-of-the-art imaging might accelerate the design and realization of advanced functional materials. Here we discuss new opportunities in materials design enabled by the availability of big data in imaging and data analytics approaches, including their limitations, in material systems of practical interest. We specifically focus on how these tools might help realize new discoveries in a timely manner. Such methodologies are particularly appropriate to explore in light of continued improvements in atomistic imaging, modelling and data analytics methods.

[5]

M.

Aykol

, S.

Kim

, V.I.

Hegde

, D.

Snydacker

, Z.

Lu

, S.

Hao

, S.

Kirklin

, D.

Morgan

, C.

Wolverton

, Nat. Commun. 7(2016) 13779.

Cathode degradation is a key factor that limits the lifetime of Li-ion batteries. To identify functional coatings that can suppress this degradation, we present a high-throughput density functional theory based framework which consists of reaction models that describe thermodynamic and electrochemical stabilities, and acid-scavenging capabilities of materials. Screening more than 130,000 oxygen-bearing materials, we suggest physical and hydrofluoric-acid barrier coatings such as WO3, LiAl5O8 and ZrP2O7 and hydrofluoric-acid scavengers such as Sc2O3, Li2CaGeO4, LiBO2, Li3NbO4, Mg3(BO3)2 and Li2MgSiO4. Using a design strategy to find the thermodynamically optimal coatings for a cathode, we further present optimal hydrofluoric-acid scavengers such as Li2SrSiO4, Li2CaSiO4 and CaIn2O4 for the layered LiCoO2, and Li2GeO3, Li4NiTeO6 and Li2MnO3 for the spinel LiMn2O4 cathodes. These coating materials have the potential to prolong the cycle-life of Li-ion batteries and surpass the performance of common coatings based on conventional materials such as Al2O3, ZnO, MgO or ZrO2.

[6]

C.

Nyshadham

, C.

Oses

, J.E.

Hansen

, I.

Takeuchi

, S.

Curtarolo

, G.L.W.

Hart

, Acta Mater. 122(2017) 438-447.

[7]

S.

Kirklin

, J.E.

Saal

, V.I.

Hegde

, C.

Wolverton

, Acta Mater. 102(2016) 125-135.

[8]

L.

Ward

, R.

Liu

, A.

Krishna

, V.I.

Hegde

, A.

Agrawal

, A.

Choudhary

, C.

Wolverton

, Phys. Rev. B 96 (2017), 024104.

[9]

S.R.

Kalidindi

, D.B.

Brough

, S.

Li

, A.

Cecen

, A.L.

Blekh

, F.Y.P.

Congo

, C.

Campbell

, MRS Bull. 41(2016) 596-602.

[10]

C.

Kim

, G.

Pilania

, R.

Ramprasad

, Chem. Mater. 28(2016) 1304-1311.

[11]

M.I.

Jordan

, T.M.

Mitchell

, Science 349 (2015) 255-260.

Machine learning addresses the question of how to build computers that improve automatically through experience. It is one of today's most rapidly growing technical fields, lying at the intersection of computer science and statistics, and at the core of artificial intelligence and data science. Recent progress in machine learning has been driven both by the development of new learning algorithms and theory and by the ongoing explosion in the availability of online data and low-cost computation. The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing.

[12]

N.

Wagner

, J.M.

Rondinelli

, Front. Mater. 3(2016) 28.

[13]

W.

Lu

, R.

Xiao

, J.

Yang

, H.

Li

, W.

Zhang

, J. Materiomics 3 (2017) 191-201.

[14]

Z.

Yang

, Y.C.

Yabansu

, R.

Al-Bahrani

, W.K.

Liao

, A.N.

Choudhary

, S.R.

Kalidindi

, A.

Agrawal

, Comput. Mater. Sci. 151(2018) 278-287.

[15]

M.A.

Remita

, A.

Halioui

, A.A.

Malick Diouara

, B.

Daigle

, G.

Kiani

, A.

Diallo

, BMC Bioinf. 18(2017) 208.

[16]

T.

Zheng

, W.

Xie

, L.

Xu

, X.

He

, Y.

Zhang

, M.

You

, G.

Yang

, Int. J. Med. Inform. 97(2017) 120-127.

[17]

H.

Kaneko

, Chemometr. Intell. Lab. 177(2018) 74-82.

[18]

F.

Garcia-Papani

, V.

Leiva

, M.A.

Uribe-Opazo

, R.G.

Aykroyd

, Chemometr. Intell. Lab. 177(2018) 114-128.

[19]

D.

Ye

, L.

Sun

, W.

Tan

, W.

Che

, M.

Yang

, Chemometr. Intell. Lab. 177(2018) 129-139.

[20]

Y.

Liu

, T.

Zhao

, W.

Ju

, S.

Shi

, J. Materiomics 3 (2017) 159-177.

[Cited within: 6]

[21]

L.

Ward

, C.

Wolverton

, Curr. Opin. Solid State Mater.Sci. 21(2017) 167-176.

[22]

P.M.

Voyles

, Curr. Opin. Solid State Mater.Sci. 21(2017) 141-158.

URL PMID [Cited within: 2]

[23]

P.

Raccuglia

, K.C.

Elbert

, P.D.

Adler

, C.

Falk

, M.B.

Wenny

, A.

Mollo

, M.

Zeller

, S.A.

Friedler

, J.

Schrier

, A.J.

Norquist

, Nature 533 (2016) 73-77.

[24]

J.G.P.

Wicker

, R.I.

Cooper

, CrystEngComm 17 (2015) 1927-1934.

[25]

M.

Rupp

, Int. J. Quantum Chem. 115(2015) 1058-1073.

[26]

V.

Stanev

, C.

Oses

, A.G.

Kusne

, E.

Rodriguez

, J.

Paglione

, S.

Curtarolo

, I.

Takeuchi

, npj Comput.Mater. 4(2018) 1-14.

[27]

J.

Behler

, M.

Parrinello

, Phys. Rev. Lett. 98(2007), 146401.

[28]

N.

Artrith

, A.

Urban

, Comput. Mater. Sci. 114(2016) 135-150.

[29]

S.

Lorenz

, A.

Groß

, M.

Schefﬂer

, Chem.Phys. Lett. 395(2004) 210-215.

[30]

Z.

Deng

, H.

Yin

, X.

Jiang

, C.

Zhang

, K.

Zhang

, T.

Zhang

, B.

Xu

, Q.

Zheng

, X.

Qu

, Comput. Mater. Sci. 155(2018) 48-54.

[31]

L.M.

Ghiringhelli

, J.

Vybiral

, S.V.

Levchenko

, C.

Draxl

, M. Schefﬂer,Phys.Rev. Lett. 114(2015), 105503.

[32]

J.

Lee

, A.

Seko

, K.

Shitara

, K.

Nakayama

, I.

Tanaka

, Phys. Rev. B 93 (2016), 115104.

DOI URL PMID [Cited within: 2]

[33]

G.

Pilania

, A.

Mannodi-Kanakkithodi

, B.P.

Uberuaga

, R.

Ramprasad

, J.E.

Gubernatis

, T.

Lookman

, Sci. Rep. 6(2016) 19375.

The ability to make rapid and accurate predictions on bandgaps of double perovskites is of much practical interest for a range of applications. While quantum mechanical computations for high-fidelity bandgaps are enormously computation-time intensive and thus impractical in high throughput studies, informatics-based statistical learning approaches can be a promising alternative. Here we demonstrate a systematic feature-engineering approach and a robust learning framework for efficient and accurate predictions of electronic bandgaps of double perovskites. After evaluating a set of more than 1.2 million features, we identify lowest occupied Kohn-Sham levels and elemental electronegativities of the constituent atomic species as the most crucial and relevant predictors. The developed models are validated and tested using the best practices of data science and further analyzed to rationalize their prediction performance.

[34]

J.

Behler

, R.

Martonak

, D.

Donadio

, M.

Parrinello

, Phys. Rev. Lett. 100(2008), 185501.

[35]

J.

Behler

, R.

Martoňák

, D.

Donadio

, M.

Parrinello

, Phys. Status Solidi B 245 (2008) 2618-2629.

[36]

R.H.

Taylor

, F.

Rose

, C.

Toher

, O.

Levy

, K.

Yang

, M. Buongiorno

Nardelli

, S.

Curtarolo

, Comput. Mater. Sci. 93(2014) 178-192.

[37]

R.

Jose

, S.

Ramakrishna

, Appl. Mater. Today 10 (2018) 127-132.

[38]

A.

Agrawal

, A.

Choudhary

, APL Mater. 4(2016), 053208.

URL PMID [Cited within: 1]

[39]

D.B.

Brough

, D.

Wheeler

, S.R.

Kalidindi

, Integr. Mater. Manuf. Innov. 6(2017) 36-53.

[40]

B.

Meredig

, Curr. Opin. Solid State Mater.Sci. 21(2017) 159-166.

[41]

M.L.

Green

, C.L.

Choi

, J.R.

Hattrick-Simpers

, A.M.

Joshi

, I.

Takeuchi

, S.C.

Barron

, E.

Campo

, T.

Chiang

, S.

Empedocles

, J.M.

Gregoire

, A.G.

Kusne

, J.

Martin

, A.

Mehta

, K.

Persson

, Z.

Trautt

, J.

Van Duren

, A. Zakutayev,Appl. Phys. Rev. 4(2017), 011105.

[42]

E.

Kim

, K.

Huang

, A.

Saunders

, A.

Mccallum

, G.

Ceder

, E.

Olivetti

, Chem. Mater. 29(2017) 9436-9444.

[Cited within: 3]

[43]

R.

Ramprasad

, R.

Batra

, G.

Pilania

, A.

Mannodikanakkithodi

, C.

Kim

, npj Comput.Mater. 54(2017) 1-13.

[Cited within: 5]

[44]

A.

Jain

, K.A.

Persson

, G.

Ceder

, APL Mater. 4(2016), 053102.

[45]

G.

Bergerhoff

, R.

Hundt

, R.

Sievers

, J. Chem.

Inf

. Comput. Sci. 23(1983) 66-69.

[46]

B.

Alec

, H.

Mariette

, K.

Vicky Lynn

, L.

Peter

, Acta Crystallogr. Sect. B-Struct. Sci. 58(2002) 364-369.

[47]

S.

Kirklin

, J.E.

Saal

, B.

Meredig

, A.

Thompson

, J.W.

Doak

, M.

Aykol

, S.

Rühl

, C.

Wolverton

, npj Comput. Mater. (2015) 1-15.

[48]

F.H.

Allen

, Acta Crystallogr, Sect. B-Struct. Sci. 58(2002) 380-388.

[49]

Cambridge Structural Database (CSD), 2019, November 12 https://www.lib. ncsu.edu/databases/cambridge-structural.html.

[50]

A.

Jain

, S.P.

Ong

, G.

Hautier

, C.

Wei

, W.D.

Richards

, S.

Dacek

, S.

Cholia

, G.

Dan

, D.

Skinner

, G.

Ceder

, APL Mater. 1(2013) 1-11.

[51]

E.

Kim

, K.

Huang

, S.

Jegelka

, E.

Olivetti

, npj Comput.Mater. 3(2017) 1-9.

[52]

The Materials Project, 2019, September 18 https://www.materialsproject. org.

[53]

S.

Curtarolo

, W.

Setyawan

, S.

Wang

, J.

Xue

, K.

Yang

, R.H.

Taylor

, G.L.

Hart

, S.

Sanvito

, M.B.

Nardelli

, N.

Mingo

, Comput. Mater. Sci. 58(2012) 227-235.

[54]

C.

Oses

, C.

Toher

, S.

Curtarolo

, MRS Bull. 43(2018) 670-675.

[55]

J.

O’Mara

, B.

Meredig

, K.

Michel

, JOM 68 (2016) 2031-2034.

[56]

E.O.

Pyzer-Knapp

, K.

Li

, A.

Aspuru-Guzik

, Adv. Funct. Mater. 25(2015) 6495-6502.

[57]

J.

Hachmann

, R.

Olivares-Amaya

, A.

Jinich

, A.L.

Appleton

, M.A.

Blood-Forsythe

, L.R.

Seress

, C.

Román-Salgado

, K.

Trepte

, S.

Atahan-Evrenk

, S.

Er

, S.

Shrestha

, R.

Mondal

, A.

Sokolov

, Z.

Bao

, A.

Aspuru-Guzik

, Energy Environ. Sci. 7(2014) 698-704.

[58]

Material Property Data(MatWeb), 2019, September 18 http://www.matweb. com.

[59]

Total Materia, 2019, November 12 https://www.totalmateria.com/.

[60]

ASM International, 2019, November 12 https://www.asminternational.org/.

[61]

D.D.

Landis

, J.S.

Hummelshøj

, S.

Nestorov

, J.

Greeley

, K.W.

Jacobsen

, Comput. Sci. Eng. 14(2012) 51-57.

[62]

NIMS Materials Database (Mat Navi), 2019, September 18 https://mits.nims. go.jp/index en.html.

[63]

B.

Blaiszik

, K.

Chard

, J.

Pruyne

, R.

Ananthakrishnan

, S.

Tuecke

, I.

Foster

, JOM 68 (2016) 2045-2052.

[64]

Interatomic Potentials (force Fields) Databases (NIST), 2019, September 18 https://www.ctcms.nist.gov/potentials.

[65]

Superconducting Material Database (SuperCon), 2019, November 12 https://supercon.nims.go.jp/.

[66]

B.

Puchala

, G.

Tarcea

, E.A.

Marquis

, M.

Hedstrom

, H.V.

Jagadish

, J.E.

Allison

, JOM 68 (2016) 2035-2044.

[67]

M.D.

Jacobsen

, J.R.

Fourman

, K.M.

Porter

, E.A.

Wirrig

, M.D.

Benedict

, B.J.

Foster

, C.H.

Ward

, Integrat. Mater. Manuf. Innov. 5(2016) 12.

[68]

A.A.

Salem

, J.B.

Shaffer

, R.A.

Kublik

, L.A.

Wuertemberger

, D.P.

Satko

, Integr. Mater. Manuf. Innov. 6(2017) 111-126.

[69]

Q.

Zhang

, D.

Chang

, X.

Zhai

, W.

Lu

, Chemometr. Intell. Lab. 177(2018) 26-34.

[70]

G.

Pizzi

, A.

Cepellotti

, R.

Sabatini

, N.

Marzari

, B.

Kozinsky

, Comput. Mater. Sci. 111(2016) 218-230.

[71]

X.

Yang

, Z.

Wang

, X.

Zhao

, J.

Song

, M.

Zhang

, H.

Liu

, Comput. Mater. Sci. 146(2018) 319-333.

[72]

Software Package VNL, 2019, September 18 https://www.quantumwise. com/products/vnl.

[73]

L.M.

Ghiringhelli

, J.

Vybiral

, E.

Ahmetcik

, R.

Ouyang

, S.V.

Levchenko

, C.

Draxl

, M.

Schefﬂer

, NJPh 19 (2017), 023017.

[74]

D.

Silver

, J.

Schrittwieser

, K.

Simonyan

, I.

Antonoglou

, A.

Huang

, A.

Guez

, T.

Hubert

, L.

Baker

, M.

Lai

, A.

Bolton

, Nature 550 (2017) 354-359.

A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo's own move selections and also the winner of AlphaGo's games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo.

[75]

A.

Jha

, A.

Chandrasekaran

, C.

Kim

, R.

Ramprasad

, Modell. Simul. Mater. Sci. Eng. 27(2019), 024002.

[76]

T.N.

Bhat

, L.M.

Bartolo

, U.R.

Kattner

, C.E.

Campbell

, J.T.

Elliott

, JOM 67 (2015) 1866-1875.

[77]

S.

Pattanayak

, S.

Dey

, S.

Chatterjee

, S.G.

Chowdhury

, S.

Datta

, Comput. Mater. Sci. 104(2015) 60-68.

[Cited within: 4]

[78]

P.V.

Balachandran

, J.

Theiler

, J.M.

Rondinelli

, T.

Lookman

, Sci. Rep. 5(2015) 13285.

In the paradigm of materials informatics for accelerated materials discovery, the choice of feature set (i.e. attributes that capture aspects of structure, chemistry and/or bonding) is critical. Ideally, the feature sets should provide a simple physical basis for extracting major structural and chemical trends and furthermore, enable rapid predictions of new material chemistries. Orbital radii calculated from model pseudopotential fits to spectroscopic data are potential candidates to satisfy these conditions. Although these radii (and their linear combinations) have been utilized in the past, their functional forms are largely justified with heuristic arguments. Here we show that machine learning methods naturally uncover the functional forms that mimic most frequently used features in the literature, thereby providing a mathematical basis for feature set construction without a priori assumptions. We apply these principles to study two broad materials classes: (i) wide band gap AB compounds and (ii) rare earth-main group RM intermetallics. The AB compounds serve as a prototypical example to demonstrate our approach, whereas the RM intermetallics show how these concepts can be used to rapidly design new ductile materials. Our predictive models indicate that ScCo, ScIr, and YCd should be ductile, whereas each was previously proposed to be brittle.

[79]

F.

Faber

, A.

Lindmaa

, O.A.

von Lilienfeld

, R.

Armiento

, Int. J. Quantum Chem. 115(2015) 1094-1101.

[80]

T.

Varol

, A.

Canakci

, S.

Ozsahin

, Compos. Part. B-Eng. 54(2013) 224-233.

[81]

A.

Canakci

, T.

Varol

, S.

Ozsahin

, Met. Mater. Int. 19(2013) 519-526.

[82]

A.J.A.

Al-Jabar

, M.A.A.

Al-dujaili

, I.A.D.

Al-hydary

, Appl. Phys. A 123 (2017) 273-274.

[83]

G.

Pilania

, P.V.

Balachandran

, C.

Kim

, T.

Lookman

, Front. Mater. 3(2016) 1-7.

URL PMID [Cited within: 1]

[84]

A.P.

Bartok

, M.C.

Payne

, R.

Kondor

, G.

Csanyi

, Phys. Rev. Lett. 104(2010), 136403.

[85]

J.R.

Hattrick-Simpers

, J.M.

Gregoire

, A.G.

Kusne

, APL Mater. 4(2016), 2832-2522.

[86]

J.K.

Bunn

, S.

Han

, Y.

Zhang

, Y.

Tong

, J.

Hu

, J.R.

Hattrick-Simpers

, J.Mater. Res. 30(2015) 879-889.

[87]

B.

Meredig

, A.

Agrawal

, S.

Kirklin

, J.E.

Saal

, J.W.

Doak

, A.

Thompson

, K.

Zhang

, A.

Choudhary

, C.

Wolverton

, Phys. Rev. B 89 (2014) 82-84.

[88]

G.

Dhaliwal

, P.B.

Nair

, C.V.

Singh

, Carbon 142 (2019) 300-310.

[89]

S.R.

Kalidindi

, J.A.

Gomberg

, Z.T.

Trautt

, C.A.

Becker

, Nanotechnology 26 (2015), 344006.

Structure quantification is key to successful mining and extraction of core materials knowledge from both multiscale simulations as well as multiscale experiments. The main challenge stems from the need to transform the inherently high dimensional representations demanded by the rich hierarchical material structure into useful, high value, low dimensional representations. In this paper, we develop and demonstrate the merits of a data-driven approach for addressing this challenge at the atomic scale. The approach presented here is built on prior successes demonstrated for mesoscale representations of material internal structure, and involves three main steps: (i) digital representation of the material structure, (ii) extraction of a comprehensive set of structure measures using the framework of n-point spatial correlations, and (iii) identification of data-driven low dimensional measures using principal component analyses. These novel protocols, applied on an ensemble of structure datasets output from molecular dynamics (MD) simulations, have successfully classified the datasets based on several model input parameters such as the interatomic potential and the temperature used in the MD simulations.

[90]

T.

Lookman

, P.V.

Balachandran

, D.

Xue

, J.

Hogden

, J.

Theiler

, Curr. Opin. Solid State Mater.Sci. 21(2017) 121-128.

DOI URL PMID [Cited within: 2]

[91]

P.

Ghanshyam

, C.

Wang

, X.

Jiang

, R.

Sanguthevar

, R.

Ramamurthy

, Sci. Rep. 3(2013) 2810.

The materials discovery process can be significantly expedited and simplified if we can learn effectively from available knowledge and data. In the present contribution, we show that efficient and accurate prediction of a diverse set of properties of material systems is possible by employing machine (or statistical) learning methods trained on quantum mechanical computations in combination with the notions of chemical similarity. Using a family of one-dimensional chain systems, we present a general formalism that allows us to discover decision rules that establish a mapping between easily accessible attributes of a system and its properties. It is shown that fingerprints based on either chemo-structural (compositional and configurational information) or the electronic charge density distribution can be used to make ultra-fast, yet accurate, property predictions. Harnessing such learning paradigms extends recent efforts to systematically explore and mine vast chemical spaces, and can significantly accelerate the discovery of new application-specific materials.

[92]

F.

Niu

, C.

Zhang

, C.

Ré

, J.

Shavlik

, Int. J. Semant. Web Inf.Syst. 8(2012) 42-73.

[93]

O.M.

Tzuc

, A.

Bassam

, M.

Abatal

, Y.E.

Hamzaoui

, Chemometr. Intell. Lab. 177(2018) 151-162.

URL PMID [Cited within: 1]

[94]

T.C.

Le

, D.A.

Winkler

, Chem. Rev. 116(2016) 6107-6132.

[95]

A.

Jain

, G.

Hautier

, S.P.

Ong

, K.

Persson

, J. Mater. Res. 31(2016) 977-994.

[96]

L.

Ward

, A.

Dunn

, A.

Faghaninia

, N.E.R.

Zimmermann

, S.

Bajaj

, Q.

Wang

, J.

Montoya

, J.

Chen

, K.

Bystrom

, M.

Dylla

, Comput. Mater. Sci. 152(2018) 60-69.

[97]

B.

Hu

, K.

Lu

, Q.

Zhang

, X.

Ji

, W.

Lu

, Comput. Mater. Sci. 136(2017) 29-35.

[98]

S.

Srinivasan

, K.

Rajan

, Materials 6 (2013) 279-290.

This paper highlights the capability of materials informatics to recreate

[99]

N.

Artrith

, A.

Urban

, G.

Ceder

, J. Chem. Phys. 148(2018), 241711.

The atomistic modeling of amorphous materials requires structure sizes and sampling statistics that are challenging to achieve with first-principles methods. Here, we propose a methodology to speed up the sampling of amorphous and disordered materials using a combination of a genetic algorithm and a specialized machine-learning potential based on artificial neural networks (ANNs). We show for the example of the amorphous LiSi alloy that around 1000 first-principles calculations are sufficient for the ANN-potential assisted sampling of low-energy atomic configurations in the entire amorphous LixSi phase space. The obtained phase diagram is validated by comparison with the results from an extensive sampling of LixSi configurations using molecular dynamics simulations and a general ANN potential trained to approximately 45 000 first-principles calculations. This demonstrates the utility of the approach for the first-principles modeling of amorphous materials.

[100]

M.

Spellings

, S.C.

Glotzer

, AIChE J. 64(2018) 2198-2206.

[101]

L.

Ward

, A.

Agrawal

, A.

Choudhary

, C.

Wolverton

, npj Comput.Mater. 2(2016) 16028.

[102]

G.

Pilania

, T.

Lookman

, J.E.

Gubernatis

, Phys. Rev. B 91 (2015), 214302.

[103]

K.T.

Schütt

, H.

Glawe

, F.

Brockherde

, A.

Sanna

, K.R.

Müller

, E.K.U.

Gross

, Phys. Rev. B 89 (2014), 205118.

[104]

F.

Faber

, A.

Lindmaa

, O.A.V.

Lilienfeld

, R.

Armiento

, Phys. Rev. Lett. 117(2016), 135502.

Elpasolite is the predominant quaternary crystal structure (AlNaK_{2}F_{6} prototype) reported in the Inorganic Crystal Structure Database. We develop a machine learning model to calculate density functional theory quality formation energies of all approximately 2x10^{6} pristine ABC_{2}D_{6} elpasolite crystals that can be made up from main-group elements (up to bismuth). Our model's accuracy can be improved systematically, reaching a mean absolute error of 0.1 eV/atom for a training set consisting of 10x10^{3} crystals. Important bonding trends are revealed: fluoride is best suited to fit the coordination of the D site, which lowers the formation energy whereas the opposite is found for carbon. The bonding contribution of the elements A and B is very small on average. Low formation energies result from A and B being late elements from group II, C being a late (group I) element, and D being fluoride. Out of 2x10^{6} crystals, 90 unique structures are predicted to be on the convex hull-among which is NFAl_{2}Ca_{6}, with a peculiar stoichiometry and a negative atomic oxidation state for Al.

[105]

C.S.

Kong

, W.

Luo

, S.

Arapan

, P.

Villars

, S.

Iwata

, R.

Ahuja

, K.

Rajan

, J. Chem. Inf. Model. 52(2012) 1812-1820.

In this work, it is shown that for the first time that, using information-entropy-based methods, one can quantitatively explore the relative impact of a wide multidimensional array of electronic and chemical bonding parameters on the structural stability of intermetallic compounds. Using an inorganic AB2 compound database as a template data platform, the evolution of design rules for crystal chemistry based on an information-theoretic partitioning classifier for a high-dimensional manifold of crystal chemistry descriptors is monitored. An application of this data-mining approach to establish chemical and structural design rules for crystal chemistry is demonstrated by showing that, when coupled with first-principles calculations, statistical inference methods can serve as a tool for significantly accelerating the prediction of unknown crystal structures.

[106]

B.

Medasani

, A.

Gamst

, H.

Ding

, W.

Chen

, K.A.

Persson

, M.

Asta

, A.

Canning

, M.

Haranczyk

, npj Comput. Mater. 2 (2016).

[107]

C.

Bertinetto

, C.

Duce

, A.

Micheli

, R.

Solaro

, A.

Starita

, M.R.

Tiné

, J. Mol. Model. 27(2009) 797-802.

[108]

G.V.S.M.

Carrera

, L.C.

Branco

, J.

Aires-De-Sousa

, C.A.M.

Afonso

, Tetrahcdron 64 (2008) 2216-2224.

[109]

R.

Koker

, N.

Altinkok

, A.

Demir

, Mater. Des. 28(2007) 616-627.

[110]

N.

Altinkok

, R.

Koker

, Mater. Des. 25(2004) 595-602.

[111]

E.

Akbari

, Z.

Buntat

, A.

Enzevaee

, S.J.

Mirazimiabarghouei

, M.

Bahadoran

, A.

Shahidi

, A.

Nikoukar

, RSC Adv. 4(2014) 36896-36904.

[112]

S.

Mercier

, I.

Uysal

, Chemometr. Intell. Lab. 177(2018) 1-7.

[113]

H.K.D.H.

Bhadeshia

, Stat. Anal. Data Min. 1(2009) 296-305.

[114]

M.

Paliwal

, U.A.

Kumar

, Expert Syst. Appl. 36(2009) 2-17.

[115]

T.

Bhattacharyya

, S.B.

Singh

, S.

Das

, A.

Haldar

, D.

Bhattacharjee

, Mater. Sci. Eng. A 528 (2011) 2394-2400.

[116]

B.

Kim

, K.

Kwon

, IEEE Trans. Semicond. Manuf. 44(1998) 692-695.

[117]

G.

Dini

, A.

Najafizadeh

, S.M.

Monir-Vaghefi

, A.Ebnonnasir, Comput. Mater. Sci. 45(2009) 959-965.

[118]

A.

Shafyei

, S.H. Mousavi

Anijdan

, A.

Bahrami

, Mater. Sci. Eng. A 431 (2006) 206-210.

[119]

S.H.

Mousavi Anijdan

, A.

Bahrami

, H.R.

Madaah Hosseini

, A.

Shafyei

, Mater. Des. 27(2006) 605-609.

[120]

R.

Tuntas

, B.

Dikici

, J. Compos. Mater. 49(2015) 3431-3438.

[121]

M.O.

Shabani

, A.

Mazahery

, Appl. Math. Model. 35(2011) 5707-5713.

Artificial neural network (ANN) is a nonlinear dynamic computational system suitable for simulations which are hard to be described by physical models where, rather than relying on a number of predetermined assumptions, data is used to form the model. In order to predict the mechanical properties of A356 including yield stress, ultimate tensile strength and elongation percentage, a relatively new approach that uses artificial neural network and finite element technique is presented which combines mechanical properties data in the form of experimental and simulated solidification conditions. It was observed that predictions of this study are consistent with experimental measurements for A356 alloy. The results of this research were also used for solidification codes of SUT CAST software. Crown Copyright (C) 2011 Published by Elsevier Inc.

[122]

M.O.

Shabani

, A.

Mazahery

, Metall. Mater. Trans. A 43 (2012) 2158-2165.

[123]

A.

Mannodikanakkithodi

, G.

Pilania

, T.D.

Huan

, Sci. Rep. 6(2016) 20952.

The ability to efficiently design new and advanced dielectric polymers is hampered by the lack of sufficient, reliable data on wide polymer chemical spaces, and the difficulty of generating such data given time and computational/experimental constraints. Here, we address the issue of accelerating polymer dielectrics design by extracting learning models from data generated by accurate state-of-the-art first principles computations for polymers occupying an important part of the chemical subspace. The polymers are 'fingerprinted' as simple, easily attainable numerical representations, which are mapped to the properties of interest using a machine learning algorithm to develop an on-demand property prediction model. Further, a genetic algorithm is utilised to optimise polymer constituent blocks in an evolutionary manner, thus directly leading to the design of polymers with given target properties. While this philosophy of learning to make instant predictions and design is demonstrated here for the example of polymer dielectrics, it is equally applicable to other classes of materials as well.

[124]

A.

Seko

, T.

Maekawa

, K.

Tsuda

, I.

Tanaka

, Phys. Rev. B 89 (2014), 054303.

DOI URL [Cited within: 2]

[125]

J.C.

Mauro

, A.

Tandia

, K.D.

Vargheese

, Y.Z.

Mauro

, M.M.

Smedskjaer

, Chem. Mater. 28(2016) 4267-4277.

[126]

G.

Pilania

, X.Y.

Liu

, J. Mater. Sci. 53(2018) 6652-6664.

[127]

K.

Toyoura

, D.

Hirano

, A.

Seko

, M.

Shiga

, A.

Kuwabara

, M.

Karasuyama

, K.

Shitara

, I.

Takeuchi

, Phys. Rev. B 93 (2016), 054112.

[128]

L.

Petrich

, D.

Westhoff

, J.

Feinauer

, D.P.

Finegan

, S.R.

Daemi

, P.R.

Shearing

, V.

Schmidt

, Comput. Mater. Sci. 136(2017) 297-305.

[129]

V.L.

Deringer

, G.

Csányi

, Phys. Rev. B 95 (2017), 094203.

[130]

V.

Botu

, J.

Chapman

, R.

Ramprasad

, Comput. Mater. Sci. 129(2017) 332-335.

[131]

M.

Rupp

, A.

Tkatchenko

, K.R.

Muller

, O.A.

von Lilienfeld

, Phys.Rev. Lett. 108(2012), 058301.

[132]

K.

Hansen

, G.

Montavon

, F.

Biegler

, S.

Fazli

, M.

Rupp

, M.

Schefﬂer

, O.A.

von Lilienfeld

, A.

Tkatchenko

, K.-R.

Müller

, J. Chem. Theory Comput. 9(2013) 3404-3419.

DOI URL PMID [Cited within: 2]

The accurate and reliable prediction of properties of molecules typically requires computationally intensive quantum-chemical calculations. Recently, machine learning techniques applied to ab initio calculations have been proposed as an efficient approach for describing the energies of molecules in their given ground-state structure throughout chemical compound space (Rupp et al. Phys. Rev. Lett. 2012, 108, 058301). In this paper we outline a number of established machine learning techniques and investigate the influence of the molecular representation on the methods performance. The best methods achieve prediction errors of 3 kcal/mol for the atomization energies of a wide variety of molecules. Rationales for this performance improvement are given together with pitfalls and challenges when applying machine learning approaches to the prediction of quantum-mechanical observables.

[133]

K.

Hansen

, F.

Biegler

, R.

Ramakrishnan

, W.

Pronobis

, O.A.

von Lilienfeld

, K.R.

Muller

, A.

Tkatchenko

, J. Phys. Chem. Lett. 6(2015) 2326-2331.

Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the

[134]

O.

Isayev

, C.

Oses

, C.

Toher

, E.

Gossett

, S.

Curtarolo

, A.

Tropsha

, Nat. Commun. 8(2017) 15679.

Although historically materials discovery has been driven by a laborious trial-and-error process, knowledge-driven materials design can now be enabled by the rational combination of Machine Learning methods and materials databases. Here, data from the AFLOW repository for ab initio calculations is combined with Quantitative Materials Structure-Property Relationship models to predict important properties: metal/insulator classification, band gap energy, bulk/shear moduli, Debye temperature and heat capacities. The prediction's accuracy compares well with the quality of the training data for virtually any stoichiometric inorganic crystalline material, reciprocating the available thermomechanical experimental data. The universality of the approach is attributed to the construction of the descriptors: Property-Labelled Materials Fragments. The representations require only minimal structural input allowing straightforward implementations of simple heuristic design rules.

[135]

O.

Isayev

, D.

Fourches

, E.N.

Muratov

, C.

Oses

, K.

Rasch

, A.

Tropsha

, S.

Curtarolo

, Chem. Mater. 27(2014) 735-743.

[136]

M.

Nuñez

, Comput. Mater. Sci. 158(2019) 117-123.

DOI URL [Cited within: 3]

[137]

T.D.

Huan

, A.

Mannodi-Kanakkithodi

, R.

Ramprasad

, Phys. Rev. B 92 (2015), 014106.

[138]

A.O.

Oliynyk

, A.

Mar

, Acc Chem. Res. 51(2018) 59-68.

Intermetallic compounds are bestowed by diverse compositions, complex structures, and useful properties for many materials applications. How metallic elements react to form these compounds and what structures they adopt remain challenging questions that defy predictability. Traditional approaches offer some rational strategies to prepare specific classes of intermetallics, such as targeting members within a modular homologous series, manipulating building blocks to assemble new structures, and filling interstitial sites to create stuffed variants. Because these strategies rely on precedent, they cannot foresee surprising results, by definition. Exploratory synthesis, whether through systematic phase diagram investigations or serendipity, is still essential for expanding our knowledge base. Eventually, the relationships may become too complex for the pattern recognition skills to be reliably or practically performed by humans. Complementing these traditional approaches, new machine-learning approaches may be a viable alternative for materials discovery, not only among intermetallics but also more generally to other chemical compounds. In this Account, we survey our own efforts to discover new intermetallic compounds, encompassing gallides, germanides, phosphides, arsenides, and others. We apply various machine-learning methods (such as support vector machine and random forest algorithms) to confront two significant questions in solid state chemistry. First, what crystal structures are adopted by a compound given an arbitrary composition? Initial efforts have focused on binary equiatomic phases AB, ternary equiatomic phases ABC, and full Heusler phases AB2C. Our analysis emphasizes the use of real experimental data and places special value on confirming predictions through experiment. Chemical descriptors are carefully chosen through a rigorous procedure called cluster resolution feature selection. Predictions for crystal structures are quantified by evaluating probabilities. Major results include the discovery of RhCd, the first new binary AB compound to be found in over 15 years, with a CsCl-type structure; the connection between

[139]

D.

Xue

, P.V.

Balachandran

, H.

John

, T.

James

, D.

Xue

, L.

Turab

, Nat. Commun. 7(2016) 1-9.

[140]

F.

Ren

, L.

Ward

, T.

Williams

, K.J.

Laws

, C.

Wolverton

, J.

Hattricksimpers

, A.

Mehta

, Sci. Adv. 4 (2018), eaaq1566.

With more than a hundred elements in the periodic table, a large number of potential new materials exist to address the technological and societal challenges we face today; however, without some guidance, searching through this vast combinatorial space is frustratingly slow and expensive, especially for materials strongly influenced by processing. We train a machine learning (ML) model on previously reported observations, parameters from physiochemical theories, and make it synthesis method-dependent to guide high-throughput (HiTp) experiments to find a new system of metallic glasses in the Co-V-Zr ternary. Experimental observations are in good agreement with the predictions of the model, but there are quantitative discrepancies in the precise compositions predicted. We use these discrepancies to retrain the ML model. The refined model has significantly improved accuracy not only for the Co-V-Zr system but also across all other available validation data. We then use the refined model to guide the discovery of metallic glasses in two additional previously unreported ternaries. Although our approach of iterative use of ML and HiTp experiments has guided us to rapid discovery of three new glass-forming systems, it has also provided us with a quantitatively accurate, synthesis method-sensitive predictor for metallic glasses that improves performance with use and thus promises to greatly accelerate discovery of many new metallic glasses. We believe that this discovery paradigm is applicable to a wider range of materials and should prove equally powerful for other materials and properties that are synthesis path-dependent and that current physiochemical theories find challenging to predict.

[141]

R.

Yuan

, Z.

Liu

, P.V.

Balachandran

, D.

Xue

, Y.

Zhou

, X.

Ding

, J.

Sun

, D.

Xue

, T.

Lookman

, Adv. Mater. 30(2018), 1702884.

[142]

G.

Pilania

, P.V.

Balachandran

, J.E.

Gubernatis

, T.

Lookman

, Acta Crystallogr. 71(2015) 507-513.

[143]

L.C.

Lin

, A.H.

Berger

, R.L.

Martin

, J.

Kim

, J.A.

Swisher

, K.

Jariwala

, C.H.

Rycroft

, A.S.

Bhown

, M.W.

Deem

, M.

Haranczyk

, Nat. Mater. 11(2012) 633-641.

One of the main bottlenecks to deploying large-scale carbon dioxide capture and storage (CCS) in power plants is the energy required to separate the CO(2) from flue gas. For example, near-term CCS technology applied to coal-fired power plants is projected to reduce the net output of the plant by some 30% and to increase the cost of electricity by 60-80%. Developing capture materials and processes that reduce the parasitic energy imposed by CCS is therefore an important area of research. We have developed a computational approach to rank adsorbents for their performance in CCS. Using this analysis, we have screened hundreds of thousands of zeolite and zeolitic imidazolate framework structures and identified many different structures that have the potential to reduce the parasitic energy of CCS by 30-40% compared with near-term technologies.

[144]

C.E.

Wilmer

, M.

Leaf

, Y.L.

Chang

, O.K.

Farha

, B.G.

Hauser

, J.T.

Hupp

, R.Q.

Snurr

, Nat. Chem. 4(2012) 83-89.

Metal-organic frameworks (MOFs) are porous materials constructed from modular molecular building blocks, typically metal clusters and organic linkers. These can, in principle, be assembled to form an almost unlimited number of MOFs, yet materials reported to date represent only a tiny fraction of the possible combinations. Here, we demonstrate a computational approach to generate all conceivable MOFs from a given chemical library of building blocks (based on the structures of known MOFs) and rapidly screen them to find the best candidates for a specific application. From a library of 102 building blocks we generated 137,953 hypothetical MOFs and for each one calculated the pore-size distribution, surface area and methane-storage capacity. We identified over 300 MOFs with a predicted methane-storage capacity better than that of any known material, and this approach also revealed structure-property relationships. Methyl-functionalized MOFs were frequently top performers, so we selected one such promising MOF and experimentally confirmed its predicted capacity.

[145]

J.

Greeley

, T.F.

Jaramillo

, J.

Bonde

, I.B.

Chorkendorff

, J.K.

Nørskov

, Nat. Mater. 5(2006) 909-913.

The pace of materials discovery for heterogeneous catalysts and electrocatalysts could, in principle, be accelerated by the development of efficient computational screening methods. This would require an integrated approach, where the catalytic activity and stability of new materials are evaluated and where predictions are benchmarked by careful synthesis and experimental tests. In this contribution, we present a density functional theory-based, high-throughput screening scheme that successfully uses these strategies to identify a new electrocatalyst for the hydrogen evolution reaction (HER). The activity of over 700 binary surface alloys is evaluated theoretically; the stability of each alloy in electrochemical environments is also estimated. BiPt is found to have a predicted activity comparable to, or even better than, pure Pt, the archetypical HER catalyst. This alloy is synthesized and tested experimentally and shows improved HER performance compared with pure Pt, in agreement with the computational screening results.

[146]

W.T.

Hong

, R.E.

Welsch

, S.-H.

Yang

, J. Phys. Chem. C 120 (2016) 78-86.

[147]

R.

Gómezbombarelli

, J.

Aguileraiparraguirre

, T.D.

Hirzel

, D.

Duvenaud

, D.

Maclaurin

, M.A.

Bloodforsythe

, H.S.

Chae

, M.

Einzinger

, D.G.

Ha

, T.

Wu

, Nat. Mater. 15(2016) 1120-1127.

Virtual screening is becoming a ground-breaking tool for molecular discovery due to the exponential growth of available computer time and constant improvement of simulation and machine learning techniques. We report an integrated organic functional material design process that incorporates theoretical insight, quantum chemistry, cheminformatics, machine learning, industrial expertise, organic synthesis, molecular characterization, device fabrication and optoelectronic testing. After exploring a search space of 1.6 million molecules and screening over 400,000 of them using time-dependent density functional theory, we identified thousands of promising novel organic light-emitting diode molecules across the visible spectrum. Our team collaboratively selected the best candidates from this set. The experimentally determined external quantum efficiencies for these synthesized candidates were as large as 22%.

[148]

T.D.

Sparks

, M.W.

Gaultois

, A.

Oliynyk

, J.

Brgoch

, B.

Meredig

, Scr. Mater. 111(2016) 10-15.

[149]

A.O.

Oliynyk

, E.

Antono

, T.D.

Sparks

, L.

Ghadbeigi

, M.W.

Gaultois

, B.

Meredig

, A.

Mar

, Chem. Mater. 28(2016) 7324-7331.

[150]

J.

Yan

, P.

Gorai

, B.

Ortiz

, S.

Miller

, S.A.

Barnett

, T.

Mason

, V.

Stevanovi´c

, E.S.

Toberer

, Energy Environ. Sci. 8(2015) 983-994.

[151]

R.

Seshadri

, T.D.

Sparks

, APL Mater. 4(2016), 053206.

[152]

S.

Kiyohara

, H.

Oda

, T.

Miyata

, T.

Mizoguchi

, Sci. Adv. 2(2016), e1600746-e1600746.

Interfaces markedly affect the properties of materials because of differences in their atomic configurations. Determining the atomic structure of the interface is therefore one of the most significant tasks in materials research. However, determining the interface structure usually requires extensive computation. If the interface structure could be efficiently predicted, our understanding of the mechanisms that give rise to the interface properties would be significantly facilitated, and this would pave the way for the design of material interfaces. Using a virtual screening method based on machine learning, we demonstrate a powerful technique to determine interface energies and structures. On the basis of the results obtained by a nonlinear regression using training data from 4 interfaces, structures and energies for 13 other interfaces were predicted. Our method achieved an efficiency that is more than several hundred to several tens of thousand times higher than that of the previously reported methods. Because the present method uses geometrical factors, such as bond length and atomic density, as descriptors for the regression analysis, the method presented here is robust and general and is expected to be beneficial to understanding the nature of any interface.

[153]

Y.

Zhang

, C.

Ling

, npj Comput.Mater. 4(2018) 25.

[154]

M.

Krallinger

, O.

Rabal

, A.

Lourenco

, J.

Oyarzabal

, A.

Valencia

, Chem. Rev. 117(2017) 7673-7761.

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

[155]

Z.

Lu

, L.

Hirschman

, DatabaseOxford (Oxford) 2012 (2012), bas043.

[156]

C.H.

Wei

, H.Y.

Kao

, Z.

Lu

, Nucleic Acids Res. 41(2013) W518-W522.

Manually curating knowledge from biomedical literature into structured databases is highly expensive and time-consuming, making it difficult to keep pace with the rapid growth of the literature. There is therefore a pressing need to assist biocuration with automated text mining tools. Here, we describe PubTator, a web-based system for assisting biocuration. PubTator is different from the few existing tools by featuring a PubMed-like interface, which many biocurators find familiar, and being equipped with multiple challenge-winning text mining algorithms to ensure the quality of its automatic results. Through a formal evaluation with two external user groups, PubTator was shown to be capable of improving both the efficiency and accuracy of manual curation. PubTator is publicly available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/PubTator/.

[157]

W.

Chih-Hsuan

, K.H.

Yu

, L.

Zhiyong

, Nucleic Acids Res. 41(2013) 518-522.

TRIM-NHL proteins are conserved regulators of development and differentiation but their molecular function has remained largely elusive. Here, we report an as yet unrecognized activity for the mammalian TRIM-NHL protein TRIM71 as a repressor of mRNAs. We show that TRIM71 is associated with mRNAs and that it promotes translational repression and mRNA decay. We have identified Rbl1 and Rbl2, two transcription factors whose down-regulation is important for stem cell function, as TRIM71 targets in mouse embryonic stem cells. Furthermore, one of the defining features of TRIM-NHL proteins, the NHL domain, is necessary and sufficient to target TRIM71 to RNA, while the RING domain that confers ubiquitin ligase activity is dispensable for repression. Our results reveal strong similarities between TRIM71 and Drosophila BRAT, the best-studied TRIM-NHL protein and a well-documented translational repressor, suggesting that BRAT and TRIM71 are part of a family of mRNA repressors regulating proliferation and differentiation.

[158]

Paper Information Assisted Extraction Software, 2019, November 12 https://www.mgedata.cn.

[159]

C.

Wang

, H.

Fu

, L.

Jiang

, D.

Xue

, J.

Xie

, npj Comput.Mater. 5(2019) 87.

[160]

O.

Kononova

, H.

Huo

, T.

He

, Z.

Rong

, T.

Botari

, W.

Sun

, V.

Tshitoyan

, G.

Ceder

, Sci. Data 6 (2019) 203.