MatSQ Blog > Column

Materials big data standardization and AI-based materials development plans

Viewed : 1234 times,  2021-09-13 01:27:38

several key projects are in place to develop lightweight materials and energy storage materials.




In recent years, several studies have been conducted to build artificial intelligence (AI)-based big data on materials (hereinafter “materials big data”) in line with the Fourth Industrial Revolution. Materials have been developed in a linear manner so far. In other words, materials development has been done through trial and error rather than the continuous accumulation of data used for the development. The AI-driven materials development is greatly significant because other researchers can utilize historically accumulated data.


This also means a shift to a feedback type virtuous circle that can loop back predicted conditions and physical properties even in the materials development process. With recently developing future technologies and industries, a competitive landscape for developing new materials conflicts with the transition to high-value-added industries. Each country also utilizes strategies for weaponized materials. In this context, there is a growing need for innovative materials development through materials big data standardization.


Materials big data utilization

- Domestic and overseas trends


Keeping pace with global trends, materials powerhouses such as the United States, Japan, and Germany implement materials development projects using AI platforms for future industrial progress. In the United States, the Materials Genome Initiative, which was launched to reduce materials development costs, has resulted in the establishment of data and computational science-centered materials big data composed of about 2.9 million cases. In particular, several key projects are in place to develop lightweight materials and energy storage materials.


In Japan, national strategies including the “New Elements Strategy Project” have been established since 2017 to digitize process technology and more with experimental and computational data. Big data has been built to pass down these accumulated technical skills to future generations. In Europe, there are efforts to share data and build a database by operating the European Union funding program called Horizon 2020. About 4.4 million pieces of big data in computational science, such as NOMAD, have been built, and every endeavor has been put into improving researcher accessibility.


(Left) the Materials Genome Initiative(MGI) and (Right) Novel Materials Discovery (Nomad)


On the other hand, in South Korea, bulk data and standardization systems for data collection are still insufficient. Although Material Bank has secured 1.6 million pieces of information on physical properties in the chemistry and materials fields, the quantity remains inadequate compared to Europe (4.4 million) and the United States (2.9 million), as mentioned above. With respect to data quality, all the conditions for the data life cycle are still unsecured, so the level of data accumulation falls short of the global level.


- Data standardization


For materials data to be standardized, the common items should be systematized according to materials development phases. In particular, raw materials, composition, processes, and physical properties are classified as major categories in the development phase. The raw materials category should indicate data such as manufacturers, purchase date, and element names, while the composition category should specify information such as raw materials, units, and writers. In the process category, equipment, conditions, and a series of actions should be mentioned.


In the physical properties category, data such as measuring instruments, measurement conditions, specimen shapes, and measured properties should be systematically organized. Importantly, a standard template needs to be created based on these standardization items. Moreover, domain experts and AI specialists, who create and provide real data, should team up to conduct research and development (R&D). This requires a series of acts: benchmarking overseas cases, comparing and analyzing data, and creating a world-class template. Another urgent challenge is to plan the standardization of terminology, classification systems, and measurement units.


- Data accumulation


Different measures should be prepared, such as R&D funding and incentives to researchers who provide materials data so that individual research subjects can voluntarily build up data. It is deeply important to simplify red tape and systematize the common items for anyone to provide verified data. In addition, domain experts who provide a lot of data should be given more incentives and mileage, as well as receive institutional aid to utilize big data preferentially and team up with other AI specialists. Furthermore, data professionals should be provided more opportunities to actively take part in science and technology studies or industrial tasks. If data is used for profit, a win-win model should be devised to allow data donors to share the returns.




In the past, alchemists ceaselessly experimented with melting and manufacturing diverse alloy elements to make gold. Ironically, although their experiments were pseudoscientific, such efforts greatly influenced the development of later experimentation-oriented science and technology. Now, humans are about to tread a new path by using data and AI they created until this point. This will require the overall effort of science and technology alongside an institutional framework.

enlightenedGo to other articles of Prof. SungUrban Air Mobility (UAM): A Futuristic Edition of Flying Taxis

  Hyokyung Sung | Professor, Department of Metallurgical and Materials Engineering, Gyeongsang National University 


  Hyokyung Sung | Professor, Department of Metallurgical and Materials Engineering, Gyeongsang National University