Today, we interviewed Dr. Kwang-Ryeol Lee, who invented 'KiRI Note', which is KIST(Korea Institute of Science and Technology)'s electronic lab notebook. Dr. Lee emphasized that KiRI Note is building a researcher-friendly system by supplementing the limitations of existing electronic lab notebooks.
Q: Hello, Dr. Kwang-Ryeol Lee. Thank you for your precious time today. First of all, some people may seem unfamiliar with a laboratory notebook. Could you give an explanation of it?
Dr. Kwang-Ryeol Lee (hereinafter “Dr. Lee”): Hello! Nice to meet you. Well, you can think of a laboratory or lab notebook as the same as a notebook used by ordinary people. It is kept by scientists to keep track of their research. It is also a necessity when they carry out their studies while checking findings with their colleagues.
Q: As you know, the latest articles related to lab notebooks cover some issues, such as the right to research and patent disputes. Why do you think a lab notebook is particularly important in the protection of research ethics?
Dr. Lee: Lab notebooks are very important because they are means for researchers to protect their intellectual property rights. Because lab notebooks are records of research conducted at certain points in time, researchers can use them to protect their rights to research.
Let me take an example. In the late 1980s, KIST developed diamond synthesis technology by collaborating with Company A, a startup at the time. This technology became an issue when company B sued Company A for patent infringements. At that time, Company B was a multinational conglomerate in the United States, while Company A was a small company, which has now grown into a company with affiliates. Obviously, Company B seemed to win the case.
However, it was the lab notebooks kept by KIST researchers that went against people’s expectations. All the lab notebooks were translated into English and sent to the US court. This allowed Company A to prove its original technology.
Q: Recently, all records have been used as data. Does this trend hold true in the field of R&D?
Dr. Lee: In the R&D field, the importance of data-based research is increasingly prevalent. Researchers around the world are actively developing research infrastructure to secure, store, and utilize research data. Now, the lab notebook should escape its traditional image as a record, and instead, it should be recognized as a tool for building infinitely available R&D big data.
In particular, the digitization of lab notebooks means analog research records can be processed by computers. This opens up the possibility of extracting data from lab notebooks to convert all research activities into a database. For researchers today, the electronic lab notebook is a kind of an R&D data platform.
Q: I think many challenges may arise when the electronic lab notebook is developing into an R&D data platform.
Dr. Lee: Yes, you are right. We have technical and cultural challenges to overcome. When it comes to the technical challenge, data from electronic lab notebooks are not structured, so they cannot be used as is. In addition, because researchers or institutions have different ways of keeping lab notebooks, data are not systematized at all. Figuratively speaking, a lab notebook is like iron ore. Just as iron ore needs to go through smelting to become useful iron, unstructured data from lab notebooks should be structured to use big data analytics or machine learning for research.
To do so, I think that we need two technologies. One is “transcript” technology that renders the contents of the electronic notebook in computer-readable characters, and the other is natural language processing technology to structure data. Although these technologies have recently made remarkable progress, there is still a long way to go. Just as it took a long time for human beings to complete the smelting process to turn iron ore into iron, a considerable amount of time is required to bring those technologies to perfection. The success of an R&D platform for the Fourth Industrial Revolution will depend on how it can systematically collect and extract (smelting) vast research data (iron ore).
On the other hand, a cultural challenge is that getting researchers to use electronic lab notebooks is not that easy as if flipping one’s palm. They tend to stick to their own research methods because of familiarity. In addition, using an electronic lab notebook is something unwelcome. This is because multiple electronic lab notebooks have been designed regardless of research environments, so they are not that suitable for research activities.
For example, if researchers have to type data from handwritten lab notebooks on a computer system for digitization, who would bother to do so? Researchers are busy focusing on their own research, so any process other than research may be cumbersome. Therefore, well-designed, easy-to-use platforms depending on research environments should be provided for researchers to do their research activities effectively. First of all, we should understand the nature of research activities to devise a researcher-friendly electronic lab notebook.
Q: What does it mean to build an R&D data platform?
Dr. Lee: The fundamental technological factors that have driven the Fourth Industrial Revolution include the Internet, cloud computing, and mobile devices. Those factors contributed to the realization of the so-called hyperconnected society. This naturally led to the birth of platform businesses such as Google, Facebook, Amazon.com, Kakao, and Naver.
Those platform businesses are characterized by data accumulation. Users provide data in exchange for using the platforms. A wider range of data accumulated from more users can result in value creation. When a service attracts users to the platforms, it is called the “crank element.” For example, Google has provided the Gmail service with no extra charge. This free service was a great innovation at the time. The same goes for KakaoTalk.
Now, do you understand how platform operators can open free services even though costs of service maintenance obviously occur, such as building a server? That is because having more platform users means more accumulated data. The use of data creates various forms of value added. A wide range of data created by users can lead to profit models, creating unique enterprise value.
Let us go back to the electronic lab notebook. Nowadays, when producing an electronic lab notebook, we consider two aspects: recording and systematic data collection. I can tell that the electronic lab notebook has become a platform for accumulating research data, not just a simple notebook. In other words, researchers can convert their research records into data if they just keep a lab notebook. Accumulated data will create new value and new knowledge in research in return.
In addition, the lab notebook, as an R&D platform, should be able to help researchers with their research activities. As I mentioned earlier, a platform aims to provide people with more comfortable lives. If this goes for the electronic lab notebook, it should also help researchers do their research activities more conveniently and efficiently. Just as living without Google and KakaoTalk is inconvenient, so is researching without the electronic lab notebook. Researchers do not need to use electronic lab notebooks, but if they do, this can make R&D procedures much more convenient.
Q: As far as I know, a variety of electronic lab notebooks have already been in use. What distinguishes KIST’s KiRI Note from other electronic lab notebooks?
Dr. Lee: As you said, there are many electronic lab notebook services in place. However, the thing is, most of them were invented by IT developers. Because they design the service focusing on office automation, an understanding of actual research environments is ruled out. In contrast, KiRI Note has been devised for researchers. It started with a different perspective from other services.
The first principle I set when formulating KiRI Note is “Do not put any additional burden on researchers other than research.” This is why we entrusted the production of KiRI Note to Virtual Lab. Virtual Lab consists of researchers engaged in basic sciences such as physics and chemistry. I believed that they could understand and apply what an IT company would miss from a researcher’s point of view.
Our KiRI Note looks good, doesn’t it? There are a lot of crank elements that fascinate researchers (laughter). It allows researchers to record, store, and link their research findings free of charge. They can easily check what they studied and recorded and when they did so. It also helps them communicate with other people involved in research projects. People involved can share their opinions in real time through various forms of communication such as comments and memos.
Take, for example, a project called User Feedback that I created in KiRI Note. This project invites all people registered in KIST MIS, from the Dean to students, whose number reaches about 1,600. It allows anyone to give their opinions about KiRI Note in the form of notes. In other words, it can serve as a platform for communication between the developers and the users. Specifically, the user can create a personal project and invite participants to discuss it through KiRI Note.
In addition, KiRI Note is linked with the KIST Advanced Analysis Center. For example, when researchers ask the center to analyze their specimens, the status of analysis in the center is transmitted in real time to the Note. Upon completion, the analysis results are delivered to their Notes. Another practical benefit is that it shows them expected charges in advance. Of course, the specimens are brought back by hand (laughter), but there is no hassle of checking the status of analysis one by one over the phone. I think this is enough to show its strengths.
As I mentioned, a time stamp is very important in patent disputes. In other words, when a lab notebook is recorded and certified is overarching. For KiRI Note, a time stamp is certified via the government’s accredited certification system. Our users can easily handle the certification process with KiRI Note. Once certified, the time stamp is marked on the Note, and no further changes are allowed.
Q: But I am a little worried about whether there is a risk that such a wide range of research data will be misused.
Dr. Lee: All data can be exposed to that risk. So, security is more important than ever. Basically, KiRI Note allows only the owner to see its contents. It also clearly defines the range of people who are allowed to see the contents. Only specific persons or coworkers who are invited or designated can share data.
KiRI Note also has an additional security function called History. Each page of the lab notebook shows information about who read on what date or whether there are any corrections. This function makes it possible to trace arbitrary extortion or ill-intentioned changes.
Q: Don’t you think benchmarking KiRI Note would be difficult if companies do not have an in-house analysis center like KIST?
Dr. Lee: I think it is a good question regarding the extensibility of KiRI Note. When a service is introduced to a company, including a large one, it needs to be customized to match its existing program. This can lead to new business. The service developer should probably act as a bridge directly. That is why I think highly of and collaborate with Virtual Lab. Very few companies can do this job.
Q: It seems that research data can be naturally collected with KiRI Note (because of the number of users). What is being developed additionally to structure/systematize the collected data?
Dr. Lee: As you mentioned, KiRI Note mainly aims to store research activities within KIST. Therefore, one of its strengths is extracting data from them. The existing services just keep lab notebooks, but our service is designed to accumulate research findings as computer-readable data when collected at first.
The first step for KiRI Note to collect data is to transcribe handwritten lab notebooks into computer-readable documents. Machine learning–based conversion technology applies to this step. However, the technology that renders different types of handwriting or abbreviations in a computer-readable form is not yet perfect. We are making more efforts to resolve this problem.
Next, we need to understand the meanings of textualized data to collect structured data. This requires natural language processing technology. Similarly, we are studying how to extract data from published papers, but this is much easier than extracting data from lab notebooks. This is because the papers are structured to some degree, but lab notebooks are arbitrarily written by researchers.
Therefore, the progress of natural language processing will determine the quality of multiple data platforms, including KiRI Note. But I am not sure how long it will take. Moreover, keeping up with a big trend in research is not easy in reality. However, because everyone is aware of the importance of data, I am optimistic about this issue.
Q: Apart from KiRI Note, you have KiRI Platform. What is the difference between the two services?
Dr. Lee: KiRI Platform is a platform that creates knowledge through machine learning by encompassing all kinds of R&D data. First, it shows what properties can be predicted. The main functions are to provide related machine learning models so that other users can utilize them.
KiRI Platform obtains data in two ways. One is by collecting data through data curation from existing literature such as journals. The other is importing data accumulated by research activities within KIST. KiRI Note is a data collection platform that is used for the latter. As you see, data acquired in these two ways makes up data groups that comprise KiRI Platform.
Improving and upgrading KiRI Platform also depends on natural language processing. Related platforms will grow further only when people with specialized knowledge, including expertise in materials science, can deal with artificial intelligence (AI). This is because natural language processing requires an understanding of numerous technical terms.
Q: Aside from the two services, I heard that you are also planning a new platform. Could you tell us more about it?
Dr. Lee: It is related to the question, “How should we organize data well structured through natural language processing?” This is also a matter of how to create structured data. Smelting is used to turn ore into iron. It is like the work of standardizing the mold at first before shaping iron in a lump, beam, or square. We would like to define terms used in materials data and make the “Standard Glossary of Materials Data” open to the public.
Take, for example, the term “temperature.” This word is also called differently by research types or researchers. We will choose one of them to define as a standard via the alias command. In more detail, we should consider that different terms are used to refer to temperature in each process, such as CVD and annealing. Unlike a general Korean language dictionary, standardized terms should have structural information.
So, we are planning to make a standard glossary in JSON format. Terms will be defined within a structured data system. A number of participants contributed to the completion of the basic framework, but not all terminology has been contained yet. We will first open a dictionary, including terminology in three areas, and then add more and more.
Q: This is the last question. Could you tell us about what your goal is in the future?
Dr. Lee: I would like to globalize our domestic system related to standard terminology in materials data. The Research Data Alliance (RDA) addresses R&D data issues in all fields around the world. One of its working groups, which deals with materials data, has worked on a similar task. Researchers from Europe and the United States almost exclusively comprise this research community organization, but in Asia, only Japan participates as a member.
However, I do not think their system is more sophisticated and generalized than ours. Since I began working on the standard glossary of materials terminology, I have told Virtual Lab staff to feel proud of themselves because they have been doing what no one else can do. It is not a joke. They should be proud of taking part in the materials science field as a national team. My goal is to introduce these platforms to the world as soon as possible and establish them as international standards.
Interviewer : Clara Kim (email@example.com)