The Born2Global Centre released an article that highlights the Ministry of Science and ICT (MSIT)’s Data Dam initiative, included in the Korean Digital New Deal. As part of the Korean government MSIT, Born2Global Centre has played a crucial role by connecting Korean startups with various opportunities worldwide.

The so-called Data Dam project is one of the cornerstones of the Korean Digital New Deal announced last year to help the country’s digital economy adapt to the post-COVID-19 era.

A water-storage dam collects, stores water, and distributes it to the surrounding land for activities, such as farming. Just like the water dam, the Data Dam collects information from public and private sectors to create useful data and releases the data across all industries. The goal is to allow people from all walks of life, as well as businesses, to easily access and make the most of useful data that meet their needs.

The Data Dam is a project to collect data generated through public and private networks and to standardize, process, and utilize the data to create smarter AI. In the process, innovative services powered by AI will be developed and a host of jobs will be created across the board.

Recently, MSIT, together with National Information Society Agency (NIA), kicked off the project in earnest by releasing 170 types of data for AI learning on an integrated data platform, called AI Hub (https://aihub.or.kr/).

Disseminating AI Data Across Industries

The release has led to a significant increase in the amount of content materials on AI, a key technology affecting the full spectrum of society and industrial sectors, including education, healthcare, transportation, and finance.

The released data come to 480 million pieces collected from eight key business areas: language such as spoken dialects; healthcare such as diagnostic images for cancer and brain disease; self-driving vehicles such as on-road driving videos; visual data that showcase sports motions; environment such as images of water pollution; agriculture and fisheries such as diseases affecting crops, livestock, and fish; social safety such as images of decrepit facilities; other areas including images of fashion goods.

In particular, data collected from “Korean services,” for example, involving language, transportation, and medical records, are expected to help the nation accelerate the development of AI-powered services in the future. The innovative services will, in turn, benefit people and improve the quality of their lives.

Among others, data about Korean-spoken dialects — from Gyeongsang-do, Jeolla-do, Chungcheong-do and Gangwon-do Provinces as well as Jeju Island — will help make up for the shortcomings of existing voiced-based AI services that could not recognize dialects well. According to the results of the data performance evaluation, the new data on spoken dialects sound more natural, with increased recognition rate by 12%.

Data on self-driving vehicles include a variety of visual materials that showcase not only on-road driving, but also parking obstacles and moving object detection. This information will help accelerate the development of future autonomous vehicles.

People Build Data Dam Together

The Data Dam project is the outcome of joint efforts by the whole nation as the government, professionals from public and private sectors and citizens joined forces to build the dam together. They engaged in the entire process of data planning, establishment, verification, and management.

In 2020, the MSIT conducted research and analysis of data demand across the public and private sectors. Based on the results, it engaged experts from business, academic, and research areas in planning and establishing a data platform for AI learning. The MIST also adopted “cloud sourcing” tactics in the process so that many citizens — amounting to as many as 40,000 — could be part of collecting and processing data together.

Experts, organizations, and businesses from the eight target areas participated in the process of verifying and managing the data quality and usability, too. For data quality control, the MSIT set up an advisory committee consisting of more than 80 experts who will regularly evaluate the quality of data available on the AI Hub platform.

In addition, a council of the Telecommunications Technology Association (TTA) and public-private entities was launched with the aim of expanding AI data usability across the board.

The government expects that the Data Dam project will quench the thirst for useful data, which is essential for the development of AI services. The dam will disseminate useful data to diverse businesses, especially small and medium-sized companies and startups with insufficient budget and human resources.