Authors: Sangseo Park, Cheolwon Lee, Sungjai Baek



This paper analyzes the National Software Reference Library Reference Data Set (NSRL RDS) and constructs a Korean RDS (KRDS) based on it. The fact that NSRL RDS offers the largest amount of hash data sets has led to its widespread adoption. However, the effectiveness analysis of NSRL RDS indicates that there are both duplicate/obsolete data that can be eliminated and unused metadata that can be deleted. Moreover, language-specific software and domestic software that has been widely used for years have to be added. Bearing these issues in mind, we develop a strategy and model for both importing effective NSRL RDS and adding Korea-specific data sets. We then construct initial KRDS using proprietary software designed to manage the entire process of analysis and construction. Lessons learned during this work are believed to be useful for those who need to construct their own RDS (based on NSRL RDS or not) and later upgrade of the NSRL RDS.