International Conference on Advances in Computer Science and Electronics Engineering - CSEE 2014
Author(s) : AHMED MOUNAF MAHDI , SABRINA TIUN
Instance-based matching is the process of identifying the correspondences of schema elements by comparing the instances of different data sources. It is used as an alternative option when the schema-based matching fails. Instance-based matching is applied in many application areas such as website creation and management, data warehousing, database design, and data integration. Many recent approaches focus on instance-based matching. In this paper, we propose an approach that utilizes WordNet-based measure for string domain by getting the similarity coefficient in the range of [0..1]. In previous approach, the regular expression is achieved with a good accuracy for numerical instances only and is not implemented on string instances because we need to know the meaning of string to decide if there is a match or not. The using of WordNet-based measures for string instances should guarantee to improve the effectiveness in terms of Precision (P), Recall (R) and f-measure (F). In this paper we implemented Lin’s measure to find the similarity of two instances. This approach is evaluated with real dataset and the results are found better than using just equality measure for string especially if the schemas are disjoint. The approach achieved 91.8% f-measure (F).