Automatically Searching for Related Information in Multiple Knowledge Graphs

Category Science

Friday - May 19 2023, 09:09 UTC - 2 years ago

tldr #

A team of researchers led by professor Xindong Wu has developed an unsupervised entity alignment framework, called SE-UEA, which can improve the process of searching for related information in multiple knowledge graphs for artificial intelligence applications. The framework scored higher on precision and recall than 12 of its 14 machine learning algorithm competitors and did not require laboriously annotated data. The two modules of SE-UEA comprise of surface similarities and similarities in the relationships between entities.

content #

A team of researchers led by professor Xindong Wu in Hefei, China has developed an unsupervised entity alignment framework to improve the process of searching for related information in multiple knowledge graphs for artificial intelligence applications. The framework brings together the advantages of multiple approaches and avoids relying on human labor to kickstart the alignment process.They tested their framework on several cross-lingual datasets and measured the results, comparing them against the results of 14 other machine learning algorithms. Their model outperformed most of its competitors on two different metrics, and scored better than all of them when the metrics were combined into an overall score.

SE-UEA is the first unsupervised entity alignment framework that can automatically learn from multiple knowledge graphs.

The group's research was published in the journal Intelligent Computing.

The new framework, called SE-UEA, scored higher on precision and recall than 12 of 14 competing algorithms, some supervised and some unsupervised. It scored higher overall for all three datasets. Experiments testing the framework's robustness and scalability also achieved encouraging results.

A major advantage of the new framework is that it does not require complex datasets laboriously annotated by humans. It can automatically handle datasets with missing information and merge datasets that have different internal structure. The quantitative research results thus show that it is not just convenient but also effective to use a combination of relatively straightforward automatic methods of processing knowledge graphs to bootstrap a more sophisticated one.

SE-UEA is two-layer machine learning with graph pair encoder-decoder.

Future research can further improve the efficiency and accuracy of the framework by tweaking one or the other of the framework's two modules.

The two modules of the framework are one that looks for surface similarities and another that looks for similarities in the relationships between entities. Both make use of a pair of knowledge graphs. In this case, the pair consisted of knowledge graphs for the same content in two different languages, English and Japanese, French or Chinese. The datasets were built by DBpedia from Wikipedia content.

SE-UEA compared 3 different datasets which contained different language entities.

The first module looks for not one but three different kinds of surface similarities: same name, same meaning and same location in the two knowledge graphs. Importantly, the output of this module is used as the input for the second module, which uses a type of neural network called a graph convolutional network to automatically examine the internal structure of the two knowledge graphs to discover pairs of identical entities.

The two modules of SE-UEA comprised of surface similarities and intrinsic similarities.

After the framework analyzed each pair of knowledge graphs and produced pairs of identical entities, the researchers were able to check its work against the correct answers supplied as part of the DBpedia datasets and assign scores according to their chosen evaluation metrics.

Although knowledge graphs are critical for artificial intelligence applications such as recommendation systems, every structured representation of knowledge is generally incomplete. Thus it is desirable to combine information from multiple knowledge graphs via a process called entity alignment.

DBpedia is an open dataset consisting of structured knowledge from Wikipedia.

The most straightforward matching method is to compare surface attributes such as the names of the entities. More sophisticated methods achieve better results, but typically require elaborate input data which must be hand-annotated by human laborers.

However, SE-UEA has kept the process simple and automatic.

hashtags #

ai entityalignment knowledgegraphs se-uea dbpedia

worddensity #

knowledge (10, 1.97%)
graphs (9, 1.78%)
framework (8, 1.58%)
datasets (7, 1.38%)
results (5, 0.99%)