Hi!

Your personalized paper recommendations for 08 to 12 December, 2025.

🎯 Top Personalized Recommendations

Ontology-Based Knowledge Graph Framework for Industrial Standard Documents via Hierarchical and Propositional Structuring

Hanyang University

Rate paper: 👍 👎 ♥ Save

AI Summary

The proposed pipeline hierarchically organizes documents into ontologies, extracts triples from logical statements, and refines the extracted triples through synonym normalization and pruning. [3]
The study also highlights the importance of hierarchical and propositional structuring in improving the quality of knowledge graphs constructed from complex documents. [3]
Synonym dictionary: A collection of synonyms that can be used to normalize and refine extracted triples, improving the accuracy and efficiency of knowledge graph construction. [3]
Pruning module: A component that refines extracted triples by removing redundant or irrelevant information, improving the quality and efficiency of knowledge graphs. [3]
The results show that the proposed methodology outperforms existing methods in terms of accuracy and efficiency for question answering tasks on industrial standard documents. [2]
The study proposes an ontology-based knowledge graph construction methodology for industrial standard documents that contain conditional rules, numerical relations, and complex tables. [1]

Abstract
Ontology-based knowledge graph (KG) construction is a core technology that enables multidimensional understanding and advanced reasoning over domain knowledge. Industrial standards, in particular, contain extensive technical information and complex rules presented in highly structured formats that combine tables, scopes of application, constraints, exceptions, and numerical calculations, making KG construction especially challenging. In this study, we propose a method that organizes such documents into a hierarchical semantic structure, decomposes sentences and tables into atomic propositions derived from conditional and numerical rules, and integrates them into an ontology-knowledge graph through LLM-based triple extraction. Our approach captures both the hierarchical and logical structures of documents, effectively representing domain-specific semantics that conventional methods fail to reflect. To verify its effectiveness, we constructed rule, table, and multi-hop QA datasets, as well as a toxic clause detection dataset, from industrial standards, and implemented an ontology-aware KG-RAG framework for comparative evaluation. Experimental results show that our method achieves significant performance improvements across all QA types compared to existing KG-RAG approaches. This study demonstrates that reliable and scalable knowledge representation is feasible even for industrial documents with intertwined conditions, constraints, and scopes, contributing to future domain-specific RAG development and intelligent document management.

Why we think this paper is great for you:
This paper directly addresses the user's interest in product categorization and knowledge graphs, particularly within the context of industrial standards – a key area for structured product information. The hierarchical and propositional structuring aligns with MECE Mutually Exclusive, Collectively Exhaustive approaches.

Model management to support systems engineering workflows using ontology-based knowledge graphs

KU Leuven

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

Versioning: The process of tracking changes to a model or artifact over time. [3]
The implementation uses an OML/OWL/RDF stack to store and query model data, with support for versioning and consistency rules. [2]
The paper proposes a framework for managing models and workflows in system engineering, using an ontology-based approach. [1]

Abstract
System engineering has been shifting from document-centric to model-based approaches, where assets are becoming more and more digital. Although digitisation conveys several benefits, it also brings several concerns (e.g., storage and access) and opportunities. In the context of Cyber- Physical Systems (CPS), we have experts from various domains executing complex workflows and manipulating models in a plethora of different formalisms, each with their own methods, techniques and tools. Storing knowledge on these workflows can reduce considerable effort during system development not only to allow their repeatability and replicability but also to access and reason on data generated by their execution. In this work, we propose a framework to manage modelling artefacts generated from workflow executions. The basic workflow concepts, related formalisms and artefacts are formally defined in an ontology specified in OML (Ontology Modelling Language). This ontology enables the construction of a knowledge graph that contains system engineering data to which we can apply reasoning. We also developed several tools to support system engineering during the design of workflows, their enactment, and artefact storage, considering versioning, querying and reasoning on the stored data. These tools also hide the complexity of manipulating the knowledge graph directly. Finally, we have applied our proposed framework in a real-world system development scenario of a drivetrain smart sensor system. Results show that our proposal not only helped the system engineer with fundamental difficulties like storage and versioning but also reduced the time needed to access relevant information and new knowledge that can be inferred from the knowledge graph.

Why we think this paper is great for you:
Given the user’s interest in knowledge graphs and product taxonomy, this paper’s focus on model-based systems engineering using knowledge graphs is highly relevant. It offers a structured approach to managing complex product information, aligning with the user’s interest in knowledge management.

Knowledge Graph Enrichment and Reasoning for Nobel Laureates

VNU University of

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

AI Summary

A small-world network structure is uncovered driven by tight-knit scientific communities and bridged by humanitarian figures. [3]
GraphRAG chatbot, powered by a fine-tuned Qwen model, offers a cost-effective and accurate solution for querying this complex data, particularly for deep reasoning tasks. [3]
Small-world network: A network with a short average path length (L) and high clustering coefficient (C), indicating rapid information flow and strong local communities. [3]
GraphRAG chatbot: A cost-effective and accurate solution for querying the complex data, powered by a fine-tuned Qwen model. [3]
Future work will focus on expanding the dataset to include citation networks and further optimizing the Text2Cypher model for even higher accuracy. [3]
Gemini 2.5 Flash Lite achieves higher average accuracy (76.41%) than our approach (72.85%), indicating stronger performance on general question answering. [3]
The GraphRAG chatbot offers competitive performance on 2-hop and 4-hop questions, achieving an accuracy of 83.69% and76.98%, respectively. [2]
The paper presents an end-to-end pipeline for knowledge graph construction and analysis using the Nobel Prize dataset enriched with Wikipedia biographies. [1]

Abstract
This project aims to construct and analyze a comprehensive knowledge graph of Nobel Prize and Laureates by enriching existing datasets with biographical information extracted from Wikipedia. Our approach integrates multiple advanced techniques, consisting of automatic data augmentation using LLMs for Named Entity Recognition (NER) and Relation Extraction (RE) tasks, and social network analysis to uncover hidden patterns within the scientific community. Furthermore, we also develop a GraphRAG-based chatbot system utilizing a fine-tuned model for Text2Cypher translation, enabling natural language querying over the knowledge graph. Experimental results demonstrate that the enriched graph possesses small-world network properties, identifying key influential figures and central organizations. The chatbot system achieves a competitive accuracy on a custom multiple-choice evaluation dataset, proving the effectiveness of combining LLMs with structured knowledge bases for complex reasoning tasks. Data and source code are available at: https://github.com/tlam25/network-of-awards-and-winners.

Why we think this paper is great for you:
The paper’s construction of a knowledge graph from Nobel Prize data directly relates to the user's interest in knowledge graphs and ontology for products. Utilizing LLMs for data augmentation is a sophisticated approach to enriching this knowledge base.

gHAWK: Local and Global Structure Encoding for Scalable Training of Graph Neural Networks on Knowledge Graphs

University of Texas at Ar

Rate paper: 👍 👎 ♥ Save

Abstract
Knowledge Graphs (KGs) are a rich source of structured, heterogeneous data, powering a wide range of applications. A common approach to leverage this data is to train a graph neural network (GNN) on the KG. However, existing message-passing GNNs struggle to scale to large KGs because they rely on the iterative message passing process to learn the graph structure, which is inefficient, especially under mini-batch training, where a node sees only a partial view of its neighborhood. In this paper, we address this problem and present gHAWK, a novel and scalable GNN training framework for large KGs. The key idea is to precompute structural features for each node that capture its local and global structure before GNN training even begins. Specifically, gHAWK introduces a preprocessing step that computes: (a)~Bloom filters to compactly encode local neighborhood structure, and (b)~TransE embeddings to represent each node's global position in the graph. These features are then fused with any domain-specific features (e.g., text embeddings), producing a node feature vector that can be incorporated into any GNN technique. By augmenting message-passing training with structural priors, gHAWK significantly reduces memory usage, accelerates convergence, and improves model accuracy. Extensive experiments on large datasets from the Open Graph Benchmark (OGB) demonstrate that gHAWK achieves state-of-the-art accuracy and lower training time on both node property prediction and link prediction tasks, topping the OGB leaderboard for three graphs.

Why we think this paper is great for you:
This paper explores graph neural networks on knowledge graphs, which is a core component of the user’s interest in leveraging graphs for product understanding and categorization. The focus on scalability is particularly relevant for managing large product datasets.

Visual Categorization Across Minds and Models: Cognitive Analysis of Human Labeling and Neuro-Symbolic Integration

Georgia Institute of the

Rate paper: 👍 👎 ♥ Save

Rate image: 👍 👎

Abstract
Understanding how humans and AI systems interpret ambiguous visual stimuli offers critical insight into the nature of perception, reasoning, and decision-making. This paper examines image labeling performance across human participants and deep neural networks, focusing on low-resolution, perceptually degraded stimuli. Drawing from computational cognitive science, cognitive architectures, and connectionist-symbolic hybrid models, we contrast human strategies such as analogical reasoning, shape-based recognition, and confidence modulation with AI's feature-based processing. Grounded in Marr's tri-level hypothesis, Simon's bounded rationality, and Thagard's frameworks of representation and emotion, we analyze participant responses in relation to Grad-CAM visualizations of model attention. Human behavior is further interpreted through cognitive principles modeled in ACT-R and Soar, revealing layered and heuristic decision strategies under uncertainty. Our findings highlight key parallels and divergences between biological and artificial systems in representation, inference, and confidence calibration. The analysis motivates future neuro-symbolic architectures that unify structured symbolic reasoning with connectionist representations. Such architectures, informed by principles of embodiment, explainability, and cognitive alignment, offer a path toward AI systems that are not only performant but also interpretable and cognitively grounded.

Why we think this paper is great for you:
This paper’s investigation of human and AI image labeling aligns with the user's interest in visual categorization and the integration of cognitive analysis. The focus on perceptual understanding is a key aspect of the user's broader interests.

Cross-platform Product Matching Based on Entity Alignment of Knowledge Graph with RAEA model

City University of Hong

Rate paper: 👍 👎 ♥ Save

AI Summary

RAEA achieves state-of-the-art performance on public datasets DBP15K and DWY100K for EA task, outperforming competitive GNN-based models. [3]
The ablation study shows that attribute triples, relation triples, and entity names are necessary to be considered for EA, and the Relation-aware graph attention networks and MPnet encoder contribute to the improvements in EA task. [3]
Graph Attention Networks (GATs): A type of neural network architecture designed for graph-structured data, which can learn node representations by attending to their neighbors. [3]
Multi-Relational Graphs: Graphs that contain multiple types of relationships between nodes, such as friendship and acquaintance in a social network. [3]
The proposed RAEA model achieves impressive results on the eBay-Amazon dataset, with NDCG = 0.566 when matching Top-10 best matched Amazon products for each eBay product. [3]
The RAEA model is a novel approach for entity alignment that leverages both attribute triples and relation triples to capture the interactions between entities and relations. [2]
Entity Alignment (EA): The process of matching identical or similar entities across different datasets or platforms. [1]

Abstract
Product matching aims to identify identical or similar products sold on different platforms. By building knowledge graphs (KGs), the product matching problem can be converted to the Entity Alignment (EA) task, which aims to discover the equivalent entities from diverse KGs. The existing EA methods inadequately utilize both attribute triples and relation triples simultaneously, especially the interactions between them. This paper introduces a two-stage pipeline consisting of rough filter and fine filter to match products from eBay and Amazon. For fine filtering, a new framework for Entity Alignment, Relation-aware and Attribute-aware Graph Attention Networks for Entity Alignment (RAEA), is employed. RAEA focuses on the interactions between attribute triples and relation triples, where the entity representation aggregates the alignment signals from attributes and relations with Attribute-aware Entity Encoder and Relation-aware Graph Attention Networks. The experimental results indicate that the RAEA model achieves significant improvements over 12 baselines on EA task in the cross-lingual dataset DBP15K (6.59% on average Hits@1) and delivers competitive results in the monolingual dataset DWY100K. The source code for experiments on DBP15K and DWY100K is available at github (https://github.com/Mockingjay-liu/RAEA-model-for-Entity-Alignment).

Why we think this paper is great for you:
Given the user’s interest in product categorization and knowledge graphs, this paper’s focus on entity alignment of knowledge graphs is highly relevant. The task of product matching directly addresses the need for structured product data.

Representation of the structure of graphs by sequences of instructions

University of Mlaga

Rate paper: 👍 👎 ♥ Save

AI Summary

The instruction string representation consistently obtains the best classification performance compared to the binary representation. [3]
The computation time for the instruction string representation is faster than the binary representation for all model sizes. [3]
The results indicate that the networks learn correctly for several tens of epochs before entering an overfitting regime. [3]
Graph neural networks (GNNs): A type of neural network that operates on graph-structured data. [3]
Further research is needed to explore the full potential of this new method and its applications in various fields. [3]
The proposed method has the potential to be used with language models, as it provides a reversible transformation between strings and graphs. [2]
The proposed representation of graphs as sequences of instructions is compact, reversible, and amenable to deep learning models. [1]

Abstract
The representation of graphs is commonly based on the adjacency matrix concept. This formulation is the foundation of most algebraic and computational approaches to graph processing. The advent of deep learning language models offers a wide range of powerful computational models that are specialized in the processing of text. However, current procedures to represent graphs are not amenable to processing by these models. In this work, a new method to represent graphs is proposed. It represents the adjacency matrix of a graph by a string of simple instructions. The instructions build the adjacency matrix step by step. The transformation is reversible, i.e. given a graph the string can be produced and vice versa. The proposed representation is compact and it maintains the local structural patterns of the graph. Therefore, it is envisaged that it could be useful to boost the processing of graphs by deep learning models. A tentative computational experiment is reported, with favorable results.

Why we think this paper is great for you:
This paper’s exploration of graph representation using deep learning language models aligns with the user’s interest in knowledge graphs and the application of advanced computational models to structured data.

Product Categorization

Fock Space Tensor Product Categorifications and Multiplicities in Complex Rank Parabolic Category O

MIT

Rate paper: 👍 👎 ♥ Save

Abstract
We undertake the study of complex rank analogues of parabolic category O defined using Deligne categories. We regard these categories as a family over an affine space, introduce a stratification on this parameter space, and formulate conjectures on the structural constancy of fibers on each stratum. Using the theory of $\mathfrak{sl}_{\mathbb{Z}}$-categorification, we prove these conjectures for admissible strata. Namely, we axiomatize the notion of multi-Fock tensor product categorifications (MFTPCs), which are interval finite highest weight categories equipped with a compatible action of commuting copies of $\mathfrak{sl}_{\mathbb{Z}}$, categorifying an external tensor product of tensor products of highest and lowest weight Fock space representations. We prove a uniqueness theorem for admissible MFTPCs and show that complex rank parabolic categories O have the structure of MFTPCs. In turn, for suitable choices of parameters, we produce an equivalence of complex rank category O with a stable limit of classical parabolic categories O, resolving our conjecture in the admissible case. These equivalences yield multiplicities of simple objects in Verma modules in terms of stable parabolic Kazhdan--Lusztig polynomials, answering a question posed by Etingof. In particular, for the case of two Levi blocks of non-integral size, we completely describe the structure of the corresponding category O in terms of stable representation theory. As an application, we obtain multiplicities for parabolic analogs of hyperalgebra Verma modules introduced by Haboush in the large rank and large characteristic limit.

AI Summary

It introduces a conjecture that the structure of the category (Ot s)ext does not depend on the choice of (t,s) ∈ W♮ I, and provides evidence for this conjecture by showing that it holds when the stratum WI is admissible. [2]
The paper discusses the problem of computing multiplicities in modular representation theory using model-theoretic methods. [1]

Continual Generalized Category Discovery

Cluster-Dags as Powerful Background Knowledge For Causal Discovery

Technical University of M

Rate paper: 👍 👎 ♥ Save

Abstract
Finding cause-effect relationships is of key importance in science. Causal discovery aims to recover a graph from data that succinctly describes these cause-effect relationships. However, current methods face several challenges, especially when dealing with high-dimensional data and complex dependencies. Incorporating prior knowledge about the system can aid causal discovery. In this work, we leverage Cluster-DAGs as a prior knowledge framework to warm-start causal discovery. We show that Cluster-DAGs offer greater flexibility than existing approaches based on tiered background knowledge and introduce two modified constraint-based algorithms, Cluster-PC and Cluster-FCI, for causal discovery in the fully and partially observed setting, respectively. Empirical evaluation on simulated data demonstrates that Cluster-PC and Cluster-FCI outperform their respective baselines without prior knowledge.

AI Summary

The paper discusses the use of Cluster DAGs (Directed Acyclic Graphs) in causal inference, which allows for the identification of causal effects between variables within clusters. [3]
Cluster DAGs are a type of graphical model that represents relationships between variables within clusters, and can be used to identify causal effects even when there is latent confounding. [3]
The authors introduce a new algorithm for learning Cluster DAGs from data, which they call 'Cluster-DAG-Learn'. [3]
They also discuss the use of Cluster DAGs in various applications, including policy evaluation, treatment effect estimation, and causal discovery over clusters of variables. [3]
The paper concludes by highlighting the potential benefits of using Cluster DAGs in causal inference, including improved accuracy and robustness to latent confounding. [3]
Cluster DAG: A type of graphical model that represents relationships between variables within clusters. [3]
Latent Confounding: The presence of unobserved variables that affect the relationship between two or more observed variables. [3]
Causal Effect: The effect of a cause on an outcome, often measured as the change in the outcome variable due to a change in the cause variable. [3]
Cluster DAGs provide a powerful tool for causal inference, allowing researchers to identify causal effects between variables within clusters even when there is latent confounding. [3]
The use of Cluster DAGs can improve the accuracy and robustness of causal inference results in various applications. [3]

VLM-NCD:Novel Class Discovery with Vision-Based Large Language Models

Northwestern Polytechnic

Rate paper: 👍 👎 ♥ Save

Abstract
Novel Class Discovery aims to utilise prior knowledge of known classes to classify and discover unknown classes from unlabelled data. Existing NCD methods for images primarily rely on visual features, which suffer from limitations such as insufficient feature discriminability and the long-tail distribution of data. We propose LLM-NCD, a multimodal framework that breaks this bottleneck by fusing visual-textual semantics and prototype guided clustering. Our key innovation lies in modelling cluster centres and semantic prototypes of known classes by jointly optimising known class image and text features, and a dualphase discovery mechanism that dynamically separates known or novel samples via semantic affinity thresholds and adaptive clustering. Experiments on the CIFAR-100 dataset show that compared to the current methods, this method achieves up to 25.3% improvement in accuracy for unknown classes. Notably, our method shows unique resilience to long tail distributions, a first in NCD literature.

Graphs for Products

Colouring Graphs Without a Subdivided H-Graph: A Full Complexity Classification

Durham University

Rate paper: 👍 👎 ♥ Save

Abstract
We consider Colouring on graphs that are $H$-subgraph-free for some fixed graph $H$, i.e., graphs that do not contain $H$ as a subgraph. It is known that even $3$-Colouring is NP-complete for $H$-subgraph-free graphs whenever $H$ has a cycle; or a vertex of degree at least $5$; or a component with two vertices of degree $4$, while Colouring is polynomial-time solvable for $H$-subgraph-free graphs if $H$ is a forest of maximum degree at most $3$, in which each component has at most one vertex of degree $3$. For connected graphs $H$, this means that it remains to consider when $H$ is tree of maximum degree $4$ with exactly one vertex of degree $4$, or a tree of maximum degree $3$ with at least two vertices of degree $3$. We let $H$ be a so-called subdivided "H"-graph, which is either a subdivided $\mathbb{H}_0$: a tree of maximum degree $4$ with exactly one vertex of degree $4$ and no vertices of degree $3$, or a subdivided $\mathbb{H}_1$: a tree of maximum degree $3$ with exactly two vertices of degree $3$. In the literature, only a limited number of polynomial-time and NP-completeness results for these cases are known. We develop new polynomial-time techniques that allow us to determine the complexity of Colouring on $H$-subgraph-free graphs for all the remaining subdivided "H"-graphs, so we fully classify both cases. As a consequence, the complexity of Colouring on $H$-subgraph-free graphs has now been settled for all connected graphs $H$ except when $H$ is a tree of maximum degree $4$ with exactly one vertex of degree $4$ and at least one vertex of degree $3$; or a tree of maximum degree $3$ with at least three vertices of degree $3$. We also employ our new techniques to obtain the same new polynomial-time results for another classic graph problem, namely Stable Cut.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.

Taxonomy of Products
MECE Mutually Exclusive, Collectively Exhaustive.

You can edit or add more interests any time.

Help us improve your experience!

This project is on its early stages your feedback can be pivotal on the future of the project. Let us know what you think about this week's papers and suggestions!

Give Feedback