Papers from 13 to 17 October, 2025

Here are the personalized paper recommendations sorted by most relevant
Graphs for Products
👍 👎 ♥ Save
London School of Economic
Abstract
An account of 2-factors in graphs and their history is presented. We give a direct graph-theoretic proof of the 2-Factor Theorem and a new variant of it, and also a new complete characterisation of the maximal graphs without 2-factors. This is based on the important works of Tibor Gallai on 1-factors and of Hans-Boris Belck on k-factors, both published in 1950 and independently containing the theory of alternating chains. We also present an easy proof that a $(2k+1)$-regular graph with at most $2k$ leaves has a 2-factor, and we describe all connected $(2k+1)$-regular graphs with exactly $2k+1$ leaves without a 2-factor. This generalises Julius Petersen's famous theorem, that any 3-regular graph with at most two leaves has a 1-factor, and it generalises the extremal graphs Sylvester discovered for that theorem.
AI Insights
  • Direct graph‑theoretic proof of the 2‑Factor Theorem, no heavy algebra.
  • New variant tightens 2‑factor existence conditions in irregular graphs.
  • Complete characterization of maximal graphs without 2‑factors, closing a gap.
  • Extends Petersen’s 3‑regular result to all (2k+1)-regular graphs: ≤2k leaves guarantee a 2‑factor.
  • Classifies exceptional (2k+1)-regular graphs with 2k+1 leaves lacking a 2‑factor, generalizing Sylvester’s examples.
  • Highlights Belck’s 1950 regular‑factor work predating Petersen, Tutte, Gallai.
  • Cites Belck 1950, Tutte 1947, Gallai 1950, and “Factors and Factorizations of Graphs” (2011).
👍 👎 ♥ Save
Abstract
Given a family $\mathcal{F}$ of graphs, a graph is \emph{$\mathcal{F}$-subgraph-free} if it has no subgraph isomorphic to a member of $\mathcal{F}$. We present a fixed-parameter linear-time algorithm that decides whether a planar graph can be made $\mathcal{F}$-subgraph-free by deleting at most $k$ vertices or $k$ edges, where the parameters are $k$, $\lvert \mathcal{F} \rvert$, and the maximum number of vertices in a member of $\mathcal{F}$. The running time of our algorithm is double-exponential in the parameters, which is faster than the algorithm obtained by applying the first-order model checking result for graphs of bounded twin-width. To obtain this result, we develop a unified framework for designing algorithms for this problem on graphs with a ``product structure.'' Using this framework, we also design algorithms for other graph classes that generalize planar graphs. Specifically, the problem admits a fixed-parameter linear time algorithm on disk graphs of bounded local radius, and a fixed-parameter almost-linear time algorithm on graphs of bounded genus. Finally, we show that our result gives a tight fixed-parameter algorithm in the following sense: Even when $\mathcal{F}$ consists of a single graph $F$ and the input is restricted to planar graphs, it is unlikely to drop any parameters $k$ and $\lvert V(F) \rvert$ while preserving fixed-parameter tractability, unless the Exponential-Time Hypothesis fails.
Continual Generalized Category Discovery
👍 👎 ♥ Save
Abstract
Traditional named entity recognition (NER) aims to identify text mentions into pre-defined entity types. Continual Named Entity Recognition (CNER) is introduced since entity categories are continuously increasing in various real-world scenarios. However, existing continual learning (CL) methods for NER face challenges of catastrophic forgetting and semantic shift of non-entity type. In this paper, we propose GenCNER, a simple but effective Generative framework for CNER to mitigate the above drawbacks. Specifically, we skillfully convert the CNER task into sustained entity triplet sequence generation problem and utilize a powerful pre-trained seq2seq model to solve it. Additionally, we design a type-specific confidence-based pseudo labeling strategy along with knowledge distillation (KD) to preserve learned knowledge and alleviate the impact of label noise at the triplet level. Experimental results on two benchmark datasets show that our framework outperforms previous state-of-the-art methods in multiple CNER settings, and achieves the smallest gap compared with non-CL results.
👍 👎 ♥ Save
HumboldtUniversitt zu
Abstract
Over the past decade, the proliferation of public and enterprise data lakes has fueled intensive research into data discovery, aiming to identify the most relevant data from vast and complex corpora to support diverse user tasks. Significant progress has been made through the development of innovative index structures, similarity measures, and querying infrastructures. Despite these advances, a critical aspect remains overlooked: relevance is time-varying. Existing discovery methods largely ignore this temporal dimension, especially when explicit date/time metadata is missing. To fill this gap, we outline a vision for a data discovery system that incorporates the temporal dimension of data. Specifically, we define the problem of temporally-valid data discovery and argue that addressing it requires techniques for version discovery, temporal lineage inference, change log synthesis, and time-aware data discovery. We then present a system architecture to deliver these techniques, before we summarize research challenges and opportunities. As such, we lay the foundation for a new class of data discovery systems, transforming how we interact with evolving data lakes.
AI Insights
  • Blend’s hybrid cache mixes in‑memory storage with on‑demand time travel, cutting latency for large‑scale discovery.
  • A classifier separates unrelated datasets from temporally linked versions, enabling precise lineage construction.
  • Heuristics extract change logs from version histories, giving users a navigable timeline of data evolution.
  • Content‑based and version‑specific queries are decoupled, letting the system infer query intent from minimal input.
  • The architecture infers temporal context even without explicit timestamps, using schema and metadata cues.
  • Recommended: “Delta Lake: High‑performance ACID table storage over cloud object stores” for robust versioning foundations.
Knowledge Graphs
👍 👎 ♥ Save
Abstract
Inductive Knowledge Graph Reasoning (KGR) aims to discover facts in open-domain KGs containing unknown entities and relations, which poses a challenge for KGR models in comprehending uncertain KG components. Existing studies have proposed Knowledge Graph Foundation Models (KGFMs) that learn structural invariances across KGs to handle this uncertainty. Recently, Large Language Models (LLMs) have demonstrated strong capabilities for open-domain knowledge reasoning. As a result, the latest research has focused on LLM-based KGFMs that integrate LLM knowledge with KG context for inductive KGR. However, the intrinsic knowledge of LLMs may be overshadowed by sparse KG context, leading to LLM knowledge distortion, which can cause irreversible damage to model reasoning. Moreover, existing LLM-based KGR methods still struggle to fully constrain generative hallucinations in LLMs, severely limiting the credibility of reasoning results. To address these limitations, we propose a Knowledge Reasoning Language Model (KRLM) that achieves unified coordination between LLM knowledge and KG context throughout the KGR process. Specifically, we design a Knowledge Reasoning Language (KRL) instruction format and a KRL tokenizer to align LLM knowledge with KG representations. Then, we propose a KRL attention layer that coordinates intrinsic LLM knowledge with additional KG context through a dynamic knowledge memory mechanism. Finally, a structure-aware next-entity predictor is proposed, which strictly constrains the reasoning results within a trustworthy knowledge domain. Extensive experimental results on 25 real-world inductive KGR datasets demonstrate the significant superiority of the proposed KRLM\footnote{Our source codes are available at https://anonymous.4open.science/r/KRLM-EA36 in both zero-shot reasoning and fine-tuning scenarios.
👍 👎 ♥ Save
The Hong Kong University
Abstract
Deductive and abductive reasoning are two critical paradigms for analyzing knowledge graphs, enabling applications from financial query answering to scientific discovery. Deductive reasoning on knowledge graphs usually involves retrieving entities that satisfy a complex logical query, while abductive reasoning generates plausible logical hypotheses from observations. Despite their clear synergistic potential, where deduction can validate hypotheses and abduction can uncover deeper logical patterns, existing methods address them in isolation. To bridge this gap, we propose DARK, a unified framework for Deductive and Abductive Reasoning in Knowledge graphs. As a masked diffusion model capable of capturing the bidirectional relationship between queries and conclusions, DARK has two key innovations. First, to better leverage deduction for hypothesis refinement during abductive reasoning, we introduce a self-reflective denoising process that iteratively generates and validates candidate hypotheses against the observed conclusion. Second, to discover richer logical associations, we propose a logic-exploration reinforcement learning approach that simultaneously masks queries and conclusions, enabling the model to explore novel reasoning compositions. Extensive experiments on multiple benchmark knowledge graphs show that DARK achieves state-of-the-art performance on both deductive and abductive reasoning tasks, demonstrating the significant benefits of our unified approach.
MECE Mutually Exclusive, Collectively Exhaustive.Knowledge Management
👍 👎 ♥ Save
Scuola Internazionale di
Abstract
The rapid increase in multimodal data availability has sparked significant interest in cross-modal knowledge distillation (KD) techniques, where richer "teacher" modalities transfer information to weaker "student" modalities during model training to improve performance. However, despite successes across various applications, cross-modal KD does not always result in improved outcomes, primarily due to a limited theoretical understanding that could inform practice. To address this gap, we introduce the Cross-modal Complementarity Hypothesis (CCH): we propose that cross-modal KD is effective when the mutual information between teacher and student representations exceeds the mutual information between the student representation and the labels. We theoretically validate the CCH in a joint Gaussian model and further confirm it empirically across diverse multimodal datasets, including image, text, video, audio, and cancer-related omics data. Our study establishes a novel theoretical framework for understanding cross-modal KD and offers practical guidelines based on the CCH criterion to select optimal teacher modalities for improving the performance of weaker modalities.
AI Insights
  • LMI is a non‑parametric estimator that operates on low‑dimensional embeddings, sidestepping high‑dimensional MI estimation.
  • It is embedded in a theoretically motivated architecture that ensures representations are MI‑friendly.
  • On synthetic, CMU‑MOSEI, and BRCA/KIPAN/LIHC cancer data, it beats direct fusion and baselines.
  • CCH states a teacher is effective when I(T;S) > I(S;Y).
  • The guideline is to pick the teacher with the largest I(T;S)–I(S;Y) gap.
  • Compared to MINE, LMI matches accuracy with less compute, and the paper defines Knowledge Distillation and Multimodal Fusion.
👍 👎 ♥ Save
University of Toronto, or
Abstract
The landscape of interactive systems is shifting toward dynamic, generative experiences that empower users to explore and construct knowledge in real time. Yet, timelines -- a fundamental tool for representing historical and conceptual development -- remain largely static, limiting user agency and curiosity. We introduce the concept of a generative timeline: an AI-powered timeline that adapts to users' evolving questions by expanding or contracting in response to input. We instantiate this concept through KnowledgeTrail, a system that enables users to co-construct timelines of historical events and knowledge formation processes. Two user studies showed that KnowledgeTrail fosters curiosity-driven exploration, serendipitous discovery, and the ability to trace complex relationships between ideas and events, while citation features supported verification yet revealed fragile trust shaped by perceptions of source credibility. We contribute a vision for generative timelines as a new class of exploratory interface, along with design insights for balancing serendipity and credibility.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • Product Categorization
  • Taxonomy of Products
  • Ontology for Products
You can edit or add more interests any time.

Unsubscribe from these updates