🎯 Top Personalized Recommendations
AI Summary - Ontology: A data structure representing a domain or knowledge area, including concepts, relationships, and rules. [3]
- Logit biasing: A technique used to adjust the model's output probabilities by adding or subtracting values from the logits (unnormalized probabilities) of specific tokens or classes. [3]
- GPT-4: A large language model developed by OpenAI that can generate human-like text based on input prompts. [2]
- The code provided appears to be a research paper on developing an intent prediction system using GPT-4 and a custom ontology. [1]
Abstract
We introduce a neuro-symbolic framework for multi-intent understanding in mobile AI agents by integrating a structured intent ontology with compact language models. Our method leverages retrieval-augmented prompting, logit biasing and optional classification heads to inject symbolic intent structure into both input and output representations. We formalize a new evaluation metric-Semantic Intent Similarity (SIS)-based on hierarchical ontology depth, capturing semantic proximity even when predicted intents differ lexically. Experiments on a subset of ambiguous/demanding dialogues of MultiWOZ 2.3 (with oracle labels from GPT-o3) demonstrate that a 3B Llama model with ontology augmentation approaches GPT-4 accuracy (85% vs 90%) at a tiny fraction of the energy and memory footprint. Qualitative comparisons show that ontology-augmented models produce more grounded, disambiguated multi-intent interpretations. Our results validate symbolic alignment as an effective strategy for enabling accurate and efficient on-device NLU.
Why we think this paper is great for you:
This paper directly aligns with your interest in structured knowledge representation, offering an ontology-enhanced approach that is crucial for defining and managing complex product categories. Its neuro-symbolic method provides a robust way to handle multi-faceted product information.
AI Summary - It addresses the limitations of prior vector-based approaches, specifically their 'black-box' nature and high storage requirements. [3]
- Knowledge Graphs: A type of graph database that stores knowledge in the form of nodes and edges. [3]
- Schema Matching: The process of identifying and mapping similar concepts or entities between different data sources. [2]
- SMoG (Schema Matching on Graph) is a novel framework that re-establishes the viability of query-based knowledge graph exploration for schema matching. [1]
Abstract
Schema matching is a critical task in data integration, particularly in the medical domain where disparate Electronic Health Record (EHR) systems must be aligned to standard models like OMOP CDM. While Large Language Models (LLMs) have shown promise in schema matching, they suffer from hallucination and lack of up-to-date domain knowledge. Knowledge Graphs (KGs) offer a solution by providing structured, verifiable knowledge. However, existing KG-augmented LLM approaches often rely on inefficient complex multi-hop queries or storage-intensive vector-based retrieval methods. This paper introduces SMoG (Schema Matching on Graph), a novel framework that leverages iterative execution of simple 1-hop SPARQL queries, inspired by successful strategies in Knowledge Graph Question Answering (KGQA). SMoG enhances explainability and reliability by generating human-verifiable query paths while significantly reducing storage requirements by directly querying SPARQL endpoints. Experimental results on real-world medical datasets demonstrate that SMoG achieves performance comparable to state-of-the-art baselines, validating its effectiveness and efficiency in KG-augmented schema matching.
Why we think this paper is great for you:
This work on schema matching, particularly 'on Graph,' is highly relevant for integrating disparate product data sources and building coherent product taxonomies. It directly addresses the challenge of aligning different product descriptions into a unified knowledge structure.
AI Summary - NeurIPS Paper Checklist: A checklist for reviewers to evaluate the quality and validity of a paper submitted to the NeurIPS conference. [3]
- Claims: The main claims made in the abstract and introduction that accurately reflect the paper’s contributions and scope. [3]
- The paper proposes an automatic and scalable pipeline for generating semantic similarity benchmarks. [2]
Abstract
Evaluating the open-form textual responses generated by Large Language Models (LLMs) typically requires measuring the semantic similarity of the response to a (human generated) reference. However, there is evidence that current semantic similarity methods may capture syntactic or lexical forms over semantic content. While benchmarks exist for semantic equivalence, they often suffer from high generation costs due to reliance on subjective human judgment, limited availability for domain-specific applications, and unclear definitions of equivalence. This paper introduces a novel method for generating benchmarks to evaluate semantic similarity methods for LLM outputs, specifically addressing these limitations. Our approach leverages knowledge graphs (KGs) to generate pairs of natural-language statements that are semantically similar or dissimilar, with dissimilar pairs categorized into one of four sub-types. We generate benchmark datasets in four different domains (general knowledge, biomedicine, finance, biology), and conduct a comparative study of semantic similarity methods including traditional natural language processing scores and LLM-as-a-judge predictions. We observe that the sub-type of semantic variation, as well as the domain of the benchmark impact the performance of semantic similarity methods, with no method being consistently superior. Our results present important implications for the use of LLM-as-a-judge in detecting the semantic content of text. Code is available at https://github.com/QiyaoWei/semantic-kg and the dataset is available at https://huggingface.co/datasets/QiyaoWei/Semantic-KG.
Why we think this paper is great for you:
Your interest in knowledge graphs makes this paper a strong match, as it leverages them to measure semantic similarity. This is essential for accurately categorizing products and understanding their relationships within a comprehensive knowledge system.
Abstract
We study streaming data with categorical features where the vocabulary of categorical feature values is changing and can even grow unboundedly over time. Feature hashing is commonly used as a pre-processing step to map these categorical values into a feature space of fixed size before learning their embeddings. While these methods have been developed and evaluated for offline or batch settings, in this paper we consider online settings. We show that deterministic embeddings are sensitive to the arrival order of categories and suffer from forgetting in online learning, leading to performance deterioration. To mitigate this issue, we propose a probabilistic hash embedding (PHE) model that treats hash embeddings as stochastic and applies Bayesian online learning to learn incrementally from data. Based on the structure of PHE, we derive a scalable inference algorithm to learn model parameters and infer/update the posteriors of hash embeddings and other latent variables. Our algorithm (i) can handle an evolving vocabulary of categorical items, (ii) is adaptive to new items without forgetting old items, (iii) is implementable with a bounded set of parameters that does not grow with the number of distinct observed values on the stream, and (iv) is invariant to the item arrival order. Experiments in classification, sequence modeling, and recommendation systems in online learning setups demonstrate the superior performance of PHE while maintaining high memory efficiency (consumes as low as 2~4 memory of a one-hot embedding table). Supplementary materials are at https://github.com/aodongli/probabilistic-hash-embeddings
Why we think this paper is great for you:
This paper's focus on online learning of categorical features is highly pertinent to your needs for dynamic product categorization and discovery. It offers methods for efficiently handling evolving product attributes in a continuous learning environment.
AI Summary - The phenomenon of edit collapse occurs when accumulated updates overwrite prior knowledge, leading to a loss of performance on earlier edits. [3]
- Edit collapse: A loss of performance on earlier edits due to accumulated updates overwriting prior knowledge. [3]
- Knowledge editing: The process of modifying or updating the knowledge stored in a large language model. [3]
- Weight-space methods for knowledge editing in large language models (LLMs) can lead to catastrophic forgetting, where earlier edits are forgotten as new knowledge accumulates. [2]
Abstract
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge. This problem is especially hard for complex, unstructured knowledge in a lifelong setting, where many edits must coexist without interference. We introduce RILKE (Representation Intervention for Lifelong KnowledgE Control), a robust and scalable method that treats knowledge control as interventions within the model's representation space. Leveraging representation-space expressiveness, we identify two properties enabling RILKE to deliver fine-grained control over complex, unstructured knowledge while maintaining general utility with frozen base weights. During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference. In inference, a query-adaptive router selects the appropriate module to guide the model's generation. In evaluation on knowledge editing benchmarks with LLaMA and Qwen models, RILKE is scalable to large-scale datasets, demonstrating high edit success, strong paraphrase generalization, and preserving general utility with modest memory overhead. These results show RILKE is an effective and scalable solution for lifelong knowledge control in LLMs.
Why we think this paper is great for you:
This research on lifelong knowledge control is valuable for managing evolving product information and taxonomies over time. It addresses the critical challenge of updating and maintaining knowledge without costly retraining, which is key for dynamic product environments.
Abstract
Neural fields are increasingly used as a light-weight, continuous, and differentiable signal representation in (bio)medical imaging. However, unlike discrete signal representations such as voxel grids, neural fields cannot be easily extended. As neural fields are, in essence, neural networks, prior signals represented in a neural field will degrade when the model is presented with new data due to catastrophic forgetting. This work examines the extent to which different neural field approaches suffer from catastrophic forgetting and proposes a strategy to mitigate this issue. We consider the scenario in which data becomes available incrementally, with only the most recent data available for neural field fitting. In a series of experiments on cardiac cine MRI data, we demonstrate how knowledge distillation mitigates catastrophic forgetting when the spatiotemporal domain is enlarged or the dimensionality of the represented signal is increased. We find that the amount of catastrophic forgetting depends, to a large extent, on the neural fields model used, and that distillation could enable continual learning in neural fields.
Why we think this paper is great for you:
The principles of continual learning explored here are directly applicable to your interest in continually discovering and refining product categories. While the domain differs, the methodology for adapting to new information is highly relevant for evolving product knowledge.
Abstract
Cross-Disciplinary Cold-start Knowledge Tracing (CDCKT) faces a critical challenge: insufficient student interaction data in the target discipline prevents effective knowledge state modeling and performance prediction. Existing cross-disciplinary methods rely on overlapping entities between disciplines for knowledge transfer through simple mapping functions, but suffer from two key limitations: (1) overlapping entities are scarce in real-world scenarios, and (2) simple mappings inadequately capture cross-disciplinary knowledge complexity. To overcome these challenges, we propose Mixed of Experts and Adversarial Generative Network-based Cross-disciplinary Cold-start Knowledge Tracing Framework. Our approach consists of three key components: First, we pre-train a source discipline model and cluster student knowledge states into K categories. Second, these cluster attributes guide a mixture-of-experts network through a gating mechanism, serving as a cross-domain mapping bridge. Third, an adversarial discriminator enforces feature separation by pulling same-attribute student features closer while pushing different-attribute features apart, effectively mitigating small-sample limitations. We validate our method's effectiveness across 20 extreme cross-disciplinary cold-start scenarios.
Why we think this paper is great for you:
This paper's exploration of adaptive knowledge transfer and cold-start scenarios offers insights into how to establish new product categories or knowledge domains with limited initial data. It provides strategies for leveraging existing knowledge to expand your product understanding.
Continual Generalized Category Discovery
Abstract
Large Language Models (LLMs) have made remarkable progress in their ability to interact with external interfaces. Selecting reasonable external interfaces has thus become a crucial step in constructing LLM agents. In contrast to invoking API tools, directly calling AI models across different modalities from the community (e.g., HuggingFace) poses challenges due to the vast scale (> 10k), metadata gaps, and unstructured descriptions. Current methods for model selection often involve incorporating entire model descriptions into prompts, resulting in prompt bloat, wastage of tokens and limited scalability. To address these issues, we propose HuggingR$^4$, a novel framework that combines Reasoning, Retrieval, Refinement, and Reflection, to efficiently select models. Specifically, We first perform multiple rounds of reasoning and retrieval to get a coarse list of candidate models. Then, we conduct fine-grained refinement by analyzing candidate model descriptions, followed by reflection to assess results and determine if retrieval scope expansion is necessary. This method reduces token consumption considerably by decoupling user query processing from complex model description handling. Through a pre-established vector database, complex model descriptions are stored externally and retrieved on-demand, allowing the LLM to concentrate on interpreting user intent while accessing only relevant candidate models without prompt bloat. In the absence of standardized benchmarks, we construct a multimodal human-annotated dataset comprising 14,399 user requests across 37 tasks and conduct a thorough evaluation. HuggingR$^4$ attains a workability rate of 92.03% and a reasonability rate of 82.46%, surpassing existing method by 26.51% and 33.25% respectively on GPT-4o-mini.
AI Summary - The system iteratively analyzes the user request, generates retrieval queries, applies filtering tools, and narrows down the candidate space based on specific criteria extracted from the query. [3]
- Refinement phase: Triggered when the candidate pool is reduced to N or fewer models, prompting the system to use refinement tools for final selection. [3]
- Reflection phase: Verifies that the selected model satisfies all specified criteria including language compatibility, dataset requirements, model size constraints, and special functionalities. [3]
- The paper does not provide a clear evaluation metric for the system's performance. [3]
- The framework may be limited by its reliance on specific criteria extracted from the user query. [2]
- The paper discusses a multi-step reasoning framework for selecting models on HuggingFace. [1]
Graphs for Products
Abstract
Let $G$ be graph with vertex set $V(G)$ and order $n$. A coalition in a graph $G$ consists of two disjoint sets of vertices $V_1$ and $V_2$, neither of which is a dominating set but whose union $V_1 \cup V_2$ is a dominating set. A coalition partition, abbreviated $c$-partition, in a graph $G$ is a vertex partition $π=\left\{V_1 , V_2,\dots, V_k\right\}$ such that every set $V_i$ of $π$ is either a singleton dominating set, or is not a dominating set but forms a coalition with another set $V_j$ in $π$. The sets $V_i$ and $V_j$ are coalition partners in $G$. The coalition number $C(G)$ equals the maximum order $k$ of a $c$-partition of $G$. For any graph $G$ with a $c$-partition $π=\left\{V_1,V_2,\dots,V_k\right\}$, the coalition graph $CG(G,π)$ of $G$ is a graph with vertex set $V_1,V_2,\dots, V_k$, corresponding one-to-one with the set $π$, and two vertices $V_i$ and $V_j$ are adjacent in $CG(G,π)$ if and only if the sets $V_i$ and $V_j$ are coalition partners in $π$. In [4], authors proved that for every graph $G$ there exist a graph $H$ and $c$-partition $π$ such that $CG(H,π)\cong G$, and raised the question: Does there exist a graph $H^*$ of smaller order $n^*$ and size $m^*$ with a $c$-partition $π^*$ such that $CG(H^*,π^*)\cong G$?. In this paper, we constructed a graph $H^*$ of small order and size and a $c$- partition $π^*$ such that $CG(H^*,π^*)\cong G$. Recently, Haynes et al.[5] defined the coalition count $c(G)$ of a graph $G$ as the maximum number of different coalition in any $c$-partition of $G$. We characterize all graphs $G$ with $c(G)=1$. Further, imposing some suitable conditions on coalition number, we study the properties of coalition count of graph.
AI Summary - The paper explores various properties and results related to coalitions in graphs, including bounds on the coalition number and the relationship between the coalition number and the coalition count. [3]
- Coalition graph: A graph whose vertices represent sets of vertices in a given graph, and edges between vertices indicate that the corresponding sets form coalitions. [3]
- Coalition theory in graph theory Coalition theory is a way to partition the vertices of a graph into sets such that each set forms a coalition with at least one other set. [3]
- This can help us understand how different groups or individuals interact and form alliances in a network. [3]
- The problem discusses the concept of coalitions in graph theory, which is a way to partition the vertices of a graph into sets such that each set forms a coalition with at least one other set. [2]
Abstract
A secure coalition in a graph $G$ consists of two disjoint vertex sets $V_1$ and $V_2$, neither of which is a secure dominating set, but whose union $V_1 \cup V_2$ forms a secure dominating set. A secure coalition partition ($sec$-partition) of $G$ is a vertex partition $π= \{V_1, V_2, \dots, V_k\}$ where each set $V_i$ is either a secure dominating set consisting of a single vertex of degree $n-1$, or a set that is not a secure dominating set but forms a secure coalition with some other set $V_j \in π$. The maximum cardinality of a secure coalition partition of $G$ is called the secure coalition number of $G$, denoted $SEC(G)$. For every $sec$-partition $π$ of a graph $G$, we associate a graph called the secure coalition graph of $G$ with respect to $π$, denoted $SCG(G,π)$, where the vertices of $SCG(G,π)$ correspond to the sets $V_1, V_2, \dots, V_k$ of $π$, and two vertices are adjacent in $SCG(G,π)$ if and only if their corresponding sets in $π$ form a secure coalition in $G$. In this study, we prove that every graph admits a $sec$-partition. Further, we characterize the graphs $G$ with $SEC(G) \in \{1,2,n\}$ and all trees $T$ with $SEC(T) = n-1$. Finally, we show that every graph $G$ without isolated vertices is a secure coalition graph.