Papers from 22 to 26 September, 2025

Here are the personalized paper recommendations sorted by most relevant
Email Marketing
šŸ‘ šŸ‘Ž ♄ Save
Paper visualization
Rate this image: šŸ˜ šŸ‘ šŸ‘Ž
Abstract
DeFi applications are vulnerable to MEV, where specialized actors profit by reordering or inserting transactions. To mitigate latency races and internalize MEV revenue, Arbitrum introduced Timeboost, an auction-based transaction sequencing mechanism that grants short-term priority access to an express lane. In this paper we present the first large-scale empirical study of Timeboost, analyzing over 11.5 million express lane transactions and 151 thousand auctions between April and July 2025. Our results reveal five main findings. First, express lane control is highly centralized, with two entities winning more than 90% of auctions. Second, while express lane access provides earlier inclusion, profitable MEV opportunities cluster at the end of blocks, limiting the value of priority access. Third, approximately 22% of time-boosted transactions are reverted, indicating that the Timeboost does not effectively mitigate spam. Fourth, secondary markets for reselling express lane rights have collapsed due to poor execution reliability and unsustainable economics. Finally, auction competition declined over time, leading to steadily reduced revenue for the Arbitrum DAO. Taken together, these findings show that Timeboost fails to deliver on its stated goals of fairness, decentralization, and spam reduction. Instead, it reinforces centralization and narrows adoption, highlighting the limitations of auction-based ordering as a mechanism for fair transaction sequencing in rollups.
Personalization
šŸ‘ šŸ‘Ž ♄ Save
Paper visualization
Rate this image: šŸ˜ šŸ‘ šŸ‘Ž
Abstract
Visual personalization is essential in user-facing AI systems such as smart homes and healthcare, where aligning model behavior with user-centric concepts is critical. However, recent large Vision-Language Models (VLMs), despite their broad applicability, remain underexplored in their ability to adapt to individual users. In this paper, we introduce MMPB, the first extensive benchmark for evaluating VLMs on personalization. MMPB comprises 10k image-query pairs and includes 111 personalizable concepts across four categories: humans, animals, objects, and characters, with the human category enriched with preference-grounded queries. We structure personalization into three main task types, each highlighting a different key property of VLMs. Using 23 widely used VLMs including both open- and closed-source models, we evaluate personalization performance via a three-stage protocol: concept injection, multi-turn dialogue, and personalized querying. Our findings indicate that most VLMs (including some closed-source models) struggle with personalization, particularly in maintaining consistency over dialogue, handling user preferences, and adapting to visual cues. Our analysis reveals that the challenges in VLM personalization (such as refusal behaviors and long-context forgetting) highlight substantial room for improvement. By identifying these limitations and offering a scalable benchmark, MMPB offers valuable insights and a solid foundation for future research toward truly personalized multi-modal AI. Project Page: aidaslab.github.io/MMPB
šŸ‘ šŸ‘Ž ♄ Save
Abstract
Large language model (LLM) personalization aims to tailor model behavior to individual users based on their historical interactions. However, its effectiveness is often hindered by two key challenges: the \textit{cold-start problem}, where users with limited history provide insufficient context for accurate personalization, and the \textit{biasing problem}, where users with abundant but skewed history cause the model to overfit to narrow preferences. We identify both issues as symptoms of a common underlying limitation, i.e., the inability to model collective knowledge across users. To address this, we propose a local-global memory framework (LoGo) that combines the personalized local memory with a collective global memory that captures shared interests across the population. To reconcile discrepancies between these two memory sources, we introduce a mediator module designed to resolve conflicts between local and global signals. Extensive experiments on multiple benchmarks demonstrate that LoGo consistently improves personalization quality by both warming up cold-start users and mitigating biased predictions. These results highlight the importance of incorporating collective knowledge to enhance LLM personalization.
Personalization Platform
šŸ‘ šŸ‘Ž ♄ Save
Yonsei University
Paper visualization
Rate this image: šŸ˜ šŸ‘ šŸ‘Ž
Abstract
Search-augmented large language models (LLMs) have advanced information-seeking tasks by integrating retrieval into generation, reducing users' cognitive burden compared to traditional search systems. Yet they remain insufficient for fully addressing diverse user needs, which requires recognizing how the same query can reflect different intents across users and delivering information in preferred forms. While recent systems such as ChatGPT and Gemini attempt personalization by leveraging user histories, systematic evaluation of such personalization is under-explored. To address this gap, we propose BESPOKE, the realistic benchmark for evaluating personalization in search-augmented LLMs. BESPOKE is designed to be both realistic, by collecting authentic chat and search histories directly from humans, and diagnostic, by pairing responses with fine-grained preference scores and feedback. The benchmark is constructed through long-term, deeply engaged human annotation, where human annotators contributed their own histories, authored queries with detailed information needs, and evaluated responses with scores and diagnostic feedback. Leveraging BESPOKE, we conduct systematic analyses that reveal key requirements for effective personalization in information-seeking tasks, providing a foundation for fine-grained evaluation of personalized search-augmented LLMs. Our code and data are available at https://augustinlib.github.io/BESPOKE/.
Data Driven CRM
šŸ‘ šŸ‘Ž ♄ Save
Paper visualization
Rate this image: šŸ˜ šŸ‘ šŸ‘Ž
Abstract
Large language model (LLM) and agent techniques for data analysis (a.k.a LLM/Agent-as-Data-Analyst) have demonstrated substantial impact in both academica and industry. In comparison with traditional rule or small-model based approaches, (agentic) LLMs enable complex data understanding, natural language interfaces, semantic analysis functions, and autonomous pipeline orchestration. The technical evolution further distills five key design goals for intelligent data analysis agents, namely semantic-aware design, modality-hybrid integration, autonomous pipelines, tool-augmented workflows, and support for open-world tasks. From a modality perspective, we review LLM-based techniques for (i) structured data (e.g., table question answering for relational data and NL2GQL for graph data), (ii) semi-structured data (e.g., markup languages understanding and semi-structured table modeling), (iii) unstructured data (e.g., chart understanding, document understanding, programming languages vulnerable detection), and (iv) heterogeneous data (e.g., data retrieval and modality alignment for data lakes). Finally, we outline the remaining challenges and propose several insights and practical directions for advancing LLM/Agent-powered data analysis.
šŸ‘ šŸ‘Ž ♄ Save
Zhejiang University, ZTE
Abstract
In commercial systems, a pervasive requirement for automatic data preparation (ADP) is to transfer relational data from disparate sources to targets with standardized schema specifications. Previous methods rely on labor-intensive supervision signals or target table data access permissions, limiting their usage in real-world scenarios. To tackle these challenges, we propose an effective end-to-end ADP framework MontePrep, which enables training-free pipeline synthesis with zero target-instance requirements. MontePrep is formulated as an open-source large language model (LLM) powered tree-structured search problem. It consists of three pivot components, i.e., a data preparation action sandbox (DPAS), a fundamental pipeline generator (FPG), and an execution-aware pipeline optimizer (EPO). We first introduce DPAS, a lightweight action sandbox, to navigate the search-based pipeline generation. The design of DPAS circumvents exploration of infeasible pipelines. Then, we present FPG to build executable DP pipelines incrementally, which explores the predefined action sandbox by the LLM-powered Monte Carlo Tree Search. Furthermore, we propose EPO, which invokes pipeline execution results from sources to targets to evaluate the reliability of the generated pipelines in FPG. In this way, unreasonable pipelines are eliminated, thus facilitating the search process from both efficiency and effectiveness perspectives. Extensive experimental results demonstrate the superiority of MontePrep with significant improvement against five state-of-the-art competitors.
AI Insights
  • DPAS sandbox prunes infeasible pipelines before LLM exploration, shrinking search space.
  • FPG builds executable DP pipelines incrementally using LLM‑guided Monte Carlo Tree Search.
  • EPO evaluates source‑to‑target runs, filtering unreliable pipelines and accelerating convergence.
  • Training‑free synthesis with zero target‑instance data makes it ideal for privacy‑restricted settings.
  • Monte Carlo Tree Search – stochastic tree‑search balancing exploration and exploitation via random sampling.
  • For deeper context, read ā€œChain‑of‑Thought Prompting elicits reasoning in large language modelsā€ (NeurIPS 2022).
CRM Optimization
šŸ‘ šŸ‘Ž ♄ Save
MBZUAI Department of
Abstract
We establish a local $\mathcal{O}(k^{-2})$ rate for the gradient update $x^{k+1}=x^k-\nabla f(x^k)/\sqrt{H\|\nabla f(x^k)\|}$ under a $2H$-Hessian--Lipschitz assumption. Regime detection relies on Hessian--vector products, avoiding Hessian formation or factorization. Incorporating this certificate into cubic-regularized Newton (CRN) and an accelerated variant enables per-iterate switching between the cubic and gradient steps while preserving CRN's global guarantees. The technique achieves the lowest wall-clock time among compared baselines in our experiments. In the first-order setting, the technique yields a monotone, adaptive, parameter-free method that inherits the local $\mathcal{O}(k^{-2})$ rate. Despite backtracking, the method shows superior wall-clock performance. Additionally, we cover smoothness relaxations beyond classical gradient--Lipschitzness, enabling tighter bounds, including global $\mathcal{O}(k^{-2})$ rates. Finally, we generalize the technique to the stochastic setting.
AI Insights
  • The authors bound E[f(x_{k+1})] by case‑splitting on gradient–Hessian relations and applying Young’s inequality.
  • Conditional expectation handles stochastic noise, producing a clean bound that drives the adaptive step‑size.
  • Regime detection uses Hessian‑vector products, avoiding costly Hessian formation while preserving global guarantees.
  • A monotone, parameter‑free first‑order variant still achieves the O(k⁻²) rate, a rare feature in stochastic schemes.
  • The paper references classic ā€œConvex Optimizationā€ and recent SGD reviews for deeper theoretical grounding.
  • Dependence on Young’s inequality means the method may falter if gradients or Hessians are ill‑defined, pointing to future robustness work.
šŸ‘ šŸ‘Ž ♄ Save
Alma Mater Studiorum, Boc
Abstract
Policy gradient methods often balance exploitation and exploration via entropy maximization. However, maximizing entropy pushes the policy towards a uniform random distribution, which represents an unstructured and sometimes inefficient exploration strategy. In this work, we propose replacing the entropy bonus with a more robust complexity bonus. In particular, we adopt a measure of complexity, defined as the product of Shannon entropy and disequilibrium, where the latter quantifies the distance from the uniform distribution. This regularizer encourages policies that balance stochasticity (high entropy) with structure (high disequilibrium), guiding agents toward regimes where useful, non-trivial behaviors can emerge. Such behaviors arise because the regularizer suppresses both extremes, e.g., maximal disorder and complete order, creating pressure for agents to discover structured yet adaptable strategies. Starting from Proximal Policy Optimization (PPO), we introduce Complexity-Driven Policy Optimization (CDPO), a new learning algorithm that replaces entropy with complexity. We show empirically across a range of discrete action space tasks that CDPO is more robust to the choice of the complexity coefficient than PPO is with the entropy coefficient, especially in environments requiring greater exploration.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • MLOps
You can edit or add more interests any time.

Unsubscribe from these updates