Hi!

Your personalized paper recommendations for 10 to 14 November, 2025.
🎯 Top Personalized Recommendations
Why we think this paper is great for you:
This paper directly explores the impact and value of personalized recommendations, which is crucial for understanding and optimizing your personalization platforms and data-driven CRM strategies. It provides insights into how tailored experiences drive user choice.
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
Personalized recommendation systems shape much of user choice online, yet their targeted nature makes separating out the value of recommendation and the underlying goods challenging. We build a discrete choice model that embeds recommendation-induced utility, low-rank heterogeneity, and flexible state dependence and apply the model to viewership data at Netflix. We exploit idiosyncratic variation introduced by the recommendation algorithm to identify and separately value these components as well as to recover model-free diversion ratios that we can use to validate our structural model. We use the model to evaluate counterfactuals that quantify the incremental engagement generated by personalized recommendations. First, we show that replacing the current recommender system with a matrix factorization or popularity-based algorithm would lead to 4% and 12% reduction in engagement, respectively, and decreased consumption diversity. Second, most of the consumption increase from recommendations comes from effective targeting, not mechanical exposure, with the largest gains for mid-popularity goods (as opposed to broadly appealing or very niche goods).
AI Summary
  • The current Netflix RecSys significantly outperforms matrix factorization (4% engagement reduction, 37.5% HHI increase) and popularity-based (12% engagement reduction, 42.5% HHI increase) algorithms, demonstrating substantial value from modern algorithmic advances. [3]
  • Effective targeting accounts for 41.9% of the consumption increase from recommendations, making it 7 times more impactful than mechanical exposure (6.8%) and primarily benefiting mid-popularity goods. [3]
  • The developed discrete choice model, incorporating recommendation-induced utility and flexible state dependence, accurately reproduces model-free diversion ratios with an R2 of 0.73 and a correlation of 0.86, validating its ability to capture substitution patterns. [3]
  • The current RecSys successfully balances high user engagement with maintaining consumption diversity, preventing the concentration of viewership seen with traditional matrix factorization and popularity-based approaches. [3]
  • The framework allows for credible quantification of incremental engagement for specific goods, including new content, by simulating counterfactual catalog changes and incorporating pre-consumption good embeddings for novel items. [3]
  • Recommendation "bonus": An additive utility boost for goods that are recommended, serving as a reduced-form characterization of the positive informational role recommendations play in guiding user choices. [3]
  • Transformer-style architecture for state dependence: An adaptation of machine learning's transformer concept to model user preferences that flexibly adapt over time based on their full consumption history, avoiding explicit user-specific embeddings. [3]
  • Incremental engagement: Defined as the difference in platform engagement when a particular good is available versus when it is removed, quantifiable through counterfactual simulations of choice probabilities. [3]
  • Low-rank discrete choice model: A model where the user preferences for goods are represented by a low-rank decomposition of good and user embeddings, allowing endogenous learning of good-level 'characteristics'. [2]
  • The model's architecture, which dynamically represents user preferences as a function of past watch history via a sequence model, enables scalable estimation for millions of users without requiring individual user embeddings. [1]
Why we think this paper is great for you:
This paper delves into adapting to individual user preferences for personalization, a core concept for enhancing your personalization platform and tailoring content for email marketing. It highlights the importance of understanding diverse tastes.
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
People have different creative writing preferences, and large language models (LLMs) for these tasks can benefit from adapting to each user's preferences. However, these models are often trained over a dataset that considers varying personal tastes as a monolith. To facilitate developing personalized creative writing LLMs, we introduce LiteraryTaste, a dataset of reading preferences from 60 people, where each person: 1) self-reported their reading habits and tastes (stated preference), and 2) annotated their preferences over 100 pairs of short creative writing texts (revealed preference). With our dataset, we found that: 1) people diverge on creative writing preferences, 2) finetuning a transformer encoder could achieve 75.8% and 67.7% accuracy when modeling personal and collective revealed preferences, and 3) stated preferences had limited utility in modeling revealed preferences. With an LLM-driven interpretability pipeline, we analyzed how people's preferences vary. We hope our work serves as a cornerstone for personalizing creative writing technologies.
Capital One
Why we think this paper is great for you:
This paper discusses optimizing LLMs for real-time reranking, which is highly applicable to enhancing content relevance within your personalization platform and optimizing MLOps workflows for dynamic content delivery. It offers insights into improving retrieval quality.
Rate paper: 👍 👎 ♥ Save
Paper visualization
Rate image: 👍 👎
Abstract
Efficiently reranking documents retrieved from information retrieval (IR) pipelines to enhance overall quality of Retrieval-Augmented Generation (RAG) system remains an important yet challenging problem. Recent studies have highlighted the importance of Large Language Models (LLMs) in reranking tasks. In particular, Pairwise Reranking Prompting (PRP) has emerged as a promising plug-and-play approach due to its usability and effectiveness. However, the inherent complexity of the algorithm, coupled with the high computational demands and latency incurred due to LLMs, raises concerns about its feasibility in real-time applications. To address these challenges, this paper presents a focused study on pairwise reranking, demonstrating that carefully applied optimization methods can significantly mitigate these issues. By implementing these methods, we achieve a remarkable latency reduction of up to 166 times, from 61.36 seconds to 0.37 seconds per query, with an insignificant drop in performance measured by Recall@k. Our study highlights the importance of design choices that were previously overlooked, such as using smaller models, limiting the reranked set, using lower precision, reducing positional bias with one-directional order inference, and restricting output tokens. These optimizations make LLM-based reranking substantially more efficient and feasible for latency-sensitive, real-world deployments.
Why we think this paper is great for you:
This paper explores automating optimization problems using LLMs, which is directly relevant to enhancing your CRM optimization efforts and streamlining complex decision-making processes in MLOps. It provides a pathway to more efficient operational strategies.
Rate paper: 👍 👎 ♥ Save
Abstract
Optimization modeling and solving are fundamental to the application of Operations Research (OR) in real-world decision making, yet the process of translating natural language problem descriptions into formal models and solver code remains highly expertise intensive. While recent advances in large language models (LLMs) have opened new opportunities for automation, the generalization ability and data efficiency of existing LLM-based methods are still limited, asmost require vast amounts of annotated or synthetic data, resulting in high costs and scalability barriers. In this work, we present OR-R1, a data-efficient training framework for automated optimization modeling and solving. OR-R1 first employs supervised fine-tuning (SFT) to help the model acquire the essential reasoning patterns for problem formulation and code generation from limited labeled data. In addition, it improves the capability and consistency through Test-Time Group Relative Policy Optimization (TGRPO). This two-stage design enables OR-R1 to leverage both scarce labeled and abundant unlabeled data for effective learning. Experiments show that OR-R1 achieves state-of-the-art performance with an average solving accuracy of $67.7\%$, using only $1/10$ the synthetic data required by prior methods such as ORLM, exceeding ORLM's solving accuracy by up to $4.2\%$. Remarkably, OR-R1 outperforms ORLM by over $2.4\%$ with just $100$ synthetic samples. Furthermore, TGRPO contributes an additional $3.1\%-6.4\%$ improvement in accuracy, significantly narrowing the gap between single-attempt (Pass@1) and multi-attempt (Pass@8) performance from $13\%$ to $7\%$. Extensive evaluations across diverse real-world benchmarks demonstrate that OR-R1 provides a robust, scalable, and cost-effective solution for automated OR optimization problem modeling and solving, lowering the expertise and data barriers for industrial OR applications.
Presidency University
Why we think this paper is great for you:
This paper presents an AI-powered platform for automated data analysis and visualization, which is foundational for deriving insights in your data-driven CRM and informing personalization strategies. It helps streamline the understanding of complex datasets.
Rate paper: 👍 👎 ♥ Save
Abstract
An AI-powered data visualization platform that automates the entire data analysis process, from uploading a dataset to generating an interactive visualization. Advanced machine learning algorithms are employed to clean and preprocess the data, analyse its features, and automatically select appropriate visualizations. The system establishes the process of automating AI-based analysis and visualization from the context of data-driven environments, and eliminates the challenge of time-consuming manual data analysis. The combination of a Python Flask backend to access the dataset, paired with a React frontend, provides a robust platform that automatically interacts with Firebase Cloud Storage for numerous data processing and data analysis solutions and real-time sources. Key contributions include automatic and intelligent data cleaning, with imputation for missing values, and detection of outliers, via analysis of the data set. AI solutions to intelligently select features, using four different algorithms, and intelligent title generation and visualization are determined by the attributes of the dataset. These contributions were evaluated using two separate datasets to assess the platform's performance. In the process evaluation, the initial analysis was performed in real-time on datasets as large as 100000 rows, while the cloud-based demand platform scales to meet requests from multiple users and processes them simultaneously. In conclusion, the cloud-based data visualization application allowed for a significant reduction of manual inputs to the data analysis process while maintaining a high quality, impactful visual outputs, and user experiences
Xian Jiaotong University
Why we think this paper is great for you:
While primarily focused on theoretical aspects of reinforcement learning for control tasks, this paper might offer abstract insights into robust system design and testing methodologies applicable to complex MLOps environments.
Rate paper: 👍 👎 ♥ Save
Abstract
Reinforcement learning (RL) has been recognized as a powerful tool for robot control tasks. RL typically employs reward functions to define task objectives and guide agent learning. However, since the reward function serves the dual purpose of defining the optimal goal and guiding learning, it is challenging to design the reward function manually, which often results in a suboptimal task representation. To tackle the reward design challenge in RL, inspired by the satisficing theory, we propose a Test-driven Reinforcement Learning (TdRL) framework. In the TdRL framework, multiple test functions are used to represent the task objective rather than a single reward function. Test functions can be categorized as pass-fail tests and indicative tests, each dedicated to defining the optimal objective and guiding the learning process, respectively, thereby making defining tasks easier. Building upon such a task definition, we first prove that if a trajectory return function assigns higher returns to trajectories closer to the optimal trajectory set, maximum entropy policy optimization based on this return function will yield a policy that is closer to the optimal policy set. Then, we introduce a lexicographic heuristic approach to compare the relative distance relationship between trajectories and the optimal trajectory set for learning the trajectory return function. Furthermore, we develop an algorithm implementation of TdRL. Experimental results on the DeepMind Control Suite benchmark demonstrate that TdRL matches or outperforms handcrafted reward methods in policy training, with greater design simplicity and inherent support for multi-objective optimization. We argue that TdRL offers a novel perspective for representing task objectives, which could be helpful in addressing the reward design challenges in RL applications.
Xian Jiaotong University
Why we think this paper is great for you:
This paper, though theoretical, provides a deep dive into reinforcement learning, which could inform advanced optimization techniques within your MLOps practices or future personalization algorithms. It emphasizes rigorous testing in complex learning systems.
Rate paper: 👍 👎 ♥ Save
Abstract
Reinforcement learning (RL) has been recognized as a powerful tool for robot control tasks. RL typically employs reward functions to define task objectives and guide agent learning. However, since the reward function serves the dual purpose of defining the optimal goal and guiding learning, it is challenging to design the reward function manually, which often results in a suboptimal task representation. To tackle the reward design challenge in RL, inspired by the satisficing theory, we propose a Test-driven Reinforcement Learning (TdRL) framework. In the TdRL framework, multiple test functions are used to represent the task objective rather than a single reward function. Test functions can be categorized as pass-fail tests and indicative tests, each dedicated to defining the optimal objective and guiding the learning process, respectively, thereby making defining tasks easier. Building upon such a task definition, we first prove that if a trajectory return function assigns higher returns to trajectories closer to the optimal trajectory set, maximum entropy policy optimization based on this return function will yield a policy that is closer to the optimal policy set. Then, we introduce a lexicographic heuristic approach to compare the relative distance relationship between trajectories and the optimal trajectory set for learning the trajectory return function. Furthermore, we develop an algorithm implementation of TdRL. Experimental results on the DeepMind Control Suite benchmark demonstrate that TdRL matches or outperforms handcrafted reward methods in policy training, with greater design simplicity and inherent support for multi-objective optimization. We argue that TdRL offers a novel perspective for representing task objectives, which could be helpful in addressing the reward design challenges in RL applications.

Interests not found

We did not find any papers that match the below interests. Try other terms also consider if the content exists in arxiv.org.
  • MLOps
  • Email Marketing
  • Personalization Platform
You can edit or add more interests any time.