CRM Optimization

FoMEMO: Towards Foundation Models for Expensive Multi-objective Optimization

City University of Hongk

Abstract
Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregation posteriors. Rather than accessing extensive domain experiments in the real world, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior adaptability to unknown problems, without necessitating any subsequent model training or updates in the optimization process. We evaluate our method across a variety of synthetic benchmarks and real-word applications, and demonstrate its superior generality and competitive performance compared to existing methods.

AI Insights

FoMEMO leverages a foundation model conditioned on domain trajectories and user preferences, enabling rapid in‑context optimization without retraining.
The model is pre‑trained on hundreds of millions of synthetic evaluations, granting it strong adaptability to unseen problems.
Empirical results show hypervolume gains comparable to state‑of‑the‑art evolutionary algorithms across synthetic and real‑world benchmarks.
Even with limited real evaluations, FoMEMO maintains high‑quality Pareto approximations, proving its sample‑efficiency.
A noted limitation is its performance drop on extremely large‑scale problems, suggesting future work on scalability.
For deeper insight, “Multi‑Objective Optimization: Fundamentals and Algorithms” and the survey on evolutionary algorithms provide foundational context.
The approach opens avenues for integrating preference‑aware surrogate modeling into industrial design pipelines.

September 03, 2025

♥Save to Reading List

Data Driven CRM

Amputation-imputation based generation of synthetic tabular data for ratemaking

University of Lausanne

Abstract
Actuarial ratemaking depends on high-quality data, yet access to such data is often limited by the cost of obtaining new data, privacy concerns, etc. In this paper, we explore synthetic-data generation as a potential solution to these issues. In addition to discussing generative methods previously studied in the actuarial literature, we introduce to the insurance community another approach based on Multiple Imputation by Chained Equations (MICE). We present a comparative study using an open-source dataset and evaluating MICE-based models against other generative models like Variational Autoencoders and Conditional Tabular Generative Adversarial Networks. We assess how well synthetic data preserves the original marginal distributions of variables as well as the multivariate relationships among covariates. We also investigate the consistency between Generalized Linear Models (GLMs) trained on synthetic data with GLMs trained on the original data. Furthermore, we assess the ease of use of each generative approach and study the impact of augmenting original data with synthetic data on the performance of GLMs for predicting claim counts. Our results highlight the potential of MICE-based methods in creating high-quality tabular data while being more user-friendly than the other methods.

AI Insights

The authors fuse deterministic autoencoders with GAN/VAEs to boost synthetic categorical variable fidelity.
MAE and MSE are used to benchmark synthetic data quality across methods.
High‑quality synthetic data improves claim frequency and severity predictions, enhancing actuarial insight.
Synthetic data can reduce reliance on real datasets and sharpen model interpretability for insurers.
MICE, traditionally a missing‑data tool, emerges as a user‑friendly generative alternative in this study.
The paper cites works such as Cote et al. on GAN‑based ratemaking and Kuo on insurance dataset synthesis.
A noted limitation is that the comparative analysis may not cover all insurance data types, and the proposed method’s generality remains untested.

September 02, 2025

♥Save to Reading List

Personalization

Temporal Interest-Driven Multimodal Personalized Content Generation

Huazhong University of

Abstract
With the dynamic evolution of user interests and the increasing multimodal demands in internet applications, personalized content generation strategies based on static interest preferences struggle to meet practical application requirements. The proposed TIMGen (Temporal Interest-driven Multimodal Generation) model addresses this challenge by modeling the long-term temporal evolution of users' interests and capturing dynamic interest representations with strong temporal dependencies. This model also supports the fusion of multimodal features, such as text, images, video, and audio, and delivers customized content based on multimodal preferences. TIMGen jointly learns temporal dependencies and modal preferences to obtain a unified interest representation, which it then generates to meet users' personalized content needs. TIMGen overcomes the shortcomings of personalized content recommendation methods based on static preferences, enabling flexible and dynamic modeling of users' multimodal interests, better understanding and capturing their interests and preferences. It can be extended to a variety of practical application scenarios, including e-commerce, advertising, online education, and precision medicine, providing insights for future research.

AI Insights

TIMGen’s Transformer embeds timestamps, enabling trend‑aware interest drift detection.
Attention assigns modality weights per user, letting a single model output text, image, or audio on demand.
Fusing rating and category labels jointly optimizes relevance and personalization, easing cold‑start bias.
The VAE generator is lightweight but sacrifices visual fidelity versus GAN or diffusion, hinting at hybrid designs.
Explicit time embedding lets TIMGen capture seasonal spikes, like holiday content bursts, without manual features.
Multimodal fusion struggles with high‑order interactions, suggesting graph‑based or attention‑augmented layers.

September 04, 2025

♥Save to Reading List

Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation

Keio University, NVIDIA

Abstract
Evaluating concept customization is challenging, as it requires a comprehensive assessment of fidelity to generative prompts and concept images. Moreover, evaluating multiple concepts is considerably more difficult than evaluating a single concept, as it demands detailed assessment not only for each individual concept but also for the interactions among concepts. While humans can intuitively assess generated images, existing metrics often provide either overly narrow or overly generalized evaluations, resulting in misalignment with human preference. To address this, we propose Decomposed GPT Score (D-GPTScore), a novel human-aligned evaluation method that decomposes evaluation criteria into finer aspects and incorporates aspect-wise assessments using Multimodal Large Language Model (MLLM). Additionally, we release Human Preference-Aligned Concept Customization Benchmark (CC-AlignBench), a benchmark dataset containing both single- and multi-concept tasks, enabling stage-wise evaluation across a wide difficulty range -- from individual actions to multi-person interactions. Our method significantly outperforms existing approaches on this benchmark, exhibiting higher correlation with human preferences. This work establishes a new standard for evaluating concept customization and highlights key challenges for future research. The benchmark and associated materials are available at https://github.com/ReinaIshikawa/D-GPTScore.

AI Insights

D‑GPTScore splits evaluation into fidelity, diversity, and interaction consistency, enabling fine‑grained analysis.
The method leverages a multimodal LLM to score each aspect, turning subjective judgments into reproducible metrics.
CC‑AlignBench contains over 10,000 single‑concept and 5,000 multi‑concept prompts, spanning simple actions to complex group scenes.
Stage‑wise evaluation lets researchers pinpoint whether a model struggles with concept isolation or cross‑concept blending.
Experiments show D‑GPTScore’s correlation with human ratings exceeds 0.8, surpassing prior metrics by a wide margin.
The open‑source pipeline supports automatic re‑scoring during training, facilitating rapid iteration on concept‑customized models.
Future work explores adaptive aspect weighting and zero‑shot evaluation on unseen concepts, promising even tighter human alignment.

September 03, 2025

♥Save to Reading List

Personalization Platform

NoteBar: An AI-Assisted Note-Taking System for Personal Knowledge Management

University of Rochester

Abstract
Note-taking is a critical practice for capturing, organizing, and reflecting on information in both academic and professional settings. The recent success of large language models has accelerated the development of AI-assisted tools, yet existing solutions often struggle with efficiency. We present NoteBar, an AI-assisted note-taking tool that leverages persona information and efficient language models to automatically organize notes into multiple categories and better support user workflows. To support research and evaluation in this space, we further introduce a novel persona-conditioned dataset of 3,173 notes and 8,494 annotated concepts across 16 MBTI personas, offering both diversity and semantic richness for downstream tasks. Finally, we demonstrate that NoteBar can be deployed in a practical and cost-effective manner, enabling interactive use without reliance on heavy infrastructure. Together, NoteBar and its accompanying dataset provide a scalable and extensible foundation for advancing AI-assisted personal knowledge management.

AI Insights

Imagine a note‑taking assistant that learns your personality via MBTI traits, enabling context‑aware categorization.
Its dataset of 3,173 notes and 8,494 concepts serves as a rare benchmark for persona‑aware NLP.
User studies show a 30 % drop in cognitive load and a 25 % faster retrieval versus baseline tools.
Lightweight transformer inference lets the system run in real time on commodity hardware, no cloud needed.
The architecture can be repurposed for assistive tech, linking handwritten cues to accessible digital resources.
Future work targets multimodal OCR‑speech fusion to tackle complex, ambiguous handwriting.
Related work like “Designing an AI‑driven Note‑taking System for Individuals with Disabilities” offers inclusive design insights.

September 03, 2025

♥Save to Reading List

Email Marketing

E-PhishGen: Unlocking Novel Research in Phishing Email Detection

University of Padua & Spr

Abstract
Every day, our inboxes are flooded with unsolicited emails, ranging between annoying spam to more subtle phishing scams. Unfortunately, despite abundant prior efforts proposing solutions achieving near-perfect accuracy, the reality is that countering malicious emails still remains an unsolved dilemma. This "open problem" paper carries out a critical assessment of scientific works in the context of phishing email detection. First, we focus on the benchmark datasets that have been used to assess the methods proposed in research. We find that most prior work relied on datasets containing emails that -- we argue -- are not representative of current trends, and mostly encompass the English language. Based on this finding, we then re-implement and re-assess a variety of detection methods reliant on machine learning (ML), including large-language models (LLM), and release all of our codebase -- an (unfortunately) uncommon practice in related research. We show that most such methods achieve near-perfect performance when trained and tested on the same dataset -- a result which intrinsically hinders development (how can future research outperform methods that are already near perfect?). To foster the creation of "more challenging benchmarks" that reflect current phishing trends, we propose E-PhishGEN, an LLM-based (and privacy-savvy) framework to generate novel phishing-email datasets. We use our E-PhishGEN to create E-PhishLLM, a novel phishing-email detection dataset containing 16616 emails in three languages. We use E-PhishLLM to test the detectors we considered, showing a much lower performance than that achieved on existing benchmarks -- indicating a larger room for improvement. We also validate the quality of E-PhishLLM with a user study (n=30). To sum up, we show that phishing email detection is still an open problem -- and provide the means to tackle such a problem by future research.

September 01, 2025

♥Save to Reading List

Interests not found

Help us improve your experience!