Tech for Social Good

Financial viability of social enterprises

Abstract
Our study presents a model of factors influencing the financial viability of Hungarian social enterprises, and tests the model on a sample of 220 Hungarian firms involved in social entrepreneurship. In the model we suggest that the most important factors for financial viability are the Regulatory environment (the transparency of regulations); the Entrepreneurial attributes of the entrepreneur (business orientation, business skills and experience, business planning tendencies); the Financial support provided by the environment (the ratio of grants, donations and subsidies within the total revenues of the firm); and the Strategy followed by the firms (the presence of such generic strategies as cost leadership or differentiation). We find that only two of the model's four factors are significantly associated with Financial viability: Entrepreneurial attributes and Financial support. The results suggest that the best way of strengthening the viability of social enterprises is through entrepreneurship training (to enhance the business skills and experience of the entrepreneurs, and to propagate business planning), and to provide grants and subsidies to these firms. As no significant association was found between Financial viability and Strategy, we can conclude that the role of market competition is probably relatively week among Hungarian social enterprises.

Econometrics for Social Good

👍 👎 ♥ Save

Functional effects models: Accounting for preference heterogeneity in panel data with machine learning

arXiv250918047v1 stat

Rate this image: 😍 👍 👎

Abstract
In this paper, we present a general specification for Functional Effects Models, which use Machine Learning (ML) methodologies to learn individual-specific preference parameters from socio-demographic characteristics, therefore accounting for inter-individual heterogeneity in panel choice data. We identify three specific advantages of the Functional Effects Model over traditional fixed, and random/mixed effects models: (i) by mapping individual-specific effects as a function of socio-demographic variables, we can account for these effects when forecasting choices of previously unobserved individuals (ii) the (approximate) maximum-likelihood estimation of functional effects avoids the incidental parameters problem of the fixed effects model, even when the number of observed choices per individual is small; and (iii) we do not rely on the strong distributional assumptions of the random effects model, which may not match reality. We learn functional intercept and functional slopes with powerful non-linear machine learning regressors for tabular data, namely gradient boosting decision trees and deep neural networks. We validate our proposed methodology on a synthetic experiment and three real-world panel case studies, demonstrating that the Functional Effects Model: (i) can identify the true values of individual-specific effects when the data generation process is known; (ii) outperforms both state-of-the-art ML choice modelling techniques that omit individual heterogeneity in terms of predictive performance, as well as traditional static panel choice models in terms of learning inter-individual heterogeneity. The results indicate that the FI-RUMBoost model, which combines the individual-specific constants of the Functional Effects Model with the complex, non-linear utilities of RUMBoost, performs marginally best on large-scale revealed preference panel data.

AI Insights

Optuna’s Bayesian search slashes tuning time for functional‑effects regressors.
PyTorch 2’s byte‑code transformation speeds up the deep‑net component by ~30 %.
Biogeme’s mixed‑logit engine benchmarks the new functional‑effects estimates.
SHARE’s socio‑demographic covariates are mapped to individual utilities in the model.
Rumboost’s gradient‑boosted random‑utility framework is fused into FI‑RUMBoost, topping panel accuracy.
Mixed‑effects regression trees embed random effects in tree learners for clustered data.
Neural‑embedded discrete‑choice models give interpretable taste representations alongside functional effects.

👍 👎 ♥ Save

Socio-Economic Model of AI Agents

Abstract
Modern socio-economic systems are undergoing deep integration with artificial intelligence technologies. This paper constructs a heterogeneous agent-based modeling framework that incorporates both human workers and autonomous AI agents, to study the impact of AI collaboration under resource constraints on aggregate social output. We build five progressively extended models: Model 1 serves as the baseline of pure human collaboration; Model 2 introduces AI as collaborators; Model 3 incorporates network effects among agents; Model 4 treats agents as independent producers; and Model 5 integrates both network effects and independent agent production. Through theoretical derivation and simulation analysis, we find that the introduction of AI agents can significantly increase aggregate social output. When considering network effects among agents, this increase exhibits nonlinear growth far exceeding the simple sum of individual contributions. Under the same resource inputs, treating agents as independent producers provides higher long-term growth potential; introducing network effects further demonstrates strong characteristics of increasing returns to scale.

AI for Social Good

👍 👎 ♥ Save

Open Opportunities in AI Safety, Alignment, and Ethics (AI SAE)

Rate this image: 😍 👍 👎

Abstract
AI safety research has emphasized interpretability, control, and robustness, yet without an ethical substrate these approaches may remain fragile under competitive and open-ended pressures. This paper explores ethics not as an external add-on, but as a possible structural lens for alignment, introducing a \emph{moral problem space} $M$: a high-dimensional domain in which moral distinctions could, in principle, be represented in AI systems. Human moral reasoning is treated as a compressed and survival-biased projection $\tilde{M}$, clarifying why judgment is inconsistent while suggesting tentative methods -- such as sparse autoencoders, causal mediation, and cross-cultural corpora -- that might help probe for disentangled moral features. Within this framing, metaethical positions are interpreted as research directions: realism as the search for stable invariants, relativism as context-dependent distortions, constructivism as institutional shaping of persistence, and virtue ethics as dispositional safeguards under distributional shift. Evolutionary dynamics and institutional design are considered as forces that may determine whether ethical-symbiotic lineages remain competitively viable against more autarkic trajectories. Rather than offering solutions, the paper sketches a research agenda in which embedding ethics directly into representational substrates could serve to make philosophical claims more empirically approachable, positioning moral theory as a potential source of hypotheses for alignment work.

Inequality

👍 👎 ♥ Save

Very Generalized LYM Inequality

Hunan University

Abstract
The LYM inequality is a fundamental result concerning the sizes of subsets in a Sperner family. Subsequent studies on the LYM inequality have been generalized to families of $r$-decompositions, where all components are required to avoid chains of the same length. In this paper, we relax this constraint by allowing components of a family of $r$-decompositions to avoid chains of distinct lengths, and derive generalized LYM inequalities across all the relevant settings, including set-theoretic, $q$-analog, continuous analog, and arithmetic analog frameworks. Notably, the bound in our LYM inequalities does not depend on the maximal length of all forbidden chains. Moreover, we extend our approach beyond $r$-decompositions to $r$-multichains, and establish analogous LYM inequalities.

AI Insights

The authors introduce a counting scheme that surpasses classical Sperner bounds by refining chain decompositions.
Combining Sperner’s theorem with Dilworth’s lemma yields a tight upper bound on r‑decompositions.
Counterexamples show the inequality is optimal even for chains of disparate lengths.
The work links combinatorial limits to geometric probability, suggesting new research avenues.
Future directions include extending the method to infinite families and algorithmic matching applications.
For background, see “Matching Theory, an Introduction,” which contextualizes the paper’s techniques.

Female Empowerment

👍 👎 ♥ Save

Estimating the Empowerment of Language Model Agents

Abstract
As language model (LM) agents become more capable and gain broader access to real-world tools, there is a growing need for scalable evaluation frameworks of agentic capability. However, conventional benchmark-centric evaluations are costly to design and require human designers to come up with valid tasks that translate into insights about general model capabilities. In this work, we propose information-theoretic evaluation based on empowerment, the mutual information between an agent's actions and future states, as an open-ended method for evaluating LM agents. We introduce EELMA (Estimating Empowerment of Language Model Agents), an algorithm for approximating effective empowerment from multi-turn text interactions. We validate EELMA on both language games and scaled-up realistic web-browsing scenarios. We find that empowerment strongly correlates with average task performance, characterize the impact of environmental complexity and agentic factors such as chain-of-thought, model scale, and memory length on estimated empowerment, and that high empowerment states and actions are often pivotal moments for general capabilities. Together, these results demonstrate empowerment as an appealing general-purpose metric for evaluating and monitoring LM agents in complex, open-ended settings.

Casual ML for Social Good

👍 👎 ♥ Save

T-TAMER: Provably Taming Trade-offs in ML Serving

Abstract
As machine learning models continue to grow in size and complexity, efficient serving faces increasingly broad trade-offs spanning accuracy, latency, resource usage, and other objectives. Multi-model serving further complicates these trade-offs; for example, in cascaded models, each early-exit decision balances latency reduction against potential accuracy loss. Despite the pervasiveness and importance of such trade-offs, current strategies remain largely heuristic and case-specific, limiting both their theoretical guarantees and general applicability. We present a general framework, T-Tamer, which formalizes this setting as a multi-stage decision process, where the objective is to determine both when to exit and which model to consult. Our main result shows that recall (i.e., the ability to revisit earlier models) is both necessary and sufficient for achieving provable performance guarantees. In particular, we prove that strategies without recall cannot obtain any constant-factor approximation to the optimal trade-off, whereas recall-based strategies provably attain the optimal trade-off in polynomial time. We validate our analysis through experiments on synthetic datasets and early-exit workloads for vision and NLP benchmarks. The results show that recall-based strategies consistently yield efficient accuracy-latency trade-offs. We hope this work provides a principled foundation for bridging heuristic practice with theoretical guarantees in the design of early-exit and cascaded models.

Interests not found

Help us improve your experience!