University of Southern Ca
Abstract
High-assurance reasoning, particularly in critical domains such as law and
medicine, requires conclusions that are accurate, verifiable, and explicitly
grounded in evidence. This reasoning relies on premises codified from rules,
statutes, and contracts, inherently involving defeasible or non-monotonic logic
due to numerous exceptions, where the introduction of a single fact can
invalidate general rules, posing significant challenges. While large language
models (LLMs) excel at processing natural language, their capabilities in
standard inference tasks do not translate to the rigorous reasoning required
over high-assurance text guidelines. Core reasoning challenges within such
texts often manifest specific logical structures involving negation,
implication, and, most critically, defeasible rules and exceptions. In this
paper, we propose a novel neurosymbolically-grounded architecture called
LOGicalThought (LogT) that uses an advanced logical language and reasoner in
conjunction with an LLM to construct a dual symbolic graph context and
logic-based context. These two context representations transform the problem
from inference over long-form guidelines into a compact grounded evaluation.
Evaluated on four multi-domain benchmarks against four baselines, LogT improves
overall performance by 11.84% across all LLMs. Performance improves
significantly across all three modes of reasoning: by up to +10.2% on negation,
+13.2% on implication, and +5.5% on defeasible reasoning compared to the
strongest baseline.