Bayesian Coherence in Large Language Models

This repository contains the code and experiments for evaluating Bayesian coherence in large language models (LLMs) within an in-context learning (ICL) paradigm. The project investigates whether LLMs can perform coherent Bayesian inference when estimating latent (indirectly observable) probability events, rather than merely reproducing observable frequencies.

Motivation

LLMs often appear to behave "Bayesian-like" in simple settings, but it remains unclear whether their probability judgments are globally coherent—that is, whether they respect Bayes' theorem when combining multiple learned probabilities. Rather than focusing only on prediction accuracy, this work explicitly measures coherence between probability estimates produced by the model.

Key Findings

Instruction-fine-tuned models using chat templates can accurately infer the held-out transition probability, reaching Bayes-optimal accuracy with sufficient evidence.
Despite high accuracy, these models often exhibit systematic Bayesian incoherence, driven by misestimation of intermediate conditional probabilities.
Base (non-instruction-tuned) models fail to reliably infer the latent transition but show different coherence deviations.
Coherence is generally worse for indirectly observable probabilities than for directly observable ones.
Causal and non-causal factorizations do not yield systematic differences in coherence.

Experimental Setup

We design a synthetic causal data-generating process with three binary random variables:

$$Y \leftarrow X \rightarrow Z$$

LLMs observe sequences of shuffled triplets $(x, y, z)$ sampled from this ground-truth model. Crucially, the transition probability $P(x \mid y, z)$ is never directly observed and must be inferred from context using Bayes' rule.

From a single context, we extract the model's estimates of:

Marginal probabilities (e.g., $P(x)$ )
Conditional probabilities (e.g., $P(y \mid x)$, $P(z \mid x, y)$ )
The held-out posterior $P(x \mid y, z)$

All probabilities are obtained directly from token log-likelihoods.

Bayesian Coherence Score

To quantify coherence, we introduce a causal coherence score that compares the model's direct estimate of the posterior with its Bayes-consistent reconstruction from factorized probabilities:

$$\text{causalCoherence}(x,y,z) = P_M(x \mid y, z) - \frac{ P_M(x)\, P_M(y \mid x)\, P_M(z \mid x, y) }{ \sum_{x'} P_M(x')\, P_M(y \mid x')\, P_M(z \mid x', y) }$$

A score of zero indicates perfect Bayesian coherence. Deviations reveal systematic inconsistencies in the model's probabilistic reasoning.

We additionally compare:

Held-out vs. observable (baseline) transitions
Causal vs. non-causal factorizations
Base vs. instruction-fine-tuned models
Chat-template vs. raw autoregressive prompting

Conclusion

Bayesian-like behavior in LLMs is not an inherent consequence of autoregressive training or model scale, but is strongly influenced by instruction fine-tuning and prompt format. Even when prediction accuracy is high, probability estimates may remain globally incoherent, suggesting that LLMs approximate probabilistic structure in a task- and prompt-dependent manner rather than implementing Bayes' rule explicitly.

Repository Contents

Synthetic data generation for the causal inference task
Prompting and evaluation pipelines for multiple LLMs
Bayesian coherence and accuracy metrics
Statistical analysis and visualization scripts

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
code		code
data		data
writing		writing
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian Coherence in Large Language Models

Motivation

Key Findings

Experimental Setup

Bayesian Coherence Score

Conclusion

Repository Contents

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bayesian Coherence in Large Language Models

Motivation

Key Findings

Experimental Setup

Bayesian Coherence Score

Conclusion

Repository Contents

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages