New 'Bayesian Teaching' Framework Trains LLMs To Reason Probabilistically With 80% Accuracy, Transfers Across Domains
Summary
A groundbreaking 'Bayesian Teaching' framework is training LLMs to reason probabilistically with 80% accuracy by fine-tuning them on interactions with an optimal Bayesian model, and the learned skills are successfully transferring across entirely different domains like hotels and web shopping.
Key Points
- LLMs struggle to perform Bayesian inference, often defaulting to simple heuristics and failing to update their probabilistic estimates of user preferences across multiple interactions.
- A new 'Bayesian teaching' framework trains LLMs through supervised fine-tuning on interactions with an optimal Bayesian model, enabling them to agree with mathematically ideal probabilistic reasoning 80% of the time.
- Models trained on synthetic flight recommendation data successfully transfer their probabilistic reasoning skills to entirely different domains like hotel and web shopping recommendations, demonstrating genuine generalization of Bayesian logic.