A new version of ResearchHub is available.Try it now
Paper
Document
Submit new version
Download
Flag content
1

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Authors
Shunyu Yao,Dian Yu
Jeffrey Zhao,Izhak Shafran,Thomas L. Griffiths,Yuan Cao
+4 authors
,Karthik Narasimhan
Published
May 19, 2023
License
cc-by
Show more
Save
TipTip
Document
Submit new version
Download
Flag content
1
TipTip
Save
Document
Submit new version
Download
Flag content
Tree of Thoughts: Deliberate Problem Solving
with Large Language Models
Shunyu Yao
Princeton University
Dian Yu
Google DeepMind
Jeffrey Zhao
Google DeepMind
Izhak Shafran
Google DeepMind
Thomas L. Griffiths
Princeton University
Yuan Cao
Google DeepMind
Karthik Narasimhan
Princeton University
Abstract
Language models are increasingly being deployed for general problem solving
across a wide range of tasks, but are still confined to token-level, left-to-right
decision-making processes during inference. This means they can fall short in
tasks that require exploration, strategic lookahead, or where initial decisions play
a pivotal role. To surmount these challenges, we introduce a new framework for
language model inference, “Tree of Thoughts” (ToT), which generalizes over the
popular “Chain of Thought” approach to prompting language models, and enables
exploration over coherent units of text (“thoughts”) that serve as intermediate steps
toward problem solving. ToT allows LMs to perform deliberate decision making
by considering multiple different reasoning paths and self-evaluating choices to
decide the next course of action, as well as looking ahead or backtracking when
necessary to make global choices. Our experiments show that ToT significantly
enhances language models’ problem-solving abilities on three novel tasks requiring
non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords.
For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only
solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all
prompts: https://github.com/ysymyth/tree-of-thought-llm.
1 Introduction
Originally designed to generate text, scaled-up versions of language models (LMs) such as GPT [ 22 ,
23 , 1, 20] and PaLM [5] have been shown to be increasingly capable of performing an ever wider
range of tasks requiring mathematical, symbolic, commonsense, and knowledge reasoning. It is
perhaps surprising that underlying all this progress is still the original autoregressive mechanism for
generating text, which makes token-level decisions one by one and in a left-to-right fashion. Is such
a simple mechanism sufficient for a LM to be built toward a general problem solver? If not, what
problems would challenge the current paradigm, and what should be alternative mechanisms?
The literature on human cognition provides some clues to answer these questions. Research on “dual
process” models suggests that people have two modes in which they engage with decisions – a fast,
automatic, unconscious mode (“System 1”) and a slow, deliberate, conscious mode (“System 2”)
[ 27 , 28, 13 , 12 ]. These two modes have previously been connected to a variety of mathematical
models used in machine learning. For example, research on reinforcement learning in humans and
other animals has explored the circumstances under which they engage in associative “model free”
learning or more deliberative “model based” planning [6 ]. The simple associative token-level choices
of LMs are also reminiscent of “System 1”, and thus might benefit from augmentation by a more
deliberate “System 2” planning process that (1) maintains and explores diverse alternatives for current
Preprint. Under review.
arXiv:2305.10601v1 [cs.CL] 17 May 2023
100%