Tree of Thought: A New Approach for Language Models' Decision Making

Category Science

tldr #

Tree-of-Thought (ToT) is an approach by OpenAI to improve Language Models' non-trivial task solving abilities and decision making by considering multiple simultaneous reasoning paths. This approach achieved a success rate of 74% on the Game of 24 compared to only 4% with standard prompting, and allows for more versatile and challenging tasks to be considered in the real-world applications.

content #

Tree of thought allows multiple step analysis like chain of thought and allows multiple comparisons of different multiple step analysis. Tree of thought allows increased options after each step and allows the system to restart the at the first or earlier steps to look again for new options. It then finds the best option after multiple searchs for best of different analytical options. Tree of Thoughts (ToT) allows LMs (Language Models) to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.

ToT requires more resources like GPT-4 API Cost than standard sampling methods, but offers more flexibility and is suitable for customization of performance-cost tradeoffs.

For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Tree of Thought is an improvement of basic input and output, chain of thought and self-consistency with chain of thought. The Tree-of-Thought approach extends existing planning formulations by considering multiple potentially feasible plans simultaneously at each problem-solving step, and proceeding with the most promising ones. The integration between thought sampling and value feedback organically integrates planning and decision-making mechanisms, enabling effective search inside a solution tree. Traditional decision-making procedures usually require training dedicated reward and policy models as in reinforcement learning whereas we use the LM itself to provide the value estimates for decision making.

ToT uses value feedback and organic integration of thought sampling and planning mechanism to enable more effective search within a solution tree.

The Tree-of-Thought formulation is more versatile and handles challenging tasks on which GPT-4 only achieves very low accuracy with standard prompts. Deliberate search such as ToT might not be necessary for many existing tasks that GPT-4 already excels at, and as an initial step this work only explores three relatively simple tasks that challenges GPT-4 and calls of better search and planning abilities incorporated with LMs. However, as we begin to deploy LMs for more real-world decision making applications (e.g. coding, data analysis, robotics, etc.), more complex tasks could emerge and present new opportunities to study these research questions. Also, search methods like ToT requires more resources (e.g. GPT-4 API cost) than sampling methods in order to improve task performances, but the modular flexibility of ToT allows users to customize such performance-cost tradeoffs.

In a recent experiment on Game of 24, GPT-4 with chain-of-thought prompting only solved 4% of the tasks while Tree of Thought achieved a success rate of 74%.

hashtags #
worddensity #