According to a recent LangChain blog post, agent planning remains a significant challenge for developers working with large-scale language models (LLMs). This article details the complexities of planning and inference, current modifications, and future expectations for agent planning.
What exactly do we mean by planning and reasoning?
The agent’s planning and reasoning involves the ability of the LLM to decide on a course of action based on the information available to it. This includes both short-term and long-term steps. The LLM evaluates all available data and decides on the first step to take immediately, and then takes subsequent actions.
Most developers use function calls to let LLM choose tasks. Function calls, first introduced by OpenAI in June 2023, allow developers to provide JSON schemas for various functions, so LLM can match the output to these schemas. Function calls are helpful for immediate tasks, but long-term planning is still a significant challenge, as LLM must consider longer time horizons while managing short-term tasks.
Current fixes to improve agent planning
One of the simplest solutions is to ensure that LLMs have all the information they need to make inferences and plan appropriately. Often, the prompts given to LLMs do not provide enough information to make rational decisions. Adding a search step or clarifying the prompt instructions can greatly improve the results.
Another recommendation is to change the cognitive architecture of the application. Cognitive architectures can be categorized into general and domain-specific architectures. General architectures such as “plan and solve” and Reflexion architectures provide a general approach to better reasoning. However, these architectures may be too general for practical use, so domain-specific cognitive architectures are preferred.
General Purpose vs. Domain Specific Cognitive Architectures
General-purpose cognitive architectures aim to improve reasoning generally and can be applied to any task. For example, the “plan and solve” architecture involves first making a plan and then executing each step. The reflexion architecture involves a reflection phase to evaluate the accuracy after completing the task.
Domain-specific cognitive architectures, on the other hand, are tailored to a specific task. They often include domain-specific classification, routing, and validation steps. The AlphaCodium paper demonstrates this as a flow engineering approach, specifying steps such as coming up with a test, finding a solution, and repeating more tests. This method is very specific to the problem at hand and may not be applicable to other tasks.
Why are domain-specific cognitive architectures so useful?
Domain-specific cognitive architectures help by providing explicit guidance, either through immediate instructions or hard-coded transitions in the code. This approach effectively removes some of the planning responsibilities from the LLM, allowing engineers to handle the planning aspects. For example, in the AlphaCodium example, the steps are predefined to guide the LLM through the process.
Almost all advanced agents in production are highly domain-specific and utilize custom cognitive architectures. LangChain makes it easier to build these custom architectures with LangGraphs, which are designed for high controllability. This is essential for building reliable custom cognitive architectures.
The Future of Planning and Reasoning
The LLM space has been evolving rapidly, and this trend is expected to continue. General-purpose inference will be further integrated into the model layer, making models more intelligent and able to handle larger contexts. However, there will always be a need to provide specific guidance to agents, whether through prompts or custom cognitive architectures.
LangChain is optimistic about the future of LangGraph, believing that as LLMs improve, the need for tailored architectures will persist, especially for task-specific agents. The company is committed to improving the controllability and robustness of these architectures.
Image source: Shutterstock