You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue presents ideas and suggestions for consolidating and advancing tool support in Spring AI, before reaching the first GA version.
Context
When talking about tool support, we can identify two main steps: tool registration and tool invocation.
Tool Registration
The goal of this step is to provide all the necessary information for the chat model to decide if and when to call tools, and for the client application to know how to invoke them.
Tool Metadata
When registering a tool, the following metadata are required because they are the ones used by a chat model:
Name -> The name of the tool (unique among all the tools made available to a chat model).
Description -> Detailed information on what the tool does, allowing the chat model to determine when to call it.
Input Type Schema -> The Schema (e.g. JSON Schema/OpenAPI) the tool input should be compliant with.
Optionally, the following information can be provided as well:
Execution -> Whether to run the tool sync/async, blocking/non-blocking.
Response Format -> How to parse the tool call result into a string to be sent back to the chat model.
Return Direct -> Whether the result of the invocation of this tool should be returned directly to the user, instead of passing it back to the chat model.
Tool Source
Methods
In the context of Java, a tool can naturally be modelled as a method.
Objects
In special cases, there might be a wish for modelling tools as objects and register one of their methods for the invocation. One example would when using functional objects (Function/BiFunction/Supplier/Consumer) as tool sources.
Tool Invocation
The goal of this step is to execute the tool identified by a chat model, using the input provided by the model itself.
Observability is an important requirement for this step.
Building The Input
When a chat model determines that a certain tool needs to be called, it provides the tool identifier (the Name defined in the Tool Registration step) and the parameters to call it (compliant with the Input Type Schema defined in the Tool Registration step).
The client application is responsible for building the Java arguments to call the tool based on the input string provided by the model.
Executing The Tool
The client application is responsible for executing the tool, i.e. running the method defined in the Tool Registration step.
The execution should happen based on the Execution mode defined in the Tool Registration step: sync/async, blocking/non-blocking, virtual threads/reactive.
Parsing The Tool Call Result
The client application is responsible for parsing the tool call result based on the response format logic provided in the Tool Registration step, if any, or based on a default strategy.
Concluding The Tool Call
The client application is responsible for sending the tool call result to the chat model or return it directly to the user, if the Return Direct option was enabled in the Tool Registration step.
Design
I'd like to suggest a few changes to the current solution to support the concepts expressed in the previous sections.
I have drafted some of these already, you can check out the following material.
Currently, there is some mixed strategy in naming tool-related APIs. In some cases we use the term function, in other cases we use the term tool. Furthermore, when saying function, it could mean a tool or it could mean a Java Function. I suggest adopting the tool term in all tool-related APIs. That would avoid confusion (is it a generic tool or is it a Java function?) and would align with the naming adopted by most GenAI frameworks.
2. Tool Definition
I would evolve the current FunctionCallback API to a ToolCallback API with the following API. This new API would combine two responsibilities: registering the tool metadata and the invocation handling logic. For now, I'll skip the sync/async aspect that requires a dedicated discussion.
I have already covered this topic in #2038, where I also suggest a possible solution.
4. Tool Registration
Let's consider two targets for tool registrations: methods and functional objects (Function/BiFunction/Supplier/Consumer).
In general, I would introduce a new ToolCallbackProvider API to abstract away the specifics of how tools are sourced and support handling more than one tool. That eliminates the need for a FunctionCallback.Builder abstraction, which is hard to maintain considering the very different steps to register tools from different sources.
For example, there could be a MethodToolCallbackProvider and an McpToolCallbackProvider.
Methods
Methods should be the primary way of registering methods, and it should be possible to do it in a type-safe manner. I suggest introducing a @Tool annotation to mark methods in a class that can be made available to chat models.
classMyTools {
@Tool("Get the list of books written by the given author available in the library")
List<Book> booksByAuthor(Stringauthor) {
returnbookService.getBooksByAuthor(newAuthor(author));
}
}
It should be possible to generate ToolCallback instances from @Tool-annotated methods automatically, no matter if they are public, private, protected or package-scoped.
An object with @Tool-annotated methods can be processed directly and for each of them a ToolCallback instance is created internally. For example, given a myTools instance of the MyTools class, I can pass it when calling a model via ChatClient or via ChatModel.
@GetMapping("/chat")
Stringchat(StringauthorName) {
returnchatClient.prompt()
.user("What books written by %s are available in the library?".formatted(authorName))
.tools(myTools)
.call()
.content();
}
A class registered as a bean with Spring can be dynamically resolved and any method annotated with @Tool would be used to generate ToolCallback instances internally.
@GetMapping("/chat")
Stringchat(StringauthorName) {
returnchatClient.prompt()
.user("What books written by %s are available in the library?".formatted(authorName))
.tools(MyTools.class)
.call()
.content();
}
The MyTools.class could also be processed at build-time, and the related ToolCallback instances made available as a "tool catalog" throughout the application lifecycle. In that case, it would also be possible to refer to a tool by its name, and under the hood it would be processed similarly to the previous step.
@GetMapping("/chat")
Stringchat(StringauthorName) {
returnchatClient.prompt()
.user("What books written by %s are available in the library?".formatted(authorName))
.tools("booksByAuthor")
.call()
.content();
}
Functional Objects
It should be possible to generate ToolCallback instances from functional objects (Function/BiFunction/Supplier/Consumer), no matter if they are defined in a configuration class with the @Bean annotation or as @Component-annotated classes.
@Configuration(proxyBeanMethods = false)
classFunctions {
@Bean@Description("Get the list of books written by the given author available in the library")
publicFunction<Author, List<Book>> booksByAuthor(BookServicebookService) {
returnauthor -> bookService.getBooksByAuthor(author);
}
}
At build time, these functional objects would be registered with the Spring context. At runtime, they could be retrieved by name when matching a requested tool name. The tool handling logic would first check if an existing ToolCallback is already available to a chat model with the given name (no matter the source, being method or functional object). If not, it would try to retrieve a bean from the Spring context named like the tool.
@GetMapping("/chat")
Stringchat(StringauthorName) {
returnchatClient.prompt()
.user("What books written by %s are available in the library?".formatted(authorName))
.tools("booksByAuthor")
.call()
.content();
}
4. Tool Invocation
To be continued...
The text was updated successfully, but these errors were encountered:
This issue presents ideas and suggestions for consolidating and advancing tool support in Spring AI, before reaching the first GA version.
Context
When talking about tool support, we can identify two main steps: tool registration and tool invocation.
Tool Registration
The goal of this step is to provide all the necessary information for the chat model to decide if and when to call tools, and for the client application to know how to invoke them.
Tool Metadata
When registering a tool, the following metadata are required because they are the ones used by a chat model:
Optionally, the following information can be provided as well:
Tool Source
Methods
In the context of Java, a tool can naturally be modelled as a method.
Objects
In special cases, there might be a wish for modelling tools as objects and register one of their methods for the invocation. One example would when using functional objects (
Function
/BiFunction
/Supplier
/Consumer
) as tool sources.Tool Invocation
The goal of this step is to execute the tool identified by a chat model, using the input provided by the model itself.
Observability is an important requirement for this step.
Building The Input
When a chat model determines that a certain tool needs to be called, it provides the tool identifier (the Name defined in the Tool Registration step) and the parameters to call it (compliant with the Input Type Schema defined in the Tool Registration step).
The client application is responsible for building the Java arguments to call the tool based on the input string provided by the model.
Executing The Tool
The client application is responsible for executing the tool, i.e. running the method defined in the Tool Registration step.
The execution should happen based on the Execution mode defined in the Tool Registration step: sync/async, blocking/non-blocking, virtual threads/reactive.
Parsing The Tool Call Result
The client application is responsible for parsing the tool call result based on the response format logic provided in the Tool Registration step, if any, or based on a default strategy.
Concluding The Tool Call
The client application is responsible for sending the tool call result to the chat model or return it directly to the user, if the Return Direct option was enabled in the Tool Registration step.
Design
I'd like to suggest a few changes to the current solution to support the concepts expressed in the previous sections.
I have drafted some of these already, you can check out the following material.
1. Naming
Currently, there is some mixed strategy in naming tool-related APIs. In some cases we use the term function, in other cases we use the term tool. Furthermore, when saying function, it could mean a tool or it could mean a Java Function. I suggest adopting the tool term in all tool-related APIs. That would avoid confusion (is it a generic tool or is it a Java function?) and would align with the naming adopted by most GenAI frameworks.
2. Tool Definition
I would evolve the current
FunctionCallback
API to aToolCallback
API with the following API. This new API would combine two responsibilities: registering the tool metadata and the invocation handling logic. For now, I'll skip the sync/async aspect that requires a dedicated discussion.And the
ToolMetadata
API:3. Schema Generation
I have already covered this topic in #2038, where I also suggest a possible solution.
4. Tool Registration
Let's consider two targets for tool registrations: methods and functional objects (
Function
/BiFunction
/Supplier
/Consumer
).In general, I would introduce a new
ToolCallbackProvider
API to abstract away the specifics of how tools are sourced and support handling more than one tool. That eliminates the need for aFunctionCallback.Builder
abstraction, which is hard to maintain considering the very different steps to register tools from different sources.For example, there could be a
MethodToolCallbackProvider
and anMcpToolCallbackProvider
.Methods
Methods should be the primary way of registering methods, and it should be possible to do it in a type-safe manner. I suggest introducing a
@Tool
annotation to mark methods in a class that can be made available to chat models.It should be possible to generate
ToolCallback
instances from@Tool
-annotated methods automatically, no matter if they are public, private, protected or package-scoped.@Tool
-annotated methods can be processed directly and for each of them aToolCallback
instance is created internally. For example, given amyTools
instance of theMyTools
class, I can pass it when calling a model viaChatClient
or viaChatModel
.@Tool
would be used to generateToolCallback
instances internally.MyTools.class
could also be processed at build-time, and the relatedToolCallback
instances made available as a "tool catalog" throughout the application lifecycle. In that case, it would also be possible to refer to a tool by its name, and under the hood it would be processed similarly to the previous step.Functional Objects
It should be possible to generate
ToolCallback
instances from functional objects (Function
/BiFunction
/Supplier
/Consumer
), no matter if they are defined in a configuration class with the@Bean
annotation or as@Component
-annotated classes.ToolCallback
is already available to a chat model with the given name (no matter the source, being method or functional object). If not, it would try to retrieve a bean from the Spring context named like the tool.4. Tool Invocation
To be continued...
The text was updated successfully, but these errors were encountered: