Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Advancing Tool Support in Spring AI #2049

Open
ThomasVitale opened this issue Jan 6, 2025 · 0 comments
Open

[DRAFT] Advancing Tool Support in Spring AI #2049

ThomasVitale opened this issue Jan 6, 2025 · 0 comments

Comments

@ThomasVitale
Copy link
Contributor

ThomasVitale commented Jan 6, 2025

This issue presents ideas and suggestions for consolidating and advancing tool support in Spring AI, before reaching the first GA version.

Context

When talking about tool support, we can identify two main steps: tool registration and tool invocation.

Tool Registration

The goal of this step is to provide all the necessary information for the chat model to decide if and when to call tools, and for the client application to know how to invoke them.

Tool Metadata

When registering a tool, the following metadata are required because they are the ones used by a chat model:

  • Name -> The name of the tool (unique among all the tools made available to a chat model).
  • Description -> Detailed information on what the tool does, allowing the chat model to determine when to call it.
  • Input Type Schema -> The Schema (e.g. JSON Schema/OpenAPI) the tool input should be compliant with.

Optionally, the following information can be provided as well:

  • Execution -> Whether to run the tool sync/async, blocking/non-blocking.
  • Response Format -> How to parse the tool call result into a string to be sent back to the chat model.
  • Return Direct -> Whether the result of the invocation of this tool should be returned directly to the user, instead of passing it back to the chat model.

Tool Source

Methods

In the context of Java, a tool can naturally be modelled as a method.

Objects

In special cases, there might be a wish for modelling tools as objects and register one of their methods for the invocation. One example would when using functional objects (Function/BiFunction/Supplier/Consumer) as tool sources.

Tool Invocation

The goal of this step is to execute the tool identified by a chat model, using the input provided by the model itself.
Observability is an important requirement for this step.

Building The Input

When a chat model determines that a certain tool needs to be called, it provides the tool identifier (the Name defined in the Tool Registration step) and the parameters to call it (compliant with the Input Type Schema defined in the Tool Registration step).
The client application is responsible for building the Java arguments to call the tool based on the input string provided by the model.

Executing The Tool

The client application is responsible for executing the tool, i.e. running the method defined in the Tool Registration step.
The execution should happen based on the Execution mode defined in the Tool Registration step: sync/async, blocking/non-blocking, virtual threads/reactive.

Parsing The Tool Call Result

The client application is responsible for parsing the tool call result based on the response format logic provided in the Tool Registration step, if any, or based on a default strategy.

Concluding The Tool Call

The client application is responsible for sending the tool call result to the chat model or return it directly to the user, if the Return Direct option was enabled in the Tool Registration step.

Design

I'd like to suggest a few changes to the current solution to support the concepts expressed in the previous sections.
I have drafted some of these already, you can check out the following material.

1. Naming

Currently, there is some mixed strategy in naming tool-related APIs. In some cases we use the term function, in other cases we use the term tool. Furthermore, when saying function, it could mean a tool or it could mean a Java Function. I suggest adopting the tool term in all tool-related APIs. That would avoid confusion (is it a generic tool or is it a Java function?) and would align with the naming adopted by most GenAI frameworks.

2. Tool Definition

I would evolve the current FunctionCallback API to a ToolCallback API with the following API. This new API would combine two responsibilities: registering the tool metadata and the invocation handling logic. For now, I'll skip the sync/async aspect that requires a dedicated discussion.

public interface ToolCallback {

    ToolMetadata getToolMetadata();

    String call(String input);

    String call(String input, ToolContext tooContext);
}

And the ToolMetadata API:

public interface ToolMetadata {

    String name();

    String description();

    String inputTypeSchema();

    Boolean returnDirect();
}

3. Schema Generation

I have already covered this topic in #2038, where I also suggest a possible solution.

4. Tool Registration

Let's consider two targets for tool registrations: methods and functional objects (Function/BiFunction/Supplier/Consumer).

In general, I would introduce a new ToolCallbackProvider API to abstract away the specifics of how tools are sourced and support handling more than one tool. That eliminates the need for a FunctionCallback.Builder abstraction, which is hard to maintain considering the very different steps to register tools from different sources.

public interface ToolCallbackProvider {

    ToolCallback[] getToolCallbacks();

}

For example, there could be a MethodToolCallbackProvider and an McpToolCallbackProvider.

Methods

Methods should be the primary way of registering methods, and it should be possible to do it in a type-safe manner. I suggest introducing a @Tool annotation to mark methods in a class that can be made available to chat models.

class MyTools {

    @Tool("Get the list of books written by the given author available in the library")
    List<Book> booksByAuthor(String author) {
        return bookService.getBooksByAuthor(new Author(author));
    }

}

It should be possible to generate ToolCallback instances from @Tool-annotated methods automatically, no matter if they are public, private, protected or package-scoped.

  • An object with @Tool-annotated methods can be processed directly and for each of them a ToolCallback instance is created internally. For example, given a myTools instance of the MyTools class, I can pass it when calling a model via ChatClient or via ChatModel.
@GetMapping("/chat")
String chat(String authorName) {
    return chatClient.prompt()
            .user("What books written by %s are available in the library?".formatted(authorName))
            .tools(myTools)
            .call()
            .content();
}
  • A class registered as a bean with Spring can be dynamically resolved and any method annotated with @Tool would be used to generate ToolCallback instances internally.
@GetMapping("/chat")
String chat(String authorName) {
    return chatClient.prompt()
            .user("What books written by %s are available in the library?".formatted(authorName))
            .tools(MyTools.class)
            .call()
            .content();
}
  • The MyTools.class could also be processed at build-time, and the related ToolCallback instances made available as a "tool catalog" throughout the application lifecycle. In that case, it would also be possible to refer to a tool by its name, and under the hood it would be processed similarly to the previous step.
@GetMapping("/chat")
String chat(String authorName) {
    return chatClient.prompt()
            .user("What books written by %s are available in the library?".formatted(authorName))
            .tools("booksByAuthor")
            .call()
            .content();
}

Functional Objects

It should be possible to generate ToolCallback instances from functional objects (Function/BiFunction/Supplier/Consumer), no matter if they are defined in a configuration class with the @Bean annotation or as @Component-annotated classes.

@Configuration(proxyBeanMethods = false)
class Functions {

    @Bean
    @Description("Get the list of books written by the given author available in the library")
    public Function<Author, List<Book>> booksByAuthor(BookService bookService) {
        return author -> bookService.getBooksByAuthor(author);
    }

}
  • At build time, these functional objects would be registered with the Spring context. At runtime, they could be retrieved by name when matching a requested tool name. The tool handling logic would first check if an existing ToolCallback is already available to a chat model with the given name (no matter the source, being method or functional object). If not, it would try to retrieve a bean from the Spring context named like the tool.
@GetMapping("/chat")
String chat(String authorName) {
    return chatClient.prompt()
            .user("What books written by %s are available in the library?".formatted(authorName))
            .tools("booksByAuthor")
            .call()
            .content();
}

4. Tool Invocation

To be continued...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant