Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core] Problems with the JSON Schema generation for tools #2038

Open
ThomasVitale opened this issue Jan 4, 2025 · 0 comments
Open

[Core] Problems with the JSON Schema generation for tools #2038

ThomasVitale opened this issue Jan 4, 2025 · 0 comments

Comments

@ThomasVitale
Copy link
Contributor

ThomasVitale commented Jan 4, 2025

Expected Behavior

When using tools, the JSON code generated by the framework to send to the model or to call a tool is always correct and compliant.

Current Behavior

There are a few issues in the current behaviour for generating JSON Schemas when using tools.

  1. Depending on whether you define a tool as a Function or as a Method, the framework uses two different libraries for generating the JSON Schema. For Functions/Suppliers/Consumers, the jsonschema-generator library is used, which is also the one used for supporting Structured Outputs. For Methods, the jackson-module-jsonSchema library is used. This difference creates some inconsistencies, causing tools to succeed or fail based on whether they are defined as functions or as methods because of the different results (sometimes) of the two libraries.
  2. The jackson-module-jsonSchema library is a risky dependency since it's not maintained anymore and won't be developed further. It doesn't support the latest drafts of the JSON Schema Specs, and it won't be included in the upcoming Jackson 3.x, as described in the README file of the project. It also includes a recommendation for users to migrate to another library.
  3. Independently from whether tools are defined as Functions or Methods, when Lists are used as input arguments, the generated schema is incomplete (it's missing the type of the items in the list). The consequence varies depending on the model integration. OpenAI fails right away on the first call, returning an error message like the following: Invalid schema for function '<myFunction>': In context=('properties', 'myList'), array schema missing items. Ollama doesn't complain, but then it's Jackson failing when trying to deserialise the JSON data into a Java class, because the schema is not correct (com.fasterxml.jackson.databind.exc.MismatchedInputException: Cannot deserialize value of type 'java.util.ArrayList<java.lang.Object>' from String value (token 'JsonToken.VALUE_STRING')). Examples are available here and here (check the /chat/function/list and /chat/method/list endpoints). The jsonschema-generator library has a solution for this problem, but jackson-module-jsonSchema doesn't (as you can see in Incomplete type definition for array/Collection properties FasterXML/jackson-module-jsonSchema#45).

I experimented with other solutions and settled on one based on the jsonschema-generator library, with an implementation that generates the JSON Schema correctly also for collections/arrays. See MethodToolCallback.java. A working example is available here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant