Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Feat/Add DEVELOPER role for openai o1 #10033

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

ymuichiro
Copy link
Contributor

Motivation and Context

Currently, the OpenAI O1 model introduces a new role "developer". However, passing the conventional "system" role results in an error. To address this issue, the following changes are proposed.

Description

  1. Add a Method to the ChatHistory Class

    • Implement a method to handle role conversion or substitution logic for compatibility with the O1 model.
  2. Expand the AuthorRole Enum

    • Add "developer" as a new value to the AuthorRole enum.
  3. Improve Developer Experience (UX)

    • If the "system" role is mistakenly passed, the logic should internally convert it to "developer" for better UX.
    • However, since all models other than O1 still use "system", simply adding support for the "developer" role seems to be the most optimal solution at this point.

Background

  • The O1 model no longer supports the "system" role, causing errors when it is passed.
  • The introduction of the "developer" role must be integrated in a way that preserves compatibility with other models.

Usage

import asyncio

from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.open_ai.prompt_execution_settings.azure_chat_prompt_execution_settings import (
    AzureChatPromptExecutionSettings,
)
from semantic_kernel.connectors.ai.open_ai.services.azure_chat_completion import AzureChatCompletion
from semantic_kernel.contents.chat_history import ChatHistory
from semantic_kernel.kernel import Kernel

async def main():
    kernel = Kernel()
    service_id = "azure-openai"
    kernel.add_service(
        service=AzureChatCompletion(
            service_id=service_id,
            api_key=*********,
            endpoint=*********,
            deployment_name="o1",
            api_version="2024-12-01-preview",
        )
    )

    settings = kernel.get_prompt_execution_settings_from_service_id(service_id=service_id)

    if isinstance(settings, AzureChatPromptExecutionSettings):
        settings.function_choice_behavior = FunctionChoiceBehavior.Auto(auto_invoke=True)

    service = kernel.get_service(service_id=service_id)

    if not isinstance(service, AzureChatCompletion):
        raise Exception("Invalid Value")

    history = ChatHistory()
    history.add_developer_message("You are a helpful assistant.")
    history.add_user_message("hello")

    result = await service.get_chat_message_contents(chat_history=history, settings=settings, kernel=kernel)

    if not result:
        raise Exception("result is None")

    print(result[0].content)

if __name__ == "__main__":
    asyncio.run(main())

Contribution Checklist

@ymuichiro ymuichiro requested a review from a team as a code owner December 24, 2024 01:31
@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Dec 24, 2024
@moonbox3
Copy link
Contributor

Thanks for having a look at this, @ymuichiro. I think it’d be good to make sure other o1 api features are in as well. I saw they have a reasoning_effort param, too. Are there other features we’d want to get in to make sure the o1 experience works well?

@ymuichiro
Copy link
Contributor Author

@moonbox3

Thank you for checking this so quickly. Regarding o1, here are the changes I’m aware of:

※ It might be worth considering providing OpenAIChatReasoningBase as a separate service from OpenAIChatCompletionBase in the future. However, since it seems that the functionality is being gradually integrated, I believe the appropriate place to incorporate these changes for now would be in OpenAIChatPromptExecutionSettings.

o1: 2024-12-17
API: 2024-09-01-preview

  1. In o1, the system role has been deprecated, and a new developer role has been introduced.
  2. In o1, max_tokens has been deprecated and replaced with max_completion_tokens.
    max_completion_tokens: int
    • The value represents the sum of reasoning tokens and completion tokens, with a recommendation to specify 25,000 or more.
  3. A new property, reasoning_effort, has been added in o1.
    reasoning_effort: enum, low, medium, high
    • Higher values indicate that the model will spend more time processing the request.
  4. A new usage metric, reasoning_tokens, has been introduced in o1.
    • This represents the number of tokens used during reasoning.
{
  "completion_tokens": 1843,
  "prompt_tokens": 20,
  "total_tokens": 1863,
  "completion_tokens_details": {
    "audio_tokens": null,
    "reasoning_tokens": 448
  },
  "prompt_tokens_details": {
    "audio_tokens": null,
    "cached_tokens": 0
  }
}

Currently Unsupported

• Parallel tool invocation
temperature, top_p, presence_penalty, frequency_penalty, logprobs, top_logprobs, logit_bias

It would be nice if these were applied to settings, usage, and author_role.

@eavanvalkenburg
Copy link
Member

I agree @moonbox3 bit more work is needed to make sure we fully support o1, these extra settings can just be added with None defaults, but we also need to consider the rendered prompt to ChatHistory parser, that uses a system message as default, that can be left as is, but might throw when used with o1 but might also be usefull to have a look at how we can support this as well!

@ymuichiro
Copy link
Contributor Author

@moonbox3 @eavanvalkenburg

Thank you for your comment. After reviewing the current Semantic Kernel code, I found several instances where AuthorRole.SYSTEM is used as a default value. Since these involve shared classes that are used not only with OpenAI models but also with other models, I believe it is not appropriate to make changes that could impact them. Additionally, to handle these cases comprehensively, it is necessary to determine whether the model in question is a traditional ChatCompletion model or a Reasoning model.

With that in mind, I am planning to proceed with the following tasks. Does this approach make sense to you?

  1. Support for the new role AuthorRole.DEVELOPER

    Add a new inference class to automatically replace AuthorRole.SYSTEM with AuthorRole.DEVELOPER where specified.

    • Additions to model types:

      • Add REASONING to OpenAIModelTypes in semantic_kernel/connectors/ai/open_ai/services/open_ai_model_types.py.
    • Additions to inference classes:

      • Add the following three classes extending OpenAIChatCompletionBase in semantic_kernel/connectors/ai/open_ai/services/open_ai_chat_completion_base.py:
        • OpenAIChatReasoning
        • AzureChatReasoning
      • Add the following class extending AzureAIInferenceBase in semantic_kernel/connectors/ai/azure_ai_inference/services/azure_ai_inference_base.py:
        • AzureAIInferenceChatReasoning

※ Most methods can be reused, so it might be sufficient to simply override some methods of the existing OpenAIChatCompletion (...etc) class.

  1. Addition of new configuration parameters:
    1. max_completion_tokens
    2. reasoning_effort

Regarding item 2, even if certain properties cannot be used, I believe it is the responsibility of the user to ensure only valid properties are passed. Therefore, I plan to set the default values for these properties to None, so they will only be used if explicitly specified.

Furthermore, regarding the new response property reasoning_tokens, since it depends on ChatCompletionChunk from openai.types.chat, I am not including it in the current scope of work. However, as the reasoning result chunks are preserved as inner_content, this value is expected to be included as is in the current implementation.

@moonbox3
Copy link
Contributor

moonbox3 commented Jan 6, 2025

Hi @ymuichiro thanks for the follow up details. Just getting back to the office after holidays - apologies for the slow reply.

As I have a lot of items to go through, let me dig deeper into this today, and I can provide some more comments. As per @eavanvalkenburg's comment above, we do need to think about how the chat history will be handled now that o1 doesn't support the system message -- today one can specify the system message via the constructor, so we'd need to now handle this in a different way.

Another quick thought is: could we have OpenAIReasoningPromptExecutionSettings, which would contain the new properties? That way we're not trying to add too much to the chat prompt execution settings -- nothing is needing to change with the chat models. I also kind of like OpenAIReasoning instead of OpenAIChatReasoning but that might be a personal preference.

Let me circle back soon. And thank you again for all your work on this thus far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants