[Feature]: GPT-OSS harmony format support

#23217

9 days ago

No assignee

help wantedgood first issuefeature request

heheda12345

opened 9 days ago

Author

From the view of API server, GPT-OSS introduces the following features:

Builtin tool call: tool calls that happens inside chain of thought. It is different from most existing models where tool call only exists in the output to users.
Harmony: a new text format to represent the chain of thought, tool calls, etc. vLLM needs to implement the parsing between OpenAI API <-> harmony <-> model input/output tokens

vLLM has basic support of the above features on response API now. But as shown in https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html#harmony-format-support , the response API with streaming, and chat completion is still in an early stage. And help wanted on completing these features!

No response

No response

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

9 days ago

No assignee

help wantedgood first issuefeature request

#23217

heheda12345

opened 9 days ago

Author

From the view of API server, GPT-OSS introduces the following features:

Builtin tool call: tool calls that happens inside chain of thought. It is different from most existing models where tool call only exists in the output to users.
Harmony: a new text format to represent the chain of thought, tool calls, etc. vLLM needs to implement the parsing between OpenAI API <-> harmony <-> model input/output tokens

No response

No response

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.