[Feature]: GPT-OSS harmony format support
Issue Details
🚀 The feature, motivation and pitch
From the view of API server, GPT-OSS introduces the following features:
- Builtin tool call: tool calls that happens inside chain of thought. It is different from most existing models where tool call only exists in the output to users.
- Harmony: a new text format to represent the chain of thought, tool calls, etc. vLLM needs to implement the parsing between
OpenAI API <-> harmony <-> model input/output tokens
vLLM has basic support of the above features on response API now. But as shown in https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html#harmony-format-support , the response API with streaming, and chat completion is still in an early stage. And help wanted on completing these features!
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Issue Details
[Feature]: GPT-OSS harmony format support
🚀 The feature, motivation and pitch
From the view of API server, GPT-OSS introduces the following features:
- Builtin tool call: tool calls that happens inside chain of thought. It is different from most existing models where tool call only exists in the output to users.
- Harmony: a new text format to represent the chain of thought, tool calls, etc. vLLM needs to implement the parsing between
OpenAI API <-> harmony <-> model input/output tokens
vLLM has basic support of the above features on response API now. But as shown in https://docs.vllm.ai/projects/recipes/en/latest/OpenAI/GPT-OSS.html#harmony-format-support , the response API with streaming, and chat completion is still in an early stage. And help wanted on completing these features!
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.