This story was originally published on HackerNoon at:
https://hackernoon.com/local-llms-need-more-than-openai-compatible-endpoints.
Respawn is a stateful OpenAI Responses API gateway for local LLMs, adding stored responses, tools, streaming, files and observability to Ollama.
Check more stories related to machine-learning at:
https://hackernoon.com/c/machine-learning.
You can also check exclusive content about
#ai,
#llm,
#open-source,
#ollama,
#self-hosted-ai,
#api,
#openai,
#local-ai, and more.
This story was written by:
@robertomanfreda. Learn more about this writer by checking
@robertomanfreda's about page,
and for more stories, please visit
hackernoon.com.
Local LLM servers are great at generating tokens, but modern clients expect more than inference: state, lifecycle endpoints, streaming shape, tool protocol, files, and metrics. Respawn is an open-source gateway that sits in front of Ollama/self-hosted backends and adds OpenAI Responses API semantics locally.