There's zero percent chance that I would proxy all my LLM calls with my API key through some third party service. However, if it was self-hostable, so that I can ensure it is only able to reach the LLM providers, I could see deploying this behind an LLM provider router. If it actually achieves the kind of token use reduction that is advertised, that would be worth paying for - especially in the enterprise. I'm skeptical of using it for product integrations, where prompts are tuned for effectiveness and efficiency, but for ad-hoc usage it probably doesn't matter too much if the phrasing affects the results a bit.
Hi! You only need our API for the compression part — API keys and LLM usage are entirely managed by your own application. We don't have access to your SaaS, and we don't even know its name. We simply receive the text through our API, compress it, and return the response to your app. Your LLM — whether local, OpenAI, Claude, or any other — then processes it using your own API keys. Your data stays safe with you. And we NEVER ask for your LLM API keys. Let me know if you have any question :)
I'm sure I'm not the only one hesitant to provide a 3rd party virtually MITM access to both my LLM usage + API keys. If this were capable of running locally, or even just an API for compressing non-sensitive parts of a prompt, I think it would be much easier to adopt.
Hi!
You only need our API for the compression part — API keys and LLM usage are entirely managed by your own application. We don't have access to your SaaS, and we don't even know its name. We simply receive the text through our API, compress it, and return the response to your app. Your LLM — whether local, OpenAI, Claude, or any other — then processes it using your own API keys. Your data stays safe with you. And we NEVER ask for your LLM API keys. Let me know if you have any question :)
No. You only need our API key for the compression step. Your LLM keys and usage stay entirely in your own app — we never see them. We receive text, compress it, and return it. Your LLM (local, OpenAI, Claude, or any other) then processes it with your own keys. We don't even know your app's name.
AgentReady is an OpenAI-compatible proxy. You swap your base_url, and every prompt gets compressed before hitting the LLM — 40-60% fewer tokens, same responses, same streaming.
It uses a deterministic rule-based engine (not another LLM call): removes filler words, simplifies verbose constructions, strips redundant connectors. ~5ms overhead.
Works with any OpenAI-compatible SDK: Python, Node, LangChain, LlamaIndex, CrewAI, Vercel AI SDK.
There's zero percent chance that I would proxy all my LLM calls with my API key through some third party service. However, if it was self-hostable, so that I can ensure it is only able to reach the LLM providers, I could see deploying this behind an LLM provider router. If it actually achieves the kind of token use reduction that is advertised, that would be worth paying for - especially in the enterprise. I'm skeptical of using it for product integrations, where prompts are tuned for effectiveness and efficiency, but for ad-hoc usage it probably doesn't matter too much if the phrasing affects the results a bit.
Hi! You only need our API for the compression part — API keys and LLM usage are entirely managed by your own application. We don't have access to your SaaS, and we don't even know its name. We simply receive the text through our API, compress it, and return the response to your app. Your LLM — whether local, OpenAI, Claude, or any other — then processes it using your own API keys. Your data stays safe with you. And we NEVER ask for your LLM API keys. Let me know if you have any question :)
I'm sure I'm not the only one hesitant to provide a 3rd party virtually MITM access to both my LLM usage + API keys. If this were capable of running locally, or even just an API for compressing non-sensitive parts of a prompt, I think it would be much easier to adopt.
Hi! You only need our API for the compression part — API keys and LLM usage are entirely managed by your own application. We don't have access to your SaaS, and we don't even know its name. We simply receive the text through our API, compress it, and return the response to your app. Your LLM — whether local, OpenAI, Claude, or any other — then processes it using your own API keys. Your data stays safe with you. And we NEVER ask for your LLM API keys. Let me know if you have any question :)
Wouldn't the example code:
provide you our OpenAI key (via the X-Upstream-API-Key header)?Do you need my OpenAI / Claude API keys?
No. You only need our API key for the compression step. Your LLM keys and usage stay entirely in your own app — we never see them. We receive text, compress it, and return it. Your LLM (local, OpenAI, Claude, or any other) then processes it with your own keys. We don't even know your app's name.
AgentReady is an OpenAI-compatible proxy. You swap your base_url, and every prompt gets compressed before hitting the LLM — 40-60% fewer tokens, same responses, same streaming.
It uses a deterministic rule-based engine (not another LLM call): removes filler words, simplifies verbose constructions, strips redundant connectors. ~5ms overhead.
Works with any OpenAI-compatible SDK: Python, Node, LangChain, LlamaIndex, CrewAI, Vercel AI SDK.
Free during beta, no credit card: https://agentready.cloud/hn
Python: pip install agentready-sdk && agentready init
Happy to answer any technical questions.
Why nobody created that before, clever approach i will give it a try for my next saas thank you
That’s nice awesome idea!