This is a tough moment. Claude is simultaneously becoming substantially more expensive, substantially less reliable (single 9 of reliability), and substantially less performant. It's really hard to justify the cost of a subscription over there right now.
There was another thread where some people pointed out, Amazon will give you access to Claude with better uptime for the same price (per million tokens up / down), downside is, it does not have the native ability to browse the web, but maybe that's a hidden blessing, since it's less likely to read some random website that has prompt injection embedded into it.
For coding its fine, I havent experimented too much with Amazon Bedrock myself, but I just might soon to check for any limitations.
Maybe the best play is to set up a routing system locally so that when claude.ai is down it automatically switches to Amazon billing and switches back when it comes back up
Correct, but in the case that they brought it up, their employer was on a enterprise license, which is still pay per token. The subscription will eventually go away in some way, or cost way more than it does.
From an economics perspective, it makes sense to make it more expensive if you're having trouble keeping up with demand for a service. It'll be tough getting used to because it was so nice and cheap
On the other hand, it was somewhat expected that we would have a correction for the prices. Hopefully after this correction things will be more stable and we won't have to worry too much about future price increases
YMMV. I would still be very happy with Claude if it hard failed on 20% of tasks. You can always come back to it.
I say this as someone working for a tech company who does not have to foot the bill (in the >$1k per month bracket)
I also experienced and accept the 1990s levels of unreliability, which is my “internet generation”. My first access was lifting a handset and placing on a speaker/mic cradle.
Programmers these days are fucking spoiled. If it’s $220 worth of value for $200 - I get it. But I’m getting $100k of value for $10k and so I’ll put up with some shit.
Not to mention substantially less open. I've been using an OpenAI subscription in Pi Agent for a couple weeks now and it's great. And from what I can tell, 5.5 is a heck of a model.
Interestingly, yeah, I can see that this would really cut into your subscription usage with the 5 hour rate limit windows...
I am an API user, and while it being down is super annoying, it isn't really as big of a hit to my overall usage as I can just prepare a bunch of stuff to run in parallel when it does come back up.
Plus, they've dumbed down their models to the point where the value just isn't there like it was. If I have to go in and clean up after it, or constantly wrestle with it through prompts, what's the point? Just spending $200 a month to be frustrated at a machine.
It's lazy, does not take ownership and responsibility, wants to defer work, and I have to force it to check reality. It likes to guess and assume it's correct and I am wrong. Agents.md is not helping at all. It's in full enshittification phase, yay!
Single nine has good vibes bro. It means when the service is up the results are better. I read about it in a blog. The model hallucinates way less. Even less than grok
At work we have unlimited use of models from Anthropic and OpenAI (for now). My coworker, a Claude Code Opus 4.6 diehard, stopped by my desk today to say he finally installed Codex to try 5.5 and his feedback was basically “it just works and does what I ask and it doesn’t disconnect and it’s just so very matter of fact.” “Yeah I’ve been telling you this since like gpt-5 man!” “I know I know…” I have not spent much time with the recent Sonnet and Opus models, but from my experience using Sonnet 4 for 3 months all day everyday (no handwritten code) last summer to make a large Playwright suite was — using Claude Code and those models becomes more about using Claude Code than doing things with it. Codex CLI with the gpt-5 family is ambient and reliable. It’s not orange, there is no little sprite guy, emojis, whimsy, and humor. But I do things with it and they land working in first edits. I also can keep the same session for days and the context doesn’t ever seem to be an issue. Maybe Claude 6 will be earth shattering and I’ll use that. It’s not Coke or Pepsi loyalty I just want to get stuff done.
At least palantir is open about their villainy I guess, they make no attempts to pull the wool over your eyes. So you at least know that you are for sure getting in bed with the bad guys if you go with them
Lately I've been using claude mainly to design plans and do code reviews while Codex does all the implementation. Having two very different models helps to work out any weird quirks one might have.
Whenever Claude goes down I relax with a nice jar of Newman's own pasta sauce. It's just zesty enough for me to dip bread in or make pasta. You name it
Not so much a reply to the post, as a comment on the comments in this post.
It's starting to feel like a lot of comments on here and other social media outlets that are anecdotal about their experience with x model and y tool are astroturfing. They add almost zero value to the conversation.
These is a multi-billion dollar market and battleground, so im skeptical of anyone telling me that this isn't happening at a decent clip. I think moderators on the site should definitely consider how to approach this because it's devaluating this space as a place for actual discourse.
My mind also considers that this being one of Altman's old stomping grounds, he may place a higher value in winning here than elsewhere.
GitHub is a long running business with a mature software stack running into scaling issues while they move to Azure and becoming Microsoft-ified. Claude is a new company in a new market with an extremely fast growing userbase running relatively novel AI infrastructure with a business model they are still figuring out.
Not trying to argue with you, GitHub (the core product) seems to have been in maintenance mode since the acquisition.
I couldn’t find any public data on GitHub, but Google Trends shows a sharp increase starting in December.
That could be in part to people complaining about the outages, but more people than ever are writing code with AI.
Hence the parallel to Eternal September – code volume is up, quality is down, and programming is never going to return to how it was (difficult for “normal” people to interface with).
There was a top comment on a thread the other day where the guy said, if the AI is offline, it's a higher value activity for him to go for a walk for an hour than to try and read the code the AI wrote. At least with the walk, he comes back refreshed.
So happy to have diversified my model providers this past couple of weeks. GPT-5.5 has had no trouble slotting into Opus workloads. Will be fun to try out more of the models as time goes on to build some resiliency into my engineering workflows :).
I think if codex can fill in some functional gaps that shouldn’t be that huge - like having defined agents in plugins like Claude code - it’s actually a preferable product. It’s faster in every way, seems to manage context a lot better - compaction isn’t a completely end of world event to be avoided at all costs. With the addition of defined thinking and the fact it actually seems to follow tool calling instructions, it’s handler for permissions, and other features it’s frankly a better tool overall. 5.5 seems to be a reasonable model.
Anthropic seems to have really killed their advantage by squandering the immense good will they built up by blundering over and over again the last few months with the developer community.
Tonight, for instance, after the incident had recovered, I restarted my work. On my Max account my usage period completely exhausted in 4 minutes of sonnet subagent work. This was long after prime time, and the workload was a fraction what I normally do.
These days I run codex concurrently and have gotten my marketplaces and plugins and MCPs adapted to it - other than the agents which I do lean heavily on - and generally find it a capable replacement. Anthropic needs to take notice and get their house in order.
What I found was that I *strongly* preferred Claude Code with its defaults. Codex was almost unusable to me -- It would spit out a 4-5 page plan where it kept repeating itself, where Claude would give me a crisp 1-2 pager I could actually review.
*But* I don't work with the defaults -- I work with my own prompt framework based off of superpowers.
Given sufficient prompt scaffolding, I've found the models relatively interchangeable -- _I might_ be getting some of this for free by basing my own system off of superpowers which is used across various harnesses -- In other words achieving this kind of portability may be a lot harder than it looks and I'm benefiting from other people's work.
The problem I ran into was, using the workflow I use with claude, the code that being written wasn't good, missing edged cases, incomplete.
After reviewing the code, I also found it was annoying to get GPT 5.4 to actually fix the code based on my prompts compared with opus. I had to be far more specific and direct (which is related then to missing edge cases, complete, etc).
I haven't really shared what I use, I'm still deciding if that's something I want to do.
To get an idea of what I'm talking about, you could install https://github.com/obra/superpowers/ into both Codex and Claude Code -- You'll find that the behavior is remarkably similar if you A/B compare them on the same problems. CC occasionally misses things that Codex gets and vice versa.
Overall the output structure and final code is remarkably similar... Which is pretty different than if you just run them with their default system prompts. I'd throw codex out the window with its default outputs.
codex. codex is also pretty garbage compared with claude code. The permissioning system in claude code with auto mode is now pretty fantastic. With codex the only vaugely usable mode is yolo mode which is bad for obvious reasons.
Anybody else double fist Codex/Claude? They both code, solve problems, and find bugs in unique ways. I find using both is more useful than using either alone. I have them code review each others work, it's great.
I have max plans for both and over the past 5+ months now have built a custom "agent swarm" orchestrator with a database backed API and several skills CLI that the agents use to deliver orchestrated software factory runs.
We can use several different topologies (2 or 3 agents, etc.) but currently primarily use pair programming teams consisting of an opus4.7 for implementation and a codex5.5 for plan and code reviews, with a codex5.5 run-manager that pushes the agent lanes along and keeps things moving if they get stuck or escalate reviews to run-manager decisions.
Escalation to run-manager is a pretty regular thing as Codex5.5 generally picky and thorough and opus4.7 pushes back at times, and after three codex rejections we allow opus4.7 to escalate to run-manager decision to settle it. Usually, opus4.7 agrees and will continue iterating but it's not unusual that it will push back and escalate.
I've found codex5.5 is extremely capable. I just now finished a large multi-phase orchestrated swarm run with codex5.5 (xhigh) as the run-manager, presiding over 8 paired lanes, with 8 opus4.7 (high) implementers and 8 codex5.5 (high) reviewers, so 16 agents orchestrated and working in a swarm together. Codex5.5 managed that run perfectly for 14 hours with zero intervention needed by me.
Overall, I prefer to let opus4.7 draft the plans and then let codex5.5 offer git-diff style change feedback on plans, then let opus implement and codex review/manage. This seems to get the best result for me.
Yea, I'm similar I think in that Claude has better style/architecture/design, while Codex is a more critical reviewer, but also writes more complex code that just works not caring as much about the bigger picture - together they work pretty well. I don't run any swarms though, I could easily see them ping/pong on the most simple feature almost endlessly if I let them. How do you review all the code being generated?
It's a lot to review since adding the AI workflows, but bottom line is I'm not in a race, I've been working on the same repo since 2019 and I generally don't add too much at once and just take my time. But, I'll admit, I'm a lot more careful about backend schema, services, testing, API design, CLI design, etc., while not being too overly worried about frontend items. This particular long run was focused on building frontend UI for backend that has been painstakingly built. This time, I used the claude.ai/design for a large amount of UI planning for a backend that is ready for it. Then I just let the swarm handle it with our orchestration tool, since it's frontend. Then, just test it in the browser and iterate on what needs changed.
When Claude is making "0 mistakes", all of his work is 100% done by Claude, therefore "coding is solved!" and we have more time to go on podcasts to tell everyone about it.
However, when there is an incident it is immediately "human error", not Claude.
> Can’t they prompt Mythos to give them better uptime?
Anthropic is currently "vibe coding" the situation right now.
With the TPU deal with Google and their relationship with Amazon they will have access compute coming online.
I worked with 4.6 and found some improvements for better planning and sustained us, but agree some posters 4.7 is slower, overthinking.
What I expect is frontier models to get bigger and more expensive (especially fast mode like on Cerberus). And most of his get much smaller distillations for the more generous subscription tiers.
i think they broke billing for new api users - I just signed up yesterday for the api and am a max customer. Even though i paid and waited 24 hours, I am unable to access claude api at all and never used any of the banned bots on the account.
- API key auth works (/v1/models returns full list)
- Console shows $20 credit balance
- Spend limit is $100/month, $0 used
- Two different API keys generated in this org both fail with same error, I only have 1 org.
request_id req_011CaZLqnqEWcA8nGSVoB7dc
I emailed support have not heard back from a human.
I don't really mind hopping between claude/codex/glm/kimi except I don't know a good way to resume as session across agent harnesses.
Normally I'd just have it write out what it's doing to a file, if I need to transfer context, but if it goes down mid-session that's a no-go.
I think people have built tools for this, and of course you could reasonably vibe one yourself, but I don't really trust something like that to work reliably or in an ongoing manner.
> Anthropic have blocked usage of your subscription however with third party harnesses.
This is the main reason I use different harnesses, but I also expect (could be wrong) codex is better with codex harness (due to training on it's specific tools) than with other harnesses. I use opencode for everything that's not claude/codex.
Ahh good point -- I've handled this by switching my harness to `pi` but recognize that may not be for everyone and doesn't directly address OP's question.
I've been on the $200 plan for 3 months, but this will be my last month. I got great use out of 4.5 for a while, but 4.6 felt like a half step back (conflated with all the random hidden config changes during it), and 4.7 is genuinely terrible.
It's impossible to tell these days whether 4.7 is stuck because it's thinking and Anthropic suppressed all output (seriously, 4.7 will just start making changes without explaining any reasoning - how is that an upgrade?) or because the underlying infrastructure is having issues.
4.5 -> 4.7 feels like going from working with a coach-able, junior engineer that does well with clear guidance to working with a cocky mid-level that will spend too long on pointless tangents and make confidently incorrect changes without any discussion.
You too can avoid Claude AWOL by subscribing to our premium package, Claude VACATION. Claude will show up on time, and give you at least one months notice before any outages.
I've been a Codex devotee since around last August. I don't know why everyone is so bonkers about Claude Code. It's not the only belle at the ball. Codex is rock solid.
This is a tough moment. Claude is simultaneously becoming substantially more expensive, substantially less reliable (single 9 of reliability), and substantially less performant. It's really hard to justify the cost of a subscription over there right now.
There was another thread where some people pointed out, Amazon will give you access to Claude with better uptime for the same price (per million tokens up / down), downside is, it does not have the native ability to browse the web, but maybe that's a hidden blessing, since it's less likely to read some random website that has prompt injection embedded into it.
For coding its fine, I havent experimented too much with Amazon Bedrock myself, but I just might soon to check for any limitations.
Maybe the best play is to set up a routing system locally so that when claude.ai is down it automatically switches to Amazon billing and switches back when it comes back up
I’m pretty sure it has the ability to browse the web.
It can use playwright, web fetch, etc…
I use bedrock at work and Claude subscription at home. They are pretty much exactly the same in my experience
Or do you mean the Claude in chrome plugin? Bedrock doesn’t have that, but in my experience it doesn’t work that well.
Neither does the Claude managed agents or ultra plan.
They likely refer to "WebSearch", not "WebFetch" (and the original statement is not correct).
But that's just paying per use right, not with the subscription which is way better value
Correct, but in the case that they brought it up, their employer was on a enterprise license, which is still pay per token. The subscription will eventually go away in some way, or cost way more than it does.
From an economics perspective, it makes sense to make it more expensive if you're having trouble keeping up with demand for a service. It'll be tough getting used to because it was so nice and cheap
On the other hand, it was somewhat expected that we would have a correction for the prices. Hopefully after this correction things will be more stable and we won't have to worry too much about future price increases
the prices will slowly increase until enough people actually stop paying for it.
YMMV. I would still be very happy with Claude if it hard failed on 20% of tasks. You can always come back to it.
I say this as someone working for a tech company who does not have to foot the bill (in the >$1k per month bracket)
I also experienced and accept the 1990s levels of unreliability, which is my “internet generation”. My first access was lifting a handset and placing on a speaker/mic cradle.
Programmers these days are fucking spoiled. If it’s $220 worth of value for $200 - I get it. But I’m getting $100k of value for $10k and so I’ll put up with some shit.
> If it’s $220 worth of value for $200 - I get it.
Wrong comparison. If a competitor gives you $230 of value for $200, of course you shouldn't pick the $220 one
Well, you can get a much bigger portion for much cheaper next door, but taste is hard to quantify.
or just use codex...
[dead]
[flagged]
We used to describe our startup as having 5 8’s of uptime
Not to mention substantially less open. I've been using an OpenAI subscription in Pi Agent for a couple weeks now and it's great. And from what I can tell, 5.5 is a heck of a model.
I'm either extremely lucky or Dario ran the direct fiber to my house because I have never had it go down in any meaningful way..
Is this just the API and I'm too much of luddite to actually use the API?
Dude dario definitely ran the fiber straight to your place personally. Everything is fine and this is such a good thing.
Interestingly, yeah, I can see that this would really cut into your subscription usage with the 5 hour rate limit windows...
I am an API user, and while it being down is super annoying, it isn't really as big of a hit to my overall usage as I can just prepare a bunch of stuff to run in parallel when it does come back up.
Don't say single nine, it sounds ugly and bad.
Say five eights of reliability. Maybe six.
We're talking about Claude, not GitHub...
that would be eight fives...
Plus, they've dumbed down their models to the point where the value just isn't there like it was. If I have to go in and clean up after it, or constantly wrestle with it through prompts, what's the point? Just spending $200 a month to be frustrated at a machine.
It's lazy, does not take ownership and responsibility, wants to defer work, and I have to force it to check reality. It likes to guess and assume it's correct and I am wrong. Agents.md is not helping at all. It's in full enshittification phase, yay!
Single nine has good vibes bro. It means when the service is up the results are better. I read about it in a blog. The model hallucinates way less. Even less than grok
At work we have unlimited use of models from Anthropic and OpenAI (for now). My coworker, a Claude Code Opus 4.6 diehard, stopped by my desk today to say he finally installed Codex to try 5.5 and his feedback was basically “it just works and does what I ask and it doesn’t disconnect and it’s just so very matter of fact.” “Yeah I’ve been telling you this since like gpt-5 man!” “I know I know…” I have not spent much time with the recent Sonnet and Opus models, but from my experience using Sonnet 4 for 3 months all day everyday (no handwritten code) last summer to make a large Playwright suite was — using Claude Code and those models becomes more about using Claude Code than doing things with it. Codex CLI with the gpt-5 family is ambient and reliable. It’s not orange, there is no little sprite guy, emojis, whimsy, and humor. But I do things with it and they land working in first edits. I also can keep the same session for days and the context doesn’t ever seem to be an issue. Maybe Claude 6 will be earth shattering and I’ll use that. It’s not Coke or Pepsi loyalty I just want to get stuff done.
Plus one. I held off for too long out of concern that it wouldn't stack up and I'd come right back to Anthropic.
I switched a few days ago and work has been much less frustrating. Feels like CC did back in February before they started playing games.
It also doesn't eat nearly as many tokens, so it's saving me $100/mo.
I'm still not switching after Altman jumped on the cyberpunk totalitarian contract with the government
well apparently Dario did the same thing with Mythos - ethics for the big AI labs is mainly posturing
I wish palantir had their own model. Now that's a business I can get behind. Until then I have to use grok I guess.
At least palantir is open about their villainy I guess, they make no attempts to pull the wool over your eyes. So you at least know that you are for sure getting in bed with the bad guys if you go with them
Lately I've been using claude mainly to design plans and do code reviews while Codex does all the implementation. Having two very different models helps to work out any weird quirks one might have.
I appreciate the objective anecdote, but obviously Coke™.
Whenever Claude goes down I relax with a nice jar of Newman's own pasta sauce. It's just zesty enough for me to dip bread in or make pasta. You name it
Honestly, I gotta agree, I find that I get way more frustrated with Claude recently than Codex.
I mean we just need technological advancements so we can get this hosted locally for people
Not so much a reply to the post, as a comment on the comments in this post.
It's starting to feel like a lot of comments on here and other social media outlets that are anecdotal about their experience with x model and y tool are astroturfing. They add almost zero value to the conversation.
These is a multi-billion dollar market and battleground, so im skeptical of anyone telling me that this isn't happening at a decent clip. I think moderators on the site should definitely consider how to approach this because it's devaluating this space as a place for actual discourse.
My mind also considers that this being one of Altman's old stomping grounds, he may place a higher value in winning here than elsewhere.
more folks should be skeptical
Between GitHub and Claude, it seems Eternal December[0][1] is upon us.
[0]I say December, because that's around the time the models got good enough that non-AI folks started to notice.
[1]https://en.wikipedia.org/wiki/Eternal_September
GitHub is a long running business with a mature software stack running into scaling issues while they move to Azure and becoming Microsoft-ified. Claude is a new company in a new market with an extremely fast growing userbase running relatively novel AI infrastructure with a business model they are still figuring out.
I don't really blame Anthropic here.
Not trying to argue with you, GitHub (the core product) seems to have been in maintenance mode since the acquisition.
I couldn’t find any public data on GitHub, but Google Trends shows a sharp increase starting in December.
That could be in part to people complaining about the outages, but more people than ever are writing code with AI.
Hence the parallel to Eternal September – code volume is up, quality is down, and programming is never going to return to how it was (difficult for “normal” people to interface with).
I use openai team plan whenever its down because its down so much lol
I built a hangout space to chill out in and chat to others while Claude is down (which is happening wayyy more often): https://clawdpenguin.com
There's a live Claude status board in the corner so you know when it's time to get back to work.
Because you can't work without it?
Yikes.
There was a top comment on a thread the other day where the guy said, if the AI is offline, it's a higher value activity for him to go for a walk for an hour than to try and read the code the AI wrote. At least with the walk, he comes back refreshed.
That's absolutely terrifying.
It's a great way to network and find companies that will need a ton of help in 6 months
Oh great. My Max account has been borked for days, and now they will never get to it with everything else burning down.
https://github.com/anthropics/claude-code/issues/54497
So happy to have diversified my model providers this past couple of weeks. GPT-5.5 has had no trouble slotting into Opus workloads. Will be fun to try out more of the models as time goes on to build some resiliency into my engineering workflows :).
I think if codex can fill in some functional gaps that shouldn’t be that huge - like having defined agents in plugins like Claude code - it’s actually a preferable product. It’s faster in every way, seems to manage context a lot better - compaction isn’t a completely end of world event to be avoided at all costs. With the addition of defined thinking and the fact it actually seems to follow tool calling instructions, it’s handler for permissions, and other features it’s frankly a better tool overall. 5.5 seems to be a reasonable model.
Anthropic seems to have really killed their advantage by squandering the immense good will they built up by blundering over and over again the last few months with the developer community.
Tonight, for instance, after the incident had recovered, I restarted my work. On my Max account my usage period completely exhausted in 4 minutes of sonnet subagent work. This was long after prime time, and the workload was a fraction what I normally do.
These days I run codex concurrently and have gotten my marketplaces and plugins and MCPs adapted to it - other than the agents which I do lean heavily on - and generally find it a capable replacement. Anthropic needs to take notice and get their house in order.
I found GPT 5.4 terrible. I just tested 5.5 and compared with opus its still not great.
What I found was that I *strongly* preferred Claude Code with its defaults. Codex was almost unusable to me -- It would spit out a 4-5 page plan where it kept repeating itself, where Claude would give me a crisp 1-2 pager I could actually review.
*But* I don't work with the defaults -- I work with my own prompt framework based off of superpowers.
Given sufficient prompt scaffolding, I've found the models relatively interchangeable -- _I might_ be getting some of this for free by basing my own system off of superpowers which is used across various harnesses -- In other words achieving this kind of portability may be a lot harder than it looks and I'm benefiting from other people's work.
The problem I ran into was, using the workflow I use with claude, the code that being written wasn't good, missing edged cases, incomplete.
After reviewing the code, I also found it was annoying to get GPT 5.4 to actually fix the code based on my prompts compared with opus. I had to be far more specific and direct (which is related then to missing edge cases, complete, etc).
I lack a bit of context. Can you point me to a place that explains what you use?
I haven't really shared what I use, I'm still deciding if that's something I want to do.
To get an idea of what I'm talking about, you could install https://github.com/obra/superpowers/ into both Codex and Claude Code -- You'll find that the behavior is remarkably similar if you A/B compare them on the same problems. CC occasionally misses things that Codex gets and vice versa.
Overall the output structure and final code is remarkably similar... Which is pretty different than if you just run them with their default system prompts. I'd throw codex out the window with its default outputs.
In what harness?
codex. codex is also pretty garbage compared with claude code. The permissioning system in claude code with auto mode is now pretty fantastic. With codex the only vaugely usable mode is yolo mode which is bad for obvious reasons.
Anybody else double fist Codex/Claude? They both code, solve problems, and find bugs in unique ways. I find using both is more useful than using either alone. I have them code review each others work, it's great.
I have max plans for both and over the past 5+ months now have built a custom "agent swarm" orchestrator with a database backed API and several skills CLI that the agents use to deliver orchestrated software factory runs.
We can use several different topologies (2 or 3 agents, etc.) but currently primarily use pair programming teams consisting of an opus4.7 for implementation and a codex5.5 for plan and code reviews, with a codex5.5 run-manager that pushes the agent lanes along and keeps things moving if they get stuck or escalate reviews to run-manager decisions.
Escalation to run-manager is a pretty regular thing as Codex5.5 generally picky and thorough and opus4.7 pushes back at times, and after three codex rejections we allow opus4.7 to escalate to run-manager decision to settle it. Usually, opus4.7 agrees and will continue iterating but it's not unusual that it will push back and escalate.
I've found codex5.5 is extremely capable. I just now finished a large multi-phase orchestrated swarm run with codex5.5 (xhigh) as the run-manager, presiding over 8 paired lanes, with 8 opus4.7 (high) implementers and 8 codex5.5 (high) reviewers, so 16 agents orchestrated and working in a swarm together. Codex5.5 managed that run perfectly for 14 hours with zero intervention needed by me.
Overall, I prefer to let opus4.7 draft the plans and then let codex5.5 offer git-diff style change feedback on plans, then let opus implement and codex review/manage. This seems to get the best result for me.
Yea, I'm similar I think in that Claude has better style/architecture/design, while Codex is a more critical reviewer, but also writes more complex code that just works not caring as much about the bigger picture - together they work pretty well. I don't run any swarms though, I could easily see them ping/pong on the most simple feature almost endlessly if I let them. How do you review all the code being generated?
It's a lot to review since adding the AI workflows, but bottom line is I'm not in a race, I've been working on the same repo since 2019 and I generally don't add too much at once and just take my time. But, I'll admit, I'm a lot more careful about backend schema, services, testing, API design, CLI design, etc., while not being too overly worried about frontend items. This particular long run was focused on building frontend UI for backend that has been painstakingly built. This time, I used the claude.ai/design for a large amount of UI planning for a backend that is ready for it. Then I just let the swarm handle it with our orchestration tool, since it's frontend. Then, just test it in the browser and iterate on what needs changed.
I get Claude to run Codex. You can do it that way, but not the other way around.
It also fits nicely. Claude plans better, and Codex has way higher limits.
This is insane. I have to move to Codex now.
codex works but code it spits out is still not as clean as opus.
But Boris declared coding is solved. How is this possible? Can’t they prompt Mythos to give them better uptime?
When Claude is making "0 mistakes", all of his work is 100% done by Claude, therefore "coding is solved!" and we have more time to go on podcasts to tell everyone about it.
However, when there is an incident it is immediately "human error", not Claude.
> Can’t they prompt Mythos to give them better uptime?
Anthropic is currently "vibe coding" the situation right now.
Odd time for Claude to go down since it's not peak work hours.
Maybe they target certain types of infra rollouts for non-peak hours?
"But humans do it too as they just cool off and check out for the rest of the day."
It's fine for Claude to be unavailable when there is no work at these hours. However, the problem is Claude gave no notice.
At this rate, Claude being unavailable every day is no better than a human on a 9 - 5 working day job.
no it's not, it's always work hours somewhere
Almost like "Claude is only down because it's too busy" is cope and the reality is it's vibe coded trash.
They are about to lose the second 9
99.02 % uptime
98.68
Ouch.
Seems like a healthy human temperature to me. Maybe AGI has finally arrived.
With the TPU deal with Google and their relationship with Amazon they will have access compute coming online.
I worked with 4.6 and found some improvements for better planning and sustained us, but agree some posters 4.7 is slower, overthinking.
What I expect is frontier models to get bigger and more expensive (especially fast mode like on Cerberus). And most of his get much smaller distillations for the more generous subscription tiers.
Yup, major outage on all platforms.
https://status.claude.com/
i think they broke billing for new api users - I just signed up yesterday for the api and am a max customer. Even though i paid and waited 24 hours, I am unable to access claude api at all and never used any of the banned bots on the account.
- API key auth works (/v1/models returns full list) - Console shows $20 credit balance - Spend limit is $100/month, $0 used - Two different API keys generated in this org both fail with same error, I only have 1 org.
request_id req_011CaZLqnqEWcA8nGSVoB7dc
I emailed support have not heard back from a human.
I think we've hit the "who's the most reliable" stage for these tools.
We can now shop around easily. They almost all do the same thing now. The models are "Just Enough".
I don't really mind hopping between claude/codex/glm/kimi except I don't know a good way to resume as session across agent harnesses.
Normally I'd just have it write out what it's doing to a file, if I need to transfer context, but if it goes down mid-session that's a no-go.
I think people have built tools for this, and of course you could reasonably vibe one yourself, but I don't really trust something like that to work reliably or in an ongoing manner.
Maybe it should just be a skill.
switch to Pi.dev or any other multi model harness, you can switch between models every message if you feel like it.
Anthropic have blocked usage of your subscription however with third party harnesses.
> Anthropic have blocked usage of your subscription however with third party harnesses.
This is the main reason I use different harnesses, but I also expect (could be wrong) codex is better with codex harness (due to training on it's specific tools) than with other harnesses. I use opencode for everything that's not claude/codex.
You might search for a concept like `/handoff` that's in ampcode. I'm sure someone's built a skill for just this.
That's not going to work if the service is down, however.
Ahh good point -- I've handled this by switching my harness to `pi` but recognize that may not be for everyone and doesn't directly address OP's question.
self hosted honcho (or other memory systems) and an api agnostic harness gets you most the way there.
you never rely on agent's own memory system. create memory system of your own
kilocode allows you to switch between models mid session!
Yeah - it's switching agents (harnesses) that's the hard part!
The models are already commoditized; if this affects you, you should probably fix your stack.
Still, it's pretty crazy that Claude is down to 1 nine.
Every day I am so much happier that I decided to go fully local for my needs.
what model do you use?
Qwen 3.6 35b a3b
[unknown] missing EndStreamResponse for every Claude Design request, is anyone else facing the same?
Claude design is still half down - after a couple of prompts it fails:
[unknown] missing EndStreamResponse
I am getting [unknown] missing EndStreamResponse for each request on Claude Code
I've been on the $200 plan for 3 months, but this will be my last month. I got great use out of 4.5 for a while, but 4.6 felt like a half step back (conflated with all the random hidden config changes during it), and 4.7 is genuinely terrible.
It's impossible to tell these days whether 4.7 is stuck because it's thinking and Anthropic suppressed all output (seriously, 4.7 will just start making changes without explaining any reasoning - how is that an upgrade?) or because the underlying infrastructure is having issues.
4.5 -> 4.7 feels like going from working with a coach-able, junior engineer that does well with clear guidance to working with a cocky mid-level that will spend too long on pointless tangents and make confidently incorrect changes without any discussion.
4.6 has been excellent when used through the API. But I am right with you on 4.7, so I have been sticking to 4.6.
So Claude decided to take the rest of the day off without notice or giving a scheduled time off?
Many such cases with humans (given that we continue to compare LLMs to humans these days which you cannot)
Introducing: Claude AWOL
You too can avoid Claude AWOL by subscribing to our premium package, Claude VACATION. Claude will show up on time, and give you at least one months notice before any outages.
It would be hilarious if instead of ending the world, Claude just gives up and shuts itself down.
Does Anthropic publish postmortems for incidents like these?
how can someone run business on top of their APIs
Working for me now
Humans are unavailable from time to time also.
Code isn’t working on my app. Chats work fine.
I can't even sign in, however.
Is signing in actually necessary though? Think of the investors guys.
More fuel to get 27b running
For what it’s worth, I moved to Codex GPT-5.5 Xhigh Fast in the desktop app and it’s been fantastic.
I've been a Codex devotee since around last August. I don't know why everyone is so bonkers about Claude Code. It's not the only belle at the ball. Codex is rock solid.
Here is one source that agrees: https://artificialanalysis.ai/models/capabilities/coding
Hoping to get another rest of limits.
Now... I have to cook dinner....
Hurry! It'll be back up any minute now!
I personally broke it by admonishing it for fucking up its last revision to my project.
wtf all days same shit
[dead]
[flagged]
I'm feeling a bit sorry for Anthropic. This last month must be very tough on them.
Oh no won’t someone please think about the trillion dollar corporation!
Arent they just on the hook for trillion dollars