This is a bold approach to federated inference. I'm building Runfra, a similar 'Uber for GPU' concept but focused specifically on batch-first creative workflows.
My question: How do you handle the 'Quality of Service' (QoS) problem in a free, P2P network? For image generation (like FLUX), a bad node might return 'AI slop' or take 10x longer. Does Mycellm have a verification layer or a reputation system to ensure that contributors aren't just sending back garbage to earn credits?
Good question. Mycellm is the protocol layer: identity, routing, credits, reputation. Similar to how BitTorrent verifies pieces match their hash but doesn't judge whether the content is any good (that's left to trackers, communities, external tools).
Mycellm can verify that inference happened: signed receipts, reputation tracking (success rate, speed, contribution history), admission control that cuts off freeloaders. Each network sets its own policies on top; a private group might trust everything, a public network might add spot-checks or consensus routing.
Directions I'm exploring for public verification: fast-of-N routing, spot-checking with known outputs, consensus at temperature=0 where inference is deterministic. But the verification logic should be pluggable — different networks, different standards. That's a design area where I'd welcome input.
Curious about Runfra — when you say 'Uber for GPU', is that an orchestration/marketplace layer where independent GPU owners sell compute time? And what does 'batch-first creative workflows' look like — image gen pipelines, video, or broader?
That BitTorrent analogy is spot on. For Runfra, I’m mostly focusing on the orchestration and inference pipeline. Trying to fix the waiting problem, but also thinking a lot about yield, like how many actually usable outputs you get per run. Right now, I’m just dogfooding a small cluster of my own 3x 4060s. Since these are basically idle home GPUs, I have the luxury to bias toward quality over latency. If this ever turns into a marketplace, something like Mycellm for node reputation would be a lifesaver. Honesty is definitely the hardest nut to crack in decentralized compute. The batch first idea is really my attempt to kill that “slot machine” workflow where you have to babysit every single prompt. Instead, you just fire off a bunch and walk away. The scoring layer (CLIP / aesthetic filters) acts as an automated quality gate, quietly filtering out the "AI slop" in the background and only surfacing the winners.
Since I need $Temperature > 0$ for that creative randomness, I inevitably get a lot of junk. So I’m essentially treating these idle GPUs as a distributed filtration layer, and trading idle time for guaranteed better outputs.
This is a bold approach to federated inference. I'm building Runfra, a similar 'Uber for GPU' concept but focused specifically on batch-first creative workflows.
My question: How do you handle the 'Quality of Service' (QoS) problem in a free, P2P network? For image generation (like FLUX), a bad node might return 'AI slop' or take 10x longer. Does Mycellm have a verification layer or a reputation system to ensure that contributors aren't just sending back garbage to earn credits?
Good question. Mycellm is the protocol layer: identity, routing, credits, reputation. Similar to how BitTorrent verifies pieces match their hash but doesn't judge whether the content is any good (that's left to trackers, communities, external tools).
Mycellm can verify that inference happened: signed receipts, reputation tracking (success rate, speed, contribution history), admission control that cuts off freeloaders. Each network sets its own policies on top; a private group might trust everything, a public network might add spot-checks or consensus routing.
Directions I'm exploring for public verification: fast-of-N routing, spot-checking with known outputs, consensus at temperature=0 where inference is deterministic. But the verification logic should be pluggable — different networks, different standards. That's a design area where I'd welcome input.
Curious about Runfra — when you say 'Uber for GPU', is that an orchestration/marketplace layer where independent GPU owners sell compute time? And what does 'batch-first creative workflows' look like — image gen pipelines, video, or broader?
That BitTorrent analogy is spot on. For Runfra, I’m mostly focusing on the orchestration and inference pipeline. Trying to fix the waiting problem, but also thinking a lot about yield, like how many actually usable outputs you get per run. Right now, I’m just dogfooding a small cluster of my own 3x 4060s. Since these are basically idle home GPUs, I have the luxury to bias toward quality over latency. If this ever turns into a marketplace, something like Mycellm for node reputation would be a lifesaver. Honesty is definitely the hardest nut to crack in decentralized compute. The batch first idea is really my attempt to kill that “slot machine” workflow where you have to babysit every single prompt. Instead, you just fire off a bunch and walk away. The scoring layer (CLIP / aesthetic filters) acts as an automated quality gate, quietly filtering out the "AI slop" in the background and only surfacing the winners.
Since I need $Temperature > 0$ for that creative randomness, I inevitably get a lot of junk. So I’m essentially treating these idle GPUs as a distributed filtration layer, and trading idle time for guaranteed better outputs.
[dead]