When a new software fuzzer with thorough orchestration appears, there’s a flood of bugs discovered and a lot of excitement. The excitement is always well deserved, but it doesn’t change the fact that that’s realistically only managed to solve the easiest part of the process.
There’s a competition, Binary Golf Grand Prix (BGGP), for which BGGP3 involves finding a crashing input, demonstrating control of PC, hijacking control of output, authoring a patch that is accepted, and producing a writeup with points-based scoring system.
Go ahead. Read the scope of the challenge. That’s the job experts are capable of _for fun_.
It’s not an LLM benchmark suite; it’s the baseline gamified end-to-end task for those that actually know what needs to be done for cyber. You’re lucky if an LLM can get you a non-duplicate first step that’s not directly in the examples or other write-ups.
Of course, an expert can drive it end-to-end successfully a bit easier now. Just like with a new fuzzer.
If my grandma can ask Mythos to find a SQLi vulnerability that’s wildly impressive if it succeeds. It doesn’t change the fact that she has no idea what to do next. That’s chaos, not weaponization. And chaos just means more job security for cyber, not less. Spend enough time in cyber and you’ll know branded chaos is a regular thing and not much to be worried about.
Remember when the NSA released Ghidra and the barrier to professional reverse engineering tools wasn’t a $30k IDA license and everyone was gonna be a reverse engineer finding bugs? The hype at the time was insane, and there was chaos, and there was more bugs found. And that was that. Now we have Ghidra which is impressive and I use it.
I’m personally quite excited for what Mythos is claimed to be. It’s great news for me as a defender.
Sure. They have a secret AI called “Mythos” that is claimed to have the mythological power of remotely controlling every server in the world. Someone got high sniffing their own farts. But it’ll surely help with the IPO anyway if newspapers can’t be bothered to fact check.
https://archive.ph/j9nGv
When a new software fuzzer with thorough orchestration appears, there’s a flood of bugs discovered and a lot of excitement. The excitement is always well deserved, but it doesn’t change the fact that that’s realistically only managed to solve the easiest part of the process.
There’s a competition, Binary Golf Grand Prix (BGGP), for which BGGP3 involves finding a crashing input, demonstrating control of PC, hijacking control of output, authoring a patch that is accepted, and producing a writeup with points-based scoring system.
https://binary.golf/3/
Go ahead. Read the scope of the challenge. That’s the job experts are capable of _for fun_.
It’s not an LLM benchmark suite; it’s the baseline gamified end-to-end task for those that actually know what needs to be done for cyber. You’re lucky if an LLM can get you a non-duplicate first step that’s not directly in the examples or other write-ups.
Of course, an expert can drive it end-to-end successfully a bit easier now. Just like with a new fuzzer.
If my grandma can ask Mythos to find a SQLi vulnerability that’s wildly impressive if it succeeds. It doesn’t change the fact that she has no idea what to do next. That’s chaos, not weaponization. And chaos just means more job security for cyber, not less. Spend enough time in cyber and you’ll know branded chaos is a regular thing and not much to be worried about.
Remember when the NSA released Ghidra and the barrier to professional reverse engineering tools wasn’t a $30k IDA license and everyone was gonna be a reverse engineer finding bugs? The hype at the time was insane, and there was chaos, and there was more bugs found. And that was that. Now we have Ghidra which is impressive and I use it.
I’m personally quite excited for what Mythos is claimed to be. It’s great news for me as a defender.
>branded chaos
I'm gonna use this term from now on, thanks.
Sure. They have a secret AI called “Mythos” that is claimed to have the mythological power of remotely controlling every server in the world. Someone got high sniffing their own farts. But it’ll surely help with the IPO anyway if newspapers can’t be bothered to fact check.