And I also haven’t seen any actual output except for “ ◧ Processing...”, 10 minutes after submitting the video. A progress bar of some sort or maybe timestamp currently being analyzed would be nice.
Also would this work in situations where the video creator is scrolling through a long code file in their IDE? Would the engine show it as one file multiple separate sections?
Hi beans42, thanks for the stress test! You completely caught me red-handed on the UI, but I actually have the raw output for your video now.
The fake console: You are 100% right. The terminal is currently a hardcoded React animation to set the "vibe" while the actual worker chugs in the background. Seeing a "Pie Torch" warning pop up during a hardcore Ben Eater video about ca65/ld65 and 6502 assembly is objectively hilarious. I am ripping that fake logger out today and piping in real status events.
The 10-minute hang: A 21-minute dense technical video is exactly what breaks my current queue. The background worker actually did finish extracting the ld65 memory map configs and the BIOS segments, but the frontend WebSocket connection silently dropped, leaving you stuck on "Processing...". A real progress indicator is my P1 task right now.
To prove the engine actually works: I pulled the completed Markdown extraction for your Ben Eater video from the database. It successfully pulled the exact linker config and the BIOS segment code. I've hosted the raw output here so you can see what it should have returned after 10 minutes:
https://gist.github.com/lmw-dev/d9f276cc0d90c05b7bab5ec0758d...
Scrolling files: You hit on the hardest problem. Right now, it extracts them as separate code blocks based on timestamps. Stitching scrolling frames into a single, deduplicated file without the AI hallucinating is my next major research hurdle.
Seriously, thank you for the roast. This is exactly the kind of feedback I needed to stop focusing on the "vibe" and fix the actual plumbing.
Not sure if this is just decoration for your demo but this fake console output shown on the landing page and every video transcribed feels weird.
04:03:26> [STATUS] Deepgram Nova-3 Engine docked. Ready. 04:03:26> [LOAD] Mounting 'Tech Stack Dictionary v4.0'... 04:03:26> [INFO] Loaded modules: CUDA, Ada Lovelace, M4 Pro. 04:03:26> [STREAM] Ingesting audio stream (48kHz)... 04:03:26> [WARN] Ambiguity detected at 04:12: 'Pie Torch'
The video I submitted (https://youtu.be/0q6Ujn_zNH8) definitely doesn’t mention “Pie Torch”.
And I also haven’t seen any actual output except for “ ◧ Processing...”, 10 minutes after submitting the video. A progress bar of some sort or maybe timestamp currently being analyzed would be nice.
Also would this work in situations where the video creator is scrolling through a long code file in their IDE? Would the engine show it as one file multiple separate sections?
Apart from that, cool project!
Hi beans42, thanks for the stress test! You completely caught me red-handed on the UI, but I actually have the raw output for your video now.
The fake console: You are 100% right. The terminal is currently a hardcoded React animation to set the "vibe" while the actual worker chugs in the background. Seeing a "Pie Torch" warning pop up during a hardcore Ben Eater video about ca65/ld65 and 6502 assembly is objectively hilarious. I am ripping that fake logger out today and piping in real status events.
The 10-minute hang: A 21-minute dense technical video is exactly what breaks my current queue. The background worker actually did finish extracting the ld65 memory map configs and the BIOS segments, but the frontend WebSocket connection silently dropped, leaving you stuck on "Processing...". A real progress indicator is my P1 task right now.
To prove the engine actually works: I pulled the completed Markdown extraction for your Ben Eater video from the database. It successfully pulled the exact linker config and the BIOS segment code. I've hosted the raw output here so you can see what it should have returned after 10 minutes: https://gist.github.com/lmw-dev/d9f276cc0d90c05b7bab5ec0758d...
Scrolling files: You hit on the hardest problem. Right now, it extracts them as separate code blocks based on timestamps. Stitching scrolling frames into a single, deduplicated file without the AI hallucinating is my next major research hurdle.
Seriously, thank you for the roast. This is exactly the kind of feedback I needed to stop focusing on the "vibe" and fix the actual plumbing.
You know this reply is transparently LLM generated, right?
It's what happens when you paste the HN feedback into your vibe session as if you wrote it, then copy the response back here.