Show HN: Every 4s, Familiar OCRs my screen into Markdown (open source, local)

(github.com)

1 points | by talsraviv 6 hours ago ago

6 comments

$talsraviv 6 hours ago

Hey! Tal here. Maxim and I created Familiar to capture our screen (and clipboard) every few seconds and save it as markdown. That way our local agent can use that as context (through a cron, skill, or slash command).
Our early testers are showing us how to use it:
- Update Claude's skills/memory based on their workday (in a scheduled task/heartbeat)
- Typing "help me with what I'm working on right now" without having to prompt/describe
- Enriching meeting transcripts with what was actually on screen (and vice versa)
- Forking it into their coaching app so coaches can see what learners did between sessions
- Someone new to tech used Familiar during a trial week at a YC startup, so that AI could coach him every few hours (and got the job)
We've consistently seen AI use Familiar context like a "router" layer. Recently, my agent "saw" that I spent a long time on a document, so it fetched the full doc directly. We've also seen the agent traverse the markdown, then decide to fetch the original image (so cool).
Familiar uses Apple's native OCR, deletes screenshot images after 48 hours, and redacts passwords/credit card numbers/SSNs/API tokens/etc. We'd love contributions on what else to block: https://github.com/familiar-software/familiar/tree/main/src/... or in general ways to improve privacy.
We stand on the shoulders of giants: screenpipe, rewind, dayflow, etc. Since then: 1) Local agents got good at handling massive amounts of messy text files 2) Local agents have their own memory and skills systems
Familiar is our "bitter lesson" version: just hand over context and get out of the way. The right way to do that piece is open source/free/offline.
$gisanokharu 5 hours ago

screen capture every 4 seconds is a lot of data accumulation for agent context. how are you handling sensitive information appearing on screen. are you filtering anything before it hits the markdown, or is that on the user to manage
[-]
- $maxvovshin 5 hours ago
  
  Hey, Maxim here, co creator of familiar together with Tal. So I’ve been running it for a few months now, it occupies a total of 400 MB so far, which is really low. There are many files but they are tiny. We’ll probably introduce additional cleanup mechanisms after a year or so.
  Regarding sensitive information, we allow adding apps into a black list which prevents from screenshots being taken when those apps appear on the screen. Additionally we have redaction mechanisms for all screenshots that remove secrets, tokens and credit card numbers. You can check the code for all patterns we detect and if anything is missing feel free to open an issue.
  [-]
  - $maxvovshin 4 hours ago
    
    regarding the handling of the large context, you'll be surprised how well agents can filter those files with simple bash commands. also, as models get better and have larger window context, it gets much better at handling and findings all of the relevant context.
$rechadkkk 6 hours ago

interesting idea, is this only for Apple, could I try on Windows?
[-]
- $maxvovshin 5 hours ago
  
  Hey, Maxim here, co creator or Familiar together with Tal.
  Right now it’s only for Apple since we utilize apple specific APIs.