index rss mastodon twitter github linkedin email
Álvaro Ramírez
sponsor

Álvaro Ramírez

17 July 2024 OCR those buffers

I've written about macosrec before. A tiny macOS command line utility I built to take screenshots or videos of my macOS windows. Sure, there are a gazillion utilities out there, but I wanted my own, so I could bend and integrate with Emacs as needed.

If you've seen me post a screenshot or gif after April 2023, it was likely taken with macosrec.

As of macosrec v0.7.3, OCR was added to the mix. I've also added a couple of dwim-shell-commands (dwim-shell-commands-macos-ocr-text-from-desktop-region and dwim-shell-commands-macos-ocr-text-from-image), so I can do things like:

OCR region

Use the mouse to select a region to OCR.

ocr-region.gif

*This gif area recording was captured via macOS's built-in screencapture.

OCR dired files

Selecting any file (or files) in dired OCRs the whole lot.

ocr-files.gif

*This gif window recording was captured via macosrec.

Invoking dwim-shell-commands-macos-ocr-text-from-image from the current image buffer does the job also.

What about non-macOS users?

The same approach can be used with any other OCR command line tool. dwim-shell-command includes dwim-shell-commands-tesseract-ocr-text-from-image, which uses tesseract.

While I've had more reliable results via macosrec (using macOS's Vision API), I'm sure there are other great alternatives on linux. If you know of one, I'd love to hear.

Available on github

Both macosrec and dwim-shell-command are on GitHub and installable via brew install xenodium/macosrec/macosrec and MELPA respectively.

Unrelated - Want your own blog?

Like this blog? Want to start a blog? Run your blog off a single file. Write from the comfort of Emacs and drag and drop to the web. I'm launching a blogging service at lmno.lol. Looking for early adopters. Get in touch.