Link Search Menu Expand Document

OCR (Optical Character Recognition) for images and scanned PDFs

We’ve added a new feature into Overview: Optical Character Recognition (OCR). That means you can upload scanned PDFs and Overview will automatically read the text from them.

You can enable OCR during import:

OCR options

If you set the OCR option, Overview will use OCR automatically on every page that has fewer than 100 characters of searchable text. When OCR is needed it will make your uploads a lot slower, but you will need to OCR them anyway before you can search them, and you can’t beat the convenience.

Overview uses the open source Tesseract for OCR. Sometimes Tesseract produces lower quality output than other OCR engines, such as the one included in Adobe Acrobat Pro. If you’ve already OCR’d your documents using another program, Overview will just read the previously created text.