OCR features added to Google Docs

According to the Google Docs blog, OCR (Optical Character Recognition) has been added to the feature set.

Google Docs now officially supports importing scanned documents. What we launched as an experimental feature for the Documents List Data API last year is now available on the upload page: check the “Convert text from PDF or image files to Google Docs documents”, upload your scanned images (JPEG, GIF, PNG) or PDFs, and Google Docs will extract text and formatting from the scans for you to edit away.


In the past, I’ve used ABBY PDF Transformer with pretty decent results. Now that I’m on a mac though, I have to use it via Parallels.

One trick I’ve been using to alleviate the pain of scanning documents is to use the office copier scanning feature instead of our old (HINT: Slow) desktop scanner. I can bump up the dpi resolution and rapidly scan a number of documents which are then emailed to me as PDF’s.

I’ll be interested in trying out this new Google Docs feature, but some reports are not so exciting. From Web Worker Daily

In my tests, the results aren’t perfect and will nearly always require some editing, but they’re not terrible, either. Obviously the accuracy of the character recognition depends on the the quality and legibility of the files submitted — a high-resolution PDF is likely to yield better results than, say, a low-res scan of a photocopy with lots of images on it. While some reports say that the accuracy of the OCR is only about 90 percent, I would say that as long as you provide clear, legible, high-resolution input files, you should expect much better results than that.

How about you?

Have you tried the Google Docs OCR tool yet and what kind of results did you get?

Have you found an alternative that you are satisfied with?

Related Posts Plugin for WordPress, Blogger...


Author:Craig Berry

Craig Berry is a Catholic web developer and musician.
Connect with him online.