OCR - 光学字符识别 (Optical Character Recognition)

20. September 2021 作者 Carrie Page 更新于 05. November 2024

What is OCR?

OCR is short for 光学字符识别. This process is used to recognize the visual representation of text, for example in an image. Based on this, OCR can create actual text that can then be edited, copied, changed, etc. It works very well with typed and printed text, and only on very rare occasions with handwritten text.

How optical character recognition works?

OCR can work in two ways: one character at a time or one word at a time. The former is the one most commonly used since the latter requires the language to separate words using a space.

In the beginning, OCR processors were trained to recognize single characters in a specific font. By now, most sans and serif fonts are known to and can be recognized by OCR. Even crooked scans and images that are not 100% straight are interpreted fairly well. This is thanks to the pre-processing many OCR programs do. It includes deskewing and despeckling, turning the scan or image into grayscale, and more.

Optical character recognition use cases

Why would you even need or want to use OCR? Here are a few common use cases:

Create notes based on lecture and presentation slides you took a photo of
Grab text from documents that were scanned as images
Digitize your paperwork and make it searchable for invoice numbers or the like

How to use OCR

Go to the PDF to Word converter of PDF2Go
Upload your file via drag & drop or upload it from your hard drive, Dropbox or Google Drive.
For text recognition choose "Convert with OCR". Configure the OCR settings to match your needs.
In the optional settings choose Microsoft Word (.docx) or Word 2003 or older (.doc) from the dropdown menu.
Click on "START".