OCR Processor Troubleshooting
29 Mar 20232 minutes to read
Exception | Tesseract has not been initialized exception. |
---|---|
Reason | The exception may occur if the tesseract binaries and tessdata files are unavailable on the provided path. |
Solution1 |
Set proper tesseract binaries and tessdata folder with all files and inner folders. The tessdata folder name is case-sensitive and should not change.
|
Solution2 | Ensure that your data file version is 3.02 since the OCR processor is built with the Tesseract version 3.02. |
Exception | Exception has been thrown by the target of an invocation. |
---|---|
Reason | If the tesseract binaries are not in the required structure. |
Solution |
To resolve this exception, ensure the tesseract binaries are in the following structure.
The tesseract binaries path is TesseractBinaries/Windows, and the assemblies should be in the following structure. 1.TesseractBinaries/Windows/x64/libletpt1753.dll,libSyncfusionTesseract.dll 2.TesseractBinaries/Windows/x86/libletpt1753.dll,libSyncfusionTesseract.dll |
Reason 1 | An exception may occur due to missing or mismatched assemblies of the Tesseract binaries and Tesseract data from the OCR processor. |
Reason 2 | An exception may occur due to the VC++ 2015 redistributable files missing in the machine where the OCR processor takes place. | Solution |
Install the VC++ 2015 redistributable files in your machine to overcome an exception. So, please select both file and install it.
Refer to the following screenshot: ![]() Please find the download link Visual C++ 2015 Redistributable file, Visual C++ 2015 Redistributable file |
</tr>
Exception | Can't be opened because the developer's identity cannot be confirmed. |
---|---|
Reason | This error may occur during the initial loading of the OCR processor in Mac environments. |
Solution | To resolve this issue, refer this link for more details. |
Exception | The OCR processor doesn't process languages other than English. |
---|---|
Reason | This issue may occur if the input image has other languages. The language and tessdata are unavailable for those languages. |
Solution |
The essential PDF supports all the languages the Tesseract engine supports in the OCR processor.
The dictionary packs for the languages can be downloaded from the following online location: https://code.google.com/p/tesseract-ocr/downloads/list It is also mandatory to change the corresponding language code in the OCRProcessor.Settings.Language property. For example, to perform the optical character recognition in German, the property should be set as "processor.Settings.Language = "deu";" |
Issue | Text does not recognize properly when performing OCR on a PDF document with low-quality images |
---|---|
Reason | The presence of low quality images in the input PDF document may be the cause of this issue. |
Solution |
By using the best tessdata, we can improve the OCR results. For more information, please refer to the links below. https://github.com/tesseract-ocr/tessdata_best Note: For better performance, kindly use the fast tessdata which is mentioned in below link,https://github.com/tesseract-ocr/tessdata_fast |
</td>