問題描述
如何將 C++ tesseract‑ocr 代碼轉換為 Python? (how to convert C++ tesseract‑ocr code to Python?)
我想在tesseract‑ocr doc中轉換C++版本Result iterator example到 Python。
Pix *image = pixRead("/usr/src/tesseract/testing/phototest.tif");
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
api‑>Init(NULL, "eng");
api‑>SetImage(image);
api‑>Recognize(0);
tesseract::ResultIterator* ri = api‑>GetIterator();
tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
if (ri != 0) {
do {
const char* word = ri‑>GetUTF8Text(level);
float conf = ri‑>Confidence(level);
int x1, y1, x2, y2;
ri‑>BoundingBox(level, &x1, &y1, &x2, &y2);
printf("word: '%s'; \tconf: %.2f; BoundingBox: %d,%d,%d,%d;\n",
word, conf, x1, y1, x2, y2);
delete[] word;
} while (ri‑>Next(level));
}
到目前為止我能做的如下:
import ctypes
liblept = ctypes.cdll.LoadLibrary('liblept‑5.dll')
pix = liblept.pixRead('11.png'.encode())
print(pix)
tesseractLib = ctypes.cdll.LoadLibrary(r'C:\Program Files\tesseract‑OCR\libtesseract‑4.dll')
tesseractHandle = tesseractLib.TessBaseAPICreate()
tesseractLib.TessBaseAPIInit3(tesseractHandle, '.', 'eng')
tesseractLib.TessBaseAPISetImage2(tesseractHandle, pix)
#tesseractLib.TessBaseAPIRecognize(tesseractHandle, tesseractLib.TessMonitorCreate())
我無法轉換 C++ api‑>Recognize(0)
到 Python(我嘗試過的是代碼的最後一行(註釋),但它是錯誤的),我對 C++ 沒有經驗,所以我不能再繼續了,任何人都可以幫助轉換嗎?API:
來自 tess4j:tesserocr 的項目可以讓我免於轉換,但問題在於項目是他們不提供最新的 Windows Python 輪子,這是我進行轉換的主要原因。
參考解法
方法 1:
I think the problem is that
api‑>Recognize()
expects a pointer as first argument. They mistakenly put a0
in their example but it should benullptr
.0
andnullptr
both have the same value but on 64bits systems they don't have the same size (usually ; I assume on some weird non‑x86 systems this may not be true either).Their example still works with a C++ compiler because the compiler is aware that the function expects a pointer (64bits) and fix it silently.
In your example, it seems you haven't specified the exact prototype of
TessBaseAPIRecognize()
to ctypes. So ctypes can't know a pointer (64 bits) is expected by this function. Instead it assumes that this function expects an integer (32 bits) ‑‑> it crashes.My suggestions:
- Use
ctypes.c_void_p(None)
instead of 0 - If you intend to use that in production, specify to ctypes all the function prototypes
- Be careful with the examples you look at: Those examples use Tesseract base API (C++ API) whereas if you want to use libtesseract with Python + ctypes, you have to use Tesseract C API. Those 2 APIs are very similar but may not be identical.
If you need further help, you can have a look at how things are done in PyOCR. If you decide to use PyOCR in your project, just beware that the license of PyOCR is GPLv3+, which implies some restrictions.
(by iMath、Jerome Flesch)
參考文件
- Use