Pytesseract 或 Keras OCR 從圖像中提取文本 (Pytesseract or Keras OCR to extract text from image)


問題描述

Pytesseract 或 Keras OCR 從圖像中提取文本 (Pytesseract or Keras OCR to extract text from image)

我正在嘗試從圖像中提取文本。目前我得到空字符串作為輸出。下面是我的 pytesseract 代碼,儘管我也對 Keras OCR 持開放態度:‑

from PIL import Image
import pytesseract

path = 'captcha.svg.png'
img = Image.open(path)
captchaText = pytesseract.image_to_string(img, lang='eng', config='‑‑psm 6')

我不確定如何使用 svg 圖像,所以我將它們轉換為 png。下面是一些示例圖片:‑

SVG 圖像轉換為 PNG

在此處輸入圖片描述

keras‑ocr not working or returning nothing is because of the grayscale image (as I found it worked otherwise). See below:

from PIL import Image 

a = Image.open('/content/gD7vA.png') # return none by keras‑ocr, 
a.mode, a.split() # mode 1 channel + transparent layer / alpha layer (LA)

b = Image.open('/content/CYegU.png') # return result by keras‑ocr
b.mode, b.split() # mode RGB + transparent layer / alpha layer (RGBA)

In the above, the a is the file you mention in your question; as It showed, it has to channel, e.g. grayscale and transparent layer. And b is the file I converted to RGB or RGBA. The transparent layer already included in your original file and I didn't remove it, but it seems useless to keep otherwise if needed. In short, to make your input work on keras‑ocr, you can convert your files to RGB (or RGBA) first and save them on disk. And then pass them to ocr.

# Using PIL to convert one mode to another 
# and save on disk
c = Image.open('/content/gD7vA.png').convert('RGBA')
c.save(....png)
c.mode, c.split()

('RGBA',
 (<PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A410>,
  <PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A590>,
  <PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A810>,
  <PIL.Image.Image image mode=L size=150x50 at 0x7F03E8E7A110>))

Full code

import matplotlib.pyplot as plt

# keras‑ocr will automatically download pretrained
# weights for the detector and recognizer.
pipeline = keras_ocr.pipeline.Pipeline()

# Get a set of three example images
images = [
         keras_ocr.tools.read(url) for url in [
            '/content/CYegU.png', # mode: RGBA; Only RGB should work too!
            '/content/bw6Eq.png', # mode: RGBA; 
            '/content/jH2QS.png', # mode: RGBA
            '/content/xbADG.png'  # mode: RGBA
    ]
]

# Each list of predictions in prediction_groups is a list of
# (word, box) tuples.
prediction_groups = pipeline.recognize(images)
Looking for /root/.keras‑ocr/craft_mlt_25k.h5
Looking for /root/.keras‑ocr/crnn_kurapan.h5
prediction_groups
[[('zum', array([[ 10.658852,  15.11916 ],
          [148.90204 ,  13.144257],
          [149.39563 ,  47.694347],
          [ 11.152428,  49.66925 ]], dtype=float32))],
 [('sresa', array([[  5.,  15.],
          [143.,  15.],
          [143.,  48.],
          [  5.,  48.]], dtype=float32))],
 [('sycw', array([[ 10.,  15.],
          [149.,  15.],
          [149.,  49.],
          [ 10.,  49.]], dtype=float32))],
 [('vdivize', array([[ 10.407883,  13.685192],
          [140.62648 ,  16.940662],
          [139.82323 ,  49.070583],
          [  9.604624,  45.815113]], dtype=float32))]]

Display

# Plot the predictions
fig, axs = plt.subplots(nrows=len(images), figsize=(20, 20))
for ax, image, predictions in zip(axs, images, prediction_groups):
    keras_ocr.tools.drawAnnotations(image=image, predictions=predictions, ax=ax)

enter image description here

(by Abhash UpadhyayaM.Innat)

參考文件

  1. Pytesseract or Keras OCR to extract text from image (CC BY‑SA 2.5/3.0/4.0)

#tesseract #ocr #python-tesseract #Keras #deep-learning






相關問題

Android Studio 如何修復無法創建類文件錯誤? (Android Studio How to fix cannot create class-file error?)

Python - 程序收到信號 SIGSEGV,分段錯誤 (Python - Program received signal SIGSEGV, Segmentation fault)

Tesseract OCR 在線程中使用時崩潰 (Tesseract OCR crash when used in thread)

如何將 Leptonica Pix 對象轉換為 Android 的位圖 (How to convert Leptonica Pix Object to Android's Bitmap)

錯誤 2 在 pytesseract 中沒有這樣的文件或目錄 (Error 2 No such file or directory in pytesseract)

OCR:沒有得到想要的結果 (OCR : Not getting desired result)

在 x64 位機器上的 Visual Studio 2013 中鏈接 tesseract 和 opencv (Linking tesseract and opencv in Visual Studio 2013 on x64 bit machine)

如何提高讀取正方體的準確性? (How to improve read tesseract accuracy?)

如何將 C++ tesseract-ocr 代碼轉換為 Python? (how to convert C++ tesseract-ocr code to Python?)

Tesseract Worker.Load 掛在 Vercel 上 (Tesseract Worker.Load hangs on Vercel)

如何在窗口上使用 MinGW 編譯 tesseract baseapi.h? (How to compile tesseract baseapi.h with MinGW on window?)

Pytesseract 或 Keras OCR 從圖像中提取文本 (Pytesseract or Keras OCR to extract text from image)







留言討論