OCR:沒有得到想要的結果 (OCR : Not getting desired result)


問題描述

OCR:沒有得到想要的結果 (OCR : Not getting desired result)

我有這張圖片在此處輸入圖片描述。我正在嘗試對這張圖片中的字母進行 OCR。對於字母“9”和“R”,我沒有得到想要的結果。首先我裁剪了這些字母,在此處輸入圖片描述 & 在此處輸入圖片描述 並執行以下命令。

tesseract 9.png stdout ‑psm 8
.

它只是返回“。”

所有其他字母的 OCR 都可以正常工作,但對於這兩個字母則不行(不過,我認為它們的圖像質量沒那麼差)。任何建議/幫助表示讚賞。


參考解法

方法 1:

I've no experience with tesseract myself, but replicating the character and adding some background works on https://www.newocr.com/ which uses tesseract internally, according to google result.

So I used this as input:

enter image description here

which gives the correct result on that web‑app: 99999999, while the single character doesn't work. Maybe you can verify this with your tesseract implementation and maybe it helps you to adjust your isolated extracted characters to work with tesseract. e.g. try to stitch multiple duplicates of your extracted contour next to each other to improve tesseract output ‑ since you know how often you stitched the contour next to each other you'll know that the output might be correct if it recognizes the same character that often times..

same works for

enter image description here

The border looks important, without enough border it will recognize P. In general afaik you should try to replace background and foreground by pure black and pure white! Not sure what kind of preprocessing the web‑app uses...

this code can be used to repeat an image with C++ and OpenCV, but it won't add a border around. To do that you would work very similar but with some additional steps and you would have to assign some color to the border.

EDIT: I've updated the code to use a border of 4 pixel in each direction (you can adjust the variable) and with black background color.

This code is very easy and should be very similar for opencv in java, python, etc.

int main(int argc, char * argv[])
{
    //cv::Mat input = cv::imread("../inputData/ocrR.png");

    if(argc != 3)
    {
        std::cout << "usage: .exe filename #Repetitions" << std::endl;
        return 0;
    }

    std::string filename = argv[1];
    int nRepetitions = atoi(argv[2]);

    cv::Mat inputImage = cv::imread(filename);
    if(inputImage.empty())
    {
        std::cout << "image file " << filename << " could not be loaded" << std::endl;
        return 0;
    }

    // you instead should try to extract the background color from the image (e.g. from the image border)
    cv::Scalar backgroundColor(0,0,0);

    // size of the border in each direction
    int border = 4;

    cv::Mat repeatedImage = cv::Mat(inputImage.rows + 2*border, nRepetitions*inputImage.cols + 2*border, inputImage.type() , backgroundColor);

    cv::Rect roi = cv::Rect(border,border,inputImage.cols, inputImage.rows);

    for(int i=0; i<nRepetitions; ++i)
    {
        // copy original image to subimage of repeated image
        inputImage.copyTo(repeatedImage(roi));

        // update roi position
        roi.x += roi.width;
    }

    // now here you could send your repeated image to tesseract library and test whether nRepetitions times a letter was found.

    cv::imwrite("repeatedImage.png", repeatedImage);
    cv::imshow("repeated image" , repeatedImage);
    cv::waitKey(0);
    return 0;
}

giving this result:

enter image description here

方法 2:

I had a tiny bit more success than you... I did a "Connected Components Analysis" to extract the individual letters, then put a border around each extracted letter and appended them all together into a single horizontal line which gave me this:

enter image description here

And if I then run tesseract I get:

VQQTRF

(by BhushanMickaMark Setchell)

參考文件

  1. OCR : Not getting desired result (CC BY‑SA 2.5/3.0/4.0)

#image-processing #tesseract #ocr #tess4j #OpenCV






相關問題

在 matlab 中用 imread 讀取圖像文件會給出什麼樣的表示? (reading a image file with imread in matlab gives what kind of representation?)

使用 CRF 的圖像標記性能 (Image labeling performance using CRF)

Opencv:獲取圖像中的段大小並刪除小段 (Opencv: Get segments sizes in image and remove small segments)

將 PHP 頁面作為圖像返回 (Return a PHP page as an image)

我在哪裡可以找到有關雙三次插值和 Lanczos 重採樣的好讀物? (Where can I find a good read about bicubic interpolation and Lanczos resampling?)

從圖像中刪除白色背景 (Remove white backgrounds from images)

如何填充投影圖像的空白部分? (How to fill empty parts of a projected image?)

如何使圖像亮度均勻(使用 Python/PIL) (How to Make an Image Uniform Brightness (using Python/PIL))

圖像處理公式可生成類似通過 Mac 相機拍攝的照片的效果 (Image manipulation formula to generate effects like pictures taken via Mac's camera)

從照片生成漂亮的直方圖? (generating nice looking histogram from photo?)

使用 DjVu 工具進行背景/前景分離? (Using the DjVu tools to for background / foreground seperation?)

如何操縱跟踪器區域使其變成方形? (How can I manipulate the tracker area to make it into a square shape?)







留言討論