[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-ocrad] Library version of OCRAD
From: |
Antonio Diaz Diaz |
Subject: |
Re: [Bug-ocrad] Library version of OCRAD |
Date: |
Tue, 15 Dec 2009 21:16:32 +0100 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.11) Gecko/20050905 |
Igor Filippov wrote:
character.h.patch - adds result() function for a single-character
recognition
This same function with the name "byte_result" is included in ocrad at
your request since version 0.18-pre3 (2008-08-21). I have found we
interchanged messages in the past, but it seems none of us remembered
it. :-)
I see you use ocrad at a low level (Blob), and I see why you do so;
ocrad does not recognize single characters without surrounding white
space. Ocrad was designed for full text pages.
As a first step, I have made ocrad recognize such characters.
Now I am wondering about the bitmap/pixmap format to use for
communication with ocrad. Image format conversion wastes a lot of time,
and osra seems to do already a lot of conversions.
Internally ocrad uses first a 256 level greymap (page_image.h)
std::vector< std::vector< unsigned char > > data;
from which it extracts a number of 2 level bitmaps (bitmap.h)
std::vector< std::vector< unsigned char > > data;
I see osra already uses "struct pixmap" from gocr:
struct pixmap {
unsigned char *p; /* pointer of image buffer (pixmap) */
int x; /* xsize */
int y; /* ysize */
int bpp; /* bytes per pixel: 1=gray 3=rgb */
};
typedef struct pixmap pix;
But It may be better one struct per image type, like this:
struct Ocrad_Bitmap
{
int height;
int width;
const unsigned char * data; // 0 = white, 1 = black
};
struct Ocrad_Greymap
{
int height;
int width;
const unsigned char * data; // 256 level greymap
};
struct Ocrad_Colormap
{
int height;
int width;
const unsigned char * data; // 3 bytes per pixel RGB
};
The "struct Ocrad_Greymap" can then be easily built from the values in
the "struct pixmap".
Best regards,
Antonio.