[Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough

bug-ocrad

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough

From:	Chris K. Skinner
Subject:	[Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough transform.
Date:	Thu, 23 Feb 2006 12:35:05 -0500

This will be long, so please be patient and see if you can read all of this...

I'm interested in various aspects of computer science including imageprocessing, neural networks,

expert systems, semantic analysis.  I've got books on such topics.

As you are probably aware, there are software patents in some countries.

If you had some kind of outline of the algorithms that were applied perversion of the software that would greatly help someone new coming in freshoff the street to gain a quicker understanding of stuff in general, andprobably demonstrate to the world at large that you have invented somethingnew that could not be patented / stolen / claimed by some greedy corporatedudes.

I have just downloaded and tried to compile your source, but it failed. Nowto understand your work, I have to look into each source to try to realizewhat is being done now. To see what was done in previous versions thateither did not work as intended or was tried and abandoned, I would have torepeat this analysis on your older version sources, and compare it to thecurrent version.

Do you have any design notes, bibliographic citations, web links toinformation that you've made use of , release notes for what algorithms arebeing used / abandoned.


I've been using the following OCR software since about 1993:
WinFax, Calara WordScan Plus, Caere OmniPage.

From my experience with these, the amount of OCR errors goes way up if the

page skew (orientation, angle, rotation, rotate) is not exactly aligned tozero degrees. When the angle is off, then the bounding boxes around eachpage element, each column, each line of text, each character is wronglypositioned to create huge amounts of recognition errors. Consider that whena high resolution scan is done that recognition should probably improvebecause the information is rich with nice amounts of redundant informationclues as to what is present on the page.

But the long horizontal lines of text then become very long "sets of stripesof pixels." With such long stripes, it is more likely that instead of therebeing a one or two pixel error from page skew, it can be much higher. Ifthe recognition algorithms do not account for this, and instead determinebounding box regions for recognition too early and presume a zero page skewangle error, the results shall be/are very bad.

In the J. R. Parker book w/CD ROM "Algorithms For Image Processing AndComputer Vision" that I have read, the author provides a couple of algorithmsuggestions for combating the page skew angle issue. A Hough-transform whenapplied to the dots of the bottoms of the bounding boxes of glyphs resultsin a page skew angle in degrees (with his source code, that is). Byapplying an image rotation that eliminates the skew, better recognitionshall result. (Unfortunately, he does not, however, present the source codefor determining the bounding boxes of glyphs so that it is not easy todemonstrate that this algorithm will work especially on larger regions oftext.)

Another approach is to use angle-independent Complex-Number-CoefficientNeural Networks to use as feature recognizers. The Japanese promoter ofthese neural networks says that they are Affine-Transform insensitive, andthereby can recognize a pattern that has been so transformed.

http://mathworld.wolfram.com/AffineTransformation.html
http://www.google.ca/search?num=20&hl=en&newwindow=1&safe=off&q=Affine

"http://www.google.ca/search?num=20&hl=en&newwindow=1&safe=off&q=Affine+Complex-Number+Coefficient+Neural+Networks"This too is just a theory. I don't have a copy of any books onComplex-Number-Coefficient Neural Networks, or any source code from acompetent mathematician who has converted the advanced mathematics intoworking C++ code examples. Often these theoreticians are not interested inthe practical applications of their work and are more interested in theexpressions of their ideas as continuous functions expressed asN-dimensional differential equations (or something much less understandableto me anyway).

Thanks for any help that you could provide me in helping understand yourproject so that I might possibly provide you with suggestions forimprovements.


Kindest regards, C.

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough transform., Chris K. Skinner <=
- Re: [Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough transform., Antonio Diaz Diaz, 2006/03/04

Prev by Date: [Bug-ocrad] Win32 configure Cygwin32 g++ 3.3.3 compile `snprintf' undeclared
Next by Date: Re: [Bug-ocrad] Win32 configure Cygwin32 g++ 3.3.3 compile `snprintf' undeclared
Previous by thread: [Bug-ocrad] Win32 configure Cygwin32 g++ 3.3.3 compile `snprintf' undeclared
Next by thread: Re: [Bug-ocrad] Technical documentation summary readme.txt, page skew, Hough transform.
Index(es):
- Date
- Thread