I am thinking of creating a certain web system. It involves a lot of different (random) people uploading scanned documents of stuff they wrote.
Is there any PHP open source way converting these handwritten texts to machine text?
I found this question but would like to know if it is capable of recognizing a lot of random & different people’s writing?
Do anyone have experience to share of this field?
Advertisement
Answer
See related question on SO: handwriting recognition with simple training
Image-based handwriting recognition is also known as Off-line handwriting recognition.
If the handwritten characters are always capital-letter, post-office style, it can be handled by Intelligent Character Recognition (ICR), which is image-based.
The difference between off-line (image-based) and on-line (real-time) recognition is that the latter requires you to record the timestamp (position and velocity) of each stroke as it is being written.
An image-based recognition engine can handle on-line data by converting the timestamped strokes into an image. To the contrary, an on-line recognition engine cannot handle image-based inputs. As such, on-line recognition is technically easier and open-source projects are available.
Several Wikipedia articles contain lists of OCR/ICR software providers:
- http://en.wikipedia.org/wiki/List_of_optical_character_recognition_software
- http://en.wikipedia.org/wiki/Intelligent_character_recognition
An example of on-line handwriting recognition engine (open-source):