On-line handwriting recognition systems get the information how a symbol is written. In contrast, OCR only gets the pixel map.
I've created a system that can be used to work with handwriting recognition systems in my bachelor's thesis.
write-math.com
The website write-math.com was used to collect data. The source is at github.com/MartinThoma/write-math.
hwrt toolkit
The hwrt
toolkit was created to
work with on-line handwritten symbols. The toolkit is documented at
pythonhosted.org/hwrt.
The raw data can be downloaded with this toolkit.
The toolkit can be used to classify data on your computer (without internet connection):
nntoolkit
The nntoolkit
was created to
have a free software to create, train, test and evaluate neural networks.
HWR experiments
All experiments configuration files are saved in the project github.com/MartinThoma/hwr-experiments.
Data
The data can be downloaded from write-math.com/data. I will try to keep a relatively recent version online. You can contact me if you want the latest version. However, I should note that currently (2015-04-12) this is about 3.7GB. This means sharing the data is not that easy.
Presentations
- 27.08.2014
- 06.11.2014: Final presentation for bachelor's thesis
Bachelor's thesis
- 07.11.2014: My bachelor's thesis. I've got the best grade (1.0) for it ☺. Please note that the submission to arxiv was later and a couple of typos were fixed as well as the term "data multiplication" was replaced by "data augmentation".
- 29.06.2015: An updated, condensed version of my bachelor's thesis.
Remarks
- What I called "data multiplication" is called "data augmentation" by others (e.g. ImageNet Classification with Deep Convolutional Neural Networks, Deep Image: Scaling up Image Recognition, Classifying plankton with deep neural networks)