Extract text from a Cyprus Personal ID using OCR
View/ Open
Date
2024-05-28Author
Ioannou, AngelosPublisher
Πανεπιστήμιο Κύπρου, Σχολή Θετικών και Εφαρμοσμένων Επιστημών / University of Cyprus, Faculty of Pure and Applied SciencesPlace of publication
CyprusGoogle Scholar check
Keyword(s):
Metadata
Show full item recordAbstract
This project was initiated to explore and investigate the existing Cyprus ID format, including the structure and the various fields it contains, with a deep dive into the crafting of the distinctive features. The objective is to gain a thorough understanding of how these fields are structured and to then create an advanced, accurate, and fast OCR tool using Python. OCR technology allows for the digitization, reading, and interpretation of characters from physical documents.
Successfully extracting and processing information from Cyprus IDs could significantly
enhance the quality of services in various sectors, such as airports, police, and other industries.
The development of an image-based tool requires an algorithm that can handle less-than-perfect character shapes and forms, necessitating the use of approximate matches—similar to an autocorrect function. Fields such as names and surnames need to align with a specific percentage of accuracy to meet the unit testing standards. This is achievable with confidence level scores that gauge the algorithm's accuracy in reflecting reality. Throughout the project, the challenges, issues, and potential areas for future enhancements are identified to better understand the present state of Cyprus IDs and to expand their current applications and use cases.
Collections
Cite as
The following license files are associated with this item: