Automated Data Digitization System for Vehicle Registration Certificates Using Google Cloud Vision API

Karanrat Thammarak, Yaowarat Sirisathitkul, Prateep Kongkla, Sarun Intakosum

Abstract


This study aims to develop an automated data digitization system for the Thai vehicle registration certificate. It is the first system developed as a web service Application Programming Interface (API), which is essential for any enterprise to increase its business value. Currently, this system is available on “www.carjaidee.com”. The system involves four steps: 1) an embedded frame aligns a document to be correctly recognised in the image acquisition step; 2) sharpening and brightness filtering techniques to enhance image quality are applied in the pre-processing step; 3) the Google Cloud Vision API receives a prompt to proceed in the recognition step; 4) a specific domain dictionary to improve accuracy rate is developed for the post-processing step. This study defines 92 images for the experiment by counting the correct words and terms from the output. The findings suggest that the proposed method, which had an average accuracy of 93.28%, was significantly more accurate than the original method using only the Google Cloud Vision API. However, the system is limited because the dictionaries cannot automatically recognise a new word. In the future, we will explore solutions to this problem using natural language processing techniques.

 

Doi: 10.28991/CEJ-2022-08-07-09

Full Text: PDF


Keywords


Thai Vehicle Registration Certificates; Optical Character Recognition; Google Cloud Vision; Service; API; Transportation.

References


Vial, G. (2019). Understanding digital transformation: A review and a research agenda. Journal of Strategic Information Systems, 28(2), 118–144. doi:10.1016/j.jsis.2019.01.003.

Orlandi, L. B., Ricciardi, F., Rossignoli, C., & De Marco, M. (2019). Scholarly work in the Internet age: Co-evolving technologies, institutions and workflows. Journal of Innovation and Knowledge, 4(1), 55–61. doi:10.1016/j.jik.2017.11.001.

BarNir, A., Gallaugher, J. M., & Auger, P. (2003). Business process digitization, strategy, and the impact of firm age and size: The case of the magazine publishing industry. Journal of Business Venturing, 18(6), 789–814. doi:10.1016/S0883-9026(03)00030-2.

Sun, L. (2018). Effect of cyclic stress level and overconsolidation ratio on permanent deformation behaviour of clayey subsoil. Civil Engineering Journal, 4(8), 1772. doi:10.28991/cej-03091113.

Narayanswamy, N., Abdul Rajak, A. R., & Hasan, S. (2022). Development of computer vision algorithms for multi-class waste segregation and their analysis. Emerging Science Journal, 6(3), 631–646. doi:10.28991/ESJ-2022-06-03-015.

Putra, I. K. G. D., Arsa, D. M. S., Hardijaya, I. G. N. D., Prabawa, I. G. G. S., & Satiawidiatmika, I. M. A. (2020). Medical vision: Web and mobile medical image retrieval system based on google cloud vision. International Journal of Electrical and Computer Engineering, 10(6), 5974–5984. doi:10.11591/ijece.v10i6.pp5974-5984.

Wang, Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., Liu, S., Zeng, Y., Mehrabi, S., Sohn, S., & Liu, H. (2018). Clinical information extraction applications: A literature review. Journal of Biomedical Informatics, 77, 34–49. doi:10.1016/j.jbi.2017.11.011.

Long, S., He, X., & Yao, C. (2021). Scene text detection and recognition: The deep learning era. International Journal of Computer Vision, 129(1), 161–184. doi:10.1007/s11263-020-01369-0.

Anand, G. S., Kuriakose, J., Sharma, S., & Guha, D. (2020). Deep learning for information extraction in finance documents-corporate loan operations. 2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). doi:10.1109/IEMCON51383.2020.9284865.

Popat, R. R., & Chaudhary, J. (2018). A survey on credit card fraud detection using machine learning. 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI). doi:10.1109/ICOEI.2018.8553963.

Sugadev, M., Yogesh, Sanghamreddy, P. K., & Samineni, S. K. (2019). Rough terrain autonomous vehicle control using google cloud vision API. 2019 2nd International Conference on Power and Embedded Drive Control (ICPEDC). doi:10.1109/ICPEDC47771.2019.9036621.

Vaithiyanathan, D., & Muniraj, M. (2019). Cloud based text extraction using google cloud vision for visually impaired applications. 2019 11th International Conference on Advanced Computing (ICoAC). doi:10.1109/ICoAC48765.2019.246822.

Chen, SH., Chen, YH. (2017). A content-based image retrieval method based on the google cloud vision API and WordNet. Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science, 10191. Springer, Cham, Switzerland. doi:10.1007/978-3-319-54472-4_61.

Chumwatana, T., & Rattana-Umnuaychai, W. (2021). Using OCR framework and information extraction for Thai documents digitization. 2021 9th International Electrical Engineering Congress (iEECON). doi:10.1109/iEECON51072.2021.9440300.

Mookdarsanit, L., & Mookdarsanit, P. (2021). Combating the hate speech in Thai textual memes. Indonesian Journal of Electrical Engineering and Computer Science, 21(3), 1493–1502. doi:10.11591/ijeecs.v21.i3.pp1493-1502.

Chomphuwiset, P. (2017). Printed Thai character segmentation and recognition. 2017 IEEE 4th International Conference on Soft Computing & Machine Intelligence (ISCMI). doi:10.1109/iscmi.2017.8279611.

Somboonsak, P. (2018). Misspelling error detection in Thai language application. Proceedings of the 6th International Conference on Information Technology: IoT and Smart City - ICIT 2018. doi:10.1145/3301551.3301584.

Thammarak, K., Kongkla, P., Sirisathitkul, Y., & Intakosum, S. (2022). Comparative analysis of Tesseract and Google Cloud Vision for Thai vehicle registration certificate. International Journal of Electrical and Computer Engineering, 12(2), 1849–1858. doi:10.11591/ijece.v12i2.pp1849-1858.

Oliveira, J., Azevedo, A., Ferreira, J. J., Gomes, S., & Lopes, J. M. (2021). An insight on b2b firms in the age of digitalization and paperless processes. Sustainability, 13(21), 1–21. doi:10.3390/su132111565.

Saputra, K. D., Rahmaastri, D. A., Setiawan, K., Suryani, D., & Purnama, Y. (2019). Mobile financial management application using google cloud vision API. Procedia Computer Science, 157(157), 596–604. doi:10.1016/j.procs.2019.09.019.

Benlİay, A., & Altuntaş, A. (2019). Visual Landscape Assessment with the Use of Cloud Vision API : Antalya Case. International Journal of Landscape Architecture Research, 3(1), 7–14.

Salvador, R. C., Lagmay, C. J. S., & Baldovino, R. G. (2020). PharmaCV: A cloud vision approach on generating labels of pharmaceutical drugs. 2020 3rd International Conference on Biomedical Engineering (IBIOMED). doi:10.1109/IBIOMED50285.2020.9487613.

Zaki, M. A., Zai, S., Ahsan, M., & Zaki, U. (2019). Development of an android app for text detection. Journal of Theoretical and Applied Information Technology, 97(20), 2485–2496.

Rajbongshi, A., Islam, M. I., Rahman, M. M., Majumder, A., Islam, M. E., & Biswas, A. A. (2020). Bangla optical character recognition and text-to-speech conversion using raspberry Pi. International Journal of Advanced Computer Science and Applications, 11(6), 274–278. doi:10.14569/IJACSA.2020.0110636.

Thammano, A., & Duangphasuk, P. (2005). Printed Thai character recognition using the hierarchical cross-correlation ARTMAP. 17th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’05). doi:10.1109/ictai.2005.100.

Arief, R., Mutiara, A. B., Kusuma, T. M., & Hustinawaty. (2018). Automated extraction of large scale scanned document images using Google Vision OCR in Apache Hadoop environment. International Journal of Advanced Computer Science and Applications, 9(11), 112–116. doi:10.14569/ijacsa.2018.091117.

Biesner, D., Ramamurthy, R., Stenzel, R., Lübbering, M., Hillebrand, L., Ladi, A., Pielka, M., Loitz, R., Bauckhage, C., & Sifa, R. (2022). Anonymization of German financial documents using neural network-based language models with contextual word representations. International Journal of Data Science and Analytics, 13(2), 151–161. doi:10.1007/s41060-021-00285-x.

Şahin, G. G., Emekligil, E., Arslan, S., Aĝin, O., & Eryiĝit, G. (2018). Relation extraction via one-shot dependency parsing on intersentential, higher-order, and nested relations. Turkish Journal of Electrical Engineering and Computer Sciences, 26(2), 830–843. doi:10.3906/elk-1703-108.

de Jager, C., & Nel, M. (2019). Business process automation: A workflow incorporating optical character recognition and approximate string and pattern matching for solving practical industry problems. Applied System Innovation, 2(4), 1–13. doi:10.3390/asi2040033.

Bisong, E. (2019). Building machine learning and deep learning models on google cloud platform. doi:10.1007/978-1-4842-4470-8_2.

Hosseini, H., Xiao, B., & Poovendran, R. (2017). Google’s cloud vision API is not robust to noise. 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA). doi:10.1109/ICMLA.2017.0-172.

Kaur, R., & Veer, D. (2021). Punjabi text recognition system for portable devices: A comparative performance analysis of cloud vision API with Tesseract. Journal of Computer Science an Engineering, 2(2), 1–8. doi:10.36596/jcse.v2i2.195.

Wongta, P., & Chalidabhongse, T. H. (2018). Vision-based bus route number reader for visually impaired travelers. 2018 2nd International Conference on Imaging, Signal Processing and Communication (ICISPC). doi:10.1109/icispc44900.2018.9006722.

Chopvitayakun, S. (2019). Mobile application implementing location based services framework with google cloud platform integration: SSRU development case. International Journal of Future Computer and Communication, 8(4), 119–122. doi:10.18178/ijfcc.2019.8.4.552.

Kesorn, K., & Phawapoothayanchai, P. (2018). Optical Character Recognition (OCR) enhancement using an approximate string matching technique. Engineering and Applied Science Research, 45(4), 282-289.

Namyang, N., & Phimoltares, S. (2020). Thai traffic sign classification and recognition system based on histogram of gradients, color layout descriptor, and normalized correlation coefficient. 2020 5th International Conference on Information Technology (InCIT). doi:10.1109/InCIT50588.2020.9310778.

Kraisin, S., & Kaothanthong, N. (2018). Accuracy improvement of a province name recognition on Thai license plate. 2018 International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP). doi:10.1109/iSAI-NLP.2018.8692983.

Kaothanthong, N., Theeramunkong, T., & Chun, J. (2017). Improving Thai optical character recognition using circular-scan histogram. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). doi:10.1109/ICDAR.2017.98.

Chandra, S., Sisodia, S., & Gupta, P. (2020). Optical character recognition - A review. International Research Journal of Engineering and Technology (IRJET), 7(4), 3037-3041. doi:10.1109/SCEECS54111.2022.9740911.

Jirattitichareon, W., & Chalidabhongse, T. H. (2006). Automatic detection and segmentation of text in low quality Thai sign images. APCCAS 2006 - 2006 IEEE Asia Pacific Conference on Circuits and Systems. doi:10.1109/APCCAS.2006.342256.

Nguyen, T. T. H., Jatowt, A., Coustaty, M., & Doucet, A. (2021). Survey of post-OCR processing approaches. ACM Computing Surveys, 54(6), 1–37. doi:10.1145/3453476.

Khosrobeigi, Z., Veisi, H., Ahmadi, H. R., & Shabanian, H. (2020). A rule-based post-processing approach to improve Persian OCR performance. Scientia Iranica, 27(6 D), 3019–3033. doi:10.24200/SCI.2020.53435.3267.

Aliwy, A. H., & Al-Sadawi, B. (2021). Corpus-based technique for improving Arabic OCR system. Indonesian Journal of Electrical Engineering and Computer Science, 21(1), 233–241. doi:10.11591/ijeecs.v21.i1.pp233-241.

DLT. (2022). Department of Land Transport, Bangkok, Thailand. Available online: https://www.dlt.go.th/site/skp2/m-news/1387/ (accessed on May 2022).


Full Text: PDF

DOI: 10.28991/CEJ-2022-08-07-09

Refbacks

  • There are currently no refbacks.




Copyright (c) 2022 Karanrat Thammarak, Yaowarat Sirisathitkul, Prateep Kongkla, Sarun Intakosum

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
x
Message