Removal of Imperfections in Digital Scans using Generative Adversarial Networks

Algoabra, Mohamad (2023) Removal of Imperfections in Digital Scans using Generative Adversarial Networks. Other thesis, Universität Rostock.

[img] Text
thesis_final_version-1.pdf

Download (260MB)

Abstract

move towards a more digital workflow. However, scanned document images often suffer from damage caused by different real-life scenarios, which makes them difficult to read and use. In this thesis, we explore the use of generative adversarial networks (GANs) to enhance scanned images and remove imperfections, such as coffee stains and other distortion factors. The problem is formulated as an image-to-image translation task between two domains, and we compare the performance of two GAN types: Pix2Pix, a supervised image-to-image translation model that uses paired data, and CycleGAN, an unsupervised image-to-image translation model that uses unpaired data. To address this problem, we developed a data pipeline to generate appropriate data to train the aforementioned models. Furthermore, we developed a prototype that allows users to easily test out these models. The effectiveness of the proposed methods was evaluated in detail using various criteria, such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), and Fréchet Inception Distance (FID). We also assessed the impact of these approaches on improving Optical Character Recognition (OCR) efficiency. The results showed that Pix2Pix can significantly improve the quality of scanned images and remove defects such as coffee stains, while CycleGAN performed averagely. Overall, this study provides a perspective on improving the digitization process by using GANs to address image imperfections. The data pipeline and prototype developed in this work can be used to improve the quality of scanned images and facilitate the transition to a more digital workflow. Future research could explore the possibility of further improving the performance of this method by incorporating other types of GANs or alternative approaches.

Item Type: Thesis (Other)
Subjects: Autorenart > Studentische Arbeiten > Bachelorarbeit
Forschungsthemen > Big Data Analytics
Forschungsthemen > Digitale Bibliotheken
Rahmenprojekte > HyDRA
Forschungsthemen > Information Retrieval
Autorenart > Studentische Arbeiten
Projekte > WossiDiA
Depositing User: Dbis Admin
Date Deposited: 18 Apr 2023 14:18
Last Modified: 18 Apr 2023 14:18
URI: https://eprints.dbis.informatik.uni-rostock.de/id/eprint/1097

Actions (login required)

View Item View Item