Update - April 2014
The images have been cropped to remove registration artefacts that may affect bleed-through removal.
Full colour images are now also available with the database.
The images contained here form a database that is designed to be a resource for people working in the field of digital document restoration, and more specifically on the problem of bleed-through degradation. It consists of a set of 25 registered recto verso sample grayscale image pairs, taken from larger manuscript images, with varied degrees of bleed-through.
The crops were taken such that they would contain a sentence or phrase of text, in order to make tests on improved legibility possible. All images in the database are saved in tif format. The verso side of each image pair has been flipped horizontally and registered to the recto side so that bleed-through text on one side and corresponding foreground text on the other are aligned.
One of the main problems encountered when researching digital document restoration is that no ground truth exists. That is there is no available clean original document with which to compare restoration results. Therefore ground truth has to be obtained either by creating synthetic degraded document images with known ground truth, or by creating synthetic ground truth images for real degraded images. This database contains the latter; manually created foreground text masks for each image have been created and are included here.
File Naming Convention for the Degraded Images
File names in the database have the format "lib"."MS"."fol".tiff
File Naming Convention for the Ground-Truth Masks
As for the degraded images, but with appended "gt".tiff to differentiate.
File Naming Convention for Colour Images
As for the degraded images, but with appended "rgb".tiff to differentiate.
Move the mouse over the links in the table to see the corresponding degraded images and restoration results.
|Image||Restoration Method||Image||Restoration Method||Image||Restoration Method|
Recto Ground Truth Foreground Mask
Verso Ground Truth Foreground Mask
Accessing the Database
All queries regarding the database should be forwarded to rowleybr_at_tcd_dot_ie.
Download the database 169MB .zip (registered ISOS users only).
We request that you reference the following in all publications which describe the work for which you use this database:
|Han||G. A. Hanasusanto, Z. Wu, and M. S. Brown. Ink-Bleed Reduction using Functional Minimization. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 825-832, 2010.|
|Hua||Y. Huang, M. S. Brown, and D. Xu, User Assisted Ink-Bleed Reduction. IEEE Transactions on Image Processing, 19(10): 2646-2658, 2010.|
|Mog||R. F. Moghaddam, M. Cheriet, A Variational Approach to Degraded Document Enhancement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(8): 1347-1361, 2010.|
|Ro1||R. Rowley-Brooke, A. Kokaram, Bleed-Through Removal in Degraded Documents. In C. Viard-Gaudin and R. Zanibbi, editors, Proceedings of SPIE 8297, Document Recognition and Retrieval XIX, 82970T, 2012.|
|Ro2||R. Rowley-Brooke, F.Pitié, A. Kokaram, A Non-Parametric Framework for Document Bleed-Through Removal. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2954-2960, 2013.|