datamodules.DivaHisDB.datasets package

Submodules

datamodules.DivaHisDB.datasets.cropped_dataset module

Load a dataset of historic documents by specifying the folder where its located.

class CroppedHisDBDataset(path: Path, data_folder_name: str, gt_folder_name: str, selection: Optional[Union[int, List[str]]] = None, is_test=False, image_transform=None, target_transform=None, twin_transform=None, **kwargs)[source]

Bases: CroppedDatasetRGB

Dataset used for the `DivaHisDB dataset<https://ieeexplore.ieee.org/abstract/document/7814109>`_ in a cropped setup. This class represents one split of the whole dataset.

The structure of the folder should be as follows:

path
├── data_folder_name
│   ├── original_image_name_1
│   │   ├── image_crop_1.png
│   │   ├── ...
│   │   └── image_crop_N.png
│   └──original_image_name_N
│       ├── image_crop_1.png
│       ├── ...
│       └── image_crop_N.png
└── gt_folder_name
    ├── original_image_name_1
    │   ├── image_crop_1.png
    │   ├── ...
    │   └── image_crop_N.png
    └──original_image_name_N
        ├── image_crop_1.png
        ├── ...
        └── image_crop_N.png
Parameters:
  • path (Path) – Path to the dataset

  • data_folder_name (str) – name of the folder that contains the original images

  • gt_folder_name (str) – name of the folder that contains the ground truth images

  • selection (Union[int, List[str], None]) – filtering of the dataset, can be an integer or a list of strings

  • is_test (bool) – if True, __getitem__() will return the index of the image

  • image_transform (callable) – transformation that is applied to the image

  • target_transform (callable) – transformation that is applied to the target

  • twin_transform (callable) – transformation that is applied to both image and target

Module contents