datamodules.RotNet.datasets package

Submodules

datamodules.RotNet.datasets.cropped_dataset module

Load a dataset of historic documents by specifying the folder where its located.

class CroppedRotNet(path: Path, data_folder_name: str, gt_folder_name: Optional[str] = None, selection: Optional[Union[int, List[str]]] = None, is_test: bool = False, image_transform: Optional[callable] = None)[source]

Bases: CroppedHisDBDataset

Dataset implementation of the RotNet paper of Gidaris et al.. This dataset is used for the DivaHisDB dataset in a cropped setup.

The structure of the folder should be as follows:

 data_dir
├── train_folder_name
│   ├── data_folder_name
│   │   ├── original_image_name_1
│   │   │   ├── image_crop_1.png
│   │   │   ├── ...
│   │   │   └── image_crop_N.png
│   │   └──original_image_name_N
│   │       ├── image_crop_1.png
│   │       ├── ...
│   │       └── image_crop_N.png
│   └── gt_folder_name
│       ├── original_image_name_1
│       │   ├── image_crop_1.png
│       │   ├── ...
│       │   └── image_crop_N.png
│       └──original_image_name_N
│           ├── image_crop_1.png
│           ├── ...
│           └── image_crop_N.png
├── validation_folder_name
│   ├── data_folder_name
│   │   ├── original_image_name_1
│   │   │   ├── image_crop_1.png
│   │   │   ├── ...
│   │   │   └── image_crop_N.png
│   │   └──original_image_name_N
│   │       ├── image_crop_1.png
│   │       ├── ...
│   │       └── image_crop_N.png
│   └── gt_folder_name
│       ├── original_image_name_1
│       │   ├── image_crop_1.png
│       │   ├── ...
│       │   └── image_crop_N.png
│       └──original_image_name_N
│           ├── image_crop_1.png
│           ├── ...
│           └── image_crop_N.png
└── test_folder_name
    ├── data_folder_name
    │   ├── original_image_name_1
    │   │   ├── image_crop_1.png
    │   │   ├── ...
    │   │   └── image_crop_N.png
    │   └──original_image_name_N
    │       ├── image_crop_1.png
    │       ├── ...
    │       └── image_crop_N.png
    └── gt_folder_name
        ├── original_image_name_1
        │   ├── image_crop_1.png
        │   ├── ...
        │   └── image_crop_N.png
        └──original_image_name_N
            ├── image_crop_1.png
            ├── ...
            └── image_crop_N.png
Parameters:
  • path (Path) – Path to root dir of the dataset (folder containing the train/val/test folder)

  • data_folder_name (str) – Name of the folder containing the train/val/test folder

  • gt_folder_name (str) – Name of the folder containing the train/val/test folder

  • selection (Union[int, List[str]]) – If you only want to use a subset of the dataset, you can specify the name of the files (without the file extension) in a list. If you want to use all files, set this parameter to None.

  • is_test (bool) – If True, the it returns additional information that are important for the test set.

  • image_transform

static get_gt_data_paths(directory: Path, data_folder_name: str, gt_folder_name: Optional[str] = None, selection: Optional[Union[int, List[str]]] = None) List[Path][source]

Creates the list of paths to the original images.

Structure of the folder:

dictionary
├── data_folder_name
│   ├── original_image_name_1
│   │   ├── image_crop_1.png
│   │   ├── ...
│   │   └── image_crop_N.png
│   └──original_image_name_N
│       ├── image_crop_1.png
│       ├── ...
│       └── image_crop_N.png
└── gt_folder_name
    ├── original_image_name_1
    │   ├── image_crop_1.png
    │   ├── ...
    │   └── image_crop_N.png
    └──original_image_name_N
        ├── image_crop_1.png
        ├── ...
        └── image_crop_N.png
Parameters:
  • directory (Path) – Path to root dir of split

  • data_folder_name (str) – Name of the folder containing the data

  • gt_folder_name (str) – Name of the folder containing the ground truth

  • selection (Union[int, List[str]]) – If you only want to use a subset of the dataset, you can specify the name of the files (without the file extension) in a list. If you want to use all files, set this parameter to None.

Returns:

List of paths to the original images

Return type:

List[Path]

Module contents