Tutorial: Creating a `DatasetFromCSV` Dataset¶

The XAI.FNFGradCAM.do_gradcam() and XAI.FSGradCAM.do_gradcam() functions take in a PyTorch dataset to perform the Grad-CAM (as opposed to images). You can refer to Step 2, Method 1 of Tutorial: Doing Grad-CAM on the Food-Non-Food (FNF) model and Step 2, Method 1 of Tutorial: Doing Grad-CAM on the Food Scoring (FS) model for more details on how the dataset is used.

This tutorial covers how to create this custom PyTorch dataset using the XAI.utils.datasets.DatasetFromCSV() class.

Note:

The same dataset can be used for both the FNF and FS models.

The Required `.csv` File¶

As stated in the name of XAI.utils.datasets.DatasetFromCSV(), this cutom PyTorch dataset requires a .csv to be created. For applications of Grad-CAM on FoodDX models, this .csv is expected to to already exist as part of the various FoodDX pipelines and should be the data.csv file. Specifically, it should look something like the following:

id	source_id	s3_path_fns	s3_path_fbd	path_fns	path_fbd	annot_json	food_score	food_type	label_fs	label_fnf
164c729b-86d2-11eb-b774-06d7ab6752a4	local_HK	s3://path/to/fns_img/164c729b-86d2-11eb-b774-06d7ab6752a4.png	s3://path/to/fbd_img/164c729b-86d2-11eb-b774-06d7ab6752a4.png	path/to/fns_img/164c729b-86d2-11eb-b774-06d7ab6752a4.png	path/to/fbd_img/164c729b-86d2-11eb-b774-06d7ab6752a4.png		1	ChineseShortbread	1	1
…	…	…	…	…	…	…	…

When performing Grad-CAM on FNF and FS models, it is compulsory that the id, path_fns, food_score and label_fnf columns are present. All other columns can be ommited.

Creating the Dataset¶

To create the dataset, run the following:

import XAI

ds = XAI.datasets.DatasetFromCSV(path_to_csv="/path/to/desired/csv/data.csv")

The XAI.datasets.DatasetFromCSV() function also takes in a transform argument. This has to be a callable function. When it is specified (i.e. not None), this transformation function will be applied to the images in the dataset on the fly when loading the images to perform Grad-CAM. By default, this function is set to transform=XAI.utils.datasets.data_transforms().

This default transformation function (data_transforms()) transforms each image from a numpy array into a PyTorch tensor before normalising its pixel values to a range of [-1 , 1]. Refer to THIS for more information.

Exploring the Dataset¶

Since the dataset ds was created using the XAI.utils.datasets.DatasetFromCSV() class, it has the following type:

>>> type(ds)
XAI.utils.datasets.DatasetFromCSV

To get the length of the dataset, use the len() function as such:

>>> len(ds)
10

The dataset returns the following 5 items/information for each image:

The image itself
The image ID
The path to the image
The ground truth FNF label (0 = non-food, 1 = food)
The ground truth FS label (food score = {1, 2, 3, 4, 5})

You can iterate through the dataset and print the 5 items by doing the following. Below, we print only the first entry of the dataset:

>>> for img_rgb, img_id, img_path, img_label_fnf, img_food_score in ds:
>>>
>>>     print("img.shape     :", img_rgb.shape)
>>>     print("img_id        :", img_id)
>>>     print("img_path      :", img_path)
>>>     print("img_label_fnf :", img_label_fnf)
>>>     print("img_food_score:", img_food_score)
>>>
>>>     break
img.shape     : torch.Size([3, 299, 299])
img_id        : 164c729b-86d2-11eb-b774-06d7ab6752a4
img_path      : path/to/fns_img/164c729b-86d2-11eb-b774-06d7ab6752a4.png
img_label_fnf : 1
img_food_score: 1

Notice that:

The default transformation function XAI.utils.datasets.data_transforms() has been applied to the image, resulting in the image being a torch tensor as opposed to a numpy array.
The dataset ds can be used for both FNF and FS models because both ground truth labels (img_label_fnf, img_food_score) are returned by the dataset.

Tutorial: Creating a DatasetFromCSV Dataset¶

The Required .csv File¶

Creating the Dataset¶

Exploring the Dataset¶

Tutorial: Creating a `DatasetFromCSV` Dataset¶

The Required `.csv` File¶