For this purpose Dirichlet distribution might be helpful because it helps to generate quantities that sum to 1. A Dirichlet-distributed random variable can be seen as a multivariate generalization of a Beta distribution.
>>> np.random.dirichlet((1, 1), 1) # for 2 images.Equivalent to λ and(1 - λ) array([ [0.92870347, 0.07129653] ]) >>> np.random.dirichlet((1, 1, 1), 1) # for 3 images. array([ [0.38712673, 0.46132787, 0.1515454] ]) >>> np.random.dirichlet((1, 1, 1, 1), 1) # for 4 images. array([ [0.59482542, 0.0185333, 0.33322484, 0.05341645] ])
So, for multiple λ
, you also need to calculate them accordingly.
# let 's say for 4 images # I am not sure the proper way. image_list = [4 images] label_list = [4 label] new_img = np.zeros((w, h)) beta_list = np.random.dirichlet((1, 1, 1, 1), 1)[0] for idx, beta in enumerate(beta_list): x0, y0, w, h = get_cropping_params(beta, full_img) # something like this new_img[x0, y0, w, h] = image_list[idx][x0, y0, w, h] label_list[idx] = label_list[idx] * beta
So essentially, we sample two values:
w = np.random.uniform(0, 1) h = np.random.uniform(0, 1)
These two values are sufficient to parameterize the mosaic problem. Each image in the mosaic occupies areas spanned by the following coordinates:
Consider that the mosaic image has dimensions W x H
and the midpoints of each dimension are represented by w
and h
respectively.
-top left - (0, 0) to(w, h) -
top right - (w, 0) to(W, h) -
bottom left - (0, h) to(w, H) -
bottom right - (w, h) to(W, H)
To create a class label in CutMix or MixUp type augmentation, we can use beta such as np.random.beta or scipy.stats.beta and do as follows for two labels:, › How to create class label for mosaic augmentation in image classification , 1 week ago Jul 26, 2022 · Thus, the first thing to do is to clearly determine the labels you'll need based on your classification goals. Then, you can craft your image dataset accordingly. In particular, you need to take into account 3 key aspects: the desired level of granularity within each label, the desired number of labels, and what parts of an image fall within ... , The CutMix function takes two image and label pairs to perform the augmentation. It samples λ (l) from the Beta distribution and returns a bounding box from get_box function. We then crop the second image ( image2) and pad this image in the final padded image at the same location.
label = label_one * beta + (1 - beta) * label_two
label = label_one * beta + (1 - beta) * label_two
import tensorflow as tf import matplotlib.pyplot as plt import random (train_images, train_labels), (test_images, test_labels) = \ tf.keras.datasets.cifar10.load_data() train_images = train_images[:10,:,:] train_labels = train_labels[:10] train_images.shape, train_labels.shape ((10, 32, 32, 3), (10, 1))
def mosaicmix(image, label, DIM, minfrac = 0.25, maxfrac = 0.75): '' 'image, label: batches of samples' '' xc, yc = np.random.randint(DIM * minfrac, DIM * maxfrac, (2, )) indices = np.random.permutation(int(image.shape[0])) mosaic_image = np.zeros((DIM, DIM, 3), dtype = np.float32) final_imgs, final_lbs = [], [] # Iterate over the full indicesfor j in range(len(indices)): # Take 4 sample for to create a mosaic sample randomly rand4indices = [j] + random.sample(list(indices), 3) # Make mosaic with 4 samples for i in range(len(rand4indices)): if i == 0: # top leftx1a, y1a, x2a, y2a = 0, 0, xc, ycx1b, y1b, x2b, y2b = DIM - xc, DIM - yc, DIM, DIM # from bottom right elif i == 1: # top rightx1a, y1a, x2a, y2a = xc, 0, DIM, ycx1b, y1b, x2b, y2b = 0, DIM - yc, DIM - xc, DIM # from bottom left elif i == 2: # bottom leftx1a, y1a, x2a, y2a = 0, yc, xc, DIMx1b, y1b, x2b, y2b = DIM - xc, 0, DIM, DIM - yc # from top right elif i == 3: # bottom rightx1a, y1a, x2a, y2a = xc, yc, DIM, DIMx1b, y1b, x2b, y2b = 0, 0, DIM - xc, DIM - yc # from top left # Copy - Paste mosaic_image[y1a: y2a, x1a: x2a] = image[i, ][y1b: y2b, x1b: x2b] # Append the Mosiac samples final_imgs.append(mosaic_image) return final_imgs, label
data, label = mosaicmix(train_images, train_labels, 32) plt.imshow(data[5]/255)
>>> np.random.dirichlet((1, 1), 1) # for 2 images.Equivalent to λ and(1 - λ) array([ [0.92870347, 0.07129653] ]) >>> np.random.dirichlet((1, 1, 1), 1) # for 3 images.array([ [0.38712673, 0.46132787, 0.1515454] ]) >>> np.random.dirichlet((1, 1, 1, 1), 1) # for 4 images.array([ [0.59482542, 0.0185333, 0.33322484, 0.05341645] ])
# let 's say for 4 images # I am not sure the proper way. image_list = [4 images] label_list = [4 label] new_img = np.zeros((w, h)) beta_list = np.random.dirichlet((1, 1, 1, 1), 1)[0] for idx, beta in enumerate(beta_list):x0, y0, w, h = get_cropping_params(beta, full_img) # something like thisnew_img[x0, y0, w, h] = image_list[idx][x0, y0, w, h]label_list[idx] = label_list[idx] * beta
w = np.random.uniform(0, 1) h = np.random.uniform(0, 1)
Last updated 2022-02-23 UTC.
Setup
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
from tensorflow.keras
import layers
This tutorial uses the tf_flowers dataset. For convenience, download the dataset using TensorFlow Datasets. If you would like to learn about other ways of importing data, check out the load images tutorial.
(train_ds, val_ds, test_ds), metadata = tfds.load(
'tf_flowers',
split = ['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
with_info = True,
as_supervised = True,
)
The flowers dataset has five classes.
num_classes = metadata.features['label'].num_classes
print(num_classes)
Let's retrieve an image from the dataset and use it to demonstrate data augmentation.
get_label_name = metadata.features['label'].int2str
image, label = next(iter(train_ds))
_ = plt.imshow(image)
_ = plt.title(get_label_name(label))
get_label_name = metadata.features['label'].int2str
image, label = next(iter(train_ds))
_ = plt.imshow(image)
_ = plt.title(get_label_name(label))
2022 - 02 - 23 02: 24: 47.464682: W tensorflow / core / kernels / data / cache_dataset_ops.cc: 768] The calling iterator did not fully read the dataset being cached.In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded.This can happen
if you have an input pipeline similar to `dataset.cache().take(k).repeat()`.You should use `dataset.take(k).cache().repeat()`
instead.