This approach enables the model to learn spatial as well as temporal information about the appearance and movement of the objects in a scene. Each stream performs image (frame) classification on its own, and in the end, the predicted scores are merged using the fusion layer.,This approach is opposite of the late fusion, as, in this approach, the temporal dimension and the channel (RGB) dimension of the video are fused at the start before passing it to the model which allows the first layer to operate over frames and learn to identify local pixel motions between adjacent frames.,Now we have two numpy arrays, one containing all images. The second one contains all class labels in one hot encoded format. Let us split our data to create a training, and a testing set. We must shuffle the data before the split, which we have already done.,Consider the action of Standing Up from a Chair and Sitting Down on a Chair. In both actions, the frames are almost the same. The main differentiator is the order of the frame sequence. So you need temporal information to correctly predict these actions.
!pip install pafy youtube - dl moviepy
import os
import cv2
import math
import pafy
import random
import numpy as np
import datetime as dt
import tensorflow as tf
from moviepy.editor
import *
from collections
import deque
import matplotlib.pyplot as plt %
matplotlib inline
from sklearn.model_selection
import train_test_split
from tensorflow.keras.layers
import *
from tensorflow.keras.models
import Sequential
from tensorflow.keras.utils
import to_categorical
from tensorflow.keras.callbacks
import EarlyStopping
from tensorflow.keras.utils
import plot_model
seed_constant = 23 np.random.seed(seed_constant) random.seed(seed_constant) tf.random.set_seed(seed_constant)
!wget - nc--no - check - certificate https: //www.crcv.ucf.edu/data/UCF50.rar
!unrar x UCF50.rar - inul - y
--2021 - 02 - 01 05: 58: 40--https: //www.crcv.ucf.edu/data/UCF50.rar Resolving www.crcv.ucf.edu(www.crcv.ucf.edu)...132.170 .214 .127 Connecting to www.crcv.ucf.edu(www.crcv.ucf.edu) | 132.170 .214 .127 |: 443...connected. WARNING: cannot verify www.crcv.ucf.edu 's certificate, issued by ‘CN=InCommon RSA Server CA,OU=InCommon,O=Internet2,L=Ann Arbor,ST=MI,C=US’: Unable to locally verify the issuer 's authority. HTTP request sent, awaiting response...200 OK Length: 3233554570(3.0 G)[application / rar] Saving to: ‘UCF50.rar’ UCF50.rar 100 % [ === === === === === === => ] 3.01 G 33.5 MB / s in 50 s 2021 - 02 - 01 05: 59: 30(61.8 MB / s) - ‘UCF50.rar’ saved[3233554570 / 3233554570]
# Create a Matplotlib figure plt.figure(figsize = (30, 30)) # Get Names of all classes in UCF50 all_classes_names = os.listdir('UCF50') # Generate a random sample of images each time the cell runs random_range = random.sample(range(len(all_classes_names)), 20) # Iterating through all the random samples for counter, random_index in enumerate(random_range, 1): # Getting Class Name using Random Index selected_class_Name = all_classes_names[random_index] # Getting a list of all the video files present in a Class Directory video_files_names_list = os.listdir(f 'UCF50/{selected_class_Name}') # Randomly selecting a video file selected_video_file_name = random.choice(video_files_names_list) # Reading the Video File Using the Video Capture video_reader = cv2.VideoCapture(f 'UCF50/{selected_class_Name}/{selected_video_file_name}') # Reading The First Frame of the Video File _, bgr_frame = video_reader.read() # Closing the VideoCapture object and releasing all resources. video_reader.release() # Converting the BGR Frame to RGB Frame rgb_frame = cv2.cvtColor(bgr_frame, cv2.COLOR_BGR2RGB) # Adding The Class Name Text on top of the Video Frame. cv2.putText(rgb_frame, selected_class_Name, (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2) # Assigning the Frame to a specific position of a subplot plt.subplot(5, 4, counter) plt.imshow(rgb_frame) plt.axis('off')
To learn how to perform video classification with Keras and Deep learning, just keep reading!,Are you referring specifically to video classification? Yes, I’m covering video classification and human activity recognition in the 3rd edition of Deep Learning for Computer Vision with Python.,In this tutorial, you will learn how to perform video classification using Keras, Python, and Deep Learning.,The difference between video classification and standard image classification
Extract the .zip and navigate into the project folder from your terminal:
$ unzip keras - video - classification.zip $ cd keras - video - classification
The data we’ll be using today is in the following path:
$ cd keras - video - classification
$ ls Sports - Type - Classifier / data | grep - Ev "urls|models|csv|pkl"
football
tennis
weight_lifting
Now that we have our project folder and Anubhav Maity‘s repo sitting inside, let’s review our project structure:
$ tree--dirsfirst--filelimit 50 .├──Sports - Type - Classifier│├── data││├── football[799 entries]││├── tennis[718 entries]││└── weight_lifting[577 entries]├── example_clips│├── lifting.mp4│├── soccer.mp4│└── tennis.mp4├── model│├── activity.model│└── lb.pickle├── output├── plot.png├── predict_video.py└── train.py 8 directories, 8 files
Let’s go ahead and parse our command line arguments now:
# construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-d", "--dataset", required = True, help = "path to input dataset") ap.add_argument("-m", "--model", required = True, help = "path to output serialized model") ap.add_argument("-l", "--label-bin", required = True, help = "path to output label binarizer") ap.add_argument("-e", "--epochs", type = int, default = 25, help = "# of epochs to train our network for") ap.add_argument("-p", "--plot", type = str, default = "plot.png", help = "path to output loss/accuracy plot") args = vars(ap.parse_args())
With our command line arguments parsed and in-hand, let’s proceed to initialize our LABELS
and load our data
:
# initialize the set of labels from the spots activity dataset we are # going to train our network on LABELS = set(["weight_lifting", "tennis", "football"]) # grab the list of images in our dataset directory, then initialize # the list of data(i.e., images) and class images print("[INFO] loading images...") imagePaths = list(paths.list_images(args["dataset"])) data = [] labels = [] # loop over the image paths for imagePath in imagePaths: # extract the class label from the filename label = imagePath.split(os.path.sep)[-2] # if the label of the current image is not part of of the labels # are interested in , then ignore the image if label not in LABELS: continue # load the image, convert it to RGB channel ordering, and resize # it to be a fixed 224 x224 pixels, ignoring aspect ratio image = cv2.imread(imagePath) image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) image = cv2.resize(image, (224, 224)) # update the data and labels lists, respectively data.append(image) labels.append(label)
!pip install - q git + https: //github.com/tensorflow/docs
[33 m WARNING: Built wheel
for tensorflow - docs is invalid: Metadata 1.2 mandates PEP 440 version, but '0.0.0543363dfdc669b09def1e06abdd34b76337fba4e-'
is not[0 m[33 m DEPRECATION: tensorflow - docs was installed using the legacy 'setup.py install'
method, because a wheel could not be built
for it.A possible replacement is to fix the wheel build issue reported above.You can find discussion regarding this at https: //github.com/pypa/pip/issues/8368.[0m
!wget - q https: //git.io/JGc31 -O ucf101_top5.tar.gz
!tar xf ucf101_top5.tar.gz
from tensorflow_docs.vis
import embed
from tensorflow.keras
import layers
from tensorflow
import keras
import matplotlib.pyplot as plt
import tensorflow as tf
import pandas as pd
import numpy as np
import imageio
import cv2
import os
2021 - 09 - 14 13: 26: 26.593418: W tensorflow / stream_executor / platform /
default / dso_loader.cc: 64] Could not load dynamic library 'libcudart.so.11.0';
dlerror: libcudart.so .11 .0: cannot open shared object file: No such file or directory
2021 - 09 - 14 13: 26: 26.593444: I tensorflow / stream_executor / cuda / cudart_stub.cc: 29] Ignore above cudart dlerror
if you do not have a GPU set up on your machine.
MAX_SEQ_LENGTH = 20 NUM_FEATURES = 1024 IMG_SIZE = 128 EPOCHS = 5
Evaluating our Video Classification Model,It’s finally time to train our video classification model! I’m sure this is the most anticipated section of the tutorial. I have divided this step into sub-steps for ease of understanding:,Training our Video Classification Model,In this article, we covered one of the most interesting applications of computer vision – video classification. We first understood how to deal with videos, then we extracted frames, trained a video classification model, and finally got a comparable accuracy of 44.8% on the test videos.
You can download the dataset from the official UCF101 site. The dataset is in a .rar format so we first have to extract the videos from it. Create a new folder, let’s say ‘Videos’ (you can pick any other name as well), and then use the following command to extract all the downloaded videos:
unrar e UCF101.rar Videos /