Are you ready for a Video Classification Challenge?

栏目: IT技术 · 发布时间: 5年前

内容简介:To help you learn new skills as well as win some prize money online, while working from home, we at MathWorks are launching a data science competition.The dataset will be comprised of image stack (a 3D image) taken from a live mouse brain showing blood ves

Preparation Guide for Video Classification

Apr 25 ·6min read

Are you ready for a Video Classification Challenge?

Image Source ( Stall Catchers )

To help you learn new skills as well as win some prize money online, while working from home, we at MathWorks are launching a data science competition.

Teaser: The Datathon will be live in May. Signup for DrivenData account to receive the launch announcement. Request for complimentary MATLAB licenses here : Advance Alzheimer’s Research with Stall Catchers

The dataset will be comprised of image stack (a 3D image) taken from a live mouse brain showing blood vessels and blood flow. Each stack will have an outline drawn around a target vessel segment and will be converted to an .mp4 video file. The problem will be to classify the target vessel segment as either flowing or stalled. The challenge will be online, globally accessible and free to participate in. You can use any approach to solve the problem.

In this story, I will talk about the concepts and methods I learned while working on setting up this problem. I will also point you to the documents you can refer, to start preparing for the challenge.

Working with Data

Video Data

Working with videos is an extension of working with images; we additionally must consider dynamic nature of a video over the static nature of an image. A video can be defined as a stack of images, also referred to as frames arranged in a specific order. Each frame is meaningful, but the order is also very important. Hence both spatial and temporal content of the frames need to be measured.

So, the first step is extracting frames from video. Make sure that the frames should have both, the sequence modeling and the temporal reasoning.

Process Data

Another challenge in working with videos is the large size of the dataset. In MATLAB, you can use the concept of datastore , to create a repository for collections of data that are too large to fit in memory. A datastore allows you to read and process data stored in multiple files on a disk, a remote location, or a database as a single entity.

Documents to refer:

Video Classification Methods

Once the data is ready, you can use either of the 5 below methods to proceed with classification. I will talk about the most commonly used video classification methods from basic non-deep learning approach to an advanced one. But I would encourage you to use the deep learning approaches due to the size of the data and to extract features from each frame in timely manner.

Classical Computer Vision Methods

Method 1: Optical Flow, Object Tracking & Cascade Classifier

Optical flow, activity recognition, motion estimation and tracking are the key activities you can use to determine the classes and their movement in adjacent frames of the video.

Resources to refer:

Another approach can be by using the local features like blobs, corners and edge pixels of an image. The cascade classifier supports local features like Haar, local binary patterns (LBP) and histograms of oriented gradients (HOG).

Resources to refer:

Deep Learning Methods

Method 2: Convolutional Neural Network (CNN) + Long short-term memory network (LSTM)

In this method, you convert the videos to a sequence of feature vectors using a pre-trained convolutional neural network to extract features from each frame. Then train a Long short-term memory (LSTM) network on the sequences to predict the video labels. As a final step, combine layers from both networks to assemble a final network that classifies videos directly.

To learn steps for this complete workflow, check this document: Classify Videos Using Deep Learning

Image Source ( MATLAB documen t)

Method 3: Large-scale video classification with CNN

If video classification is like image classification, why not just use convolutional neural network?

To answer this, remember I talked about the temporal component of the video. So, to capture the temporal and spatial aspects, you can use CNN, but you need to structure the network in different ways.

This paper from Stanford, Large-scale Video Classification with Convolutional Neural Networks , talks about the challenges of the basic CNN for videos. It further elaborates all the different models of CNN you can use, to fuse features from multiple frames.

Are you ready for a Video Classification Challenge?

Image source: Research paper

Method 4: Two-stream CNN

The other approach as explained by the researchers in this paper: Two-Stream Convolutional Networks for Action Recognition in Videos , is two conv-nets each for spatial and temporal aspect.

Are you ready for a Video Classification Challenge?

Image Source: Research paper

Documents to refer to develop CNN architecture in MATLAB:

Method 5: Using a 3D convolution network

3D ConvNets are on the initial choice for video classification since they inherently apply convolutions and max pooling in the 3D space. In this paper: Learning Spatiotemporal Features with 3D Convolutional Networks , researchers propose a C3D ( convolutional 3D ) with compact features and efficient compute.

Are you ready for a Video Classification Challenge?

Image source: Research paper

Documents to refer:

Next Steps

If you do not have a MATLAB license, start your preparation by requesting for complimentary MATLAB licenses here: Advance Alzheimer’s Research with Stall Catchers .

Stay tuned for further updates, in my next blog in May, on the competition launch day. The blog will be the benchmark code for the problem with all other details.

Feel free to give your feedback or any questions you have in the comments below.


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

算法

算法

Robert Sedgewick、Kevin Wayne / 人民邮电出版社 / 2012-3 / 99.00元

《算法(英文版•第4版)》作为算法领域经典的参考书,全面介绍了关于算法和数据结构的必备知识,并特别针对排序、搜索、图处理和字符串处理进行了论述。第4版具体给出了每位程序员应知应会的50个算法,提供了实际代码,而且这些Java代码实现采用了模块化的编程风格,读者可以方便地加以改造。本书配套网站提供了本书内容的摘要及更多的代码实现、测试数据、练习、教学课件等资源。 《算法(英文版•第4版)》适合......一起来看看 《算法》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

HTML 编码/解码
HTML 编码/解码

HTML 编码/解码

URL 编码/解码
URL 编码/解码

URL 编码/解码