Pytorch: Step by Step implementation 3D Convolution Neural Network

栏目: IT技术 · 发布时间: 5年前

内容简介:In this article, we will be briefly explaining what a 3d CNN is, and how it is different from a generic 2d CNN. Then we will teach you step by step how to implement your own 3D Convolutional Neural Network usingA very dominant part of this article can be f

Pytorch: Step by Step implementation 3D Convolution Neural Network

Lern on how to code a PyTorch implementation of 3d CNN

In this article, we will be briefly explaining what a 3d CNN is, and how it is different from a generic 2d CNN. Then we will teach you step by step how to implement your own 3D Convolutional Neural Network using Pytorch .

A very dominant part of this article can be found again on my other article about 3d CNN implementation in Keras. So if you tend to code with Tensorflow/Keras instead then this link might be appropriate.

This article will be written around these 4 parts :

  • 1] What is a 3D Convolutional Neural Network?
  • 2] How does 3d datas look like? (e.g MNIST)
  • 3] How to Implement it now?!
  • 4] But then a 3d? What for?

1] What is a 3D Convolutional Neural Network?

A 3d CNN remains regardless of what we say a CNN that is very much similar to 2d CNN. Except that it differs in these following points (non-exhaustive listing):

3d Convolution Layers

Originally a 2d Convolution Layer is an entry per entry multiplication between the input and the different filters, where filters and inputs are 2d matrices. (fig.1)

fig.1 (rights: own)

In a 3d Convolution Layer, the same operations are used. We do these operations on multiple pairs of 2d matrices. (fig.2)

fig.2 (rights: own)

Padding options and slides step options work the same way.

3d MaxPool Layers

2d Maxpool Layers (2×2 filter) is about taking the maximum element of a small 2×2 square that we delimitate from the input. (fig.3)

fig.3 (rights: own)

Now in a 3d Maxpool (2x2x2), we look for the maximum element in a width 2 cube. This cube represents the space delimited by the 2x2x2 zone from the input. (fig.4)

fig.4 (rights: own)

Note that the number of operations (compared to 2d CNN layers) is multiplied by the size of the filters used (regardless of the layer being Maxpool or Convolution) and also multiplied by the size of the input itself.

2]How does 3d datas look like?

So how does a data point for a 3d CNN look like?

One way to picture it is by using the following image (fig.5):

fig.5 (rights: own)

Other existing datasets that you can use for your CNN are:

3]How to Implement it now?!

You can try for yourself the code on this dataset from Kaggle that we are using.

Multiple libraries will be used throughout the the notebook. Here are the lists of it.

To begin with, since the dataset is a bit specific, we use the following to helper functions to process them before giving them to the network.

Plus, the dataset is stored as h5 file, so to extract the actual data points, we are required to read from h5 file, and use the to_categorical function to transform it into vectors. In this step, we also prepare for cross-validation.

Supposedly, the variables X_train/X_test should have respectively shape (10000, 16, 16, 16, 3) and (2000, 16, 16, 16, 3) and targets_train/targets_test respectively (10000,) (2000,). But again we now convert all of that to PyTorch tensor format. Which we do the following way.

For the model here is the architecture that we will be using:

2 sets of ConvMake:

  • a 3d Convolution Layer with filter size (3x3x3) and stride (1x1x1) for both sets
  • a Leaky Relu Activation function
  • a 3d MaxPool Layer with filters size (2x2x2) and stride (2x2x2)

2 FC Layerswith respectively 512 and 128 nodes.

1 Dropout Layerafter first FC layer.

The model is then translated into the code the following way:

In terms of parameters pay attention to the number of input nodes on your first Fully Convolutional Layer. Our data set being of shape (16,16,16,3), that is how we are getting filtered outputs of size (2x2x2).

If you are not familiar with CNN on PyTorch (i.e parameters or training of model) then consider reading this introduction to CNN on PyTorch!

Here is the code for training. Nothing really special, you can obviously optimise (WAY MORE!) it by changing the optimizer to Adam for instance, tweak on the learning rate (with some momentum) and much more…

For your information, after a small sample training, we got the following accuracies and losses. (fig.6 and fig.7)

fig.6 (rights: own)
fig.7 (rights: own)

4] But then a 3d? What for?

There happens to have many applications for a 3d CNN that are for instance:

  • IRM data processing and therefore the inference
  • self-driving
  • Distance estimation

Alright, that’s pretty much all. I hope you will try this technology out! The source code is over here !


以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

全景探秘游戏设计艺术

全景探秘游戏设计艺术

Jesse Schell / 吕阳、蒋韬、唐文 / 电子工业出版社 / 2010-6 / 69.00元

撬开你脑子里的那些困惑,让你重新认识游戏设计的真谛,人人都可以成为成功的游戏设计者!从更多的角度去审视你的游戏,从不完美的想法中跳脱出来,从枯燥的游戏设计理论中发现理论也可以这样好玩。本书主要内容包括:游戏的体验、构成游戏的元素、元素支撑的主题、游戏的改进、游戏机制、游戏中的角色、游戏设计团队、如何开发好的游戏、如何推销游戏、设计者的责任等。 本书适合任何游戏设计平台的游戏设计从业人员或即将......一起来看看 《全景探秘游戏设计艺术》 这本书的介绍吧!

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试

RGB HSV 转换
RGB HSV 转换

RGB HSV 互转工具

RGB CMYK 转换工具
RGB CMYK 转换工具

RGB CMYK 互转工具