Detecting animals in the backyard — practical application of deep learning.

栏目: IT技术 · 发布时间: 4年前

Source: Deep Learning on Medium

Detecting animals in the backyard — practical application of deep learning.
  1. Install OpenCV
  2. Multiprocessing VideoReader
  3. Tensorflow model Megadetector
  4. Batches
  5. Possible optimizations: Graph Optimize, TensorRT

As I didn t have data, resources, and time to train my own animal detection neural network, I searched the net for what was available today. And I ve found that the task even with state-of-the-art neural networks and data gathered all the world is not so simple as it seemed.

Of course, there are products and researches doing animal detection. Still, with one main difference from what I was looking for they are detecting creatures from photo cameras or smartphone cameras, and such shots differ by color, shapes, and quality from what you are getting with motion detection cameras.

But, whatever. There are still projects doing the same as my goal was. My searches led me to the CameraTraps project from Microsoft. As I understood, they are building Image Recognition API using data collected from different Wild Life Cameras all over the world. As a result of that, they open-sourced the pre-trained model for detecting, if animal or human , is present on the image, called MegaDetector.

The main limitation of that model is coming from the name of the model. It is only a Detector, but not a Classifier.

Statement from Microsoft about detectors and classifiers

Even considered limitations like that, such an approach did fit me perfectly.

The model is trained to detect three different classes:

  1. Animal
  2. Person
  3. Vehicle
Racoon identified with Animal class

In most of the cases, you ll find in various blog posts when speaking of video object detection, the real-time video will be described. My case was a bit different as an input, I had a huge pile of video files produced by the camera, and as output, I also wanted video files.

For reading and writing video files in Python today as standard-de-facto considered OpenCV library. I also find it my favorite image manipulation package.

The logic of running inference on video file is quite straightforward:

  1. Read: Get a frame from the video
  2. Detect: Run inference on the image
  3. Write: Save a video frame to the new file with detection if there are any.
  4. Repeat: Run steps 1 3 until the end of the video

It can be implemented with this code sample

Tensorflow object detection with OpenCV VideoReader

Even though such a straightforward approach has several bottlenecks as the same thread reading and writing, it works. So, if you are looking for a code to try your model on a video, check that script.

It took me around 10 minutes to process a FullHD one-minute 10 FPS video file.

Detection took 9 minutes and 18.18 seconds. Average detection time per frame: 0.93 seconds

But you can find many tutorials like that telling you how to run a vanilla OpenCV/Tensorflow inference. The challenging part is how to make that code run continuously and with nice performance.

I/O blocks

With the code provided, reading frames, detecting, and writing back are happening in the same loop, and it means sooner or later one of the operations will become a bottleneck, for example, reading video files from not-very-stable network storage.

To get rid of that part, I ve used instructions from a fantastic computer vision blogger Adrian Rosebrock and his library imutils. He is offering splitting reading frames and processing frames into multiple threads, and such an approach gave me a prepopulated queue of frames ready to be processed.

Modified FileVideoStream from Adrian Rosebrock

It won t impact much on inference time, but it helps with slow drives, which are often used for video storage.

Optimization: Graph analysis

Another part, I ve heard about was optimizing models for deployment. I ve followed a guide discovered here: https://towardsdatascience.com/optimize-nvidia-gpu-performance-for-efficient-model-inference-f3e9874e9fdc and managed to achieve some improvement by assigning non-GPU supported layers to be processed on CPU.

[INFO] :: Detection took 8 minutes and 39.91 seconds. Average detection time per frame: 0.86 seconds

Batch inference

Based on my previous experience, one of the bottleneck parts in deep learning training was data transfer from disk to GPU, and to minimize that time were used so-called batches when GPU got several images at once.

I wondered if it was possible to do the same batch processing on inference. And luckily, it was possible according to StackOverflow answer.

I just needed to find the largest acceptable batch size and pass array or frames for inference. For that, I ve extended FileVideoStream class with batch functionality

[INFO] :: Detection took 8 minutes and 1.12 second. Average detection time per frame: 0.8 seconds

Optimization: Compiling from sources

Another important part, when we are talking about running heavy, time-consuming computations, is squeezing the most from the hardware.

One of the most straightforward approaches is using machine-type optimized packages. The message every Tensorflow user has seen:

tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

It means that Tensorflow is underutilizing hardware because of ignoring built-in CPU optimizations. And the reason for that is because the generic package was installed, which will work on any type of x86 machine.

One way of increasing its performance is to install optimized package from 3rd parties like https://github.com/lakshayg/tensorflow-build, https://github.com/mind/wheels or https://github.com/yaroslavvb/tensorflow-community-wheels/issues

Another way is to follow instructions from Google and build the package from source https://www.tensorflow.org/install/source#tensorflow_1x. But consider that it is could a bit difficult if you didn t have an experience before and it is quite a time and RAM consuming process (last time it took 3.5 hours on my six-core CPU).

The same comes with OpenCV, but that is an even more complex topic, so I m not covering it here. There are handy guides by Adrian Rosebrock, if you are interested in that topic, please follow them.

Share

Small Python application waiting for the incoming videos with detections. As the video arrives, it updates my Telegram channel. I ve used my previous project, which was resending incoming videos to my telegram channel.

The app is configured like that and continuously monitoring a folder for the new files using watchdog library

{
"xiaomi_video_watch_dir" : PATH_TO_WATCH,
"xiaomi_video_temp_dir" : PATH_TO_STORE_TEMP_FILES,
"xiaomi_video_gif_dir" : PATH_WITH_OUTPUT_GIFS,
"tg_key" : TELEGRAM_KEY
}
Initial Telegram group version

What didn t work out?

This project brought me lots of new learnings and even though I ve managed to reach my final goal, I ve gone through some failed trials. And I think that is one of the most important parts of each project.

Image Enhancing

During my research, I ve come across several reports from iWildCam Kaggle competition participants. They mentioned quite often about applying the CLAHE algorithm to input images for Histogram Equalization. I ve tried the mentioned algorithm and several others, but with no success. Applying image modification dropped the number of successful detections. But to be honest, night camera images looked more sharp and crisp.

def enchance_image(frame):
temp_img = frame
img_wb = wb.balanceWhite(temp_img)
img_lab = cv.cvtColor(img_wb, cv.COLOR_BGR2Lab)
l, a, b = cv.split(img_lab)
img_l = clahe.apply(l)
img_clahe = cv.merge((img_l, a, b))
return cv.cvtColor(img_clahe, cv.COLOR_Lab2BGR)

以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

Ruby元编程

Ruby元编程

[意] Paolo Perrotta / 廖志刚、陈睿杰 / 华中科技大学出版社 / 2012-1-10 / 56.00元

《Ruby元编程》以案例形式循序渐进讲解Ruby对象模型原理和高级应用技巧,堪称动态语言的设计模式。书中讲述的各种Ruby编程模式,完全可以应用于其他动态语言(甚至静态语言)。本书不仅适合Ruby程序员阅读,也适合对动态编程 语言和面向对象编程感兴趣的读者阅读。所有对程序设计理论感兴趣的人都能从中获益。Ruby之父松本行弘作序推荐。一起来看看 《Ruby元编程》 这本书的介绍吧!

CSS 压缩/解压工具
CSS 压缩/解压工具

在线压缩/解压 CSS 代码

MD5 加密
MD5 加密

MD5 加密工具

UNIX 时间戳转换
UNIX 时间戳转换

UNIX 时间戳转换