Cuda on WSL2 for Deep Learning — First Impressions and Benchmarks

栏目: IT技术 · 发布时间: 5年前

内容简介：Not going to lie, Microsoft has been doing some good things in the software development community. I love coding in Visual Studio Code and ONNX has been great if you want to optimize your deep learning models for production. WSL2 allowing you to have acces

Cuda on WSL2 for Deep Learning — First Impressions and Benchmarks

Michael Phi

Jul 2 ·6min read

Cuda on WSL2 for Deep Learning — First Impressions and Benchmarks

Not going to lie, Microsoft has been doing some good things in the software development community. I love coding in Visual Studio Code and ONNX has been great if you want to optimize your deep learning models for production. WSL2 allowing you to have access to an entire Linux Kernel is exactly what I’ve been wanting, but the lack of CUDA support means it was a non-starter for me as an A.I. Engineer.

As an A.I. Engineer and a Content Creator, I need Windows for the tools to create my content, but I need Linux to easily run and train my A.I. software projects. I have a dual boot setup on my machine, but it’s not ideal. I hate that if I’m training a model I can’t access my creation tools. So I have to work synchronously instead of asynchronously.

I know most deep learning libraries support Windows but the experience to get things working, especially open source A.I. software, was always a headache. I know I can use something like qemu for running Windows software on Linux, but that requires me to isolate an entire GPU to the VM, causing my Linux instance to not have access to it.

Here come Microsoft and Nvidia with CUDA WSL2 support! The promise of all Linux tools running natively on Windows would be a dream for my workflow. I immediately jumped on it when they released a preview version. In this post, I will write about my first impressions as well as some benchmarks!

Setting up CUDA on WSL2

Setting up Cuda on WSL2 was super easy for me. Nvidia has really good docs explaining the steps you need to take. I was pleasantly surprised that I did not run into a single error! I can’t remember the last time that happened when setting up software that’s still in beta.

Setting up my Machine Learning Tools

As a A.I. Engineer that specifically spends a lot of time doing deep learning, there are a few tools that I need to make my developing experience much better.

Docker with CUDA support ( nvidia-docker )
PyTorch as my deep learning framework of choice
Horovod for distributed training

I’ve not tried WSL2 since I dual boot into Linux, so I was pleasantly surprised that I can easily download and install my tools like I was on a normal Ubuntu machine. I had no issue installing each of these packages. It was an Ubuntu experience as you would expect it.

Training Models

OK now, this is the real test. First I will talk about my experience, and then I’ll present some benchmarks to compare Cuda on WSL2 and bare-metal Linux.

I think a common workflow when training deep learning models regardless if you have your own hardware or if you’re training on the cloud is to have a separate disk for all of your data and a disk for the actual operating system. WSL2 will automatically detect and mount any disk that Windows 10 recognizes so that was cool; but I ran into issues with file permissions on my mounted data drive.

NOTE: The issue is only on mounted drives and works fine if you do everything within your WSL2 file system.

So my training script would error out on random data files due to the file permissions being restricted. So I read the WSL docs and it states that…

When accessing Windows files from WSL the file permissions are either calculated from Windows permissions, or are read from metadata that has been added to the file by WSL. This metadata is not enabled by default.

Ok, so WSL2 calculates the file permissions, and sometimes it screws up I guess, I just need to enable this metadata thingie to get it working right ? Well sorta… So I added the metadata and then did chmod -R 777 to all of my data files as a quick and dirty way to just free up the permissions so I can continue training! Well, it worked for a bit... then it broke again with the same permissions error! So I looked at the permissions and it somehow reverted my changes and went back to restricted access. The kicker is if I check file permissions multiple times it would then revert to my chmod permissions. So it was randomly changing permissions and the only way I would know if it changes permissions is if I checked the permissions using ls -l . I discovered a weird WSL2 quantum phenomenon that I'll coin... The Schrödinger's file permissions . The Issue stopped when I used chmod 700 to give full read, write, and execution permissions to only the WSL2 user and not to everybody and their mom. This somehow fixed the Schrödinger's file permissions issue so I just went on with life.

I started training with no issues after that! Everything looked good, the model loss was going down and nothing looked out of the ordinary. I decided to do some benchmarking to compare deep learning training performance of Ubuntu vs WSL2 Ubuntu vs Windows 10.

Benchmarks — Ubuntu V.S. WSL2 V.S. Windows 10

To benchmark, I used the MNIST script from the Pytorch Example Repo . I modified the script to make the network much bigger to get a more accurate reading for larger models.

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 128, 3, 1)
        self.conv2 = nn.Conv2d(128, 128, 3, 1)
        self.conv3 = nn.Conv2d(128, 128, 3, 1)
        self.dropout1 = nn.Dropout2d(0.25)
        self.dropout2 = nn.Dropout2d(0.5)
        self.fc1 = nn.Linear(15488, 15488//2)
        self.fc2 = nn.Linear(15488//2, 10)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = self.conv3(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

My machine specs are…

Intel i9 10920X — 12 cores, 24 threads
Nvidia RTX 2080 TI — 11GB VRAM

I ran 3 tests on…

Ubuntu 20.04
WSL2 Ubuntu 20.04
Windows 10

I used a batch size of 512 and ran for 14 epochs and ran on FP32 precision. Below are the results…

So the results aren’t too bad honestly! WSL2 with CUDA support takes 18% longer than native Ubuntu to train an MNIST model on my Nvidia RTX 2080 Ti. CUDA support with WSL2 is still in early preview mode and I’m hopeful that the engineers and researchers over and Microsoft and Nvidia will eventually reach a point where it gets close to Ubuntu Performance.

For some people, taking 18% longer to train models may be a non-starter, but for me, I can take the small performance hit if it means I can asynchronously work on training deep learning models as well as use my Windows compatible software tools to create content. I’m going to stick with Windows 10 and WSL2 as my daily driver for a while and see how it goes!

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

面向对象葵花宝典：思想、技巧与实践

李运华编著 / 电子工业出版社 / 2015-12 / 69

《面向对象葵花宝典：思想、技巧与实践》系统地讲述了面向对象技术的相关内容，包括面向对象的基本概念、面向对象开发的流程、面向对象的各种技巧，以及如何应用面向对象思想进行架构设计。在讲述相关知识或技术的时候，除了从“是什么”这个角度进行介绍外，更加着重于从“为什么”和“如何用”这两个角度进行剖析，力争让读者做到“知其然，并知其所以然”，从而达到在实践中既能正确又能优秀地应用面向对象的相关技术和技巧。 ......一起来看看《面向对象葵花宝典：思想、技巧与实践》这本书的介绍吧!

码农工具