Metastatic Cancer Detection in Histopathological Image Scans

栏目: IT技术 · 发布时间: 4年前

Metastatic Cancer Detection in Histopathological Image Scans

Using Machine Learning to predict if Cancer is within Microscopic Images.

Metastatic Cancer Detection in Histopathological Image Scans

Feb 29 ·7min read

Metastatic Cancer Detection in Histopathological Image Scans

The Terrible Tale of Cancer

Cancer continues to be one of the world’s most deadly diseases, killing over 10 million+ people a year. One of the reasons that cancer is so deadly is that if it’s left unattended even for a short period of time, it can have already taken over a biological system. Early diagnoses are the key to battling cancer, and machine learning is revolutionizing early diagnosis.

Machine learning is able to deduce patterns in things that humans can’t. This includes finding very abstract patterns in images. Using the power of machine learning, a model can be created to deduce cancerous cells from non-cancerous cells in image scans. Convolutional Neural Networks can also help with this problem, as they are perfect for finding relationships in spatial sensitive data like image scans.

Metastatic Cancer

Cancer usually starts in a primary place in the body and then spreads viciously. Cancer that results from the primary place in the body is metastatic cancer. It’s essentially secondary cancer that is a direct result of primary cancer. So why are we looking for secondary cancer and not the root cause? Because it’s much easier to deduce secondary cancer when scanning the whole body and when someone has cancer, metastatic cancer is usually throughout their body in small quantities. Finding metastatic cancer is like an indicator to determine if someone has a crippling case of cancer somewhere else in their body.

Specifically, metastatic cancer can be found in different organs in the body. We can examine histopathological image scans , which are microscopic images of organs and cells within the body. Histopathological image scans have a great detail of different cells within what portion of the body one is looking at and can examine different diseases. However, it’s hard to tell in these scans if cancer is present or not since we aren’t all experts within the field of cancer research. Machine learning can be used to solve this issue, to tell if metastatic cancer is present in histopathological image scans.

Metastatic Cancer Detection in Histopathological Image Scans

Histopathological Image Scan

Machine Learning + Convolutional Neural Networks

Machine learning is all about getting machines to learn how to perform a specific task. Instead of giving explicit instructions, a machine learns to create its own instructions . In reality, all these instructions are mathematical values. A machine learning model is just a differentiable function , that takes in a predefined amount of input, uses variables (weights) to perform actions on the input and produces an output. It’s like y = mx + b . the x is the input of the model, m and b are the variables (weights) and the y is the output. The machine algorithm has to optimize the m and b weights so when given any input, it will always produce the desired output. How do we optimize our variables? There are essentially two steps, generating a loss and backpropagation.

The Loss Function

The loss function is a function that represents how bad the model performed. The higher the loss, the worse the model performs. We can generate the loss by using labelled data . We can give the model input of x and let it produce its own output based on its weights. Since we have labelled data, we know what the output of the model should be. We can compare the model’s output to the real output and see the difference. For example, the model’s output might be 3 and the real output might be 5. A loss for this example could be 2, which is how far apart the two outputs were. We want the model to always produce a loss of 0, which means we need to minimize the loss function .

Backpropagation

Backpropagation is the process of going back to the model’s weights and changing them to minimize the loss function. It does this by taking the loss function and calculating it’s gradient. The gradient is the instantaneous rate of change of the generated loss, and looking upon the gradient, we can go to the lowest point of our gradient where the weights produce a lower loss. This is called gradient descent and is key in a lot of machine learning algorithms. We can repeat this process until our loss keeps getting smaller and smaller, and while our model keeps getting better and better. Usually, a model will have thousands of weights, so the model can become very accurate when producing the output.

Metastatic Cancer Detection in Histopathological Image Scans

Gradient Descent Visualized

Convolutional Neural Networks

Convolutional Neural Networks employ all the principals of machine learning mentioned above yet incorporate a method that can analyze spatial data, such as images and audio. It uses filters , a machine learning algorithm that uses weights to scan across the input data. Filters are a grid of weights that scan over the input data and at each scan, multiply the weight by the input. Since images are also just grids of values, convolutional filters can easily go over them. After a convolution or when it’s done scanning the image, the results of the scanning (the multiplied product) are pooled together. What the pooling is basically doing is representing the input data in a smaller concise way with all the important features. Essentially what convolutional neural networks do is that they deduce the special features of an image and put it all together.

For a more comprehensible explanation, check out this article I wrote explaining convolutional neural networks in more detail.

Metastatic Cancer Detection Model

The Data

The data used to create the model was the Histopathologic Cancer Detection labelled data on Kaggle. It comes with over 150 000+ images of histopathologic image scans with every image labelled with either a 1 or a 0. 0 Indicates no metastatic cancer is prevalent in the image and 1 indicated that metastatic cancer is within the image. Each image is 96 pixels by 96 pixels by 3 colour channels (Red, Blue, Green). We can represent each image by a grid of values of 96x96x3.

Metastatic Cancer Detection in Histopathological Image Scans

Examples of Data

The Model

The conception of the model was based on the principles of machine learning and tailored toward the task at hand.

Convolutional layers were used for the feature extraction of the images, to generalize what a cancerous image looks like and what a non-cancerous image looks like. The layers are 2D convolutional neural networks meaning that the convolutional layers take 3D matrices of values (the representation of the images).

The max pool layers take the feature maps created by the convolutional layers and generalize them. It does this by taking max value for every 3x3 square in the image. This way, when the feature maps are given to the other parts of the neural network, it can have an easier time figuring out if an image contains cancer or not.

The dropout layers that are scattered throughout the network are to make sure that the model learns to perform the task and doesn’t just memorize the data . It does this by turning off certain weights during the loss calculation, which leads to a slightly lower accuracy yet generalizes to a wide array of images.

Lastly, the dense layers make sense of the feature maps and actually figure out if there is cancer or not . The last layer contains one node or output, which will either be a 0 or 1. 0 meaning no cancer and 1 meaning cancer. The loss function will use this to calculate the loss and backpropagation will optimize the weights in these layers.

The model contains over 51 million+ parameters, meaning it can attain a really good accuracy.

Loss Function + Optimization

The loss function used for this model was the categorical cross-entropy loss . This loss takes the model output and performs the log function and multiplies it with the real loss. It uses the log function because as you start getting to the real answer, the loss gets exponentially better.

The optimizer used for this model was the Adam optimizer. This optimizer performs regular backpropagation but as it gets closer to the global minimum of the loss function, it slows down how fast it optimizer to prevent overstepping the minimum. It’s a pretty standard optimization function.

Training

The model trained for over 10 epochs or performed backpropagation in batches of 100 images each 10 times over the whole dataset. 20% of the images were used for validation or making sure that the model was training. After a couple of hours of training, the model was finished.

The Results are in…

The model got an 80% accuracy rate of detecting cancer. Which means, 80% will be right in its prediction. Here are a few examples:

Metastatic Cancer Detection in Histopathological Image Scans

Examples of the Model

Overall, the model works very well and is a viable option when detecting cancer.

References:

https://mc.ai/gradient-descent-and-its-types/ (Gradient Descent Image)


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

数据结构与算法

数据结构与算法

Michael McMillan / 吕秀峰、崔睿 / 人民邮电出版社 / 2009-5 / 49.00元

《数据结构与算法C#语言描述》是在.NET框架下用C#语言实现数据结构和算法的第一本全面的参考书。《数据结构与算法C#语言描述》介绍的方法非常实用,采用了时间测试而非大O表示法来分析算法性能。内容涵盖了数据结构和算法的基本原理,涉及数组、广义表、链表、散列表、树、图、排序搜索算法以及更多概率算法和动态规则等高级算法。此外,书中还提供了.NET框架类库中的C#语言实现的数据结构和算法。 《数据......一起来看看 《数据结构与算法》 这本书的介绍吧!

JSON 在线解析
JSON 在线解析

在线 JSON 格式化工具

随机密码生成器
随机密码生成器

多种字符组合密码

HEX CMYK 转换工具
HEX CMYK 转换工具

HEX CMYK 互转工具