内容简介：The healthcare industry is in the midst of a huge paradigm shift where technology is the driving force. In this shift, although AI has made tremendous progress it is not yet sophisticated to perform the final diagnosis; it is simply meant to assist medical
Detecting COVID-19 symptoms through Artificial Intelligence
Note from the editors: Towards Data Science is a Medium publication primarily based on the study of data science and machine learning. We are not health professionals or epidemiologists, and the opinions of this article should not be interpreted as professional advice. To learn more about the coronavirus pandemic, you can click here .
The healthcare industry is in the midst of a huge paradigm shift where technology is the driving force. In this shift, although AI has made tremendous progress it is not yet sophisticated to perform the final diagnosis; it is simply meant to assist medical professionals who give their final diagnosis. According to the World Health Organization (WHO) , the most common diagnosis for severe COVID-19 is severe pneumonia. This was the key driver behind this project. To help with the research on COVID-19, I used the concepts learnt in my first Artificial Intelligence course at Columbia University on a data set to classify lung X-ray scans as either normal or pneumonia infected lungs .
It is not possible for a layman to distinguish between the lungs of a normal person and that of a pneumonia infected person as depicted above.
Let’s get into our doctor shoes and dive deeper into the data set!
The data file contained the following folders:
· Train: This folder had 5309 images each named with an image id for example IM-0131–0001.jpeg
· Test: This folder had 624 images stored in a similar manner
· Chest_xray_Corona_Metadata.csv: This had the mapping from the image id to the label i.e. from IM-0131–0001.jpeg to Normal. There were some images which did not have a mapping present and these were discarded
· Chest_xray_Corona_dataset_Summary.csv: This excel file had a breakdown of the number of images that belong to each category
The images were processed using python and stored locally. They were made uniform by changing the dimensions (300 x 300) and format (PNG).
The label was used to rename the images in python so that it is easy for classification. The dimensions were chosen to balance the trade-off between loss of information (too small) and higher computational time while training the model (too large).
The processed images were pushed to Google drive as the model training had to be done on Collab (GPU support and 25 GB RAM). The normal images were labelled as 1 and pneumonia images as 0. The training images were converted into an array of numbers from 0 to 255 (pixels) and stored along with their labels in a list to be passed to the neural network. 20% of the training data was kept for validation (hyper parameter tuning).
The Network architecture that was followed is a popular, effective and simple one called the VGGnet .
The architecture usedhere was a smaller version of the figure shown, but the interesting thing is that the filter size increases as we go down the layers. The following filters were used:
32 → 64 → 128 → 512 → 1
The hidden layers were all activated by Relu (Rectified Linear Unit) activation function. The 2-D convolution network used a kernel of size [3,3]. This specifies the height and width of the 2-D convolution window. The filter size  is the size of the output dimension (the number of output filters in the convolution). The input shape was [300, 300, 1]. The first two dimensions are the size of the image and the last is the number of channels in the image (1 as the images were in gray scale).
Before compiling the network, data augmentation was done to increase the number of input images . Keras has a data generator for images which is a standard tool for doing this. The generator helps in re-scaling the pixel values as well as generating new forms of the images. For example zooming in and out, rotating by certain angles creates similar versions of the same image and increase the size of the data. The idea of horizontal flipping for x-ray images has been validated here . Using the generator, the pixel values were also scaled to be between 0 and 1.
The loss function was set to be binary cross-entropy as the output takes a binary value, the optimizer chosen was adam with a learning rate of 0.0001 and accuracy was the chosen metric . The other specifications of the network are present in the code on GitHub .
The number of epochs were taken to be 60. As one can see, the accuracy levels out after around 50 epochs and we can argue that the weights have been optimized reasonably well. Further fine tuning of the epochs can be done by methods like early stopping.
With these weights, the accuracy on the 624 test images came out to be 92.1% . Below is a snapshot of the results where the title of each image shows the actual as well as the predicted label (more results in the code).
For the future scope, multi-level classification can be done provided one has more labelled images for the COVID-19 cases and other viruses. Neural networks are data hungry models and with humongous datasets deeper networks with much better results can be built. Also, it might be a good idea to monitor the out-of-sample performance using cross validation (5 fold) as this would be more representative of the learning. The number of epochs could be fine-tuned to improve the accuracy on the test set as well.
On the closing note, I would re-emphasize the fact that AI algorithms should be taken as supplemental support for making important decisions specially in domains such as healthcare (at least till we reach the terminator era)!
- 鉴别真假，Face X-Ray技术给换脸图像“照X光”
- Deep Learning in Healthcare — X-Ray Imaging (Part 4-The Class Imbalance problem)
- 重要更新 | 洞鉴（X-Ray）支持CVE-2019-0708漏洞扫描
- Metastatic Cancer Detection in Histopathological Image Scans
The Joy of X
Niall Mansfield / UIT Cambridge Ltd. / 2010-7-1 / USD 14.95
Aimed at those new to the system seeking an overall understanding first, and written in a clear, uncomplicated style, this reprint of the much-cited 1993 classic describes the standard windowing syste......一起来看看 《The Joy of X》 这本书的介绍吧!