Prettifying Partial Density Plots in Python

栏目: IT技术 · 发布时间: 3年前

内容简介:Give your non-technical stakeholders a glimpse into black-box models with more engaging partial density plots.People don’t trust what they don’t understand. Artificial intelligence and machine learning algorithms are some of the most powerful technologies

Give your non-technical stakeholders a glimpse into black-box models with more engaging partial density plots.

Prettifying Partial Density Plots in Python

People don’t trust what they don’t understand. Artificial intelligence and machine learning algorithms are some of the most powerful technologies we have at our disposal, but they are also the most misunderstood. Hence, one of the most critical responsibilities of a data scientist is to communicate complex information in easy-to-understand ways.

Black Box Models

Perhaps one of the biggest misconceptions about neural networks is the notion that we can’t see directly into the models that produce results. We can see our inputs and outputs, and we can measure the results, but we don’t truly understand the relationship between them. From a practicality standpoint, this is problematic because, like humans, the nature of relationships can change with time. What a vision AI perceives as a truck today may reflect what trucks will look like tomorrow.

Prettifying Partial Density Plots in Python

Tesla Cybertruck. Source: Mike Mareen / Shutterstock.com

Most changes, however, are not as jarring as Tesla’s Cybertruck. How do we know algorithms are keeping up with gradual changes to common assumptions if we can’t see inside them? We open the box. And one of the best tools we have at our disposal is the partial dependence plot (PDP).

Partial Dependence Plots

The creators of Scikit-Learn describe partial dependence plots this way:

Partial dependence plots (PDP) show the dependence between the target response and a set of ‘target’ features, marginalizing over the values of all other features (the ‘complement’ features).

In other words, PDP allows us to see how a change in a predictor variable affects the change in the target variable. Below is a sample of PDP’s that show the effect that different traits of a home have on the predicted price.

Prettifying Partial Density Plots in Python

Source: Scikit-Learn

From these plots, we can see that as median income and age of the home increases, predicted price tends to increase. However, as average occupancy in an area increases, predicted price decreases. The lines at the bottom represent the distribution of observations.

These plots are incredibly easy to understand and easy to create. With a fitted model, dataset (X features only), and a list of input features, you can generate the above plots with a single line of code after importing the relevant libraries:

import matplotlib.pyplot as plt
from sklearn.inspection import partial_dependence, plot_partial_dependenceplot_partial_dependence(model, X, features)

These plots are great for almost any type of regression model. However, I have found that non-technical stakeholders sometimes have difficulty interpreting the results when applying PDPs to classification tasks. What’s more, they are not particularly engaging to look at for a presentation. Let’s dress it up and add some functionality to this.

Prettified PDPs

For illustrative purposes, we’ll use the Titanic dataset . We’ll build a simple model using the XGBoost classification model that attempts to identify survivors based on several input features. We are most interested in figuring out how our model uses age as a predictor of survivorship (no pun intended).

from xgboost import XGBClassifierdf = pd.read_csv('titanic.csv')
X = df[['Age', 'SibSp', 'Parch', 'Fare']]
y = df['Survived']model = XGBClassifier()
model.fit(X, y)fig = plt.figure(figsize(10, 9))
plot_partial_dependence(model, X, ['Age'], fig=fig)
plt.show()

Prettifying Partial Density Plots in Python

As we can see, our model has identified that older persons are less likely to survive, all other factors being equal. We can also see that most passengers were between 20 and 40 years old.

Wouldn’t it be great if we could get a clearer picture of the age distribution by plotting a histogram on the same chart? What about displaying partial dependence values as percentages? Wouldn’t it be nice if we could also visualize the decision boundary? We can do all of this by grabbing the partial dependence values by using the partial_dependence method and plotting the results ourselves. Fortunately, I have already created a function that will do this for you.

from sklearn.inspection import partial_dependence

The above function will produce a PDP for a single input variable and allows for the input of a target name for axis labels and chart title. Furthermore, it provides options to display y-ticks as percentages, change the decision boundary, and return partial dependence values for further analysis. By sticking with the standard settings and passing a name for the target, we get the following:

plot_pdp(model, X, 'Age', target='Survival')

Prettifying Partial Density Plots in Python

With this, we get a much richer view of the age distribution. We can clearly see where age crosses the decision boundary. We labeled axes in a way that will make it easier for non-technical stakeholders to read and understand. From here, you can play around with the options to see how they change the display of the chart or modify the code to your liking.

If nothing else, I would encourage you to come up with new ways to share your work to create more engagement with non-technical audiences.


以上所述就是小编给大家介绍的《Prettifying Partial Density Plots in Python》,希望对大家有所帮助,如果大家有任何疑问请给我留言,小编会及时回复大家的。在此也非常感谢大家对 码农网 的支持!

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

图灵的秘密

图灵的秘密

Charles Petzold / 杨卫东 / 人民邮电出版社 / 2012-11 / 69.00元

图灵机是英国数学家阿兰•图灵提出的一种抽象计算模型,本书深入剖析了图灵这篇描述图灵机和可计算性的原始论文《论可计算数及其在判定性问题上的应用》。书中在详解论文的同时,也附带了大量的历史背景资料、图灵的个人经历,以及图灵机对于人们理解计算机、人类意识和宇宙所产生的影响。 本书适合所有计算机科学专业的学生、程序员或其他技术人员,同时也适合欲了解图灵生平及其构建图灵机的思维的读者阅读。一起来看看 《图灵的秘密》 这本书的介绍吧!

在线进制转换器
在线进制转换器

各进制数互转换器

html转js在线工具
html转js在线工具

html转js在线工具

正则表达式在线测试
正则表达式在线测试

正则表达式在线测试