Statistical Decision Theory

栏目: IT技术 · 发布时间: 6年前

In this post, we will discuss some theory that provides the framework for developing machine learning models.

Let’s get started!

If we consider a real valued random input vector, X , and a real valued random output vector, Y , the goal is to find a function f ( X ) for predicting the value of Y. This requires a loss function, L ( Y , f ( X )). This function allows us to penalize errors in predictions. One example of a commonly used loss function is the square error losss:

The loss function is the squared difference between true outcome values and our predictions. If f ( X ) = Y , which means our predictions equal true outcome values, our loss function is equal to zero. So we’d like to find a way to choose a function f ( X ) that gives us values as close to Y as possible.

Given our loss function, we have a critereon for selecting f ( X ). We can calculate the expected squared prediction error by integrating the loss function over x and y :

Where P( X , Y ) is the joint probability distribution in input and output. We can then condition on X and calculate the expected squared prediction error as follows:

We can then minimize this expect squared prediction error point wise, by finding the values, c , which minimize the error given X :

The solution to this is:

Which is the conditional expectation of Y , given X = x. Put another way, the regression function gives the conditional mean of Y, given our knowledge of X. Interestingly, the k -nearest neighbors method is a direct attempt at implementing this method from training data. With nearest neighbors, for each x , we can ask for the average of the y ’s where the input, x , equals a specific value. Our estimator for Y can then be written as:

Where we are taking the average over sample data and using the result to estimate the expected value. We are also conditioning on a region with k neighbors closest to the target point. As the sample size gets larger, the points in the neighborhood are likely to be close to x . Additionally, as the number of neighbors, k , gets larger the mean becomes more stable.

If you’re interested in learning more, Elements of Statistical Learning , by Trevor Hastie, is a great resource. Thank you for reading!

以上就是本文的全部内容，希望本文的内容对大家的学习或者工作能带来一定的帮助，也希望大家多多支持码农网

查看所有标签

本站部分资源来源于网络，本站转载出于传递更多信息之目的，版权归原作者或者来源机构所有，如转载稿涉及版权问题，请联系我们。

码农书籍

JSP应用开发技术

柳永坡 / 人民邮电出版社 / 2005-9 / 52.00元

本书全面系统地介绍了JSP应用开发技术，包括JSP预备知识和环境配置、JSP编程基础、JSP应用开发进阶、在JSP中使用数据库、Servlet技术、标签库和表达式语言、Web编程模式和应用框架等几个方面的内容。本书不但由浅入深地介绍了JSP程序设计的原理、方法和技术，还提供了大量的JSP应用开发实例，给出了相应的实用技巧、操作步骤及优化思路。本书着重于JSP技术的应用性和可操作性，......一起来看看《JSP应用开发技术》这本书的介绍吧!

码农工具

JS 压缩/解压工具

在线压缩/解压 JS 代码

XML 在线格式化

在线 XML 格式化压缩工具