In this post, we will discuss some theory that provides the framework for developing machine learning models.
Let’s get started!
If we consider a real valued random input vector, X , and a real valued random output vector, Y , the goal is to find a function f ( X ) for predicting the value of Y. This requires a loss function, L ( Y , f ( X )). This function allows us to penalize errors in predictions. One example of a commonly used loss function is the square error losss:
The loss function is the squared difference between true outcome values and our predictions. If f ( X ) = Y , which means our predictions equal true outcome values, our loss function is equal to zero. So we’d like to find a way to choose a function f ( X ) that gives us values as close to Y as possible.
Given our loss function, we have a critereon for selecting f ( X ). We can calculate the expected squared prediction error by integrating the loss function over x and y :
Where P( X , Y ) is the joint probability distribution in input and output. We can then condition on X and calculate the expected squared prediction error as follows:
We can then minimize this expect squared prediction error point wise, by finding the values, c , which minimize the error given X :
The solution to this is:
Which is the conditional expectation of Y , given X = x. Put another way, the regression function gives the conditional mean of Y, given our knowledge of X. Interestingly, the k -nearest neighbors method is a direct attempt at implementing this method from training data. With nearest neighbors, for each x , we can ask for the average of the y ’s where the input, x , equals a specific value. Our estimator for Y can then be written as:
Where we are taking the average over sample data and using the result to estimate the expected value. We are also conditioning on a region with k neighbors closest to the target point. As the sample size gets larger, the points in the neighborhood are likely to be close to x . Additionally, as the number of neighbors, k , gets larger the mean becomes more stable.
If you’re interested in learning more, Elements of Statistical Learning , by Trevor Hastie, is a great resource. Thank you for reading!
以上就是本文的全部内容,希望本文的内容对大家的学习或者工作能带来一定的帮助,也希望大家多多支持 码农网
本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们。
JSP应用开发技术
柳永坡 / 人民邮电出版社 / 2005-9 / 52.00元
本书全面系统地介绍了JSP应用开发技术,包括JSP预备知识和环境配置、JSP编程基础、JSP应用开发进阶、在JSP中使用数据库、Servlet技术、标签库和表达式语言、Web编程模式和应用框架等几个方面的内容。本书不但由浅入深地介绍了JSP程序设计的原理、方法和技术,还提供了大量的JSP应用开发实例,给出了相应的实用技巧、操作步骤及优化思路。 本书着重于JSP技术的应用性和可操作性,......一起来看看 《JSP应用开发技术》 这本书的介绍吧!
JS 压缩/解压工具
在线压缩/解压 JS 代码
XML 在线格式化
在线 XML 格式化压缩工具