(翻译)An introduction to machine learning with scikit-learn

原文

An introduction to machine learning with scikit-learn

Machine learning: the problem setting

一般情况下,一个学习问题可以被看作是根据 n 组样本数据,然后试图预测未知数据特性的问题。如果每组样品数据是多于一个单独数字,那么就可以看成一个多维记录(aka multivariate data),也就意味着拥有多个属性或特征。

我们可以将学习问题分为一个几个大类:

Training set and testing set

Machine learning is about learning some properties of a data set and applying them to new data. This is why a common practice in machine learning to evaluate an algorithm is to split the data at hand into two sets, one that we call the training set on which we learn data properties and one that we call the testing set on which we test these properties.

lastest-update (2017-06-07)

-->