2.1.13.1.KNN Theory
Last updated
Last updated
找和新數據最近的K個鄰居, 這些鄰居是什麼分類, 那麼新數據就是什麼樣的分類 (Choosing a K will affect what class a new point is assigned to)
Training algorithm
Store all the data
Prediction algorithm
Calculate the distance from x to all points in your data
Sort the points in your data by increasing distance from x
Predict the majority label of the "k" closet points
Pros
Very simple
Training is trivial
Works with any number of classes
Easy to add more data
Few parameters
K
Distance metric
Cons
High prediction cost (worse for large data sets)
Not good with high dimensional data
Categorical features do not work well