CtrlK

2.1.13.1.KNN Theory

1.K Nearest Neighbor

找和新數據最近的K個鄰居, 這些鄰居是什麼分類, 那麼新數據就是什麼樣的分類 (Choosing a K will affect what class a new point is assigned to)

Training algorithm
- Store all the data
Prediction algorithm
1. Calculate the distance from x to all points in your data
2. Sort the points in your data by increasing distance from x
3. Predict the majority label of the "k" closet points

2.Pros and cons

Pros
1. Very simple
2. Training is trivial
3. Works with any number of classes
4. Easy to add more data
5. Few parameters
  - K
  - Distance metric
Cons
1. High prediction cost (worse for large data sets)
2. Not good with high dimensional data
3. Categorical features do not work well

Previous2.1.13.K Nearest Neighbors Next2.1.13.2.KNN with Python

Last updated 5 years ago

Was this helpful?