DS from S: Chapter 12

k-Nearest Neighbors

Aug 05, 2024

Thought before starting

I think I’ve done something like this before, but maybe I’m thinking of k-means. Feels like it should be straightforward: assign some distance metric, and find the elements that are within that distance. That should scale as… what, N squared? That’s how long it’ll take to find all possible distances between any two points. Maybe there will be a faster way to do this in the chapter?

Thoughts while reading

What I learned

How to use k nearest neighbors on (at least one) data set.

What I liked

Iris dataset
Practice with tqdm library
I don’t think I’ve used k-nearest-neighbors before, so this was great to see an example in action
Didn’t find anything more efficient for nearest-neighbors distance calculation than what I expected, which was gratifying.

What I disliked

Other thoughts

Author wrote data file as `iris.dat` but read it out as `iris.data`
Author didn’t explicitly write the commands to create the plots shown (this is the first time I didn’t see them).

Vocabulary terms

N/A

William’s Substack

Discussion about this post

Ready for more?