DS from S: Chapter 12
k-Nearest Neighbors
Thought before starting
I think I’ve done something like this before, but maybe I’m thinking of k-means. Feels like it should be straightforward: assign some distance metric, and find the elements that are within that distance. That should scale as… what, N squared? That’s how long it’ll take to find all possible distances between any two points. Maybe there will be a faster way to do this in the chapter?
Thoughts while reading
What I learned
How to use k nearest neighbors on (at least one) data set.
What I liked
Iris dataset
Practice with tqdm library
I don’t think I’ve used k-nearest-neighbors before, so this was great to see an example in action
Didn’t find anything more efficient for nearest-neighbors distance calculation than what I expected, which was gratifying.
What I disliked
N/A
Other thoughts
Author wrote data file as `iris.dat` but read it out as `iris.data`
Author didn’t explicitly write the commands to create the plots shown (this is the first time I didn’t see them).
Vocabulary terms
N/A

