Quiz: K-Nearest Neighbors Algorithm¶

Test your understanding of the K-Nearest Neighbors algorithm with these questions.

1. What does the "k" in K-Nearest Neighbors represent?¶

The number of features in the dataset
The number of nearest training examples used for prediction
The distance threshold for neighbors
The number of classes in classification

Show Answer

The correct answer is B. The "k" in K-Nearest Neighbors represents the number of nearest training examples (neighbors) used to make a prediction. For example, in 5-NN, the algorithm finds the 5 closest training examples and uses their labels to predict the class (via majority vote) or value (via average).

Concept Tested: K-Nearest Neighbors

2. What is the primary reason KNN is called a "lazy learning" algorithm?¶

It takes a long time to make predictions
It stores all training data and defers computation until prediction time
It requires minimal memory
It only works with small datasets

Show Answer

The correct answer is B. KNN is called "lazy learning" because it doesn't build an explicit model during training—it simply stores all training examples. All computation is deferred until prediction time, when it must calculate distances to all training points. This contrasts with "eager" learners that build models during training.

Concept Tested: Lazy Learning

3. Given a query point in 2D space at (3, 4) and a training point at (6, 8), what is the Euclidean distance between them?¶

3.0
4.0
5.0
7.0

Show Answer

The correct answer is C. The Euclidean distance is calculated as sqrt((6-3)² + (8-4)²) = sqrt(9 + 16) = sqrt(25) = 5.0. Euclidean distance is the straight-line distance between two points and is the most common distance metric used in KNN.

Concept Tested: Euclidean Distance

4. How does Manhattan distance differ from Euclidean distance?¶

Manhattan distance is always larger than Euclidean distance
Manhattan distance sums absolute differences while Euclidean distance uses squared differences
Manhattan distance only works in 2D space
Manhattan distance requires normalized features

Show Answer

The correct answer is B. Manhattan distance (L1) sums the absolute differences across all dimensions: |x₁-y₁| + |x₂-y₂| + ..., while Euclidean distance (L2) uses squared differences under a square root: sqrt((x₁-y₁)² + (x₂-y₂)² + ...). Manhattan distance represents the distance if you could only travel along grid lines, like navigating city blocks.

Concept Tested: Manhattan Distance

5. What happens when k=1 in K-Nearest Neighbors classification?¶

The algorithm predicts the most common class in the entire dataset
The prediction is based solely on the single nearest training example
The algorithm cannot make predictions
All training examples contribute equally to the prediction

Show Answer

The correct answer is B. When k=1, the algorithm finds the single nearest training example and assigns its label to the query point. This makes the model highly sensitive to noise and outliers, as each prediction is based on just one neighbor, often leading to overfitting with complex, irregular decision boundaries.

Concept Tested: K Selection

6. Why does KNN performance typically degrade in high-dimensional spaces?¶

Computers cannot process many dimensions
The curse of dimensionality makes distances less meaningful as dimensionality increases
KNN can only use up to 10 dimensions
High dimensions require more neighbors

Show Answer

The correct answer is B. In high-dimensional spaces, the curse of dimensionality causes all points to become approximately equidistant from each other. Data becomes increasingly sparse, distances lose their discriminative power, and the nearest and farthest neighbors become nearly the same distance away, making KNN's distance-based predictions unreliable.

Concept Tested: Curse of Dimensionality

7. For a KNN regression problem with k=5 and neighbor values [10, 12, 11, 13, 14], what would be the predicted value?¶

10
12
13
14

Show Answer

The correct answer is B. For KNN regression, the predicted value is the average of the k nearest neighbors' values: (10 + 12 + 11 + 13 + 14) / 5 = 60 / 5 = 12. Unlike classification which uses majority voting, regression predicts the mean (or sometimes median) of neighbor values.

Concept Tested: KNN for Regression

8. What is the main computational bottleneck of KNN during prediction time?¶

Storing the training data
Computing distances to all training examples
Sorting the class labels
Calculating the majority vote

Show Answer

The correct answer is B. During prediction, KNN must compute the distance from the query point to every training example, which has O(n) complexity where n is the number of training examples. This becomes expensive for large datasets. Data structures like k-d trees or ball trees can reduce this to O(log n) in lower dimensions.

Concept Tested: K-Nearest Neighbors (computational complexity)

9. In a binary classification problem, why might choosing an even value for k be problematic?¶

Even values are computationally more expensive
It can lead to ties in majority voting
Even values always cause overfitting
The algorithm only works with odd k values

Show Answer

The correct answer is B. With even k values in binary classification, it's possible to get a tie (e.g., k=4 with 2 votes for each class). While this can be resolved with strategies like choosing the label of the nearest neighbor or random selection, odd k values naturally avoid ties and are generally preferred for binary classification.

Concept Tested: K Selection

10. What is a Voronoi diagram in the context of KNN?¶

A visualization showing decision boundaries when k=1
A graph showing the relationship between k and accuracy
A plot of training data points in feature space
A diagram showing the distance between all pairs of points

Show Answer

The correct answer is A. A Voronoi diagram partitions the feature space into regions where each region contains all points closest to a particular training example. For 1-NN classification, the Voronoi diagram exactly represents the decision boundaries, as any point in a region is classified with the label of the training point in that region.

Concept Tested: Voronoi Diagram