Taxonomy Distribution Report¶
Overview¶
- Total Concepts: 200
- Number of Taxonomies: 14
- Average Concepts per Taxonomy: 14.3
Distribution Summary¶
| Category | TaxonomyID | Count | Percentage | Status |
|---|---|---|---|---|
| Neural Networks | NN | 37 | 18.5% | ✅ |
| Foundation Concepts | FOUND | 31 | 15.5% | ✅ |
| Convolutional Networks | CNN | 20 | 10.0% | ✅ |
| Evaluation Metrics | EVAL | 19 | 9.5% | ✅ |
| Support Vector Machines | SVM | 16 | 8.0% | ✅ |
| Optimization | OPT | 16 | 8.0% | ✅ |
| Decision Trees | TREE | 12 | 6.0% | ✅ |
| Clustering | CLUST | 12 | 6.0% | ✅ |
| K-Nearest Neighbors | KNN | 11 | 5.5% | ✅ |
| Logistic Regression | LOGREG | 9 | 4.5% | ✅ |
| Data Preprocessing | PREP | 7 | 3.5% | ✅ |
| Regularization | REG | 5 | 2.5% | ℹ️ Under |
| Transfer Learning | TL | 4 | 2.0% | ℹ️ Under |
| Miscellaneous | MISC | 1 | 0.5% | ℹ️ Under |
Visual Distribution¶
NN █████████ 37 ( 18.5%)
FOUND ███████ 31 ( 15.5%)
CNN █████ 20 ( 10.0%)
EVAL ████ 19 ( 9.5%)
SVM ████ 16 ( 8.0%)
OPT ████ 16 ( 8.0%)
TREE ███ 12 ( 6.0%)
CLUST ███ 12 ( 6.0%)
KNN ██ 11 ( 5.5%)
LOGREG ██ 9 ( 4.5%)
PREP █ 7 ( 3.5%)
REG █ 5 ( 2.5%)
TL █ 4 ( 2.0%)
MISC 1 ( 0.5%)
Balance Analysis¶
✅ No Over-Represented Categories¶
All categories are under the 30% threshold. Good balance!
ℹ️ Under-Represented Categories (<3%)¶
- Regularization (REG): 5 concepts (2.5%)
- Note: Small categories are acceptable for specialized topics
- Transfer Learning (TL): 4 concepts (2.0%)
- Note: Small categories are acceptable for specialized topics
- Miscellaneous (MISC): 1 concepts (0.5%)
- Note: Small categories are acceptable for specialized topics
Category Details¶
Neural Networks (NN)¶
Count: 37 concepts (18.5%)
Concepts:
-
- Neural Network
-
- Artificial Neuron
-
- Perceptron
-
- Activation Function
-
- ReLU
-
- Tanh
-
- Leaky ReLU
-
- Weights
-
- Bias
-
- Forward Propagation
-
- Backpropagation
-
- Gradient Descent
-
- Stochastic Gradient Descent
-
- Mini-Batch Gradient Descent
-
- Learning Rate
- ...and 22 more
Foundation Concepts (FOUND)¶
Count: 31 concepts (15.5%)
Concepts:
-
- Machine Learning
-
- Supervised Learning
-
- Unsupervised Learning
-
- Classification
-
- Regression
-
- Training Data
-
- Test Data
-
- Validation Data
-
- Feature
-
- Label
-
- Instance
-
- Feature Vector
-
- Model
-
- Algorithm
-
- Hyperparameter
- ...and 16 more
Convolutional Networks (CNN)¶
Count: 20 concepts (10.0%)
Concepts:
-
- Convolutional Neural Network
-
- Convolution Operation
-
- Filter
-
- Stride
-
- Padding
-
- Valid Padding
-
- Same Padding
-
- Receptive Field
-
- Max Pooling
-
- Average Pooling
-
- Spatial Hierarchies
-
- Translation Invariance
-
- Local Connectivity
-
- Weight Sharing
-
- CNN Architecture
- ...and 5 more
Evaluation Metrics (EVAL)¶
Count: 19 concepts (9.5%)
Concepts:
-
- Training Error
-
- Test Error
-
- Generalization
-
- Stratified Sampling
-
- Holdout Method
-
- Confusion Matrix
-
- True Positive
-
- False Positive
-
- True Negative
-
- False Negative
-
- Accuracy
-
- Precision
-
- Recall
-
- F1 Score
-
- ROC Curve
- ...and 4 more
Support Vector Machines (SVM)¶
Count: 16 concepts (8.0%)
Concepts:
-
- Support Vector Machine
-
- Hyperplane
-
- Margin
-
- Support Vectors
-
- Margin Maximization
-
- Hard Margin SVM
-
- Soft Margin SVM
-
- Slack Variables
-
- Kernel Trick
-
- Linear Kernel
-
- Polynomial Kernel
-
- Radial Basis Function
-
- Gaussian Kernel
-
- Dual Formulation
-
- Primal Formulation
- ...and 1 more
Optimization (OPT)¶
Count: 16 concepts (8.0%)
Concepts:
-
- Computational Complexity
-
- Time Complexity
-
- Space Complexity
-
- Scalability
-
- Online Learning
-
- Optimizer
-
- Adam Optimizer
-
- RMSprop
-
- Momentum
-
- Nesterov Momentum
-
- Gradient Clipping
-
- Dropout
-
- Early Stopping
-
- Grid Search
-
- Random Search
- ...and 1 more
Decision Trees (TREE)¶
Count: 12 concepts (6.0%)
Concepts:
-
- Decision Tree
-
- Tree Node
-
- Leaf Node
-
- Splitting Criterion
-
- Entropy
-
- Information Gain
-
- Gini Impurity
-
- Pruning
-
- Overfitting
-
- Underfitting
-
- Tree Depth
-
- Cross-Entropy Loss
Clustering (CLUST)¶
Count: 12 concepts (6.0%)
Concepts:
-
- K-Means Clustering
-
- Centroid
-
- Cluster Assignment
-
- Cluster Update
-
- K-Means Initialization
-
- Random Initialization
-
- K-Means++ Initialization
-
- Elbow Method
-
- Silhouette Score
-
- Within-Cluster Variance
-
- Convergence Criteria
-
- Inertia
K-Nearest Neighbors (KNN)¶
Count: 11 concepts (5.5%)
Concepts:
-
- K-Nearest Neighbors
-
- Distance Metric
-
- Euclidean Distance
-
- Manhattan Distance
-
- K Selection
-
- Decision Boundary
-
- Voronoi Diagram
-
- Curse of Dimensionality
-
- KNN for Classification
-
- KNN for Regression
-
- Lazy Learning
Logistic Regression (LOGREG)¶
Count: 9 concepts (4.5%)
Concepts:
-
- Sigmoid Function
-
- Log-Loss
-
- Binary Classification
-
- Multiclass Classification
-
- Maximum Likelihood
-
- One-vs-All
-
- One-vs-One
-
- Softmax Function
-
- Sigmoid Activation
Data Preprocessing (PREP)¶
Count: 7 concepts (3.5%)
Concepts:
-
- Normalization
-
- Standardization
-
- Min-Max Scaling
-
- Z-Score Normalization
-
- One-Hot Encoding
-
- Dimensionality Reduction
-
- Data Augmentation
Regularization (REG)¶
Count: 5 concepts (2.5%)
Concepts:
-
- Regularization
-
- L1 Regularization
-
- L2 Regularization
-
- Ridge Regression
-
- Lasso Regression
Transfer Learning (TL)¶
Count: 4 concepts (2.0%)
Concepts:
-
- Transfer Learning
-
- Fine-Tuning
-
- Domain Adaptation
-
- ImageNet
Miscellaneous (MISC)¶
Count: 1 concepts (0.5%)
Concepts:
-
- Logistic Regression
Recommendations¶
- ✅ Good balance: Categories are reasonably distributed (spread: 18.0%)
- ✅ MISC category minimal: Good categorization specificity
Educational Use Recommendations¶
- Use taxonomy categories for color-coding in graph visualizations
- Design curriculum modules based on taxonomy groupings
- Create filtered views for focused learning paths
- Use categories for assessment organization
- Enable navigation by topic area in interactive tools
Report generated by learning-graph-reports/taxonomy_distribution.py