Scikit-learn classifer visualization

Some binary classifiers are trained on a 2D dataset and visualised.

0. Installation

git clone https://github.com/PierreExeter/classifier_visualizations.git

add conda env.

1. Classifier comparison

This visualisation by Scikit Learn compares the performance of various classifier. Adapted from here.

python 1_classifier_comparison.py

2. Classifier plot

K-Nearest Neighbors is a popular classifier. After being fitted on the train data, it can predict a probability (between 0 and 1) for each point in the 2D feature space. The decision boundary is fixed at 0.5.

python 2_simple_plot_KNN.py

3. Manual hyperparameter tuning

KNN - number of neighbors

Changing the number of neighbors can have a significant effect on the test accuracy and may cause overfitting.

python 3_KNN_hyperparameter_tuning.py

The model is overfitting for low numbers of neighbors since the train accuracy is marginally higher than the test accuracy (i.e. the model does not generalise on unseen data). The best number of neighbors is 14 for this particular dataset.

SVC - regularization parameter

python 3_SVC_hyperparameter_tuning.py

Random Forest - nb of trees

python 3_RFC_hyperparameter_tuning.py

Random forest is clearly overfitting the dataset.

Effect of the dataset size

add effect of dataset size

add ROC curve, F1 score, confusion matrix

add pipelines and cross validation and hyperparameter tuning

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
plots		plots
1_classifier_comparison.py		1_classifier_comparison.py
2_simple_plot_KNN.py		2_simple_plot_KNN.py
3_KNN_hyperparameter_tuning.py		3_KNN_hyperparameter_tuning.py
3_RFC_hyperparameter_tuning.py		3_RFC_hyperparameter_tuning.py
3_SVC_hyperparameter_tuning.py		3_SVC_hyperparameter_tuning.py
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scikit-learn classifer visualization

0. Installation

1. Classifier comparison

2. Classifier plot

3. Manual hyperparameter tuning

KNN - number of neighbors

SVC - regularization parameter

Random Forest - nb of trees

Effect of the dataset size

About

Releases

Packages

Languages

License

PierreExeter/classifier_visualizations

Folders and files

Latest commit

History

Repository files navigation

Scikit-learn classifer visualization

0. Installation

1. Classifier comparison

2. Classifier plot

3. Manual hyperparameter tuning

KNN - number of neighbors

SVC - regularization parameter

Random Forest - nb of trees

Effect of the dataset size

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages