fokicommunications.blogg.se - Hyperplan separateur

HYPERPLAN SEPARATEUR SERIES

The more formal definition of an initial dataset in set theory is : So your dataset is the set of couples of element We can say that is a -dimensional vector if it has dimensions. Moreover, most of the time, for instance when you do text classification, your vector ends up having a lot of dimensions. Note that can only have two possible values -1 or +1. Įach will also be associated with a value indicating if the element belongs to the class (+1) or not (-1). Most of the time your data will be composed of vectors. So we will now go through this recipe step by step: Step 1: You have a dataset and you want to classify it It is because as always the simplicity requires some abstraction and mathematical terminology to be well understood. If it is so simple why does everybody have so much pain understanding SVM ? The region bounded by the two hyperplanes will be the biggest possible margin. select two hyperplanes which separate the data with no points between them.If I have a margin delimited by two hyperplanes (the dark blue lines in Figure 2), I can find a third hyperplane passing right in the middle of the margin.įinding the biggest margin, is the same thing as finding the optimal hyperplane. If I have an hyperplane I can compute its margin with respect to some data point. Right now you should have the feeling that hyperplanes and margins are closely related. How did I find it ? I simply traced a line crossing in its middle. It is slightly on the left of our initial hyperplane. You can also see the optimal hyperplane on Figure 2. Figure 2: The optimal hyperplane is slightly on the left of the one we used in Part 2. The biggest margin is the margin shown in Figure 2 below. In Figure 1, we can see that the margin, delimited by the two blue lines, is not the biggest margin separating perfectly the data. However, even if it did quite a good job at separating the data it was not the optimal hyperplane.įigure 1: The margin we calculated in Part 2 is shown as M1Īs we saw in Part 1, the optimal hyperplane is the one which maximizes the margin of the training data. We then computed the margin which was equal to. How do we calculate the distance between two hyperplanes ?Īt the end of Part 2 we computed the distance between a point and a hyperplane.How can we find the optimal hyperplane ?.Here is a quick summary of what we will see: The main focus of this article is to show you the reasoning allowing us to select the optimal hyperplane. If you did not read the previous articles, you might want to start the serie at the beginning by reading this article: an overview of Support Vector Machine.

HYPERPLAN SEPARATEUR SERIES

This is the Part 3 of my series of tutorials about the math behind Support Vector Machine. "new regular observations", "new abnormal observations"], [ "learned frontier", "training observations", SPACE_SAMPLING_POINTSįacecolor = 'orange', edgecolor = 'gray', alpha = 0.3)Īx. # Scale and transform to actual size of the interesting volume This is done using the marching cubes algorithm implementation from # scikit-image. # Plot the separating hyperplane by recreating the isosurface for the distance # = 0 level in the distance grid computed through the decision function of the # SVM. scatter(X_outliers, X_outliers, X_outliers, c = 'red') scatter(X_test, X_test, X_test, c = 'green')Ĭ = ax. scatter(X_train, X_train, X_train, c = 'white')ī2 = ax. # Plot the different input points using 3D scatter plottingī1 = ax.

# Create a figure with axes for 3D plotting # Calculate the distance from the separating hyperplane of the SVM for the # whole space using the grid defined in the beginning # And compute classification error frequencies # Predict the class of the various input created before OneClassSVM(nu = 0.1, kernel = "rbf", gamma = 0.1) # Create a OneClassSVM instance and fit it to the dataĬlf = svm. # Generate some abnormal novel observations using a different distribution # Generate some regular novel observations using the same method and # distribution properties # Generate training data by using a random cluster and copying it to various # places in the space linspace(Z_MIN, Z_MAX, SPACE_SAMPLING_POINTS)) linspace(Y_MIN, Y_MAX, SPACE_SAMPLING_POINTS), linspace(X_MIN, X_MAX, SPACE_SAMPLING_POINTS), Z_MAX = 5 # Generate a regular grid to sample the 3D space for various operations later TRAIN_POINTS = 100 # Define the size of the space which is interesting for the example