Classification example


# Problem statement

In this example, we consider a dataset where each input vector \( X = ( x , y) \) is assocated to a class A(+1) or B(-1). The following figure illustrates the classification problem:


# Network architecture

The single layer architecture is the following:

As we need to distinguish class A from class B, we have to use an activation function that can separate classes. In this example, hyperbolic tangent has been selected:

The choice of hyperbolic tangent is motivated by the fact that this function output a value between -1 and +1. Output can be interpretated in two ways, in terme of binary classes (A or B) or in term of probabilities.

# Binary interpretation

To determine if the sample belongs to class A or B, ones can specify the following rule: positive outputs belongs to class A, while negative to class B. Mathematicaly, we add the following function after the output of the network:

# Probabilistic interpretation

The second option to interpret the output of the network is to consider it as a probability to belonging to class A or B. When the output is equal to +1, the probability for the sample to be classify in class A or B is respectively one and zero. The following equations generalize this concept, and convert the network output into probablities:

Probablity to be in class A : $$ p_A = \frac{o+1}{2} $$ Probablity to be in class B : $$ p_B = \frac{1-o}{2} $$ Note that sum of probablities is always equal to one ( \( p_A + p_B = 1 \) ).

# Results

The following figures shows how the space is splitted to separate classes:

The following figures is an overview of training results.


# Code source

Cliquez sur l'un des langages suivants pour afficher le code source de cette classification :

%% Single layer classifier
close all;
clear all;

%% Parameters
% Dataset size
% Learning rate

%% Dataset
% Generate dataset for each class [ X , Y , class (+1 or -1) ]

% Class A (+1)

% ~ 98% good classification
classA = mvnrnd ([2, 2] , [5 1.5; 1.5 1] ,N/2) ;
%% Uncomment the following line to create a 100% good classification
%classA = mvnrnd ([2, 4] , [5 1.5; 1.5 1] ,N/2) ;
classA= [classA , ones(N/2,1) ];

% class B (-1)

classB = mvnrnd ([2,-2] , [3,0;0,0.5] ,N/2);
classB= [classB , -ones(N/2,1) ];

% Merge classes for creating the dataset
dataset=[ classA ; classB ];
% Shuffle dataset

%% Initialize weight

%% Trainig loop
for i = 1:size(dataset,1)
    % Forward
    Y=tanh (S);
    % Expected output
    % Update weights
    W=W+Eta*(Y_ - Y)*[dataset(i,1:2),1]'*(1-tanh(S)*tanh(S));

%% Display

% Get boundaries (for display)

%% Output
[X,Y] = meshgrid(Xmin:0.1:Xmax,Ymin:0.1:Ymax);
Z = tanh ( X*W(1) + Y*W(2) + W(3) );
hold on;

%% Display dataset
plot3 (classA(:,1),classA(:,2),4+classA(:,3),'.r'); hold on;
plot3 (classB(:,1),classB(:,2),4+classB(:,3),'.b');
grid on;
axis square equal;

%% Test on training set
for i = 1:size(dataset,1)
    % Compute network output
    Y=tanh (W'*[dataset(i,1:2),1]');
    % Compare to the expected output
    if (sign(Y)==dataset(i,3))
        % Good classification (green circle)
        plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'og');
        % Wrong classification (black cross)
        plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'xk');            

% Axis labels and colormap
shading interp;
xlabel ('X');
ylabel ('Y');
% Uncomment for a top view

% Compute success ratio 
badly_classified = 1-good/N

Output :

badly_classified =