neural-networks.io

neural-networks.io

Classification example

 

# Problem statement

In this example, we consider a dataset where each input vector \( X = ( x , y) \) is assocated to a class A(+1) or B(-1). The following figure illustrates the classification problem:

 

# Network architecture

The single layer architecture is the following:

As we need to distinguish class A from class B, we have to use an activation function that can separate classes. In this example, hyperbolic tangent has been selected:

The choice of hyperbolic tangent is motivated by the fact that this function output a value between -1 and +1. Output can be interpretated in two ways, in terme of binary classes (A or B) or in term of probabilities.

 
# Binary interpretation

To determine if the sample belongs to class A or B, ones can specify the following rule: positive outputs belongs to class A, while negative to class B. Mathematicaly, we add the following function after the output of the network:

 
# Probabilistic interpretation

The second option to interpret the output of the network is to consider it as a probability to belonging to class A or B. When the output is equal to +1, the probability for the sample to be classify in class A or B is respectively one and zero. The following equations generalize this concept, and convert the network output into probablities:

Probablity to be in class A : $$ p_A = \frac{o+1}{2} $$ Probablity to be in class B : $$ p_B = \frac{1-o}{2} $$ Note that sum of probablities is always equal to one ( \( p_A + p_B = 1 \) ).
 

# Results

The following figures shows how the space is splitted to separate classes:

The following figures is an overview of training results.

 

# Code source

Cliquez sur l'un des langages suivants pour afficher le code source de cette classification :

%% Single layer classifier
close all;
clear all;
clc;

%% Parameters
% Dataset size
N=1000;
% Learning rate
Eta=0.003;




%% Dataset
% Generate dataset for each class [ X , Y , class (+1 or -1) ]

% Class A (+1)

% ~ 98% good classification
classA = mvnrnd ([2, 2] , [5 1.5; 1.5 1] ,N/2) ;
%% Uncomment the following line to create a 100% good classification
%classA = mvnrnd ([2, 4] , [5 1.5; 1.5 1] ,N/2) ;
classA= [classA , ones(N/2,1) ];

% class B (-1)

classB = mvnrnd ([2,-2] , [3,0;0,0.5] ,N/2);
classB= [classB , -ones(N/2,1) ];

% Merge classes for creating the dataset
dataset=[ classA ; classB ];
% Shuffle dataset
dataset=dataset(randperm(length(dataset)),:);




%% Initialize weight
W=[0;0;0];

%% Trainig loop
for i = 1:size(dataset,1)
    % Forward
    S=W'*[dataset(i,1:2),1]';
    Y=tanh (S);
    
    % Expected output
    Y_=dataset(i,3);
    
    % Update weights
    W=W+Eta*(Y_ - Y)*[dataset(i,1:2),1]'*(1-tanh(S)*tanh(S));
end


%% Display

% Get boundaries (for display)
Xmin=min(dataset(:,1));
Xmax=max(dataset(:,1));
Ymin=min(dataset(:,2));
Ymax=max(dataset(:,2));

%% Output
[X,Y] = meshgrid(Xmin:0.1:Xmax,Ymin:0.1:Ymax);
Z = tanh ( X*W(1) + Y*W(2) + W(3) );
surf(X,Y,Z,'facecolor','texture')
hold on;



%% Display dataset
plot3 (classA(:,1),classA(:,2),4+classA(:,3),'.r'); hold on;
plot3 (classB(:,1),classB(:,2),4+classB(:,3),'.b');
grid on;
axis square equal;




%% Test on training set
good=0;
for i = 1:size(dataset,1)
    % Compute network output
    Y=tanh (W'*[dataset(i,1:2),1]');
    
    % Compare to the expected output
    if (sign(Y)==dataset(i,3))
        % Good classification (green circle)
        good=good+1;
        plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'og');
    else
        % Wrong classification (black cross)
        plot3 (dataset(i,1),dataset(i,2),8+sign(Y),'xk');            
    end
end

% Axis labels and colormap
colormap(jet);
colorbar
shading interp;
xlabel ('X');
ylabel ('Y');
% Uncomment for a top view
%view(0,90);

% Compute success ratio 
badly_classified = 1-good/N

Output :

badly_classified =

    0.0060