In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze.
I have a dataset of hands against a background. As you can see the background is not uniformly lit.
In addition to that the hand's position ,scale and rotation is not static.
I don't have the ground truth of the hand and I don't want to waste my twenties marking it myself.
It is fairly obvious that thresholding simply doesn't cut it and given that I don't have any skin information to begin with what can I do?
Well , the only thing you can do is improvise.
My first thought was to use K-Means to try to split the image into two parts : a background class containing the backdrop and a foreground class containing the hand pixels.
I tested that a bit and found that it worked pretty well. So I decided to upgrade it and use Expectation Maximization using Gaussian Mixtures.
A subproblem is given the two classes is discriminating between the skin and background classes. After the mask is computer it is advisable to do some post-processing to remove some rogue pixels.
Below I include the code to do all the above :
I used a bit of active contours cleaning that I don't include. Below I include some of the resulting images which are very satisfying and can be used to bootstrap any system for even better results :
I have a dataset of hands against a background. As you can see the background is not uniformly lit.
In addition to that the hand's position ,scale and rotation is not static.
I don't have the ground truth of the hand and I don't want to waste my twenties marking it myself.
It is fairly obvious that thresholding simply doesn't cut it and given that I don't have any skin information to begin with what can I do?
Well , the only thing you can do is improvise.
My first thought was to use K-Means to try to split the image into two parts : a background class containing the backdrop and a foreground class containing the hand pixels.
I tested that a bit and found that it worked pretty well. So I decided to upgrade it and use Expectation Maximization using Gaussian Mixtures.
A subproblem is given the two classes is discriminating between the skin and background classes. After the mask is computer it is advisable to do some post-processing to remove some rogue pixels.
Below I include the code to do all the above :
%------------------------------------------------------------------------
% gather features
[rows cols colors] = size(imgDouble);
[X,Y] = meshgrid(0 :1/(rows-1) : 1,0 :1/(cols-1) :1);
X = reshape(X,rows*cols,1);
Y = reshape(Y,rows*cols,1);
imgFeatures = [X Y];
for i = 1 : colors
imgMasked(:,:,i) = medfilt2(imgMasked(:,:,i),[5 5],'symmetric');
imgFeatures = [imgFeatures reshape(imgDouble(:,:,i),rows*cols,1)];
end
%------------------------------------------------------------------------
% use EM with gaussian mixture
objGM = gmdistribution.fit(imgFeatures,noClasses,'Replicates',10);
% find which of the classes is the hand
Sigma = zeros(noClasses,1);
for i = 1 : noClasses
Sigma(i) = sum(sum(objGM.Sigma(:,:,i)));
end
[C,I] = max(Sigma);
skinClass = I(1);
% use gaussian mixtures to classify pixel features
[IDX,nlogl,P] = cluster(objGM,imgFeatures);
% fill holes to create a mask
IDXimg = reshape(IDX,rows,cols);
IDXimg = IDXimg == skinClass;
IDXimg = bwmorph(IDXimg,'erode',1);
IDXimg = bwareaopen(IDXimg, 100 / scale);
imgMasked(repmat(IDXimg == 0,[1 1 colors])) = 0;
%------------------------------------------------------------------------
I used a bit of active contours cleaning that I don't include. Below I include some of the resulting images which are very satisfying and can be used to bootstrap any system for even better results :