Sunday, December 11, 2011

List all files

Suppose you want to list all files which match a given regular expression  in a directory along its subdirectories. Maybe you want to gather your training samples, images, text files or whatever.

Here I provide a small code snippet to quickly do all the dirty work and give you back a cell array with the full names.

function [files] = ListFiles(directory, regex)
% LISTFILES - returns all files located in dir and its subdirectories 
%             whose names matches the regex.
%  
% For example ListFiles('.', '.*.avi') will return all avi files in the 
% current directory and all of it's subdirectories. 
%%
    % check if directory is valid otherwise return
    if isempty(directory) | isnan(directory)
        files = {};
        return;
    end
    
    % get all files and subdirectories in directory
    allDirectoryEntries = dir(directory);
    
    % get files in current directory
    n = length(allDirectoryEntries);
    
    % init empty files list and directories list
    files = cell(n,1);
    directories = cell(n,1);
    k = 0;
    d = 0;
    
    for i = 1 : n     
        file = allDirectoryEntries(i);
        name = fullfile(directory, file.name);
        % if file is directory add to directories otherwise add to files
        if file.name(1) ~= '.' && file.isdir && ~strcmp(name,directory)
            d = d + 1;
            directories{d} = fullfile(directory, file.name);
        elseif length(regexp(name, regex)) > 0 && ~file.isdir
            k = k + 1;
            files{k} = name;
        end
    end
    
    % keep only the non empty cells of files
    files = {files{1:k}};
    
    % keep only the non empty cells of directories
    directories = {directories{1:d}};
    
    % run ListFiles subdirectories recursively in all subdirectories
    for i = 1 : d
        files = [files ListFiles(directories{i}, regex)];
    end
end


And a simple example to gather all images file in the inputDirectory and its subdirectories :

%------------------------------------------------------------------------
    % Find all image files in input directory
    files = ListFiles(inputDirectory,'.*.(jpg|jpeg|tif|tiff|png|gif|bmp|PNG)');
%------------------------------------------------------------------------