Matlab Svmlight Interface


The most used implementations for SVM (support vector machines) are currently svmlight and libsvm. While libsvm comes with interfaces for many different programming languages, svmlight (svm-light perf) has the advantage that you can specify the loss function. I have very disproportionate classes in my training data, so using the area under the curve (AUC), the area under the receiver operating characteristics (ROC), brings a great improvement.

I couldn't find a good matlab interface, so I wrote one. Note that it is functional, but quite simple.
function Y=svmlight(training,test,params)
% (very) simple wrapper for svmlight
% Writes matrices in sparse format to data file that can be used by svmlight.
% Columns are variables, rows are observations.
% It is assumed that the first column of the matrix is the target. Targets are elements of {-1,1}.
%
% These steps are made:
% 1. output matlab matrix to text file
% 2. format text file for svm (awk)
% 3. create classification model (svm_learn)
% 4. apply classification model (svm_classify)
%
% All files are written in the /tmp/ directory
%
% Example:
% Y=svmlight(data(traininds,:),data(testinds,:),'-c 1 -w 3 -l 10 ');
% (if you set parameters for svmlight don't forget to include the learning options!)
%
% (c) Benjamin Auffarth, 2008
% licensed under CC-by-sa (creative commons attribution share-alike)
if nargin<3
   params='-c 1 ';
end
trainfile=sparse_write(training);
[s,w]=unix(['svmlight/svm_perf_learn ' params trainfile '.svm2 ' trainfile '.model']);
if s
   disp(
'error in executing smv-light!');w;
   error('svm_perf_learn not found or returned error');
end
testfile=sparse_write(test);
[s,w]=unix(['svmlight/svm_perf_classify -v 0 ' testfile '.svm2 ' trainfile '.model ' testfile '.dat']);
if s

   disp('error in executing smv-light!');w;
    error('svm_perf_classify not found or returned error');
end
Y=dlmread([testfile '.dat']);
end

function fname=sparse_write(M)
[a,fname]=unix('date +/tmp/_svm_%F_-%H:%M_%S%N');
fname=fname(1:end-1); % get rid of newline character
dlmwrite([fname '.svm1'],M,'delimiter',' ');
unix(['awk -F" " ''{printf $1" "; for (i=2;i<=NF;i++) {printf i-1":"$i " "}; print ""}'' ' fname '.svm1 > ' fname '.svm2']);
end

Download

Some Explanations

You need awk installed (obviously). If you are working in a windows environment, you can install awk on cygwin, wubi, or install awk on windows).

Temporary files are generated and stored in the /tmp/ directory. If you are on windows you might want to change that to "."

You need svm-light perf installed. The function searches for it in the svmlight subdirectory. You might want to adapt that to point to your local installation of svmlight.

Enjoy. Please leave comments below.

13 Responses to "Matlab Svmlight Interface"

Please don't copy-paste this code if you are in a windows environment. Download it from snipplr.

This is exactly what I need. Thank you very much!

@Dylan: You are welcome. Glad it helped.

thank so much this information could help me to do a fix in my computer , these kind of blog can help some people.

Just for my own reference, you can do this without matlab using standard *nix commands.

To convert from csv to svmlight format:
awk -F"," '{printf $1" "; for(i=2;i<=NF;i++) {printf i-1":"$i " "}; print ""}' hepatitis.data > hepatitis.svmlight

Separate into training and test if needed:
head -n 155 hepatitis.svmlight > hepatitis.svmlight.training
tail -n 155 hepatitis.svmlight > hepatitis.svmlight.test

Run training and classification:
./svm_learn -c 1 -# 1 -w 3 -l 10 hepatitis.svmlight.training .model
./svm_classify hepatitis.svmlight.test .model .outputfile

hi, would you plz describe for me how can I use this code for KNN classifier?or do you have a code for ROC of KNN and Glass data set?I need it for a part of my project,plz help me,thank you

Thank you. Will try this.
My blog: Silk Flowers

This is for SVM not for KNN. For KNN you can use matlab's knn functions. For ROC/AUC statistics see my post on ROC.

Can you get an example of use.
like which format is the training parameter? 

Thanks!!

training is a matrix, See the commented text in the function:
% Columns are variables, rows are observations.
% It is assumed that the first column of the matrix is the target. Targets are elements of {-1,1}.

What is the function sparse_write? Matlab doesn't recognize it.

just a question for installation of SVMLIGHT it 'll be in with directory(may be in work folder of matlab)??and plz can you exmplain exatly what's role of awk??

I tried the native mex interface for svmlight, but the performance is awful (many orders of magnitude slower than the libsvm interface) and I suspect it is doe to the mex interface itself. Not having the time to redevelop it myself, I came searching for another - this is something along the lines of what I would have tried to do by myself, but now I don't need to understand the svmlight doc format. I still feel dirty using this way though.

Many thanks.

  Subscribe to replies to this post

 
This conversation is missing your voice. Your feedback is appreciated.
Post a Comment


You can use some HTML tags, such as <b>, <i>, <a>

If you see a message that says "your request could not be processed" press preview first and then post.
 
You can follow the discussion of this post by subscribing.


 
You are free to include information from this article on your own site if you provide a backlink. You can use the following markup:
<a href="http://www.myoutsourcedbrain.com/2008/11/matlab-svmlight-interface.html">Matlab Svmlight Interface</a>