Aug 30, 2008

Ubuntu for PhD Students

Please see the updated version of this article about installation of software for research in Ubuntu 12.04.

This post is about ubuntu software installation for scientific computing and research in general. I wrote this post some time ago (with hardy heron in mind), but I update it every now and then with new software and links. It is updated for intrepid ibex (ubuntu 8.10) and most should still be valid for 9.04 (Jaunty Jackalope).

I use ubuntu at my work mainly for searching scientific articles, statistical computing, and document typesetting in latex. After a default ubuntu installation, I need a lot more...

I need more document viewers, more web browsers (in case mozilla shows errors), more editors, the ssh server, vim, emacs, the gnu toolchain, gnu screen, subversion (svn), git, many texlive latex packages, jabref and bibutils for converting to and from other reference formats, GNU R, GNU octave, python, scipy, perl, java RE and SDK, maxima, gnu scientific library, blas, graphviz, gnuplot, pdfedit, file converter from dos to unix (tofrodos), inkscape, plugins to play media contents, etc.

Two weeks ago my new hard disk at work failed and I had to reinstall my system. Customizing the system turned out to be very fast. I install all packages I need, install acrobat reader (needing ia32-libs), then I change periodicity of system updates, and make it more comfortable. The whole procedure takes few minutes of attention.

>> sudo su

>> apt-get update && apt-get upgrade && apt-get install openssh-server build-essential gcc gcc-doc g++-multilib g++-4.2-multilib gcc-4.2-doc libstdc++6-4.2-dbg apt-file gcj gsl-bin gsl-doc-pdf gsl-ref-html libgsl0-dev gsl-bin gsl-doc-pdf libgsl0-dbg libgsl0ldbl glibc-doc libblas-dev maxima maxima-share subversion subversion-tools git screen $(aptitude search R| grep -v ^i | awk '{print $2}' | grep ^r-) octave $(aptitude search texlive | grep -v ^i | awk '{print $2}') untex luatex latex-xft-fonts perl fontforge context-nonfree context-doc-nonfree dvipng imagemagick graphviz gnuplot-x11 gnuplot-doc gnuplot libatlas3gf-base sun-java6-bin sun-java6-jdk sun-java6-jre sun-javadb-client sun-javadb-common sun-javadb-core axel kdevelop kate kile kile-i18n-ca vim-gtk vim vim-addon-manager vim-common vim-doc vim-latexsuite latex2html latex-beamer xpdf writer2latex jabref bibutils hevea hevea-doc wordnet cups-pdf djvulibre-bin djvulibre-plugin pdfedit inkscape scribus pdf2djvu pdf2svg python2.5 ipython python3-dev python3-all vim-python python2.5-dev python-scipy unrar tofrodos galeon epiphany-browser epiphany-extensions scribes gnochm lyx claws-mail claws-mail-i18n claws-mail-doc claws-mail-tools libqt4-core libqt4-gui flashplugin-nonfree gstreamer0.10-ffmpeg gstreamer0.10-plugins-ugly gstreamer0.10-plugins-ugly-multiverse gstreamer0.10-plugins-bad gstreamer0.10-plugins-bad-multiverserar ubuntu-restricted-extras regionset soundconverter gxine libxine1-ffmpeg ia32-libs libstdc++5 libmms0

I installed the 64bit version of ubuntu. If you use the 386 version you don't need ia32-libs which allows you to run 32bit applications in the 64bit platform. libqt4-core and libqt4-gui are for skype, libstdc++5 is for RealPlayer, libmms0 for windows media player codec.

For playing movies and other media content, for example DVDs, you need libraries gstreamer0.10-ffmpeg gstreamer0.10-plugins-ugly gstreamer0.10-plugins-ugly-multiverse gstreamer0.10-plugins-bad gstreamer0.10-plugins-bad-multiverserar ubuntu-restricted-extras regionset gxine libxine1-ffmpeg (included above). See also this howto. In order to read encrypted DVDs you also need to execute a script that installs another library, which is not included in ubuntu for legal reasons:

sudo /usr/share/doc/libdvdread3/

Note that for installing some of the GNU R libraries via install.packages you need to have installed the development files for R (in ubuntu this is the r-base-dev package. I just install every R package that is not yet installed).

For acrobat reader download the package at, and execute sudo dpkg -i --force-architecture AdobeReader_enu-*.deb. (You need the force-architecture switch for 64bit platforms).

For skype download the package at and if you run your system with 64bits use the --force-architecture switch with dpkg.

Opera is one of the fastest web browsers. You can download it here.

What I find annoying in firefox are (among other things) that you have to open an extra window to access downloads. The Download Statusbar integrates them into the main window. spellchecker addons can be quite useful when you are writing mails.

For having the shortkeys alt-shift-up for changing windows in gnome (similar to MacOS expose) I put in preferences -> appearance -> visual effects: "normal." (you need to have compiz enabled for that)

In order not to get vexed by daily updates I change periodicity in synaptic (configuration -> repositories -> actualizations: once a week).

That's it. For still more you might want to see the UbuntuScience community page.

In next posts I will introduce smart bookmarks for faster web searches, explain how to synchronize web browser bookmarks on different work stations, personalize the vim editor, set up a revision control repository, and automatically synchronize data.

Enjoy. Please leave a comment below for questions and suggestions.

N800 - ebook reader, internet, and music

Some time ago I saw a guy in the university who was reading a book on an ebook reader. I later looked it up: his reader cost about 500 euros, it only permitted reading documents, and anyways it was too bulky to carry around.

I wanted to extend my computing experience to the handheld and searched for better, cheaper, and smaller devices and I found one: the N800. It has a screen where you can read pdfs, it further has wireless and plays music and video. You can listen to lastfm, talk on skype, make photos, and chat. And it runs linux.

On the photo you see the N800 on top of the apple speaker system, playing music.

Picture of the N800 playing music
In Europe it is not that cheap, but if you order in the US...

At dell it cost me:
N800: $249.99
2 GB Secure Digital Memory Card: $24.99
IVA: $16.50
Discounts: $55 .

Shipping to Europe costs an import fee. Having friends in the US can save the fee. Add 75 eurocents for a power converter at the Chinese store around the corner ;).

This totals about 170 euros. Sweat.

Once you get it, you want to install it and see what it can do. A warning: be careful with the settings for email. It is not easy to change it, once you got it wrong initially!

In the administration tool, there are few software packages available by default, so you have to search on the internet for software (e.g., basically just clicking on install the N800 does the rest alone. However, the fastest way for adding software is using the administration tools. This guide explains a nice hack how to add many more software repositories at one go, so you have more software to choose from in the menu.

I installed many applications including xterm, openssh, evince, vim, vegalume, canola, mplayer.

The operating system didn't seem to recognize the flash memory on its own, so I had to set it up, doing some magic. Let's see how to do it:
You need to become root in order to do this. At the terminal:
> su gainroot
> passwd -d root
We set an empty root password, so we don't have to remember this gainroot thing ;) (the passwords are not really save anyways with the automatic word completion) and the next time we need to become root we can just type su.

Now edit the /etc/fstab file:
> vi /etc/fstab
We add a line at the end (going to the last line and pressing o).
/dev/mmcblk0p1 /media/mmc2 vfat rw,noauto,noexec,nodev,nosuid,utf8,uid=29999 0 0

Press escape, then save the file and leave by :wq. You can now mount the flash memory
> mount -a

On reboot the flash memory should be available.

Fast Internet Searches with Smart Bookmarks

Smart bookmarks are similar to other browser bookmarks. The idea of smart bookmarks is that you enter a keyword and some search text in the location bar and on pressing enter you get the result from the search engines that you defined for the keyword. That sounds easy and fast and it really is. This post shows how to define a smart bookmark and gives several examples of smart bookmarks that can help to speed up internet searches and dictionary lookups.

An example: You are on the phone with a friend from Mongolia who tells you about his country. You don't want to seem stupid and need to know the name of the president of Mongolia and fast. You hit ctrl-L to get to the location bar and type w Mongolia. The wikipedia article on Mongolia opens and on the left-hand panel you see the president's name among the key data on Mongolia. This works because before you defined a smart bookmark with shortcut w to correspond to wikipedia.

How to Create a Smart Bookmark?

Smart bookmarks work in Firefox, Epiphany, Galeon, Opera, Google Chrome (via search engines), and others. Defining them can be done in the following steps:
1. go to the search engine or site you want to define as a smart bookmark (say wikipedia article on Mongolia)
2. bookmark the site (in firefox that's control-D)
3. edit the bookmark and insert the shortcut as seen in the screenshot below. Go to organize bookmarks (in firefox control-shift-O), find the new bookmark, edit the location where you put in %s instead of a keyword and put a keyword, for example w.
smart bookmarks in firefox

You can use the address of any search engine you want to use. Just search for something and then bookmark the page. In the bookmark, substitute your search term for %s. The keyword should be something you can easily remember.

Ideas for Smart Bookmarks

Here you can see some engines I use:

  1. google:
  2. google scholar:
  3. Leo English-German dictionary:
  4. Leo Spanish-German dictionary:
  5. wikipedia:

I chose w for wikipedia, g for google, s for scholar, es for Spanish-German, and en for English-German.

Even though you probably have a search tab in your web browser, the google search is useful, because its fast and because the modification at the end, which provides near real-time results. In fact in this case, search results are restricted to sites indexed less than 24 hours ago. I refer you to the article where I found this hint to get more information on how to get more recent google search results.

Translation Engines as Smart Bookmarks

For an overview over translation/dictionary services see a comparison in this wikipedia article. Google translate and worldlingo support most language pairs of the compared services.

Some languages are harder to find. For Catalan, I had trouble to find a free dictionary that could be used with smart bookmarks. However google's translation service (google.translate) also does Catalan. You can use these urls:

Substitute %from and %to with the country codes corresponding to your language pair. To use the yahoo translation service (babelfish) use this url:

You find more explanation and more examples at the mozilla site.

By the way, there exist also a variety of plugins for firefox for translation of text, for example the Babelfish and gTranslate plugins.

In Google Chrome you would right click on the address bar, edit search engines, define a search engine, and choose a keyword that you want to use.

Do you already know who's the president of Mongolia?

Enjoy. Please leave a comment below for questions and suggestions.

How to Build a Beowulf? - Hardware Part

A beowulf cluster is a cluster of computers connected over a network, running linux, and dedicated to computation. This article is the first part of a series on how to build a beowulf. It describes the hardware assembly and costs.

At the university we need big computational power to do simulations, but we don't want to spend huge money on one big and expensive computer. We rather want many cheap computers and have them compute in parallel. The computers are connected in a network and run Linux. This architecture is called a beowulf.

In this post I will describe how we built our beowulf cluster. I'll explain which hardware we chose and some experiences assembling it. In later posts I will describe cluster installation, cloning of configurations, and running parallel processes in matlab and GNU R using MPI and PVM.

I spend a lot of time comparing different hardware. I was fortunate to find a store, specialized in servers, where they helped me with good advice. We bought 8 computers with these parameters:
  • Intel Core 2 Duo Quad Q6600 2.4 Ghz FSB1066 8MB
  • Chipset Intel X38 / Intel ICH9R
  • 4 GB RAM DDR3 1066 (in 2x2Gb).
  • 2 x PCI Express x16, 1 x PCI Express x1, 2 x PCI-X y 1 PCI
  • Marvell88E8056 Dual Gigabit LAN controller
  • Realtek ALC882M 7.1 chanels (sound
  • 6 USB 2.0 ports y 1 IEEE1394
  • VGA 512MB Gforce8400GS PCI-e
  • 160 GB de Disco Duro SATA II 3 Gb.
  • DVD R/W, Multicard reader/writer
  • 19" rack computer case, 4U, with frontal lock.
  • 550W Server Guru

They cost 923 euros each.

To operate 8 computers efficiently we need a monitor... or better a KVM (keyboard video mouse) with 8 ports, which cost 850 euros.

We also need to connect all the computers among themselves: A switch, 16 ports, 10/100/1000, 162 euros.

We want to put all the hardware somewhere, where it is save and where it has good conditions: a rack. This is our rack. It came like Ikea from the shop, even including easter egg screws that you have to search ;). It can take up 42 units of 19'' and cost us 500 euros.

There came a surprise for us: we need cables, screws(!), and multi-outlet power strips. Rails allow you to stack in and take out your computers like drawers. Additional cost: about 700 euros!

In Spain, they mostly don't give you the real costs. There is IVA (impuesto al valor agregado=value added tax), 18%. About 1500 euros.

Total cost of the beowulf: about 11,090 euros.

We bought computers, we bought a rack, we bought everything we need, but we are in Spain and we are temporarily in an old building. What had to happen? The electricity lines didn't have the capacity to power the computers! While a pile of cardboard boxes started putting on dust, we waited impatiently (especially me) for more than a month. Finally, the technicians fixed the power lines so we have the capacity we need.

So, we mounted everything and the magical moment was starting up all computers and seeing they were running without power failures or explosions.

You see on the photo, there is a lot of light in the room. On top, air conditioning in the room doesn't work and they have to fix it. Fortunately (unfortunately), August with Barcelona's heat is about to end and it becomes already less hot. Maybe in January we will move to another building and hopefully get a better room, where we can run the computers during next summer.

In other posts, I'll explain basics of cluster configuration and parallelization. Until then please enjoy with me a photo of the computer hardware mounted into the rack. You can see the 8 computers flanking from two sides the KVM which is in the middle, behind it the switch. The cluster stands now next to my desk, because in our office both air conditioning and ethernet work.

Book recommendations:
Robert G. Brown's "Engineering a Beowulf-style compute cluster" [online at duke university] (2004) gives you the essentials of the hardware side.

High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI (Nutshell Handbooks) by Joseph Sloan

Configuration of the Vim Editor

vim editor screenshotVim is a fast and feature-rich text editor. It offers syntax highlighting for many programming and scripting languages, editing by regular expressions, and a lot more. In this post I give first a short introduction to vim then I show how to enable some of its powerful features by default on start of vim.

Introduction to Vim

Vim is free as in beer and speech and was created as an extended version of the vi editor, with many additional features designed to be helpful in editing program source code and the name Vim stands for Vi IMproved. Vim is cross-platform, which means that it works on many different operating systems, including Windows, Linux, and Mac.

Handling of Vim is not completely intuitive, however vim can make you very fast when you edit text as this screencast demonstrates.

Vim gives you great power, which comes at the price of a steep learning curve. There is a tutorial (command :vimtutor), online documentation, you can find many cheatsheets on the internet. For a more accessible interface to vim, there is Cream, a descendant of vim.

The following video offers you a first introduction to the vim editor.

Configuration (vimrc)

In this section I describe my basic configuration for vim. We want to enable some of the powerful features that Vim offers. The initial settings depend on your system. You can check them by opening the file $VIM/vimrc (on Linux this is usually /usr/share/vim/vimrc, type :echo $VIM within vim to find out where you have them).

I want searching with smart case matching, I want to use the mouse to mark text and for jumping with the cursor, and filetype plugins, I want US American spell checking for txt, tex, and HTMLs. I also want an extended status line and have some WYSIWYG with latex which means instant compiling and viewing by pressing ctrl-k.

nocompatiblemost important option. nocompatible prevents vim from emulating the original vi's limitations.
autoindenttells vim to use the previous line's indent level to set the indent level of the new lines.
smartindentlets vim make an educated guess based on the content of the previous line (works for programming code).
tabstopdefines how many spaces correspond to a tabulator.
showmatchtells vim to show you matching opening or closing brackets when you step on a bracket with your cursor.
incsearchvim will search for text as you enter it.
syntax onsyntax highlighting on
filetype plugin indent onload filetype plugins/indent settings
ignorecasecase insensitive
smartcaseif there are capital letters, become case sensitive
wildmenuvim's command line completion
clipboard=unnamedVim yanks into the system clipboard and you can paste into other applications (in linux by middle click)

Try :help options in Vim to see all available options.

The spellfile option is for adding words to your user spell dictionary (move cursor over word and press zg).

Here goes my ~/.vimrc (exported to html by gvim):

set ignorecase smartcase
set nocompatible
set incsearch
set expandtab
set shiftwidth=2
set smarttab autoindent
syntax match Tab /\t/
hi Tab gui=underline guifg=blue ctermbg=blue 
set showmatch
set ru
set mouse=a
filetype indent on
set clipboard=unnamed
filetype plugin on
set wildmenu
autocmd BufNewFile,BufRead *.txt,*.tex,*.html,README set spell spelllang=en_us 
syntax on
set paste
set statusline=%F%m%r%h%w\ [FORMAT=%{&ff}]\ [TYPE=%Y]\ [ASCII=\%03.3b]\ [HEX=\%02.2B]\ [POS=%04l,%04v][%p%%]\ [LEN=%L]
set laststatus=2
:map Q <Nop>
I like especially the status line (Note that ^K has to be entered within vim by pressing three times ctrl-k, and ^M by pressing ctrl-k once and ctrl-m twice). I found the status line in the book Hacking VIM" by Kim Schulz, which explains more about how to personalize vim. When you are programming/scripting and porting code from Windows to Unix it is important to convert carriage returns (CR) to linefeeds (LF) otherwise you'll get errors. The last part of the status line issues therefore a warning in red, if the file format is not unix. I found this hack in the article about conversion between unix and windows with vim at

The last line disables the ex mode.

Further Reading:
In this excellent blog post, the author gives some very useful hints, including code folding and multiple clipboards. Also, you might want to read an introduction to personalization of vim.

More great vim commands you can find at CommandLineFu.

Image Credit:
vim screenshot at wikipedia.

Introduction to Version Control with Subversion

Subversion is a software for version-control which means that it helps you to store changes you make to files, so you can revert edits, restore deleted files, and much more. In this post I will list the most important subversion commands, from creating a repository on a remote server to committing and updating locally on the machine you are working on. In another post I explain how to synchronize you repository automatically and in yet another I compare some commercial providers of remote backup servers.

About two weeks ago my hard disk failed. I got a new one the following day from our vendor and had my system configured very fast (see other post). Unfortunately I cannot say the same of my data. The server I had been using for backups had been crashing all the time and I was lazy in catching up again with my backups. In consequence I lost two posters I had created. I was very angry with myself. As a reminder to myself this post is about subversion.

Subversion can help you to keep track of changes you make to documents. You can work on a project with other people, manage versions of your files, undo changes, and much more. I use it to be able to make changes fast and always be able to roll-back any time, and as a backup system. Many programmers use version control systems such as subversion, however it can be used with any textual documents.

Subversion can be used completely from the command line, however there are several graphical interfaces to control your versions by clicks of some buttons (examples). In order to use subversion it is important in any case to know how it works. In this post I will explain step by step how to use subversion

It is important to understand the concepts of client and server. They may refer to different computers, however the difference is rather conceptual. All revisions to your files are stored on the server. The server can be remote, this means you use a different computer. You work on a local computer, a client, make changes to your files, and then synchronize with the server.

Instead of a written introduction to subversion, you may watch this video to learn what subversion is.

Basic Subversion Commands

We will first need to create a repository on the server and then tell subversion that a directory on a local machine we are working on corresponds to that repository. After that we can start committing local changes and update our local files.

On your server you create a subversion project:
> svnadmin create path

Locally, you can start working, starting by checking out the project from the server:
> svn co svn+ssh://url/path local_directory

You need to give the full (absolute) path to the directory (otherwise you'll get "no repository found").

If you have your repository on the same computer use instead:
> svn co path local_directory
Here you can give a relative directory.


> svn co file:///home/yourname/svnrep/ newprojectpath

Now we can start working by creating files and editing them. We add the files to the version control:
> svn add filename

Finishing work (or always after having made some major changes), we commit the changes to the repository:
> svn commit -m "changelog"

Next time you start working, if you made changes from another computer, do an update:
> svn update

or (if you are not in the directory)

> svn update local_directory

We can always check out all subversion commands by typing svn help and get help on a command by typing svn update command.

svn status always gives you summary information about whether and how your data is changed:

A Added
D Deleted
U Updated
C Conflict
G Merged
E Existed

It is often useful to be remembered whether you need to add some more files to the version control:
> svn status | grep ^?

I find symbolic links very useful. Subversion recognizes and can administrate symbolic links, however if you had a file which you want to change for a symbolic link, you need to set the svn:special property of the file (otherwise you get the error svn: Entry 'yourfile' has unexpectedly changed special status):
> svn propset svn:special on filename

In the contrary case, if you change symbolic links to files delete the svn:special property:
> svn propdel svn:special filename

That's the basics and you can do with knowing these commands, but of course there's much more functionality, which you kind learn about from the official manuals.

Enjoy. Please leave a comment below for questions and suggestions.