Jan 30, 2009

Installing Microsoft Office 2007 in Ubuntu 8.10


This post chronicles my attempts to install Microsoft Office on my linux box in the office in order be able to read and possibly edit files that people send to me in Microsoft formats. I now think that given third party format supported by Microsoft Office (particularly sun's odf plugin) and the difficulties I encountered, installing it should not have priority for me.

My professor finally has the odf import/export extensions to Microsoft Office installed, so we can use open document format to exchange files.

Ok, WTF! So, why not? (carnal weakness)

It's a long time ago since I used any Microsoft application on a computer at work. I use OpenOffice if I have to (to collaboratively edit texts with my supervisors) or use latex, which usually gives visually better results. Now what to tell your boss if he says "You can't edit the template of my presentation? Oh, the problem is that you don't have Powerpoint!"

I didn't want to go into explanations for the xth time that the problem was the proprietory format of Powerpoint. Stupid me, instead I promised to install some version of Microsoft Office (we have licenses of 2007). Sometimes you get into problems with OpenOffice and I thought an installation of Microsoft Office might help me to open and change some files that are incompatible with OpenOffice.

My professor had asked me to install him Linux for Windows (something like wubi, ubuntu under windows), so I guess that would be a quid pro quo.

Although there have been stories on slashdot detailing the ease of installation, these seemed rather hyped. At the time of writing, a bug in the wine, the linux windows emulator, do not make it pleasurable to use Microsoft Office on linux.

I run Ubuntu 8.10, intrepid. A good explanation can be found on this page on programmerfish.

Ok, so the university has a license for Microsoft Office and I can install wine to run certain windows applications under linux. In the wine database I found that for my linux distribution, ubuntu 8.10, intrepid ibex, the office installation should run without a problem and that powerpoint and word should run, even though I might need the newer version of wine, 1.1.13.

It was really easy to setup Microsoft Office. I only had to change the configuration of wine a little. Problem is it crashes every few seconds.


In order to have always the last version of wine include the winehq repository for intrepid:
> wget -q http://wine.budgetdedicated.com/apt/387EE263.gpg -O- | sudo apt-key add -
> sudo wget http://wine.budgetdedicated.com/apt/sources.list.d/intrepid.list -O /etc/apt/sources.list.d/winehq.list

After updating (apt-get update) you should always have access to the last versions of wine.

In order to have always the latest version of OpenOffice (and have hopefully less compatibility problems with Microsoft Office) follow the instructions here and here.

Make sure you install the help of openoffice. In ubuntu the package name is openoffice.org-help-en-us.

Jan 22, 2009

Accessing Virtual Private Networks with Juniper

In an earlier post I lamented how difficult it was to connect to the computer on my work place by ssh and the complete lack of support from the responsible IT department. In the same post I showed how to connect to the virtual private network at work by a ssh reverse tunnel. A meeting with a guy from the IT department later, there seem to have some changes and it seems to be actually possible to connect using the juniper network connect software (thanks to our group's own new IT guy for his help!). Now with juniper, you get assigned an internal IP address to your machine at home and can use ssh with your machines at work.

I am using ubuntu intrepid amd64 and there seemed to have been many people who had problems installing juniper networks. I tried for a long time, always getting the error "Failed to load the ncui library. Quitting." It's not too hard to find the ncui library. On a debian system you can use apt-file:
>> apt-file search ncui

I found some great instructions to set it up correctly. This and above all this page gave some help and in the end I managed (though, like so often, it's a mystery to me what exactly made it work).


I recently finished the novel Fountainhead by Ayn Rand and looking at a citation given at the beginning of another book "Information theory, evolution, and the theory of life" I found it nicely summed up parts of the Fountainhead story:

It must be considered that there is nothing more difficult to carry out nor
more doubtful of success, nor more dangerous to handle, than to initiate a
new order of things. For the reformer has enemies in all those who profit
by the old order, and only lukewarm defenders in all those who would
profit by the new order, this lukewarmness arising partly for fear of their
adversaries, who have the laws in their favor; and partly from the incredulity
of men, who do not truly believe in anything new until they have had actual
experience of it.

Niccolò Machiavelli (1469–1519), The Prince, Chapter 6.

Jan 14, 2009

The Word of God Illustrated

... with lego bricks. Foreigners, divorce, virginity, rape, bestiality, in the bible there's a solution to all of life's problems. Verbum Dei manet in aeternum.

An example found on youtube put together with music and sound effects.

Jan 9, 2009

Popularity of Python, R, and Matlab

Matlab has been used for years. In recent years, with the rise of linux, open source community projects such as python and GNU R have found increasing use and a recent article in NY times wrote about the rise of R. I took the time to create graphs to compare trends in popularity of three algebra computing platforms. This could help in predicting which is the future platform for scientific computing.

People use different software packages for data analysis including excel, SAS, SPSS, or perl. I also have a post about C++ software for number crunching. I could have taken all these alternatives into account, however, I think desiderata for a language should be these three:
  • It should be possible to do fast-prototyping. This includes:
    • It should be a scripting language.
    • It should have a lot of available libraries
  • It should be cross-platform (at least linux and windows) to be portable.
  • It should be fast.
Python (with SciPy and NumPy), R, and matlab meet these desiderata. All of them are cross-platform and scripting languages. All of them are reasonably fast and allow the easy integration of C, C++, and Fortran code to achieve more speed-up. There are a lot of libraries available especially for R and matlab (I am not knowledgeable about python libraries).

So, in this post I want to look at the popularity of python, R, and matlab. I first look at the numbers of citations in scientific publications, then I link to google trends to see visits to project websites.

To look at the scientific zeitgeist, I compiled numbers of citations from scholarly articles of the three platforms counting them in citeseerX and google scholar.

Matlab is the oldest software package. Matlab v. 1.0 was released 1984. Guido van Rossum published version 0.9.0 of python in 1991. The R mailing list started 1997; this was with version v. 0.16.

Google scholar has a huge index of scholarly articles and searches within the text of the articles (which is not done by all search engines). CiteseerX has much fewer articles and article count differs hugely over years with much fewer articles in the last three years. Therefore the curves were normalized by the total article count for each year.

I am conscious that searching "python" many hits are not relevant, however I think the graphs here nonetheless can give some general idea to python's relative importance. Results should be taken with a grain of salt.

Google scholar returns different numbers of hits and different hits over several trials, a fact which I ignored for simplicity. There were also inconsistencies in the "recent articles" feature. You can say you want articles published earliest in some year and you would expect the more years back you go, the more articles you find. However this is not the case always, so the google scholar graph is restricted to 2001 onwards.

Data as of January 8th, 2009.

Here the graphs (click to increase):


Google scholar

Matlab comes out as the big winner. Python and R find increasing popularity. From both graphs you can see that R and python are on the rise. CiteseerX does not have as many citations as google scholar (between 1,500 and 30,000 for each year between 1990 and 2008). Because of big fluctuations in the matlab curve in the google scholar count, I do not dare to make any conclusion about matlab.

Alexa and google trends among other services alow you the comparison of websites. Here is the link to google trends' comparison between R, matlab, and python. Surprisingly, python and matlab are not that different in page visits.

Graphs created in R. Thanks to Statsmethods.net for the explanation on the plot and points functions in R.

Jan 2, 2009


Often when I hear people at public events talking admiringly about geniuses I would like to close my ears. I think in the media there is the image of the child genious, of "the gifted," which were given something by some divinity (or genes) that normal people don't have. I have the impression that many people are too willing to repeat this vision that justifies their denial of the fact that they didn't take responsibility for their life and didn't work as hard as the so-called genius.

I think the word genius is devoid of meaning or rather for me it tells more about the speaker than about the object. I prefer the rags to riches archetype, "if you work hard enough, you can make it" (whatever "it" refers to; I never read any novels by Horatio Algers), though I recognize there are many factors to becoming good at something. Francis Galton's exhibition in heritary genius forms the other extreme of this conceptual dimension. According to this second view, we are born with a talent (more general: IQ) or not. If we are not, then that's the end of the story, some people are and they are famous, such as Mozart (the incarnation for many of the child-genious).

Malcolm Gladwell's book was a discussion of the idea that I had needed. In outliers he brings forth the argument that so-called geniuses, such as Mozart or others (he gives more examples: the Beatles, Bill Joy, Bill Gates - admittedly I didn't like that last example, because of Microsoft's negative corporate image and the perceived worth of their products), could only be successful because of a mix of cultural and family background, hard work, and timely opportunities presenting themselves.

With hard work Malcolm Gladwell refers to the 10,000 hours rule, which says that you need about 10,000 hours to become very proficient in a field. 10,00 hours that's about 3 hours daily for 10 years (forget about holidays, you can slack off about every second sunday). To make the 10,000 hours in about 3 years (say for PhD) you would need 12 hours daily to make it in little more than 3 years (with free weekends). Go to work!