My English vocabulary is relatively small since it’s not my native language. Often I read materials on the Internet and sometimes I have to lookup some terms. If I wouldn’t lookup these new words, I will never learn them.

Secondly, it often happens that I know the meaning of some concepts, but I would like to know more about them. Lookup everything takes a while, which will disturb your mental model while reading.

This is why I started working on a little project to embed the knowledge of the Web in the reading process. I’m developing this thing in a pragmatic way using RubyOnRails. The main objectives are: I’ve just finished these objectives, which required writing a little HTML parser and an interface to use the Python NLP toolkit (which is far more superior then Ruby’s).

  • Providing the content with a simple piece of code that can make any piece of HTML smarter.
  • Using Natural Language Parsing (NLP) to pick out the important words.
  • Using Princeton’s WordNet to explain basic concepts.
NLP screenshot Wordnet screenshot

At the moment I’m refactoring the code so it can be distributed as a RubyOnRails plugin. For the near future the following features are on my TODO list:

  • Interfacing with the Wikipedia encyclopaedia.
  • Detecting concepts like ‘Software Engineer’ rather then detecting ‘Software’ and ‘Engineer’.
  • Looking for alternatives forms of user/reading interaction.

Current code: laboratoire_nuage-101206.tar.gz (or browse)

Requires: Python-NLTK, Ruby-Linguistics, Ruby-WordNet and RubyOnRails

Diversity VS uniformity

on September 04, 2006
There are many different programming languages out there. Each of them have profound cultural implications to their programmers. Programmers that learn new languages every once in a while find themselves travelling through different geek-culture warzones. Since one year I settled down in Ruby land. For me this language is unique, unlike any other language I have ever seen. One could view Ruby as the mechanical version of a natural language, because it allows people to extend this computer language like a natural language.

Python and Ruby programmer’s mostly oppose eachother, although they can agree on one thing: PHP is the devil. Python and Ruby branched from the same ancestral tree but have significant cultural impacts. Like Java, Python embraces the principle of uniformity, whereas Ruby embraces the principle of diversity. One could argue that the latter might be more suitable for most of the current (european) political models :)

Now, I’m not saying that Python is for communists. Actually, I’d rather not dirty my hands on politics. But I do think these principles have significant cultural and economical consequences, respectively these are the working environment and the productivity.

I’m highly convinced that Ruby has a much greater potential for both. Programmers can program in their own style, they can even look at it as art. Like a columnist loves writing articles, programmers love writing their kung-fu code. I wonder wether this same columnist can have the same artistic joy in China.

David Heinemeier Hansson, the inventor of RubyOnRails, says beauty leads to happiness, hapiness leads to productivity . I do agree that the programmer’s freedom in creating this hapiness contributes to the productivity, but I don’t think it’s enough. To achieve the full potential of Ruby’s productivity, we need to adjust our development methodologies to this awesome language. Rails makes a good start by facilitating for example unit testing.

In short, I think Ruby is THE language for the new software world and a gateway to new ways of thinking.