How to Build a Twitter Agent
Note, while working on this project this ReadWriteWeb article was released, illustrating the future potential of the Jabber/XMPP protocol.
In this article we will build an actual useful Twitter Service that will allow us to track the Blogosphere. In the process we will get hands on programming experience with Ruby, DRb, Twitter and Jabber. This will sharpen our developer skill-set to get ready for the upcoming (Folk)Semantic Web. Also we evaluate the problems seen and opportunities ahead.
Background
Whether you want to call it Web 2.0, Web 3.0, the Semantic Web or the Web of Data – change is happening. The past years we’ve seen the tremendous power of Folkosonomies and now this Social Web is colliding with the emergence of the Semantic Web, resulting in the first semantic services. For us developers and creative entrepreneurs it’s important to get ready for this new wave of business opportunities. I find the whole notion of Intelligent Agents very interesting. For our little project however, we will create a Stupid Agent :]
Technologies like Jabber/XMPP and DRb will enable us to move from a reactive web to a proactive web. Right now this proactive realtime push of data is important for the more liquid content creation services. Micro/nano blogging platforms as Twitter and Tumblr are good examples of this. This is one of the reasons that they already have a Jabber service set up.
I’ve had used Jabber before to communicate with my geek friends. For this project, I had to set up a Jabber client and Jabber account. Call me stupid, but I actually had to spend 30 minutes figuring out how the hell I had to create an account and choose which server to use (turns out you can do that in the client). Now of course XMPP/Jabber is just a standard for enabling IM communication, but apart from Google’s GTalk there hasn’t been much widespread use by ordinary users. In my view, these uses of XMPP for machine to machine (to human) are much more interesting.

Case: The Observatory Bot
I’ve been programming for quite some time now. When learning a new language I really hate doing little examples that produce zero user/business value. That’s why I think the best way to learn new technologies is to solve real world problems right away. Of course tutorials are valuable to just get a general idea of what’s going on, but don’t waste too much time on them – implement straight away.
Our Twitter Service will also need to create some value for the user and must be production ready. However, don’t get your hopes up too much since these are experimental technologies with dependence on external services like Jabber and Twitter.
Remember my last little project? Wigitize.com is actually generating a lot of data. It’s tracking about 5000 6000 feeds every hour! Let’s do something with that data :]

- is a Twitter-only service
- will allow you to ‘track’ the Blogosphere
- will send you a direct message when something happens in the Blogosphere
Basically this is like Twitter’s IM functionality to track the Twittersphere. So in a sense The Observatory will be a proof of concept portal between the Twittersphere and the Blogosphere.
The Architecture

Right now Wigitize.com uses BackgrounDRb to perform background tasks and also to update all RSS feeds periodically. Everytime the feed aggregation process finds a new feed entry it will create a FeedEntry instance. The creation of these objects serve as events for the Observatory Bot. These events have to be pushed to the Observatory Bot in some way.
What can Twitter’s IM service do for us?
To play around with Twitter’s agent you need to set up a Jabber account and a Twitter account. For debugging I’ve found the MacOS tool JabberFox very helpful.
Basically, these commands are available:

However, I think there are a lot more hidden commands which can be used. After emailing with the Twitter developers they told me there is a command called “d”. This can be used to send direct messages (d username message). Very useful!
Coding the Bot
Our implementation choice for today will be Ruby. If you’ve programmed intensively in other languages before you’ve probably come to the conclusion that Ruby is quite different from most other. Ruby’s flexible object models allow for great extension of the language itself (eg 3.minutes.ago). It is therefore no surprise that interesting Semantic Web projects like ActiveRDF are choosing Ruby as their language.
To communicate with Twitter we can use this cool Ruby Twitter API. Unfortunately all of these API’s are HTTP/REST driven and really limit what we can do in terms of realtime response. Also, if you want to build a serious production ready service, constantly polling Twitter will kill both parties.

So we need to interface with their Jabber Service. Jabber is a friendly name for Instant Messaging (IM) using the open XMPP protocol. Luckily, there is a Ruby library called XMPP4R which does most of the XMPP work for us. This blog post provides some simple examples and this German wiki entry provides sample code how to use callbacks (very important for a bot).
I’ve wrapped all of this in a simple JabberBot class jabber_bot.rb that can be used like this:
1 2 3 4 5 6 7 8 |
class MyJabberBot < JabberBot def on_message(from, body) say(from, "You said: #{body}") end end my_jabber_bot = MyJabberBot.new('observatory@jabber.org', 'password') my_jabber_bot.connect_and_authenticate my_jabber_bot.run |
As you can see in the diagram, I’ve build a TwitterBot on top of this JabberBot. Unfortunatly it’s not possible to do all communication with Twitter through Jabber yet. For example: there are no events for when users start following other users or ways to retrieve information. This is why twitter_bot.rb is essentially a hybrid using both the Twitter API and Twitter’s Jabber service. Feel free to use all sourcecode provided here, I know it will be useful to some of you out there. This is how to use this TwitterBot:
1 2 3 4 5 6 7 8 9 10 11 12 |
twitter_bot = TwitterBot.new('observatory', 'password', 'observatory@jabber.org', 'password') twitter_bot.track_phrases = ['observatory.topoints.com'] twitter_bot.on_directed_tweet do |username, message| puts("directed tweet: #{username} says #{message}") end twitter_bot.on_tweet do |username, message| puts("something from #{username}: #{message}") end twitter_bot.on_track do |username, message, phrase| puts("track: #{username} says #{message} (keyword: #{phrase}") end twitter_bot.runn(:follow_all_followers => true) |
Now that we have the basic building blocks to build our service, let’s build our core business logic (observatory_twitter_bot.rb):

This means that we will send a greeting when people start following us:
1 2 3 4 5 6 7 8 |
on_follow do |username| logger.info("#{username} is following us, will follow #{username} too and send welcome message") follow(username) direct_message(username, "the Observatory is now ready to serve you, use '@observatory track [keyword]' to get blogosphere updates.") end on_unfollow do |username| logger.info("#{username} stopped following us") end |
Note: in order to get the on_follow event, we have to poll the Twitter HTTP API . Since Twitter limits the rate to 70 requests per hour, I poll every two minutes to be on the safe side.
And that we will start tracking the Blogosphere for them when they say the magic word:
1 2 3 4 5 6 7 8 9 10 11 12 |
on_directed_tweet do |username, message| logger.info("directed tweet: #{username} says #{message}") if (phrase = track_phrase(message)) logger.info("tracking '#{phrase}' for user #{username}") begin direct_message(username, "Will send a direct message anytime something happens in the Blogosphere regarding '#{phrase}'") Tracker.for(username, phrase) rescue => e logger.error("tracking failure: #{e.to_s}") end end end |
Now when a new FeedEntry is created, we need to make sure that these Twitter users get notified when their tracked phrase matches the FeedEntry. Since this might take up some time, I’ve created a background worker task for it:
As you might see, Distributed Ruby (DRb) makes it extremely easy to control our bot remotely. In the ObservatoryBot we say:
1 |
DRb.start_service("druby://:8997", self) |
And all bot functionality can be accessed by calling: observatory_bot = DRbObject.new(nil, ‘druby://:8997’)
Now that we have our autonomous agent it would be nice if we could easily start and stop it in a production environment. I found the Ruby Gem called Daemons extremely useful to wrap these things up.
First, set up a file that runs the never ending process (eg script/observatory_twitter_bot.rb):
1 2 3 4 5 6 7 8 |
require 'logger' require File.dirname(__FILE__) + '/../config/boot' require File.dirname(__FILE__) + '/../config/environment' require File.dirname(__FILE__) + '/../lib/observatory_twitter_bot' logger = Logger.new(File.join(RAILS_ROOT, 'log/observatory_twitter_bot.log')) observatory_twitter_bot = ObservatoryTwitterBot.new(logger) observatory_twitter_bot.runn |
Next, wrap this up in a daemon script (eg script/observatory_twitter_bot):
1 2 3 4 5 6 7 |
#!/usr/bin/env ruby require File.dirname(__FILE__) + '/../config/boot' require 'rubygems' require 'daemons' Daemons.run('script/observatory_twitter_bot.rb') |
The Demo
Right now if all communication lines with Twitter are working fine, the service is up and running. I’ve made a little bot homepage at observatory.topoints.com

@observatory track ‘Twitter’:

Problems and Opportunities
In the development of the Observatory I had one big obstacle: Twitter is often down and it can cripple your service and development time. I understand that Twitter is a small team under enormous pressure but there have been a lot of complaints about this.
Nevertheless Twitter and it’s developers really kick ass. I emailed Alex Payne and he was excited about what I’m doing (and also the Twitter things happening the iKnow! project in Japan). He responded fairly quickly and immediately whitelisted my Twitter account to up the rate limits.
While working on this project I realized that Twitter in it’s current state isn’t really suitable for system-to-human notifications. Twitter could expand their system to be a true notification framework, but I’m not sure if they will. If they don’t, there is a tremendous business opportunity here. Imagine an open API that mashes up with technologies like XMPP and Growl. A service like that could become THE notification-bus of the web! (Already Growl is pretty big in Mac land). A full blog post about this startup idea coming up!
Geek Food for Thought
What about RubyOnRails, Jabber, DRb, Daemons and BackgrounDRb? I think we are seeing a new framework here! In this interesting article by Danny Ayers he talks about a toolset for agents. On Java there is already “an Agent Framework” that I haven’t checked out yet. I can imagine that these frameworks ease doing development like this and facilitate better system autonomy. Of course it’s desirable to give such a framework a pragmatic paintjob by using Rails paradigms.

Above here I’ve illustrated Rails’ missing brother. I think it’s also good to take into account interesting technologies like Juggernaut and Comet. These are basically Javascript Push techniques to make a synchronous interaction on the asynchronous web possible.
When you combine all these micro- asynchronous communication lines you get one big synchronous connection line between machine agents and user agents.

11 Responses to “How to Build a Twitter Agent”
Sorry, comments are closed for this article.
Where'd you get the picture of the observatory?
Great tutorial Dominiek, I know a lot of people are waiting for examples like this. Keep it up!
@Mike flickr! forgot where, keyword observatory. It's really sci-fi isnt it?!
Great tutorial. Are you aware of any intelligent Agents frame works built in ruby? All the ones i know of JADE , JACK , JASON, Agent factory etc are all developed in java ...
Is this a bug or a feature? http://img338.imageshack.us/img338/1351/twitterobservatoryuy7.png
The emails look like this:
Will send a direct message anytime something happens in the Blogosphere regarding 'macbook '
observatory / observatory
Great article Dominiek! Out of curiosity, have you played with Juggernaut and/or Comet specifically? I'd be interested to know your experiences with these toolkits.
@Ilya I did some Juggernaut on my project confabio.com. Wasn't very stable at the time but has great potential! Will do a little cool realtime project soon!
@Illya: Unrelated perhaps, but we are using Comet for our website http://www.marketsimplified.com . For pushing Stock Market Ticks in realtime. I started with Juggernaut, but then wrote my own push_server using EventMachine, then faced some scaling problems and rewrote PushServer in Scala. Working well now.
@dominiek: Perhaps you are still using older version of bdrb, since you need DRb. But I was wondering if you checked out latest code at http://gitorious.org/projects/backgroundrb . I wrote some sample code, where I was managing XMPP connections from bdrb, but well that was just a sample. Excellent writeup BTW.
Great tutorial! I'm stating to use ActiveRDF and DBr appears to be useful for me.
Thks!!!
Twittert is experimenting with XMPP and they already have an XMPP endpoint to which they publish all tweets. It's not public yet but I saw it running on Ralph's laptop so you can have all tweets flying by.
Ralph Meijer at Mediamatic is working on this stuff and he's building an XMPP<->HTTP gateway among other stuff, so you can publish notifications via HTTP, and get callbacks from XMPP events on a HTTP callback URL.
Some preliminary work: http://idavoll.ik.nu/wiki/HTTP_Interface
Hi all, I want to learn Ruby on Rails. So I need a help ? I'm just newbie. Please contact me ?
Regards Zeck