Wigitize.com, a few hours of Creative Coding
on January 02, 2008Today and yesterday I’ve spend another couple of hours coding and designing away on my (autistic) little project Wigitize.com. I’m keeping an exact log of the amount of hours I’m spending and what’s happening so I will publish some full-exposure articles about how to build a project like this.
Currently these are the things that still need to be done:
- enabling people to choose a style for their widget
- write a simple API and a page to show how it works
- optimize backend so that a background worker will be used
- allowing people to ‘grab the grabber’ so that they can offer widgets of their own content to their users
- renting a machine with bandwidth, I’m thinking of slicehost.com (much cheaper than EC2)
- rethinking the name WIGITIZE. It sounds good to my Dutch-English, but it might not for Natives.
Another little sneak preview:
click image to expand

Wigitize.com, widgitize any web feed
on December 31, 2007Detecting atom/rss Feeds in Ruby
on June 22, 2007In the current SNS - Social Networking Site - boom it is becoming increasingly important to deal with usability. People have accounts for many different websites and it's getting more and more tiring to register for a new account. This is one of the main reasons why Confabio.com doesn't require you to signup and login. And it's also one of the main reasons why websites like Wakoopa.com make their registration as painless as possible. A colleague of mine was even experimenting with the idea of omitting the username/email requirement at all. Also, OpenID is yet too young and Sun's Liberty Alliance is just too corporate and slow.
But for most social networking sites it's pretty simple: they just need people to enter information. So let's make that as easy for the user as possible.
Entering Syndication Feeds
For one of my projects I have to let users enter information about themselves. This is so they can build up their own profile. What I really like about some of the new sites is that they aggregate your blog's contents and your FlickR pictures.
One of such websites is the Tokyo based Social Networking Site Asooboo.com. After signing up you can enter your blog feed and FlickR username and it will keep track of all your stories and pictures. I think that's really cool and it's one of the first steps in making the web more ubiquitous. You can later change your Feed URL in your 'edit my profile':
Entering Links instead of Feeds
Entering feeds is nice, but to users that are not tech-savy 'Feed, RSS and Atom' might raise question marks. Therefore I think it would be nice if the users wouldn't have to worry about feeds, but instead can just enter their links like:
My Websites and Profiles:
- http://blog.dominiek.com/
- http://www.flickr.com/photos/dominiekterheide/
- http://del.icio.us/dominiekth
It would then show a fancy spinner and convert it to 'My Blog', 'My Pictures' and 'My Links'. All content will be automatically aggregated if it can detect any RSS feeds on those pages.
Detecting RSS feeds
When you use a proper browser like Mozilla Firefox you will see a syndication icon every time you visit a website that has RSS feeds:
It does this by reading certain HTML tags.
After a quick search I couldn't find any code to do this in my own project, so I wrote a little piece of code for it with a RubyOnRails integration test.
You can use it like this:
FeedDetector.fetch_feed_url('http://blog.dominiek.com/')
=> "http://blog.dominiek.com/feed/atom.xml"
FeedDetector.fetch_feed_url('http://blog.dominiek.com/feed/atom.xml')
=> "http://blog.dominiek.com/feed/atom.xml"
FeedDetector.fetch_feed_url('http://www.flickr.com/photos/dominiekterheide/', :rss)
=> "http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=rss_200"
# alternatively you can parse HTML with FeedDetector.get_feed_path(html_data)
# see integration test for more examples
FeedDetector + Test
Excuse my quick mash code. The FeedDetector (lib/feed_detector.rb):
require 'net/http'
class FeedDetector
##
# return the feed url for a url
# for example: http://blog.dominiek.com/ => http://blog.dominiek.com/feed/atom.xml
# only_detect can force detection of :rss or :atom
def self.fetch_feed_url(page_url, only_detect=nil)
url = URI.parse(page_url)
host_with_port = url.host
host_with_port << ":#{url.port}" unless url.port == 80
req = Net::HTTP::Get.new(url.path)
# something fishy going on with URI.host
res = Net::HTTP.start(url.host.gsub(/:[0-9]+/, ''), url.port) {|http|
http.request(req)
}
feed_url = self.get_feed_path(res.body, only_detect)
feed_url = "http://#{host_with_port}/#{feed_url.gsub(/^\//, '')}" unless !feed_url || feed_url =~ /^http:\/\//
feed_url || page_url
end
##
# get the feed href from an HTML document
# for example:
# ...
# <link href="/feed/atom.xml" rel="alternate" type="application/atom+xml" />
# ...
# => /feed/atom.xml
# only_detect can force detection of :rss or :atom
def self.get_feed_path(html, only_detect=nil)
unless only_detect && only_detect != :atom
md ||= /<link.*href=['"]*([^\s'"]+)['"]*.*application\/atom\+xml.*>/.match(html)
md ||= /<link.*application\/atom\+xml.*href=['"]*([^\s'"]+)['"]*.*>/.match(html)
end
unless only_detect && only_detect != :rss
md ||= /<link.*href=['"]*([^\s'"]+)['"]*.*application\/rss\+xml.*>/.match(html)
md ||= /<link.*application\/rss\+xml.*href=['"]*([^\s'"]+)['"]*.*>/.match(html)
end
md && md[1]
end
end
The integration test (test/integration/feed detector test.rb:
require "#{File.dirname(__FILE__)}/../test_helper"
class FeedDetectorTest < ActionController::IntegrationTest
def test_fetch_feed_url
return # uncomment me to test HTTP fetching
# test mephisto
feed_url = FeedDetector.fetch_feed_url('http://blog.dominiek.com/')
assert_equal('http://blog.dominiek.com/feed/atom.xml', feed_url)
# test wordpress
feed_url = FeedDetector.fetch_feed_url('http://digigen.nl/')
assert_equal('http://digigen.nl/feed/', feed_url)
# test non conventional port
feed_url = FeedDetector.fetch_feed_url('http://blog.dominiek.com:8000/')
assert_equal('http://blog.dominiek.com:8000/feed/atom.xml', feed_url)
# test only_detect rss/atom on flickr
feed_url = FeedDetector.fetch_feed_url('http://www.flickr.com/photos/dominiekterheide/', :atom)
assert_equal('http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=atom', feed_url)
feed_url = FeedDetector.fetch_feed_url('http://www.flickr.com/photos/dominiekterheide/', :rss)
assert_equal('http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=rss_200', feed_url)
# make sure that feeds return themselves
feed_url = FeedDetector.fetch_feed_url('http://blog.dominiek.com/feed/atom.xml')
assert_equal('http://blog.dominiek.com/feed/atom.xml', feed_url)
feed_url = FeedDetector.fetch_feed_url('http://digigen.nl/feed/')
assert_equal('http://digigen.nl/feed/', feed_url)
end
def test_get_feed_path
body = []
body << ' <html>'
body << ' <head>'
body << ' <link href="/super.css" rel="alternate" type="text/css"/>'
body << ' <link href="/feed/atom.xml" rel="alternate" type="application/atom+xml" />'
body << ' </head>'
body << ' </html>'
# Mephisto
feed_path = FeedDetector.get_feed_path(body.join("\n"))
assert_equal('/feed/atom.xml', feed_path)
body[3] = ' <link href=\'/feed/atom.xml\' rel="alternate" type="application/atom+xml" />'
feed_path = FeedDetector.get_feed_path(body.join("\n"))
assert_equal('/feed/atom.xml', feed_path)
# FlickR
body[3] = '<link rel="alternate" type="application/atom+xml" title="Flickr: Photos from dominiekth Atom feed" href="http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=atom">'
feed_path = FeedDetector.get_feed_path(body.join("\n"))
assert_equal('http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=atom', feed_path)
body[4] = '<link rel="alternate" type="application/rss+xml" title="Flickr: Photos from dominiekth RSS feed" href="http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=rss_200">'
feed_path = FeedDetector.get_feed_path(body.join("\n"))
assert_equal('http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=atom', feed_path)
feed_path = FeedDetector.get_feed_path(body.join("\n"), :rss)
assert_equal('http://api.flickr.com/services/feeds/photos_public.gne?id=71386598@N00&lang=en-us&format=rss_200', feed_path)
# Wordpress
body[3] = '<link rel="alternate" type="application/rss+xml" title="Digigen RSS Feed" href="http://digigen.nl/feed/" />'
body[4] = ' </head>'
feed_path = FeedDetector.get_feed_path(body.join("\n"), :atom)
assert_equal(nil, feed_path)
feed_path = FeedDetector.get_feed_path(body.join("\n"), :rss)
assert_equal('http://digigen.nl/feed/', feed_path)
end
end
I'm sure this might be useful to some people so Enjoy!
