Wigitize.com, Post-Digg

on January 09, 2008

Yesterday I hit the Digg front page with my Building a .com in 24 hours article. Apart from that I was also number one and number nine in the delicious rankings. Very exciting!

What is also interesting is that my rapidly build website – wigitize.com – got a lot of hits. My blog got about 25000 uniques, Wigitize about 10000. All these hits caused an incredible load on my small 256MB memory server so I quickly bought a second one and scaled up. Yesterday, I could finally stabilize the backgrounding on the website and everything is running fine ever since. Someone offered me a free slicehost slice too last night!

It’s also funny that people commented on how I must have too much time on my hands. I even got offered participation in startups and companies haha. In fact, the opposite is true, but I do love hacking away in my free time. Besides, it was xmas, that means coding, drinking and eating right?!

While combating the performance issues I had yesterday, I learned these technical lessons:

Make sure you’re running the right version of tools. Apparently I was running a very old backgroundrb version. So I now upgraded to Ezra’s 0.2.1 version

Debug things in production-mode on your local machine! Some code – especially funky code like backgroundrb – will behave differently on production as opposed to dev. Often people like me are strictly developing in ‘development’ mode and than publishing to staging/production environments. I found a mayor issue that was screwing up everything in backgroundrb by debugging things in production mode on my MacBook.

Regular expressions can be dangerous too! Sometimes Feed detection workers started eating up 100% CPU. After debugging things for a long time, I found out that my regular expression was going berserk on sites like msn.com. So I changed expressions in the feed detecting algorithm from:
1
/<link.*href=['"]*([^\s'"]+)['"]*.*application\/rss\+xml.*>/.match(html)
to:
1
/<link.*href=['"]+([^'"]+)['"]+.*application\/rss\+xml.*>/.match(html) 

10 Responses to “Wigitize.com, Post-Digg”

  1. MikeonTV says:

    How come I cannot use wigitize to see my Digg feeds?

    http://digg.com/design/upcoming/

    for example. Great service BTW!

  2. Dan says:

    Google has an API for widgets...

  3. Dominiek says:

    Mike: good point, that is actually because for some reason my feedeater code is stalling on that domain when I do a request. Most likely, I am not sending my connection type properly. Will fix that when I get around to it :]

  4. Rutger van Waveren says:

    @Dominiek: Maybe the feed stalls because it Digg refers to their rss feeds relatively? Mikes example points to "/rss/indexdesigndig.xml" not to "http://www.digg.com/rss/indexdesigndig.xml". Maybe that helps.

  5. Chu Yeow says:

    Congrats on getting Dugg!

    Anyway, you made a comment about BackgrounDRb and yes its stable releases are not really stable (I just submitted a patch for what I consider a major bug for the latest 1.x release), but to give the guy credit he is adding a lot of useful stuff. The thing that scares me though is the lack of tests in the project. Still, it is Open Source so it would probably benefit more from some contributor help rather than an open bashing :)

    I may just consider using the old 0.2.1 version of BackgrounDRb like you recommend since I'm really just using the most basic features :)

  6. MikeonTV says:

    Thanks Dominiek. I would love an update when this is possible.

  7. Chris says:

    That regexp fails on Slate.com (and I imagine several other sites with other types of <link>s than just RSS). I think this might fix it: /<link>]href=['"]+([^'"]+)['"]+[^>]application\/rss\+xml.*>/.match(html) Note the [^>]s in place of the .*s. This ensures you stay inside one tag and will probably saves you some additional CPU (I don't know for sure, it might go either way).

  8. Chris says:

    My last post was garbled. The regexp there is incorrect. I'm not sure if it's the software, but here's my second try: /<link>]href=['"]+([^'"]+)['"]+[^>]application\/rss\+xml.*>/.match(html) Note the [^>]s in place of

  9. fLUx says:

    Hey Dominiek,

    Liking the new feature at the bottom of the front page at wigitize, with the top requested feeds.

    But I have found a "bug" which I hope you know about - if you go to the setup page for a feed (in your case its http://wigitize.com/url/http:%2F%2Fdominiek.com%), and refresh it, it will +1 to the total of the stat on the front page. Unlimited times.

    Just dumping the URL in something like ApacheBench or some other stress test tool, you can easily get the number up to a very high number.

    For instance I have got your feed up to #1 at 338 in a matter of seconds.

    Just thought I would tell you, and to maybe expect your top requested feed to be "Cheap VIAGARA - clik here!!!"

    Cheers and good work, look forward to your future projects! ;)

    fLUx

    P.s. cheers for the recomendation for Slicehost, although I had problems with it accepting my credit card number, and when I went back I forgot to use your referal link, sorry! ;)

  10. receive says:

    The regexp also fails on mediawiki rss feeds (eg http://en.wikipedia.org/w/index.php?title=Special:Recentchanges&feed=atom)

Sorry, comments are closed for this article.