I suppose it is a cliche to say that many useful things have been created unexpectedly, even accidentally. Here in Silicon Valley, that principle often becomes a problem, as highly creative people see a thousand products or services in their creations, but fail to focus enough to create a viable business. I know that disease well because I have to fight it constantly. Right now, however, with Tweetsnet, I’m still in the brainstorming and experimentation phase, when the point is to explore the possibilities. If it gives rise to a business of some sort, that’ll be just fine, but that’s not the point yet.
The bit of unexpected goodness I’ve noticed in Tweetsnet over the last few days is in the tagging. The tags and the tag cloud achieve one of my goals – self-organization – even though I didn’t really plan on it. If I had stopped to think about it, I guess I would have realized it would happen. It all started when I realized that since I’m fetching page titles from popular Twittered URLs, I could also extract any keywords found on those pages. I had to hack a Python WordPress RPC-XML library to support tags, but that was no big deal.
Once those tags were working, I realized that I could treat Twitter hashtags as a special case of tagging. In the Tweetsnet database, tags are identified by source – HTML meta keywords or hashtags. On the Tweetsnet pages, they all look the same.
When that was working, I found myself staring at the “phrases” that I’m capturing from Twitter. Those are two-word phrases extracted via some very simple rules – end of sentence detection, a stopwords list, hashtags and user names excluded and so forth. I noticed that when the same word showed up in more than one of those phrases, it often would be an appropriate tag. And I noticed that existing tag words often showed up in the phrases, so those get added no matter how frequent they occur. Any word that show up in at least three of the phrases is also added as a tag, although I’m not storing them in the database, since they are sometimes a bit odd.
The result is a set of tags and a tag cloud that do a pretty good job of finding articles related to a particular topic. For example, when an article about the rumored GDrive showed up, it was tagged “gdrive,” which I clicked and found two more articles. Cool. That’s why I recently increased the size of the Tweetsnet tag cloud widget.
As you may have noticed, I have added links to sites that are doing things similar to Tweetsnet. One of those, Twitscoop, offers a tag cloud widget, which gave me the idea that perhaps Tweetsnet should do the same. Soon, I hope. That would be in keeping with my idea that one of the secrets to success is to notice when you’ve invented something useful, then package it well.
I would be remiss if I didn’t point out that all this would not have happened if I wasn’t using WordPress as my platform. Although it gets in the way sometimes, the features that come for free, including all the third-party themes and widgets, are terrific. Ditto for Python and all the libraries people write for it.
Tags: self-organizing, tweetsnet, twitter