msgbartop
Social media analytics for decision-making
msgbarbottom

14 Apr 09 Why “Six Days in Fallujah” should be banned

I have to share my epiphany. ”Six Days in Fallujah” must be banned, since the company creating it says that it will allow players to ”become someone else,” specifically, the players apparently will become combatants in the battle for Fallujah.  Combatants are maimed.  Combatants are killed.  I know this because I’ve buried one.

No matter how entertaining it might be, we just can’t allow all these game players to be wounded and killed.  To use the classic example, the right to free speech doesn’t allow us to yell “Fire!” falsely  in a crowded theater.

I realized that the game needs to be banned when I received a personal email from the president of Atomic Games, Peter Tamte, in which he said he was misquoted by a reporter who wrote that he said  “The challenge was how to present the horrors of war in a game that is entertaining.”  Perhaps so.

However, Tamte also wrote to me, “We believe it is time for videogames to deal with complex issues and that videogames can give players deeper insight into the events in Iraq than passive forms of media, such as movies or TV, because videogames can make players become someone else.”

That’s when the light bulb came on.  I can become a Marine, like my niece’s husband, fighting in Fallujah.  And if I do what he did and go start up the engine in my AAV near the train station at the wrong time, somebody will fire a rocket at me from a nearby mosque and blow me to bits and I’ll be dead.  Seems harsh for a “game,” but that’s what happens when you walk a mile in somebody else’s boots.  I wonder if I’ll be eligible to be buried in some sort of simulated Arlington cemetery?  Will my wife receive survivor benefits from Atomic Games?  

If the game is published, how many people will it kill and wound?  Let’s calculate!

In the actual battle, there were about 5,000 combatant troops, of which 95 were KIA and 560 were wounded.  That’s about 2 percent killed and 11 percent wounded.  If this game is as successful as Call of Duty 4: Modern Warfare, it will sell around 6.25 million copies a year.  Assuming at least one player who will “become someone else” per copy, about 125,000 of them will die and close to 720,000 will be wounded.  Good news, though – only 55 percent of the wounded in Fallujah were so badly injured that they could not return to duty, which means that about 325,000 of Atomic’s wounded customers will fare well enough to play the game twice!  Let’s just hope they have good health insurance.

Still, that’s a lot of dead and wounded people – and we haven’t even counted those who choose to “become” insurgents, who had a far higher mortality rate.  I doubt if “innocent bystander” is a category that players can “become,” but if so, that’s another 5,000 to 6,000 dead.  In any case, it seems to me that if there ever were a justification for setting aside the First Amendment, killing and wounding this many game players fits the bill.

One thing I’m still wondering about.  If the game really would let me become my niece’s husband, does that mean that the computer will actually scatter pieces of me hundreds of yards in either direction?  That would be messy.  The cleanup costs alone would be prohibitive.

Well, enough bitter satire.   Back to measuring social media.

07 Apr 09 Atomic Games re-traumatizes every survivor of violence

This blog, when I’m keeping up with it, is mostly about measuring social media.  Not today.  Today it is about a game announced by Atomic Games, called Six Days in Fallujah.  Atomic Games President Peter Tamte said this about it: “For us, the challenge was how do you present the horrors of war in a game that is also entertaining.”

Is this guy out of his mind?  The idea of such a “game” is hard for anybody who has lost anyone to violence.  My niece’s husband, a Marine, was killed in action on one of those six days, November 10, 2004.  One of the ways I responded to his death was to become a grief counselor, as part of the Bay Area Critical Incident Stress Management (CISM) Team.  I spent last Thursday and Friday doing crisis intervention at a school attended by children recently killed in a murder-suicide here in Santa Clara.  I’m also a former paramedic.  I know the reality of violence.

I am angry.  Any sane person who has lived with the horror of deadly violence knows that it cannot become entertainment.  The fact that it is based on real events makes “Six Days” intolerable as a game. Tamte’s boasts about it have re-traumatized hundreds of thousands of survivors, at a time when violence is on the rise in our nation.  In recent days, the news has been full of horrible police and family homicides and suicides.

I know that simulations can save lives in the battlefield by creating realistic simulations.  I know that simulations can help therapists treat post-traumatic stress.  I was writing about and advocating those uses of simulations many years ago when I co-founded Multimedia Computing Corp., a market research and publishing company.

Another quote from Tamte: “Our opportunity for giving people insight goes up dramatically when we can present people with the dilemmas and the choices that faced these soldiers.”

Baloney.  This “game” offers zero insight into what it is like to be in a chaotic situation where peoples’ real lives are on the line.  It is profoundly disrespectful to claim to know what it was like for those who where there, no matter how many of them may have contributed to it.  

Atomic Games has lost me forever as a potential customer and I hope others will follow.

War is not entertainment a game.

10 Mar 09 What I’ve been doing

The person at the next desk to the right (my wife) just pointed out that I haven’t updated my blog in quite a while.  Among other things, I’ve been, uh, struggling a bit with Twitter social graph analysis.  The new social graph API calls make it easy to get friend and follower relationships, but they return Twitter IDs, not screen names.  Getting screen names requires a lot of API calls and parsing of JSON or XML… and so far, that’s pretty slow.  I seem to be able to get a few hundred names per minute, which might sound like a lot, but the Twitter social graph is huge.  We have Twitter’s openness to blame.

The fact that for the most part, anybody can follow anybody, along with all the auto-followbacks, makes for a very densely connected graph.  For example, I follow somewhere over 300 people.  They follow several hundred thousand.  The graph that shows the follower relationships of me and all the people I follow has about 1.4 million edges.  That’s a lot to manipulate.  I’ve experimented with removing the people who follow more than 500 or more than 1,000 people, which reduces the graph size considerably, but it is still challenging to analyze on a desktop system.

I think my next blog post will focus on the social graph, why it matters and what direction Twitter might take it.  Feel free to ping me with your thoughts; I’m eager to hear them.

I should mention that even as I’m doing all this, I’m still looking for the right “real” job.  I’m focusing mainly on product management related to social media and analytics.

Tags: , ,

18 Feb 09 That’s the best I can do for you, sir

We had a fairly strong storm come through the Bay Area over the weekend.  On Saturday, a Santa Clara police officer showed up at our door, saying that our phone was dialing 911 and producing static.  I picked up the phone and heard nothing but static.  A few minutes later, it was back to normal and our DSL came back alive.  ”Normal” phone service around here has been bad since we bought this house two years ago (bad timing, but I’ll complain about our mortgage later).  The line has had intermittent noise problems that limit our DSL speed and make the voice portion unusable at times.

Monday morning, the phone line died completely.  Dead silence – no dial tone, no DSL.  I looked over our wiring, saw nothing except some insulation rubbed off the CAT5 cable that connects our DSL splitter to the model and taped that over, since it appeared to me that the internal wire pairs were intact.  I called AT&T/Pacific Bell to report the trouble.  Eventually, that is.

Eventually, because we don’t have any phone books in the house and when I tried to use the AT&T web site via my Treo, I found it impossible to navigate. Despite the fact that AT&T is a huge provider of mobile phone services, it doesn’t seem to have a mobile-friendly web site.  You’d think that “mobile.att.com” would be there, but it’s not.  Nor is the even more mobile-friendly “m.att.com”.  What are they thinking?  I finally gave up and dialed 411 and got the number.  After navigating the voice response system, which always seems to request the same information at least twice, I got to a live person, who said they would test the line and if they found a problem, they’d dispatch a technician no later than 8 p.m.  Not 8 p.m. the same day, but 8 p.m. the next day.  Great.

As long as I had them on the phone, I asked to be transferred to the billing department because we have been “crammed” (or “slammed,” I forget which is which) for the second time lately.   This item showed up at the bottom of our billing page:

Item
   No.  Date  Description
   Billed on Behalf of BUSINESS TO BUSINESS
   Questions?    Call: 1 888 296-8079
   3-01 12-14        BUS. TO BUS. ONLINE,INC-MONTHLY FEE                    19.95

   Total The Billing Resource                                               19.95

I called the company and they claimed that we signed up for high-speed Internet service.  Yeah, right.  Just like the crooks who started billing us for voicemail we never ordered.  They promised to refund our money and stop billing us.  But the refund will take two to three months.  Enjoy the interest on my money, crooks. 

Meanwhile, since AT&T had been willing to refund three months of the bogus charges when this happened before, I asked if they would do that for me again.  

No, absolutely impossible, the customer service rep said.  

“But I know it is possible,” I said,  ”You did it for me when this happened before.”  

“I’m sorry, sir, there’s nothing we can do.”

“But I know that isn’t true, you did it for me before.”

“There’s nothing I can do. We are required by law to allow companies to bill through us.”

Although that bit about the (idiotic) law is true, I pointed out to him that as an individual, I have little influence, but a big company like AT&T surely can lobby to change the law – and surely will if it hits them in the pocketbook.  As long as AT&T can just pass along the charges, they have no incentive to try to make change happen.

Eventually, the customer service rep offered me a $20 refund “as a courtesy.”

“That’s the best you can do?”

“Yes, sir.”

“I guess I’ll take that, then.  But make a note that we’re going to cancel our home phone service because we don’t want this to keep happening.”

“I understand sir.”  Long pause.  ”Sir, since you mentioned cancelling your service and we want to keep you as a customer, I am authorized to refund the entire amount to you.”

Huh????  This from the same guy who told me moments earlier that $20 was the best he could do.  And before that, zero was the best he could do.  I’m starting to think that if I hang on longer, they’d give me more than just a refund.  And they did.  With no further prompting, the customer service rep, who sure sounded like he was reading from a script, said, “Since you said that you are going to cancel your service and we want to keep you as a customer, I’m going to give you $5 a month off your phone and internet service for the next 12 months.  Would that be okay?”

Yes, that would be okay.  Except that we’re going to get rid of it anyway, I’m fairly sure.  But I wonder if I had held on longer, they’d eventually pay me to keep the line.  Crazy.  And no wonder I never believe a customer service rep who says, “There’s nothing more I can do, sir.”

Meanwhile, I figured out how to tether my Treo to my desktop system and go on-line at GPRS speed.  Slow, but better than nothing.  That enabled me to do something I’d been planning anyway – move my mail server and mailing lists to Bluehost.  It was a fire drill, made harder by a lack of access to the Mailman mailing list utilities, but I seem to have gotten it done.  All that’s left at home now is the back end of TwURLed News, which can tolerate brief outages and doesn’t need any incoming connections (which I figure will be a necessity if we switch to a cable modem at home).

The phone service guy didn’t make it by 8 p.m. last night.  He showed up around eight this morning and found a short in my wiring – the spot where the insulation rubbed off.  I was embarrassed that I didn’t replace that wire when I saw it, but hey, it was pouring rain.  I was doing the best I could.  He had us up and running by about 8:30.  About 25 minutes after he left, I got an automated phone call from AT&T saying that they would not be able to send a technician before 8 p.m.  Last night.  Right.

It’s good to be back on-line.  I’m doing everything I can for you from here.  Believe me.

13 Feb 09 Another vertical – Web analytics

I have added a second vertical slice to TwURLed News – web analytics.  It is also available on Twitter at @TwURLedNewsWA.  That’s the good news.

The bad news is that everything was dead for about 12 hours because my hosting company, Bluehost, shut it down for consuming too many CPU cycles.  The culprit was a Wordpress plugin that generates XML sitemaps.  It was generating an updated sitemap for every post, with a fairly expensive MySQL query each time.  No more.  The plugin is set to only permit manual updates and I’ll trigger that every few hours, not at every posting.  That should also make the site more responsive.

Live and learn.

Tags: , , ,

11 Feb 09 First TwURLed News vertical – social media

I’ve just launched a beta version of a “vertical” slice of TwURLed News (formerly Tweetsnet). It is TwURLed News -Social Media. It uses the same infrastructure as the general TwURLed News blog, but focuses on people and pages that tend to be about social media.

I seeded the search system with a number of words, phrases, people and tags that are related to social media.  This includes things such as the phrase “social media,” for example.  The robot periodically searches Twitter for a handful of those terms, which leads it to find people and cited web pages related to the target subject.  I have a fairly long list of other evidence that a tweet or web page is about social media; each tweet, page title and page description is checked against all the evidence.  I’m calling it evidence because, like the other TwURLed News algorithms, this one uses evidential logic to estimate relevancy – the more, the better.  Each bit of evidence is assigned a weight and those weights are combined to meaure how related to social media a tweet or page is.

As the system identifies people, pages and tags that have strong correlations to social media, it should be able to figure out additional evidence, particularly words.  We’ll see how much that can be automated, but I’m hoping that TF/IDF (term frequency/inverse document frequency) will reveal at least some such terms.  One of these days I’ll take a deep breath and use SVD (singular value decomposition) and other clustering techniques to find patterns in the people, pages, tags and words.  I’ve had fairly good success with that in the past, although until lately, I could never figure out a good way to fully automate it.  If Twitter continues to be a place where people retweet and repeat the same URL citations, I have high hopes that a fully automated system will be useful.

If it’s not already obvious, what I’m doing is not very different from Google’s PageRank algorithm, which considers a page more significant if a lot of other pages have links to it.  I’m finding pages cited on Twitter that have a lot of people linked to them, so to speak.  One of Google’s ongoing problems is link spam, which is more or less like the problem TwURLed News faces with aggregators.  It is very easy to spew a ton of URLs, which can make a “person” on Twitter appear more prescient than it really is… but the good news is that it is fairly easy to exclude them.  On the web, it is relatively simple to fake the date and time an article was posted, but not on Twitter.  That means that nobody can fool the system by pretending to have cited a popular page before it became popular.  That’s a big advantage – it prevents quite a few potential spoofing approaches.

I’d love to hear your suggestions for further verticals.  I’m working on one that will cover web analytics, though at first glance, there doesn’t seem to be a lot of #wa talk on Twitter.  (That was a hashtag, for those who don’t Twitter yet.)

08 Feb 09 Technorati profile

Technorati Profile

03 Feb 09 Twitter is a chorus, not a bunch of solos

I have struggled with language to describe the people whose web page citations appear on Tweetsnet.  I started with a simple illustration about influence, the idea that people whose followers have more followers are potentially more influential, that influence is at least a second-order phenomenon.  But that doesn’t really describe what I’m doing.  I also experimented with the word “perceptive,” thinking that people who regularly are among the first to cite web pages that become popular may not be influential, they might just be good at seeing where things are headed.  But that’s not the whole story.

I think I finally found the right phrase when I updated Tweetsnet’s “About” page to say that it looks for people who are “in tune” with what becomes popular.  I see Twitter as a platform where people constantly organize themselves into choruses, amplifying the most pleasing melodies, generating and discovering harmonious ideas.  As with flocking behavior, these choruses have no single leader, but unlike a flock (as far as I know), some people are clearly more “in tune” than others.  Those are the people Tweetsnet seeks to identify – those who most frequently cite web pages that become popular.

I suspect there is a great opportunity in reporting what the choruses, known and discovered, are singing about.  In other words, monitoring the buzz in sections of an ecosystem of interacting, overlapping shared-interest communities.  This is where I want to take Tweetsnet, generating verticals, starting with known popular subject areas such as social media.  I’m sure there’s a lot of thinking and experimentation to be done about the ways we could define the intertwined borders of such choruses.  One thing I’m sure about – we need to change the way we tend think about redundancy.

Our left brains tend to think that duplicated effort is inherently wasteful, but the fact is that we are creatures of community.   But here’s the most important idea to take away from the “chorus” metaphor: when a bunch of people act similarly in social media (e.g., post the same URL), it is not redundant, it usually adds value.  That is deeply contrary to the one-to-many 20th century idea of information distribution, in which achieving stardom, not harmony, was the goal.  We still have room for stars, but some of them will be choruses.

Here are some thoughts on features that contribute to Twitter’s “choral value.”

  • Retweeting has very choral  high value, as it strengthens the “melody” – people’s deliberate arrangement of information into tweets – and the “harmony” – the commonality among Twitter users that goes beyond simply posting the same information.
  • UI design that makes retweeting easy is good, as long as it doesn’t encourage people to spew everything.
  • Excessive tweeting and retweeting becomes noise – witness the efforts I’ve had to make to remove aggregators from Tweetsnet.  The greatest value is added by the “jazz” tweeters, who have a melody and know how to harmonize, but aren’t afraid to improvise.  In other words, have a focus, but don’t be a robot about it.
  • Anything that shows how much human energy and thought went into a tweet adds value.  Anything that makes it easy to tweet will eventually diminish the value.  This is why the 140-character limit has added value – even headlines often are bigger, forcing people to think about how to squeeze information.
  • Hashtags reduce Twitter’s choral value as  ”solos” they do more to discourage than encourage retweeting.  If a tweet is already tagged, I think people tend to assume there’s no need to retweet because interested people should be monitoring the hashtag.
  • Followering somebody only matters if you take action; the main visible action is retweeting.  Blogging about something you found on Twitter would add “choral” value if there were an easy way to discover it.
  • Twitter’s APIs make it fairly easy to track user, URL and word usage, which is good data not just for Twitter’s basic features, but for discovering things we didn’t know to look for.  It’s great that everything is open by default, unlike most other social networking platforms.

What’s the “choral value” you see in Twitter?  What could the company do to further encourage it?

P.S. I’m going to change Tweetsnet’s name to TwURLedNews.

Tags: ,

30 Jan 09 Tweetsnet tags: Surprisingly useful

I suppose it is a cliche to say that many useful things have been created unexpectedly, even accidentally.  Here in Silicon Valley, that principle often becomes a problem, as highly creative people see a thousand products or services in their creations, but fail to focus enough to create a viable business.  I know that disease well because I have to fight it constantly.  Right now, however, with Tweetsnet, I’m still in the brainstorming and experimentation phase, when the point is to explore the possibilities.  If it gives rise to a business of some sort, that’ll be just fine, but that’s not the point yet.

The bit of unexpected goodness I’ve noticed in Tweetsnet over the last few days is in the tagging.  The tags and the tag cloud achieve one of my goals – self-organization – even though I didn’t really plan on it.  If I had stopped to think about it, I guess I would have realized it would happen.  It all started when I realized that since I’m fetching page titles from popular Twittered URLs, I could also extract any keywords found on those pages.  I had to hack a Python Wordpress RPC-XML library to support tags, but that was no big deal. 

Once those tags were working, I realized that I could treat Twitter hashtags as a special case of tagging.  In the Tweetsnet database, tags are identified by source – HTML meta keywords or hashtags.  On the Tweetsnet pages, they all look the same.

When that was working, I found myself staring at the “phrases” that I’m capturing from Twitter.  Those are two-word phrases extracted via some very simple rules – end of sentence detection, a stopwords list, hashtags and user names excluded and so forth.  I noticed that when the same word showed up in more than one of those phrases, it often would be an appropriate tag.  And I noticed that existing tag words often showed up in the phrases, so those get added no matter how frequent they occur.  Any word that show up in at least three of the phrases is also added as a tag, although I’m not storing them in the database, since they are sometimes a bit odd.

The result is a set of tags and a tag cloud that do a pretty good job of finding articles related to a particular topic.  For example, when an article about the rumored GDrive showed up, it was tagged “gdrive,” which I clicked and found two more articles.  Cool.  That’s why I recently increased the size of the Tweetsnet tag cloud widget.

As you may have noticed, I have added links to sites that are doing things similar to Tweetsnet.  One of those, Twitscoop, offers a tag cloud widget, which gave me the idea that perhaps Tweetsnet should do the same.  Soon, I hope.  That would be in keeping with my idea that one of the secrets to success is to notice when you’ve invented something useful, then package it well.

I would be remiss if I didn’t point out that all this would not have happened if I wasn’t using Wordpress as my platform.  Although it gets in the way sometimes, the features that come for free, including all the third-party themes and widgets, are terrific.  Ditto for Python and all the libraries people write for it.

Tags: , ,

29 Jan 09 My robot is more popular than I am

I knew this day was coming.  Sometime in the last few hours, Tweetsnet acquired more Twitter followers than my personal Twitter account.  I have 232 and Tweetsnet is now at 256 – and climbing faster than I am.  

I’m happy that there’s increasing evidence that Tweetsnet is useful.  On the other hand, what a strange world this is, in which I can create an automated information source that seems, by one metric, to be more popular than I am.  It seems impersonal and perhaps just plain silly… until I consider that we are creating a world in which increasingly intelligent robots will interact not just with us, but with each other, which will make them (a) stupider, because they will have to deal with rapidly increasing amounts of data and (b) smarter, because we will figure out how to make them take advantage of all that data.

If you’ve been following Tweetsnet or this blog for the last few days, you know that my No. 1 strategic problem (as opposed to various little bugs) is the fact that aggregators – other robots – tend to score quite high in the rankings.  An idealistic part of me wants every Twitter account to self-identify as robot or human… but I know that there’s no hope of compliance with anything like that.  I’m actually more intrigued by the notion that value will arise from writing code that guesses whether or not a user is a robot.  Web analytics has the same problem because some web robots and spiders masquerade as ordinary web browsers.  I spent a lot of time on this problem at LiveWorld, where some of our customers were not too eager to pay for robot page views at the same rate as human page views.

The cool thing about the challenge of distinguishing bots from humans is that we’re essentially collaborating and competing on Turing tests.  People are designing bots to gain influence in the Internet’s social networks, in competition with people who want to filter them out.  As long as bots are dumber than people (and they will be for a long time), this competition will persist and it will drive collaborations that make software smarter.  When we reach the singularity, it will stop mattering… or perhaps it will completely flip, so that the people who were trying to decrease the influence of stupid bots will focus on decreasing the influence of those stupid humans.  Or perhaps it will be a happy collaboration.

Tweetsnet gained its first bunch of followers by following everybody who cited a URL that made it into the feed.  A lot of those people automatically followed it in return.  The recent big spike appears to be driven by the fact that a few Twitter users are now retweeting Tweetsnet items.   That’s a kindness, really, because there’s no reason for them to do so.  They could retweet one of the original tweets.  

I imagine that one reason they give Tweetsnet the credit, so to speak, is that Tweetsnet doesn’t try to drive traffic to itself.  When it posts a tweet, the links in that tweet point directly to the original site, not back to the posting on Tweetsnet.  I get annoyed by tweets that point me to somebody’s site that does nothing more (for me) than provide a link to the site the tweet was really about.  

Meanwhile, today’s project is to keep other peoples’ robots out of the Tweetsnet scoring – because they are stupid.  The robots, I mean, the robots.

Tags: , , ,