Most read articles from last week

Here’s a quick list of the most read articles from Poblish over the past week (count in brackets):

I’ll add more statistics to the site when I get the chance.

WordPress plugins: “More like this” from across the blogosphere

Here’s a first look at Poblish‘s first WordPress plugin.

It looks at the content of the current blog post, and automatically identifies related content from across all the content hosted at Poblish – currently 216,296 articles from 1,698 working feeds – returning you a list of the most closely matching articles in under a second.

You can click the name of any blogger or blog to see their live feed (pictured) in a Facebook-style popup frame.

In fact, forget about the screenshot, because you can see the plugin installed on this very blog – just look at the foot of this post, and scroll forward and back through our other posts.

The plugin is stable, but needs to be packaged-up a little so it fits seamlessly into the WordPress world. If you’re impatient to try it out, though, drop me a line and I’ll let you know the two or three steps you need to follow.

Let me know if you have any ideas of your own for developing the plugin. Some of mine are:

  • Ignoring matches from your own blog.
  • Restricting matches by date.
  • Restricting to matches with the same set of tags as the current post – somewhat influenced by Last.fm Radio.

Google Reader integration: share your feeds

All Feed boxes within Poblish now feature a “Subscribe with Google Reader” button.

So, straight away, you can subscribe to:

  • A feed of all activity on Poblish.
  • A feed of all activity for those Actors, Blogs/Feeds, and Groups you follow.
  • A feed for activity for any Actor, Blog/Feed, or Group you choose.
  • A feed of all recent activity (Flags, Favourites, Ratings, Group creations, etc.) on Poblish.
  • Content-tracking feeds, like the one illustrated.

Poblish is all about open data, and interoperability: making it as easy as possible for you to use the content we host, to share it, work with it, build upon it, and to recombine it in new and interesting ways.

I’m currently looking into how we can best use Google Buzz to help us in that mission, as well as finishing the work on our Custom Feeds facility, which will let you build your own combined feeds: some Actors, some Blogs, some content, all your flags and favourites, and so on.

Poblish and the Semantic Web: progress so far

I mentioned last month that Poblish has been using OpenAmplify‘s semantic/sentiment analysis service to give technology a shot at making sense of the vast sea of content that is the political blogosphere, in such a way as to help policymakers make better informed decisions. As I’ve said before:

Billions of individual thoughts and personal experiences have been written about, from all conceivable perspectives. No policy process will come up with ideas that have never been thought of before; so existing content represents a knowledge base that should not be ignored

In my piece at Left Foot Forward, earlier this week, I imagined a future in which such tools could take a source article and use this content to automatically, dynamically identify counter-arguments, hopefully before bad policy is made. Well, we have the content, we know that counter-arguments are out there, some of which may very well not yet have crossed the mainstream media’s horizon, and we hope – and believe – that technology can help us find them.

Only a very small percentage of Poblish’s articles have so far been semantically analysed (OpenAmplify are very kindly letting us evaluate their software for free, so the number of articles we process is limited), but all new articles are – and for those articles that have them, Poblish is now displaying the results in the page’s sidebar. Here are the results for the following article.

The way we display the results is simplistic at best, but essentially what we’re showing are the main topics from the article, divided into their relevant category, and coloured as follows:

  • Blue: favourable references (or “polarity”). Dark blue for wholly positive (never negative), light blue for generally positive (but occasionally negative).
  • Red: unfavourable references. Red for wholly positive (never negative), pink for generally positive (but occasionally negative).
  • Grey: neutral references, or a mixture of positive and negative ones.

Clearly there are successes and failures in the above list. Sunny Hundal‘s name appears as a mere noun, rather than a human name (though I wonder if the fact that his surname was misspelled in the original article is relevant here) and some of the polarities seem a little random.

Bear in mind, though, that each set of results you see was the result of an analysis of one, single article, without any context. Give the tool 200,000, however, and we can be certain that insights will start to massively outweigh mistakes. Context is critical, and – just as we don’t judge people or texts on the basis of what we objectively see – semantic applications should not be regarded in isolation, but as part of a vast network of humans and machines, using different techniques to identify and weave links between pieces of information, gradually improving our understanding of them.

All in all, the questions I’m interested in are:

  1. Do we believe semantic analysis can work?
  2. Do we believe that it can reveal insights that it would be impractical for human beings to find?
  3. Do we believe that those insights might be just the ones we need?
  4. Is it worth us investing more in such solutions?

I’d offer a yes to each of those questions, and have had a lot of fun evaluating OpenAmplify, but: what do you think?

Usability improvements

A quick list of recent usability improvements:

  • We now record a “number of times read” count for each article, as well as totals for each blog / blogger1, which you can see in the sidebar of the article page.
  • All articles now feature a Twitter sharing widget, to go with the one we added for Facebook sharing.
  • We now display links within the content of articles, rather than stripping them out. Command-click (Mac) or control-click (Windows) to open the link target in a new window.

1 If you’re curious, we count reads from logged-in users, guest users, and search engines. However, an article’s read count only goes up by one per browser session, so refreshing has no effect on the total. Finally, we don’t count reads from the article’s own author – or, if the article came from a Group blog – from any of the authors registered as contributors. In short, you can be confident that the read counts are genuine.

Aggregated Twitter feeds: @poblishLab and @poblishNI

I’ve added the ability for Poblish to automatically republish articles that it aggregates to single, group Twitter feeds – within minutes of publication. The most obvious use is to connect a Twitter account to a Poblish Group; so we currently have:

  • @poblishLab – the collective output of 801 UK Labour bloggers.
  • @poblishNI – the collective output of 68 political bloggers from Northern Ireland.

For each Tweet we display the original blogger’s Twitter name (else their Poblish username), a summary of their post, a bit.ly link to the original article, plus a group-specific hashtag.

In fact, the facility is flexible enough to allow arbitrary content queries (e.g. all references to ‘Obama’, or ‘Gordon Brown’) to be republished, custom feeds, or arbitrary collections of blogs.

Overall, the aim is to ‘free up’ the political data that Poblish is curating, and get a wider audience for the bloggers that we feature.