Poblish app 1.0.2

The new version of our free, political feed-reading iPhone app has been submitted, and should be ready for download within the next few days.

What’s new? Well, you can access your favourite blogs and topics, and re-run your favourite searches, all with one click, using the new Tagged Feeds feature. Adding new favourites is also just one click!

I’ve also removed Ads from the app, and you should find that recent topic results are more accurate than ever.

Thanks to OpenAmplify for their link to the current version (1.0.1) of the app. Need to get the word out more!

New Article features

Poblish has always provided a “more articles like this” facility for every article on the system – not just related articles from that blog or Twitter feed, but related articles from all blogs and Twitter feeds. This list used to appear next to each article, crammed into a column that was always just a little too narrow to make the list truly usable, so I’ve moved it to a new screen which you can pop up using the big “Explore…” button.Explore button

We’ve also restored the “Similar Bloggers” facility and put it alongside the list of articles, to help you explore other bloggers who deal with similar themes. Finally, if you’re logged-in, you’ll find your own individual list of recommended articles. This uses the latest collaborative filtering techniques to suggest a list of articles based on your own ratings, flags, favourites, as well as those of people with tastes similar to your own.

Above the Explore button, you’ll see what looks like a “tag cloud” for the article. However, what you’re seeing is much cleverer than what 99% of other applications offer. We use semantic analysis to determine the article’s key themes, or “Zones“, rather than simply relying on the categories the blogger chose; we rank them according to how often they have been mentioned during the past 24 hours; and we provide a link to the Zone’s home page, where you can see – and follow – a feed of matching articles.

The point of all this is to seamless weave articles – whether blog posts or Twitter posts – into the greater and wider world of political content, using state-of-the-art techniques, and to make it easier than ever for people to explore and to learn.

Churnalism.com

Churnalism.com is an independent, non-profit website built to help the public distinguish between original journalism and ‘churnalism’ – where what appears to be a genuinely journalistic newspaper or online news article turns out to be a recycled press-release, quite possibly from a special-interest group, or self-interested campaigning organisation.

It’s partly because of this habit of journalists – the BBC Health site has been a particular bugbear in the past – that I created the “Positive Political Blogging” campaign, last year. My goals were to mobilise bloggers against churnalism, and to produce a system by which – with the help of a bit of technology – the output of bloggers could become a replacement news service, one whose output was much higher quality, more varied, and less biased than what journalists of the big newspapers and online news sites could find elsewhere (“Harnessing the distributed intelligence of the blogosphere“, I called it.)

Back to Churnalism: their service allows people to run comparisons of press releases – indeed any news article – against the huge Journalisted archive of online newspaper articles. It’ll point out any sections that the journalist seems to have copied and pasted from the article you supply, and give you a score to show just how much of a paste job it was. Why not try some examples? Even better, Churnalism has an API allows developers of other sites to hook into this service.

With that in mind, I’ve hooked Poblish up to try out the new API, and you can see the results on any article page. You can try this one, for starters. The numbers suggest that there were no fewer than 2662 similarities with the Guardian article, which is pretty convincing evidence of widespread copying and pasting.

*

In case you’re wondering, Poblish strips out all quotations before passing articles to Churnalism – that way we don’t flag up articles that, quite correctly, refer to the original article or press-release. By contrast, pasting without quotations, without analysis, and without evidence of original thought, is pretty much what this campaign is all about. We expect journalists to do this essential part of their job, just as we hope that bloggers apply much the same principles.

Now, you might be thinking: “Hang on, is Poblish just comparing blog posts and news articles with other news articles? How about tracing these articles back to the hack or PR who first created them?”  OK, there’s an element of truth there, but as Poblish expands its coverage I’d like to see us aggregating more of the press-releases and think-pieces too, and to use our existing – and Churnalism’s new – analysis tools to make this kind of research a breeze for readers.

*

Final thoughts: yes, I’m very impressed with the Churnalism API, though I’d really like to see, if possible:

  • Article titles, not just URLs (see above).
  • Links back to Churnalism’s own beautifully user-friendly result pages – just showing the number of matches isn’t very compelling – and / or:
  • More results and statistics I could render myself.

All in all, a great start! Hope to show you more developments in due course…

Firefox search plugin for Poblish

Poblish now supports OpenSearch, which means that when visiting the site using Firefox (and possibly other browsers, apparently including Internet Explorer 7 and later), a menu item named “Add Poblish Search” appears in the search engines menu of your browser toolbar.

Once selected, a new Poblish search engine appears in your toolbar’s menu, like so.

This gives you the ability to search Poblish from any other site. Find something political you’re interested in? A person, a place, or a policy? Type them in the box wherever you see our icon, and hit return. Something like this will soon appear:

Visualising political content with Wordle

I’ve been inspired by Leigh Caldwell‘s Economics Zeitgeist word clouds to hook Poblish up to the wonderful Wordle.

Now you can visualise any Poblish feed with just a single click.

So, here’s Wordle’s visualisation of our most recent incoming articles from the past two days (click for full-size version).

Here’s the results for a group, e.g. our US Political bloggers (past 4 days of activity)

Some other feeds you can try:


All images created by the Wordle.net web application are licensed under a Creative Commons Attribution 3.0 United States License.

WordPress plugins: “More like this” from across the blogosphere

Here’s a first look at Poblish‘s first WordPress plugin.

It looks at the content of the current blog post, and automatically identifies related content from across all the content hosted at Poblish – currently 216,296 articles from 1,698 working feeds – returning you a list of the most closely matching articles in under a second.

You can click the name of any blogger or blog to see their live feed (pictured) in a Facebook-style popup frame.

In fact, forget about the screenshot, because you can see the plugin installed on this very blog – just look at the foot of this post, and scroll forward and back through our other posts.

The plugin is stable, but needs to be packaged-up a little so it fits seamlessly into the WordPress world. If you’re impatient to try it out, though, drop me a line and I’ll let you know the two or three steps you need to follow.

Let me know if you have any ideas of your own for developing the plugin. Some of mine are:

  • Ignoring matches from your own blog.
  • Restricting matches by date.
  • Restricting to matches with the same set of tags as the current post – somewhat influenced by Last.fm Radio.

Google Reader integration: share your feeds

All Feed boxes within Poblish now feature a “Subscribe with Google Reader” button.

So, straight away, you can subscribe to:

  • A feed of all activity on Poblish.
  • A feed of all activity for those Actors, Blogs/Feeds, and Groups you follow.
  • A feed for activity for any Actor, Blog/Feed, or Group you choose.
  • A feed of all recent activity (Flags, Favourites, Ratings, Group creations, etc.) on Poblish.
  • Content-tracking feeds, like the one illustrated.

Poblish is all about open data, and interoperability: making it as easy as possible for you to use the content we host, to share it, work with it, build upon it, and to recombine it in new and interesting ways.

I’m currently looking into how we can best use Google Buzz to help us in that mission, as well as finishing the work on our Custom Feeds facility, which will let you build your own combined feeds: some Actors, some Blogs, some content, all your flags and favourites, and so on.

Poblish and the Semantic Web: progress so far

I mentioned last month that Poblish has been using OpenAmplify‘s semantic/sentiment analysis service to give technology a shot at making sense of the vast sea of content that is the political blogosphere, in such a way as to help policymakers make better informed decisions. As I’ve said before:

Billions of individual thoughts and personal experiences have been written about, from all conceivable perspectives. No policy process will come up with ideas that have never been thought of before; so existing content represents a knowledge base that should not be ignored

In my piece at Left Foot Forward, earlier this week, I imagined a future in which such tools could take a source article and use this content to automatically, dynamically identify counter-arguments, hopefully before bad policy is made. Well, we have the content, we know that counter-arguments are out there, some of which may very well not yet have crossed the mainstream media’s horizon, and we hope – and believe – that technology can help us find them.

Only a very small percentage of Poblish’s articles have so far been semantically analysed (OpenAmplify are very kindly letting us evaluate their software for free, so the number of articles we process is limited), but all new articles are – and for those articles that have them, Poblish is now displaying the results in the page’s sidebar. Here are the results for the following article.

The way we display the results is simplistic at best, but essentially what we’re showing are the main topics from the article, divided into their relevant category, and coloured as follows:

  • Blue: favourable references (or “polarity”). Dark blue for wholly positive (never negative), light blue for generally positive (but occasionally negative).
  • Red: unfavourable references. Red for wholly positive (never negative), pink for generally positive (but occasionally negative).
  • Grey: neutral references, or a mixture of positive and negative ones.

Clearly there are successes and failures in the above list. Sunny Hundal‘s name appears as a mere noun, rather than a human name (though I wonder if the fact that his surname was misspelled in the original article is relevant here) and some of the polarities seem a little random.

Bear in mind, though, that each set of results you see was the result of an analysis of one, single article, without any context. Give the tool 200,000, however, and we can be certain that insights will start to massively outweigh mistakes. Context is critical, and – just as we don’t judge people or texts on the basis of what we objectively see – semantic applications should not be regarded in isolation, but as part of a vast network of humans and machines, using different techniques to identify and weave links between pieces of information, gradually improving our understanding of them.

All in all, the questions I’m interested in are:

  1. Do we believe semantic analysis can work?
  2. Do we believe that it can reveal insights that it would be impractical for human beings to find?
  3. Do we believe that those insights might be just the ones we need?
  4. Is it worth us investing more in such solutions?

I’d offer a yes to each of those questions, and have had a lot of fun evaluating OpenAmplify, but: what do you think?

A new vision for blogging, and content-based policy crowdsourcing

This is the third in a series of posts on the subject of ‘How the semantic web can crowdsource high-quality judgment and improve policymaking’. In part 2, last week, I described how existing content – the blogosphere, in particular – is currently used, or perhaps abused, by policymakers.

This time, I’m going to cover a range of improvements: how we can make better use of existing content, why we’d want to do so, and I’m going to roughly split these into: (a) technical solutions, and (b) human solutions.

(i) Technology: Aggregation vs. isolation

Political blog aggregators are still very rare, especially in the UK. Creating and maintaining an application that is able to monitor hundreds or thousands of feeds, and produce new, aggregated feeds in a timely fashion, is neither trivial nor cheap. Nonetheless, when I created Bloggers4Labour in early 2005, I showed that usable aggregators were both possible, and – certainly at the time – desirable. By providing the media with a single window onto a wide range of blogging opinion, the blogging oligarchy I mentioned last week could perhaps have been broken.

Only when all blogs are aggregated – on an equal footing, and irrespective of their political affiliation and their nationality – can the blogosphere becomes the comprehensive, fair, and effective knowledge base it needs to be. We don’t want to throw contextual information away, but rather than let it entrench artificial barriers, we should let technology draw its own, more useful inferences.

Thus aggregation should become the norm, rather than the exception – or rather, the least we should expect. Furthermore, bloggers should be encouraged to leave the safety of their partisan networks, and become global political actors.

(ii) Technology: Breaking-down barriers

Rather than being bound by technological limitations and by non-interoperable software tools, and rather than advocating one particular package or way of working, any new crowdsourcing platform should use technology to enable everyone concerned with policy development can participate in a more informed and productive way.

Imagine a knowledge base that not only lets you see related content for any article you read, but that automatically updates you with content as you start to develop a new article. You might discover articles that refuted the argument you just made, that provided you with valuable supportive evidence, or that caused your article to take a different path. Imagine how easy it would be for a policy to have been decided-upon without those crucial points ever having been made, and how expensive and time-consuming a failed policy like that could be.

The old ‘linear’ aggregator model – with its single time-line of unrelated blog posts – is not much help here. Only by bringing all types of expressed opinion together on an equal basis, collapsing the distinctions between the various types, and replacing single time-lines with a web of matched, linked, and related information, can we achieve a really usable knowledge base, that’s easy to visualise and to navigate.

Debategraph-style maps, collaboratively edited documents and Wikis, and aggregated blog content will all be represented in this web. There may well also be a place for Twitter messages and open-source Government data. The overall goal should be to let structured data and mappings bring precision to blog posts, and to let blog posts bring context and detail to structured debates.

(iii) Technology: The Semantic Web

Technical solutions that understand the content they are given will always produce more relevant results than the 99% that don’t. Furthermore, solutions that use sentiment analysis can identify whether a particular individual, or concept, is being talked about in a positive, neutral, or negative light. This opens up the possibility of being able to automatically identify supporting or contradictory evidence for policies mentioned in existing articles, and in new policy documents as they are created. Once again, technology plus existing content can be used to support good policy, strike out bad policy, and save time and effort, not to mention embarrassment.

(iv) Human crowdsourcing: Collaborative editing

Collaborative editing – currently a niche interest – should become the norm, in contrast to the disjointed, sequential model of blog-commenting that is popular today. It is literally vital because it adds value, and adds life to already expressed opinion. The blog post of last year – that was overtaken by events and discredited – can be transformed into the post that acknowledges its original mistakes, assimilates new information, and becomes a valuable addition to the policy debate.

Collaborative editing also accustoms bloggers to a new way of working: by exposing them to scrutiny it encourages more thought and greater responsibility, but at the same time it rewards the extra effort, by giving bloggers – especially new ones, those who are less well-connected, and therefore those who might have the most original ideas – the encouragement that their output is being read and considered by a wider audience than before. While firing off posts into the ether can be cathartic, my experience tells me that bloggers do prefer to be engaged in a greater debate.

In future, contributors will adapt an existing blog post – working within the existing context – and create new branches, or sub-versions, that other contributors can approve and rate, and use as the basis for their own versions. Over time, the most active, the most popular, and the most highly regarded versions will rise to the surface. It may be that these versions will be quite different from one another – after all, while agreement and resolution are fine things, political disagreement can also be valuable, and these versions will be much more useful themselves than the undistilled thoughts of just one blogger.

There is no reason why those used to the current model of blog commenting should not contribute by adding their suggestions at the foot of the original article, rather than working within the framework of the original. Potentially useful insights should not be lost, even if they cannot immediately be related to the existing content. The important thing is that contributors are not limited – or forced to work in a particular way – by technology that dates back to the early days of the Web.

(v) Human crowdsourcing: Juries, assertion-flagging, and data cleanup

There’s a lot more humans can do with a crowdsourcing platform besides creating new content (individually or collectively), flagging, and rating.

The platform can invite – or randomly select – disinterested participants (i.e. who don’t have a personal connection with the issue at hand) to work together on a particular debate, marking up relevant arguments, marking down irrelevant arguments, linking similar ones, and perhaps trying to find resolutions in other areas: essentially doing things that are just too tricky for a computer to do. The Guardian’s recent, and very successful, crowdsourced MP’s expenses exercise is a good example of this. Provide users with an incentive to donate their time and brainpower to the community, and great benefits can be reaped.

Another task humans can perform is to manually tag assertions within articles they read, and ask the platform to contact the original author / blogger so that they can respond with supporting evidence. Those who respond satisfactorily will be given credit for having done so, and their response will be attached to the original article, taking its place in the knowledge base for others to consult.

(vi) Conclusion

I hope I’ve succeeding in setting out a brighter vision of how crowdsourcing can improve policymaking, making it better informed and more efficient; how technology can be used more, and more effectively; how political blogging has a potentially enormous part to play; and how bloggers have a lot to gain by getting involved with a new crowdsourcing platform.

(Originally posted here, on January 26th, 2010.)

Crowdsourcing new policies, and why blogging has to change

This is the second in a series of posts on the subject of ‘How the semantic web can crowdsource high-quality judgment and improve policymaking’. Last week I made the case for using existing content – blog posts; Wikis, like Debatepedia; and visual debate-mapping tools, like Debategraph – as a knowledge base to drive new policy exercises, and introduced you to my new project, Poblish, which demonstrates this.

This time, I’m going to cover how existing content – the blogosphere, in particular – is currently used, and just how bad the situation is.

Blogging and personality

Individualistic political blogging dominates the collaborative alternatives because of its quantity rather than its quality, and because of personality rather than because of the arguments made. ‘Reputation’ within the blogging world is too often self-fulfilling, and technological limitations – combined with the laziness of politicians and the media – have created an oligarchy of ‘go to’ bloggers.

While the minds of journalists are not entirely closed to newcomers, it’s undeniable that the opinions of a couple of dozen ‘power bloggers’ carry more weight than all others put together. Where the strong preferences of a small minority dominate the weak preferences of the majority, democracy suffers.

Not only does this conceal the richness and diversity of the blogosphere in favour of accepted wisdom and conventional categories (‘Labour bloggers say…’), it corrupts both readers and writers. The priority of these bloggers gradually turns towards reportage – being ‘newsworthy’, breaking stories, filtering gossip, tracking trends, and developing their own ‘brand’ and influence. As their fame spreads, they draw traffic away from less well-connected blogs, encouraging readers to leave comments among a sea of others, rather than take the time to develop their thoughts more fully elsewhere.

While aggregators held out the possibility of providing readers with a single window onto a wide range of blogging opinion, the result has generally been to tie bloggers to their own political party.

Lack of interactivity

The level of interactivity on blogs has barely advanced during the past five years. Although all blogging platforms now offer a commenting facility, and some allow comments to be nested below others, comments continue to sit apart from the original article. They cannot refer to particular sections in the original, even though useful contributions are far more likely to relate to specific sections of the original rather than the generality. (Services like this do exist, but they are very far from mainstream blog tools.)

By being outside the context of the original, the mental pressure – to understand the original, and to constructively contribute – is taken off the contributor, but shifted onto the original blogger, who must attempt to understand and ‘re-contextualise’ the commenter’s addition before he can move his own argument on. What should be an interactive process becomes a sequential one, and all the slower and more time-consuming as a result.

Finally, the noise-to-signal ratio of comments can become enormous as a blog increases in popularity, unless strict controls or voluntary ‘codes of conduct’ are in place.

Lack of collaboration

Collaborative alternatives potentially provide more valuable content than blogs: more focus; less duplication; less pressure to be ‘journalistic’; a fairer balance between contributors; as well as a less ‘noisy’ experience. However, the very fact that they ask more of contributors makes them more expensive to create, and therefore thinner on the ground. This, in turn, can make collaborative editing seem a lonely experience. This situation will likely continue until there are efforts to break down barriers between the two types of content. Aggregators are of little help here, as they perpetuate the idea of a single time line of unrelated articles, in stark contrast to the ‘world wide web’.

Isolation and insulation

New blogs begin life in complete isolation and need to build connections with others if they are to keep their enthusiasm going. They need blogging friends, and they need encouragement. However, until a true blogging political hub appears, new bloggers often find themselves locked into political party silos, isolating themselves from the much wider external audience. A parallel incentive is for people to insulate themselves in order to avert the discomfort they feel when confronted with deeply contrary opinions and threats to their world-view. More often than not, it us unregulated comment-boxes that fuel this, rather than the behaviour of other bloggers.

Conclusion

When reputation becomes detached from quality; when friendship, like-mindedness, and convention determine the success of a blog and the popularity of its content; and when atomisation rather than interaction is the norm, the result must be a homogenisation of ideas, and a greater chance that rare but brilliant insights will be missed. This is the opposite of what we’re looking for.

In the next post I’ll be explaining how Poblish tries to address each of these problems, and how policy-making can be made more informed, more efficient, more constructive, and also more satisfying.

(Originally posted here, on January 18th, 2010.)