A new vision for blogging, and content-based policy crowdsourcing

This is the third in a series of posts on the subject of ‘How the semantic web can crowdsource high-quality judgment and improve policymaking’. In part 2, last week, I described how existing content – the blogosphere, in particular – is currently used, or perhaps abused, by policymakers.

This time, I’m going to cover a range of improvements: how we can make better use of existing content, why we’d want to do so, and I’m going to roughly split these into: (a) technical solutions, and (b) human solutions.

(i) Technology: Aggregation vs. isolation

Political blog aggregators are still very rare, especially in the UK. Creating and maintaining an application that is able to monitor hundreds or thousands of feeds, and produce new, aggregated feeds in a timely fashion, is neither trivial nor cheap. Nonetheless, when I created Bloggers4Labour in early 2005, I showed that usable aggregators were both possible, and – certainly at the time – desirable. By providing the media with a single window onto a wide range of blogging opinion, the blogging oligarchy I mentioned last week could perhaps have been broken.

Only when all blogs are aggregated – on an equal footing, and irrespective of their political affiliation and their nationality – can the blogosphere becomes the comprehensive, fair, and effective knowledge base it needs to be. We don’t want to throw contextual information away, but rather than let it entrench artificial barriers, we should let technology draw its own, more useful inferences.

Thus aggregation should become the norm, rather than the exception – or rather, the least we should expect. Furthermore, bloggers should be encouraged to leave the safety of their partisan networks, and become global political actors.

(ii) Technology: Breaking-down barriers

Rather than being bound by technological limitations and by non-interoperable software tools, and rather than advocating one particular package or way of working, any new crowdsourcing platform should use technology to enable everyone concerned with policy development can participate in a more informed and productive way.

Imagine a knowledge base that not only lets you see related content for any article you read, but that automatically updates you with content as you start to develop a new article. You might discover articles that refuted the argument you just made, that provided you with valuable supportive evidence, or that caused your article to take a different path. Imagine how easy it would be for a policy to have been decided-upon without those crucial points ever having been made, and how expensive and time-consuming a failed policy like that could be.

The old ‘linear’ aggregator model – with its single time-line of unrelated blog posts – is not much help here. Only by bringing all types of expressed opinion together on an equal basis, collapsing the distinctions between the various types, and replacing single time-lines with a web of matched, linked, and related information, can we achieve a really usable knowledge base, that’s easy to visualise and to navigate.

Debategraph-style maps, collaboratively edited documents and Wikis, and aggregated blog content will all be represented in this web. There may well also be a place for Twitter messages and open-source Government data. The overall goal should be to let structured data and mappings bring precision to blog posts, and to let blog posts bring context and detail to structured debates.

(iii) Technology: The Semantic Web

Technical solutions that understand the content they are given will always produce more relevant results than the 99% that don’t. Furthermore, solutions that use sentiment analysis can identify whether a particular individual, or concept, is being talked about in a positive, neutral, or negative light. This opens up the possibility of being able to automatically identify supporting or contradictory evidence for policies mentioned in existing articles, and in new policy documents as they are created. Once again, technology plus existing content can be used to support good policy, strike out bad policy, and save time and effort, not to mention embarrassment.

(iv) Human crowdsourcing: Collaborative editing

Collaborative editing – currently a niche interest – should become the norm, in contrast to the disjointed, sequential model of blog-commenting that is popular today. It is literally vital because it adds value, and adds life to already expressed opinion. The blog post of last year – that was overtaken by events and discredited – can be transformed into the post that acknowledges its original mistakes, assimilates new information, and becomes a valuable addition to the policy debate.

Collaborative editing also accustoms bloggers to a new way of working: by exposing them to scrutiny it encourages more thought and greater responsibility, but at the same time it rewards the extra effort, by giving bloggers – especially new ones, those who are less well-connected, and therefore those who might have the most original ideas – the encouragement that their output is being read and considered by a wider audience than before. While firing off posts into the ether can be cathartic, my experience tells me that bloggers do prefer to be engaged in a greater debate.

In future, contributors will adapt an existing blog post – working within the existing context – and create new branches, or sub-versions, that other contributors can approve and rate, and use as the basis for their own versions. Over time, the most active, the most popular, and the most highly regarded versions will rise to the surface. It may be that these versions will be quite different from one another – after all, while agreement and resolution are fine things, political disagreement can also be valuable, and these versions will be much more useful themselves than the undistilled thoughts of just one blogger.

There is no reason why those used to the current model of blog commenting should not contribute by adding their suggestions at the foot of the original article, rather than working within the framework of the original. Potentially useful insights should not be lost, even if they cannot immediately be related to the existing content. The important thing is that contributors are not limited – or forced to work in a particular way – by technology that dates back to the early days of the Web.

(v) Human crowdsourcing: Juries, assertion-flagging, and data cleanup

There’s a lot more humans can do with a crowdsourcing platform besides creating new content (individually or collectively), flagging, and rating.

The platform can invite – or randomly select – disinterested participants (i.e. who don’t have a personal connection with the issue at hand) to work together on a particular debate, marking up relevant arguments, marking down irrelevant arguments, linking similar ones, and perhaps trying to find resolutions in other areas: essentially doing things that are just too tricky for a computer to do. The Guardian’s recent, and very successful, crowdsourced MP’s expenses exercise is a good example of this. Provide users with an incentive to donate their time and brainpower to the community, and great benefits can be reaped.

Another task humans can perform is to manually tag assertions within articles they read, and ask the platform to contact the original author / blogger so that they can respond with supporting evidence. Those who respond satisfactorily will be given credit for having done so, and their response will be attached to the original article, taking its place in the knowledge base for others to consult.

(vi) Conclusion

I hope I’ve succeeding in setting out a brighter vision of how crowdsourcing can improve policymaking, making it better informed and more efficient; how technology can be used more, and more effectively; how political blogging has a potentially enormous part to play; and how bloggers have a lot to gain by getting involved with a new crowdsourcing platform.

(Originally posted here, on January 26th, 2010.)

Crowdsourcing new policies, and why blogging has to change

This is the second in a series of posts on the subject of ‘How the semantic web can crowdsource high-quality judgment and improve policymaking’. Last week I made the case for using existing content – blog posts; Wikis, like Debatepedia; and visual debate-mapping tools, like Debategraph – as a knowledge base to drive new policy exercises, and introduced you to my new project, Poblish, which demonstrates this.

This time, I’m going to cover how existing content – the blogosphere, in particular – is currently used, and just how bad the situation is.

Blogging and personality

Individualistic political blogging dominates the collaborative alternatives because of its quantity rather than its quality, and because of personality rather than because of the arguments made. ‘Reputation’ within the blogging world is too often self-fulfilling, and technological limitations – combined with the laziness of politicians and the media – have created an oligarchy of ‘go to’ bloggers.

While the minds of journalists are not entirely closed to newcomers, it’s undeniable that the opinions of a couple of dozen ‘power bloggers’ carry more weight than all others put together. Where the strong preferences of a small minority dominate the weak preferences of the majority, democracy suffers.

Not only does this conceal the richness and diversity of the blogosphere in favour of accepted wisdom and conventional categories (‘Labour bloggers say…’), it corrupts both readers and writers. The priority of these bloggers gradually turns towards reportage – being ‘newsworthy’, breaking stories, filtering gossip, tracking trends, and developing their own ‘brand’ and influence. As their fame spreads, they draw traffic away from less well-connected blogs, encouraging readers to leave comments among a sea of others, rather than take the time to develop their thoughts more fully elsewhere.

While aggregators held out the possibility of providing readers with a single window onto a wide range of blogging opinion, the result has generally been to tie bloggers to their own political party.

Lack of interactivity

The level of interactivity on blogs has barely advanced during the past five years. Although all blogging platforms now offer a commenting facility, and some allow comments to be nested below others, comments continue to sit apart from the original article. They cannot refer to particular sections in the original, even though useful contributions are far more likely to relate to specific sections of the original rather than the generality. (Services like this do exist, but they are very far from mainstream blog tools.)

By being outside the context of the original, the mental pressure – to understand the original, and to constructively contribute – is taken off the contributor, but shifted onto the original blogger, who must attempt to understand and ‘re-contextualise’ the commenter’s addition before he can move his own argument on. What should be an interactive process becomes a sequential one, and all the slower and more time-consuming as a result.

Finally, the noise-to-signal ratio of comments can become enormous as a blog increases in popularity, unless strict controls or voluntary ‘codes of conduct’ are in place.

Lack of collaboration

Collaborative alternatives potentially provide more valuable content than blogs: more focus; less duplication; less pressure to be ‘journalistic’; a fairer balance between contributors; as well as a less ‘noisy’ experience. However, the very fact that they ask more of contributors makes them more expensive to create, and therefore thinner on the ground. This, in turn, can make collaborative editing seem a lonely experience. This situation will likely continue until there are efforts to break down barriers between the two types of content. Aggregators are of little help here, as they perpetuate the idea of a single time line of unrelated articles, in stark contrast to the ‘world wide web’.

Isolation and insulation

New blogs begin life in complete isolation and need to build connections with others if they are to keep their enthusiasm going. They need blogging friends, and they need encouragement. However, until a true blogging political hub appears, new bloggers often find themselves locked into political party silos, isolating themselves from the much wider external audience. A parallel incentive is for people to insulate themselves in order to avert the discomfort they feel when confronted with deeply contrary opinions and threats to their world-view. More often than not, it us unregulated comment-boxes that fuel this, rather than the behaviour of other bloggers.

Conclusion

When reputation becomes detached from quality; when friendship, like-mindedness, and convention determine the success of a blog and the popularity of its content; and when atomisation rather than interaction is the norm, the result must be a homogenisation of ideas, and a greater chance that rare but brilliant insights will be missed. This is the opposite of what we’re looking for.

In the next post I’ll be explaining how Poblish tries to address each of these problems, and how policy-making can be made more informed, more efficient, more constructive, and also more satisfying.

(Originally posted here, on January 18th, 2010.)

Better Facebook sharing with Poblish

All Poblish articles now feature a handy Facebook Share button, with a few new features to make it as easy and satisfying as possible to post interesting content. Firstly:

  • We provide the exact title of the original post – no ‘Poblish’ prefix.
  • We add a preview image, courtesy of BitPixels.
  • We provide a quick preview of the content of the article – or article version – you selected.
  • Facebook will tell you how many times the article has been shared in the past.

Hope you find this useful – I plan to use it more and more, myself. If you fancy having a go, you could try starting here.

When crowdsourcing new policies, don’t waste existing content

This is the first in a series of posts on the subject of ‘How the semantic web can crowdsource high-quality judgment and improve policymaking’ that Paul introduced yesterday.

Debategraph

Debategraph: One way of mapping arguments

With all the talk about brand new crowdsourcing platforms, and letting the population ‘speak their minds‘, it’s easy to forget the mass of already-expressed opinion that exists in electronic form, and that can inform future debates. Not only the millions of overtly political blogs, but regular blogs, online newspapers, Wikis, and visual debate-mapping tools, like Debategraph.

Billions of individual thoughts and personal experiences have been written about, from all conceivable perspectives. No policy process is likely to come up with ideas that have never been thought of before; so expressed opinion represents an archive – a knowledge base – that should not be ignored. Here’s why:

  • It already exists – the mental work has already been done.
  • It happened – it’s a record of what happened when particular policies were tried.
  • It’s not just blogs: thanks to TheyWorkForYou, Hansard reports and transcripts of Select Committees make for highly-detailed content.
  • It can be linked-in: it can be dynamically matched, linked, and related to brand new policy debates.
  • It can be made fresh – it can be given a new lease of life when updated collaboratively.
  • It’s as good a source as any – basing arguments in brand new policy debates around what happens to be current in the mainstream media will inevitably produce less diverse, more error-prone, and less extensively scrutinised results than using sources that have already been run past potentially hundreds of human brains.
  • There may be no alternative – it enables, and bootstraps new policy debates, bringing in the words of those who haven’t yet joined – or even heard of – the new platform.

The challenge of using technology to make sense of all this political information is what concerns us now.

My new project – Poblish.org – aims to put this content to use, and to collapse the distinctions between the worlds of blogging, collaborative editing, and debate mapping. The result will be a collaborative ‘open data’ platform that works for both bloggers and policy-makers, and that will nurture an ecosystem of new political data tools. Hopefully the Labour-themed iPhone app Paul mentioned yesterday will be merely the first of these.

I will be explaining more about Poblish in future posts: the particular problems it was designed to address, the questions it tries to answer, and more about how it can improve policy-making.

(Originally posted here, on January 13th, 2010.)

Taking ‘Possibly Related Posts’ to the next level

Many WordPress bloggers use plugins like these to help people who read their posts find other, related posts.

That’s all well and good if you only want to help readers find your own articles, but perhaps other political bloggers have made your own point better than you have? Let’s now turn that round: perhaps you’ve made another blogger’s point better than he has? Wouldn’t it be great if there was a Related Posts service that let people follow links to similar content from one blog to another, irrespective of who wrote it and where you started reading?

Poblish offers just this. Simply open a blog post from one of the 1500 feeds we monitor, and you’ll see a list of similar posts, ranked by similarity, from across all of those feeds.

If that wasn’t cool enough, the list of related posts updates automatically. So, if you create a new, collaborative article with us, you can watch the list update as your work progresses – literally as you type. That’s very useful – perhaps you make a particular point, then some articles appear that strongly refute that point. You might then reconsider, delete your last paragraph, and move on in a different vein.

I believe that tools like this are an essential part of making the political blogosphere a knowledge base, that can not only improve political blogging, but also improve policy-making.

(I should add, as an aside, that all these services essentially use Inverse Document Frequency algorithms. Here’s a worked example. They can work very well – Poblish’s especially, I hope, as our algorithm uses stemming and stopwords – but there’s no attempt by the computer to understand the text, or the context, so there will inevitably be howlers. These are not semantic solutions of the type I mentioned yesterday, but don’t worry: we have big plans in that area.)

Semantic Web: Sentiment analysis using OpenAmplify

Buzz-terms like ‘the Semantic Web‘ are massively over-used, nonetheless it really is possible to do what was originally promised: allow a computer to understand the meaning of a particular article or piece of text, without human help. We all know that articles that mention David Cameron are political in essence, but can a computer know? By understand, I mean: be able to determine with a high probability, say, which words in a piece of text are the names of individuals; whether a ‘David Cameron’ referred to is really the Tory Party leader, based on the context of the article; and be able to go on to answer more of our questions about the afore-mentioned Cameron.

Why is this useful? Well, if a computer is capable of taking a mass of seemingly random articles – for example, the entire daily output of the world of blogging – and give us informed matches, rather than simple ‘textual’ matches, that saves human beings a lot of time. The computer can then use this understanding to tell us new things about the people or items we care about, in a way that conventional search-engines like Google can’t manage.

Sentiment analysis goes a step further. Using the understanding that an article might be political in nature, the computer analyses the precise language used to determine – for each ‘topic’ – whether references are generally ‘positive’ or ‘negative’, i.e. are people being nice to Cameron, or nasty to him? There are important implications for online debates. If you’re currently debating issue X, and happily providing positive coverage of a particular point, Y, you’d be particularly interested to hear about articles that are very critical of Y. If a computer was able to alert you to articles like this, that could save you a huge amount of reading, and reduce the chances that you commit yourself to a policy that has been refuted elsewhere. This approach is more likely to challenge your opinions, and to reduce the likelihood of Groupthink.

I’m currently trying out OpenAmpify‘s software, with a view to adding these capabilities to Poblish. While being able to search, link, and match articles from the political blogosphere is incredibly useful, I believe that we can go much further. The better we can understand articles with the help of technology, the more we can use that technology to inform online debates, to improve their quality, to make them less frustrating, and ultimately to improve decision-making.

Tory blog aggregation

It’s not well-enough known that Poblish‘s support for custom groups means that the issue of the missing Conservative Blog Aggregator, that Matt Wardman wrote about last year, has finally been solved, once and for all. Labour bloggers have had one for nearly 5 years.

Clearly this is extremely useful for anyone who’s interested in what UK Conservatives are talking about. So, here’s the Conservative Party group page, where you can watch the live feed. Here’s it is in JSON format, and in RSS 2.0 format.

The group currently contains 527 members, which comprises: all Conservative MPs  (via They Work For You), plus all the bloggers from the Total Politics directory, minus the broken links and the bloggers who weren’t really Tories on closer inspection.

Liberal Democrats shouldn’t feel left out, even though we only have 67 members at present. Here’s their group page, their JSON feed, plus the RSS representation. They do, of course, already have a well-known aggregator of their own.