Please visit od4d.com for the latest articles.

Our Second Workshop

We completed our second workshop on Wednesday (21st) as planned. It was another fascinating and very instructive event – very largely down to the energy, enthusiasm and expertise of the attendees. We learned some very important things about how useful open data could be and also became very aware of the issues in making it a realistic tool in widespread use.

We used the period between the workshops to try and find open data that addressed the information that the attendees would find really useful. This exercise drove home some lessons that we might have guessed but there is nothing like experiencing them for real.

* Finding the right data to solve a given problem was very hard. It is relatively easy given some data to think of an interesting way to use it. Doing it the other way round – well it was the main task for two of us who are reasonably specialist in the area for about a week (we also had suggestions from the Cabinet Office, DCLG and Hampshire Hub) and while we found some interesting and relevant stuff I couldn’t honestly say we nailed any of the major information requirements.

* Information about processes was just as important as data.  Some of the most important information the attendees needed was to do with diagnosing and managing “legal highs” and understanding the ever-changing regulations. This not the type of information we typically associate with open data and the data that does exist often does not make much sense unless you understand the related processes. That seems particularly relevant when examining  claims that open data heralds a step change in government transparency –without what David Heald calls process transparency open data is not going to provide much additional transparency.

* The right level of granularity was crucial. In several cases the attendees needed data about individual clients or landlords or properties  – national or even city averages on benefit claimants or empty houses were of marginal value. Whilst aggregation offers protection from identification, there is still a need to be able to interrogate single or much fewer instances, more of which later.

The area where open data seemed to have most to offer was in providing evidence of the costs to the public purse of failing to provide intervention and services for the homeless. These might include multiple A&E visits and hospital admissions or criminal activity requiring police time and affecting other citizens. Establishing the costs of these could help with funding applications and making the case for better intervention for individuals. We decided to concentrate on data health costs as an example. We used this dataset which is publically available and examined how useful it would be in practice, or to put it more positively what would be needed to make it more useful in practice.

The short answer is that it needed a lot of background knowledge and explanation to be any use. Both columns and rows made little sense without more explanation e.g. the unit cost of Drug Services, Adult, Admitted Patient is £429. Is this the cost of the first consultation? The average cost of a patient? What qualifies as drug services anyway? There is a 58 page accompanying pdf file which presumably explains all this – but for a busy person with another job to do this is not much use.

However, there were also some very positive messages to come out the workshop.  Steve Peters from the UK DCLG (the part of central government responsible for local government) who is a data specialist was present and told us about the unit cost database from New Economy Manchester (a think tank owned by Manchester Local Authorities and related bodies). This provides a service to organisations such as the voluntary sector, giving costs based on data such as the data set we were examining in a way that is relevant and meaningful .  While we did not have the database at the workshop – subsequent examination backs up what Steve was saying.

Steve also described pilot services that address the need to get information about individuals by allowing authorised people to ask specific questions about individuals and getting simple yes/no replies.  While there is no such service for the homeless sector the attendees could see the real value of such a service.

So where does that leave us?

It is important to emphasise this is just one pair of workshops. The lessons we learned here do not necessarily apply elsewhere. It will be very interesting to see what the results are when we repeat them in India next week. However, based on this group we are left with some  interesting conclusions and challenges. The attendees were extremely competent – not only in their own area of expertise but in IT skills and understanding and assessing data. Nevertheless it would have been impractical for them to use the datasets we found as part of their working life. The data was too hard to find and understand. Other considerations such as timeliness and technical format faded into insignificance. Among other things this calls into question the demand for greater data literacy. Our attendees were as IT literate and data literate as anyone could reasonably ask of a non-specialist audience.

I could summarise it by saying that if open data is to be generally usable (at least in this context) it needs someone (or something) to act as an intermediary – to find it, understand it, interpret it  and ideally turn it into a service. We see this already in terms of apps e.g. transport apps but for this area at least, it is clear is that is also necessary if the open data is to be effective in informing decision making and giving transparency. Quite what this means for our research question is a headache for the team to work on!