RSS
 

Archive for August, 2010

data.ac.uk revisited

18 Aug

Following on from Dan’s recent post about data.ac.uk there is a growing consensus that this is a necessary step that needs to be made to facilitate the growth of usable linked data in Higher Education. Next week there will be a public briefing paper from JISC on the topic. We expect that they will announce, among other things, that the data.ac.uk domain name has been ring-fenced for to provide a data.gov.uk-esque repository for HE.

The tools to visualise this type of data have been around for a long time but as yet there has been no long term strategy for maintaining the actual data that drives these tools.

Why data.ac.uk?

So why do we need a centralised datastore for HE data? Why can’t we just host it locally on data.southampton.ac.uk?

In our original post Dan highlighted our stance on the need for institutional agnosticism for the data we are creating for this project:

We also found that it would be best to not use a subdomain of our school (such as musicnet.soton.ac.uk), since this would be seen as partisan to the school/university and is likely to get less uptake that something at a higher level (such as musicnet.ac.uk).

It doesn’t really make sense to expose the MusicNet data (a set of canonical names and datapoints for classical music composers) on a URI which contains the institution (in our case University of Southampton). The data has global significance and it just so happens that we are the ones tasked with exposing it. Our data is a perfect fit for a non politically aligned data.ac.uk implementation.

In his post Time for data.ac.uk? Or a local data.open.ac.uk? Tony Hirst raises some interesting questions about where we should be storing data:

Another possible source of data in a raw form is from the data.gov.uk education datastore (an example can be found via here, which makes me wonder about the extent to which a data.ac.uk website might just be an HE/FE view over that wider datastore? (Related: @kitwallace on University data.) And then maybe, hence: would data.*.ac.uk be a view over data.ac.uk for a particular institution. Or *.sch.ac.uk a view over a data.sch.ac.uk view over the full education datastore?

These questions are yet to be answered but he does make an interesting point drawing on a presentation given by Mike Nolan at mashlib2010. At present a lot of the information that could be (re-)exposed at minimal cost would be the syndicated data that most institutions already make available, RSS, CalDav etc. I would argue however that these institutional specific datasets, being themselves already intrinsically politically aligned, would be a better fit for localised hosting on a data.southampton.ac.uk type URI. The URI then infers ownership and some context/authority to the data held there.

So perhaps there is room for both data.ac.uk AND data.southampton.ac.uk?

How to make data.ac.uk work?

The only way we can make any datastore work is if the data is available in formats that people are able to make use of! For MusicNet we intend to host RDF-XML on our local server (until a data.ac.uk alternative becomes a reality) using the standard content negotiation to allow for a human readable HTML representation to be presented to casual users.

We also intend to investigate the Linked Data API that was announced at the Second London Linked Data Meetup and has been developed by Dave ReynoldsJeni Tennison and Leigh Dodds. The LD API will allow us to also provide our dataset in formats such as JSON & Turtle using a RESTful querying API, which is currently the protocol of choice for mashup/web2.0 developers.

I would like to see a similar infrastructure in place on a centrally hosted data.ac.uk.