RSS
 

Final Product Post: MusicNet & The Alignment Tool

29 Jun

This is a final report and roundup of the MusicNet project. We’ll mainly be discussing the primary outputs of the project but will also cover an overview of the project as a whole.

We have two primary prototypal outputs/products from the project, they are:

  1. The Alignment Tool
  2. The MusicNet Codex

We’ll discuss each of these in turn and address what they are, who they are for and how you can use them in your own projects.

The Alignment Tool

The Alignment Tool is the unexpected victory of the project. Whilst the primary goal from the outset was to produce some meaningful RDF (discussed more later), the tooling we created to enable this has become a powerful re-usable asset.

What is it?

The Alignment Tool is a WebApp designed to enable domain experts to cleanse and align one or more datasets.

Who is it for?

We built the tool with non-technical domain experts in mind, assuming only that the end user have experience of modern OS paradigms such as drag & drop (users or desktop OS’s such as OS X & Windows will be familiar with WIMP) GUI’s).

The tool is designed for users with expert knowledge in the domain of the data they are aligning, in our use case a Musicology Research Fellow.

We also enabled full keyboard control via shortcuts to enable quicker data manipulation for users that had a higher technical proficiency.

NB: The setup/installation of the prototype tool currently requires basic LAMP knowledge

Why would you use it?

This tool is designed to help users clean-up their dirty data. It is specifically aimed at disambiguating names/authors or grouping several different instances of an author together.

For example, it could be used to find all occurrences of a single author in an EPrints repository (e.g. “Joe Lambert”, “J. Lambert”, “Lambert, J”) and reduce these into a single canonical reference.

The tool handles two types of operations really well, we call this internal and external alignment.

  • Internal alignment is looking for duplications or spelling errors within a single dataset. E.g. the EPrints example above.
  • External alignment is looking for matches across multiple different datasets, such as between Grove & British Library (as was the case with MusicNet).

The Alignment Tool not only handles both of these operations but it can handle them simultaneously, saving the user the time and effort of multiple iterations of data cleansing.

How to use it?

The Alignment Tool loads in a web browser (currently Firefox/Safari/Chrome) & is built using a single screen split-pane UI.

Alignment Tool Screenshot

Ungrouped/Grouped Lists

The left hand side contains two lists, one for all items that have yet to have been “grouped” and the second which shows all Grouped items. Each list has an A-Z control for quick indexing into the data and can be fully controlled by the keyboard. By selecting multiple items from the ungrouped list the user (at the touch of a single keystroke) can create a group from the selection.

Verified Toggle

The verified toggle lets the user see either expert-verified or non-expert-verified groups. During the setup stages the Alignment Tool will try and automatically align as many entities as it can, to ensure no incorrect alignments make it into the final dataset we mark these automatic assignments as “unverified”.

This process lightens the load for the end user without reducing the quality of the tool’s output.

Context/Extra Metadata

The right hand side is where the user can see extended metadata about one or more items. This view displays all relevant information that we know about each entity and can be used by the expert to decide if more than one instance are in fact the same.

Metadata Match

Metadata Match is an additional visual prompt to help the expert decide whether entities are the same. If multiple entities are selected and their metadata matches they will be highlighted in grey. If the user hovers over the metadata, all matching items are highlighted.

Demo

Our prototype for the Alignment Tool uses licensed data for its input so we are unable to expose this publicly, our Musicology Research Fellow produced this screencast to demonstrate the tool in use (video best viewed full screen):

The MusicNet Codex

MusicNet was a project funded by the jiscEXPO programme of work, whose primary goal was to expose linked data. We exposed canonical composer URIs as RDF and The MusicNet Codex is the human face on the data and exemplar for the power of Linked Data.

What is it?

The MusicNet Codex is a human-friendly web search interface (like Google) built on top of our rich Composer RDF.

Who is it for?

The MusicNet Codex is for musicologists looking to study classical composers. It is a single repository of information & links to trusted scholarly repositories and is designed to be used by non-technical users. The primary aim for the Codex was to be as simple to use as a Google search.

It could however be generalised for use with other domains as it is a companion product to our Alignment Tool.

Why would you use it?

The Codex offers some benefits over a simple Google search, namely it is domain specific so the user knows that the results will be pertinent to their search (e.g. Beethoven the composer vs Beethoven the movie). It is also a single point of search, encapsulating data from all the major Musicology resources. The typical use case undertaken by musicologists involves searching each of these remote resources individually, the Codex removes this overhead & streamlines the researchers’ workflow.

Using the power of Linked Data we are also able to expose non-scholarly resources about a composer from the same search interface. This means that as well as academic information from databases such as Grove & The British Library a user can immediately get articles from the New York Times, usage in films from IMDB or plays on the BBC.

How to use it/Demo?

The Codex is publicly accessible from http://musicnet.mspace.fm/search. To locate a composer a user can either filter alphabetically or search by keyword.

The results of a search are displayed like this:
Codex Search Results

Each search result displays the following information:

  • The part of the Composers title that matched the search request
  • Basic metadata about birth/death dates (where available)
  • The datasets that the composer appears in

By selecting a search result, all available links are presented to the user:

Composer Screenshot

Along with all non-scholarly links that the Codex has harvested:
Extended Metadata

All information displayed in human readable form is also available as RDF, to access this there is a link at the bottom of each page.

Blog Table Of Contents

Over the lifetime of the project, we have published a lot of great content to help inform others of our progress & to try and engage the community in discussion related to issues we feel are important for UKHE/Linked Data.

Below you will find categorised links to our posts and a brief summary of the key themes that emerged:

Guidance on Linked Data publishing

The projects primary Linked Data contribution was RDF about musical composers as sourced from a number of scholarly repositories. As such we discussed the need for suitable in-perpetuity hosting of UKHE URIs. This led to a discussion within the community about a centralised data.ac.uk hub where URIs could be minted and used without fear of them disappearing after a project has ended.

Especially when considering short lived rapid innovation projects that deal with Linked Data, the assets produced should outlive the life of the project. Enabling this would bring greater return/value on the funding invested in these projects & would also increase the scope of work achievable by such projects, removing the overhead of long term infrastructure commitments.

Alignment Tool & MusicNet Codex

The posts above outline the progress & justifications for how we arrived at the Alignment Tool. Our aims for the tool were that it be re-usable and generic without compromising its power/usefulness. If you have data that needs to be cleaned up or you wish to align similar entities within two datasets then the Alignment Tool is a lightweight way to achieve this. The tool is open source and can be downloaded from Google Code.

The later posts also outline our implementation for the MusicNet Codex, which represents a lightweight method to serve up both HTML & RDF data making use of Content Negotiation.

Dissemination & Engagement with the Community

Throughout the project our work was disseminated at a number of workshops and conferences, you can find slides and abstracts from all our appearances in the links above.

We also organised the Music Linked Data Workshop (MLDW) for the music and linked data community at the end of the project. You can find slides from all those that spoke at this event along with an archive of the tweets captured at the event.

Project Proposal

These posts are a breakdown of the original proposal we submitted for the funding call.

Miscellaneous

JISC DaoP

Below is information related to the project as a whole, rather than just our primary output contributions. This includes: the project team, where we are based and the licence under which the project outputs are made available.

Project tag: #musicnet

Project Lifetime: 31 July 2010 – 31 July 2011

Total Funding: £98,068

JISC PIMS: https://pims.jisc.ac.uk/projects/view/1868

This project was funded by JISC under the jiscEXPO stream of the INF11 programme.

JISC Logo

Overview

In this project, we propose five deliverables to ensure the longevity and usability of linkable musicological data:

  1. authoritative or “minted” URIs for composers that can confidently be used to align/link related sources, and links to those composers in scholarly data sources;

  2. persistence of this URI resource;

  3. a suite of tools to enable new sources to be easily aligned with these URIs;

  4. a Codex to host pointers to any other linked data sources using our URIs;

  5. exemplary visualisation/exploration interactions to show how the data can be used for knowledge building.

Primary Product Outputs

Codex, Exposed Linked Data, URI Scheme, Workshop, Documentation & Videos.

Lead Institution

University of Southampton
Electronics & Computer Science
SO17 1BJ
(0)23 8059 5415

Project Team

Joe Lambert

Joe Lambert

Project Manager / Lead Developer

Daniel Alexander Smith

Daniel Alexander Smith

Linked Data Developer

David Bretherton

David Bretherton

Musicology Research Fellow

mc schraefel

mc schraefel

Principle Investigator

Software Output

The Alignment Tool can be obtained from the project’s Google Code repository at http://code.google.com/p/musicnet/. API documentation generated by PDoc can be found at musicnet.mspace.fm/docs.

It has been released under the terms of the MIT Licence.

The MusicNet Codex is built upon the Kohana 3 framework which is licensed on the BSD Licence. All additional custom Codex code is licensed under MIT.

Data

The RDF Data output from the project is available via the MusicNet Codex & is licensed under the Creative Commons Zero license.

Creative Commons Zero

Documentation

Documentation for installing and using the Alignment Tool can be found as part of the Beta Download or viewable on the web at: Google Code README.

Documentation & blog posts are covered by the Creative Commons Attribution UK 2.0: England and Wales.

Creative Commons Licence

 
7 Comments

Posted by Joe Lambert in Uncategorized

 

Leave a Reply

 

 
  1. boca raton carpet cleaning, boca carpet cleaning, boca carpet cleaners, boca raton odor control, boca raton steam carpet cleaning, boca raton flood service, boca raton tile and grout cleaning

    October 20, 2011 at 5:07 am

    things i wanted to say is that i really need some sleep and that this blog is starting to sound good

     
  2. Clara Khuu

    February 6, 2013 at 9:25 am

    Really what is going on here? Do you need some new writers to add to the mix.

     
  3. sms marketing

    February 7, 2013 at 4:10 am

    HELLO …there have a bugs when we post a comment,i think a lemonjuz bugs not ILEENS bugs

     
  4. geosocial

    February 12, 2013 at 4:05 pm

    Mobile Social platform and social networking integration will boost effective marketing for your business

     
  5. geosocial

    February 13, 2013 at 9:42 am

    Foursquare as forefront money spinner for geolocation services |

     
  6. Anthony

    August 5, 2013 at 3:01 am

    Nice post . Thank. i ll must share it .

     
  7. pandora necklaces

    August 10, 2013 at 8:50 am

    hello
    That’s a nice post.Thank you for sharing.