RSS
 

Archive for the ‘Project Plan’ Category

Budget

22 Jun

This is the budget submitted for the MusicNet project proposal.

 
 

Project Timeline, Workplan & Methodology

22 Jun

Here is the Gantt chart & work packages submitted with the MusicNet project proposal.

Work package 1 Data Triage

In collaboration with musicologists decide on the key types to expose about Composers

Deliverable 1: Report

Work package 2 Composer Alignment Tool

Tools will need to be generated to automatically recognise composer matches between multiple data sources. These tools should also allow for input from musicology experts to improve matches by manual approval and creating patterns for common reoccurring errors.

The musicSpace project has laid the groundwork for this alignment by mapping each different dataset to a common ontology created as part of the project in collaboration with musicology experts.

Deliverable 1: Tools for alignment

Work package 3 URI Scheme

Following the data.gov.uk and cabinet office review guidelines for publishing, an appropriate and sustainable URI scheme will be created.

Deliverable 1: URI Scheme

Deliverable 2: Justification for scheme

Work package 4 Perform Data Alignment

Using the tools developed in WP2 each of the available each data sources will be aligned Deliverable 1: Data mappings between each of the data sources

Work package 5 Expose Linked Data

Using the scheme decided upon in WP3, URIs will be minted and structured linked data will be published that represents musicology composers

Deliverable 1: Linked Data

Work package 6 Prototype Development

Once the linked data has been published a prototype timeline visualisation will be produced.

Deliverable 1: Simile timeline using linked data from WP3

Deliverable 2: Codex on top of linked data

Deliverable 3: Documentation demonstrating how the prototypes were achieved

Work package 7 Community Engagement

A one day workshop to encourage use and reuse of the project outputs.

Deliverable 1: Workshop review

Work package 8 Reports & Guidance

Important research discoveries will be disseminated to the wider community via the project website, published deposits into the local EPrints repository as well as monthly posts to the project blog.

Methodology

All project outputs will be deposed into the e-Framework Knowledgebase, and/or the JISC InnovationBase. A similar approach to that used in the musicSpace project will be used to ensure that sound agile software engineering practices are used..The project will build upon existing specifications and standards from W3C, JISC, and other projects. In particular, it is expected to reference agreed standards such as RDF and follow guidelines from data.gov.uk. Accessibility of Web-based systems and software will be ensured by conforming to the WC3 Web Accessibility Initiative level Double-A.

 
 

Project Team Relationships and End User Engagement

22 Jun

These are the members of the MusicNet project team:

mc schraefel is a Reader in Computer Science at the University of Southampton. She has led a number of JISC funded projects including the musicSpace project from which this proposal builds upon. More recently she is a CI on the EPSRC funded EnAKTing project whose main concern is the exposure of UK Government information as Linked Data.

Joe Lambert is a Research Fellow within the Intelligence, Agents and Multimedia group at the University of Southampton. He is the primary UI developer on the mSpace faceted browser and has worked on the JISC funded Richtags project as well as the Arts and Humanities Research Council (AHRC) funded musicSpace. He also worked on the JISC funded OpenPSI project, where he worked with public sector data in SPARQL databases, producing a standalone SPARQL version of the popular facetted browser; mSpace.

Daniel A. Smith is a Research Fellow; the primary developer on the mSpace server and has also been lead developer on the JISC funded Richtags project and the JISC/AHRC/EPSRC funded musicSpace. His doctoral thesis focused on the intelligent linking of remotely hosted linked data resources in a process called ‘pivoting’. He is also a regular contributor to the wider Linked Data research community, including engagement with the BBC.

David Bretherton is a Research Fellow with the Music department at the University of Southampton. He has a doctorate in Musicology from Oxford University and has also been the primary musicology consultant on the musicSpace project.

Engagement with the Community

To ensure that the way the URI resources are both published for use and made accessible for re- use by other tools and services, we will be working with stakeholders throughout the process for regular review and updates of our approach. We have used this kind of development/evaluation approach successfully with projects like musicSpace.

In the first instance, the project technical team will be working regularly throughout the lifecycle of the project with our musicologist colleagues here at the University of Southampton. In particular we will be running standard evaluation and development processes to refine both the Codex and the Visualisation tools – the main outward-facing interfaces in the project.

At regular intervals, we will also be deploying beta prototypes of these services with our distributed stakeholders from Durham University and Royal Holloway who are all on board to participate in these trials. Their letters of support for the project are appended.

A workshop will also be held toward the end of the project to promote the use and uptake of the linked data outputs of the project. The workshop will be suitable for musicologists as well as computer scientists and cater for a range of abilities and range of familiarity with the semantic web.

The aim of the workshop is to give tutorials of the use and reuse of the project outputs and encourage the linking to minted URIs by publishers of existing linked data (we have liaised with Yves Raimond at the BBC who publish a large amount of linked data including classical music performances from BBC Radio 3, whom has expressed an interest in such a workshop). We have contacted DevCSI who have agreed to aid us in engagement with the UKHE developer community, and we have contacted the DCC who have agreed to aid us by linking to our announcements for our proposed workshop.

We anticipate that the outcomes from the project will also continue to be of interest to various research venues where we have published this kind of work in the past, such as the International Society for Music Information Retrieval (ISMIR) and International Association of Music Libraries.

 
 

IPR (Creative Commons Use & Open Source Software License)

22 Jun

While the code will be made available under an appropriate open source agreement as used within any educational establishment and in-line with JISC’s requirements, the IPR will also remain with the University of Southampton thereby allowing Southampton to further exploit the IP.

Sustainability of the produced code is through ensuring other universities and JISC projects have access to the code and documentation for the system, through BSD or MIT licences (the code being published in Google Code). Quality factors built in to the work packages will ensure successful Open Source life through achievement of a good OSMM rating, community engagement, and community stated need.

Project memory will be recorded at regular intervals through the publication of reports and blog posts. A community infrastructure will be created during project start up using Google Code, in order to enable public facing source code releases, bug tracking and mailing list to be used for community engagement.

All reports, tools, and code from the project will remain on the project server and the Google Code repository for a minimum period of 2 years. As long term management of the URIs for this project are important the hosting of the Linked Data will be maintained by ECS central infrastructure.

 
 

Risk Analysis and Success Plan

22 Jun

Risk Analysis

1. Technical (Incomplete or unavailable prior work)

While service descriptions, designs, and implementations will be repurposed from previous projects, it is possible that these projects may be unable to provide the anticipated materials. The risk will be managed by ensuring that these projects will be exploited in such a way that unavailability from one or even two of these projects will not prove fatal to the proposed deliverables.

2. Uptake by target user communities

There is a risk that the target users will fail to engage with the technology provided. We have minimised this risk by creating a network of interested collaborators, and enlisting institutional support. The proposed workshop also aims to mitigate this risk by directly engaging the community.

3. Staffing

The team understands design principles and no one member of the team has any vital piece of knowledge not understood by the others. The design principles are based on techniques used by the project team on previous successful projects. The advantage of this approach is that we are relying on an experienced existing team.

4. Project Sustainability

To ensure project memory, source code will be hosted on Source Forge, and minted URIs will be hosted by ECS infrastructure indefinitely (see attached letter).

5. Acquiring data from partners

The risk of not acquiring data from partners is low as contracts are already in place for musicSpace and the provision for the extension of these already exists.

Success Plan

The project will build on the practices of the musicSpace project to ensure that milestones and deliverables are met in a timely fashion. In addition to a rapid and iterative approach to development, this project will also utilise co-design methodologies wherever appropriate. Involving stakeholders at each incremental phase and from the early stages of development will ensure that the outputs of the project meet the real user needs of our targeted community.

In accordance with JISC guidelines the project team liaised with OSS Watch at the bid-writing stage to ensure an open dialogue is developed should the proposal be successful & to seek advice regarding the proposal writing process. OSS Watch templates will be used to develop a Governance Model for the project, to be published in Week 4, in order to define the scope of the project for third parties.

The project will be led by the School of Electronics and Computer Science (ECS) at Southampton. The project will begin with an initial project start-up face-to-face meeting with all those taking part in the project. A similar team meeting will occur at monthly intervals to monitor progress against objectives. Public versions of the minutes of these meetings will be published on the project website. Financial reports will be supplied by ECS financial management, and a Final Report will be produced at the end of the Project. There will be a final project closure meeting. Each of the work packages will require formal review and sign-off meetings, and these are spaced at monthly intervals. There will be weekly technical meetings of the core project staff.

At the final project meeting and at the regular stakeholder meetings, the project’s output will be evaluated against the requirements of the stakeholders to ensure the outputs fulfil the user needs.

A total of four person-days have been allocated to engage in programme-level activities: programme meetings, relevant special interest groups, communications and dissemination activities and the e-Framework, and budget has been allocated to cover travel to these activities.

Time has been allocated for the organisation, promotion and running of a one day workshop to promote the project outputs to the wider community. Tutorials given during the workshop will be video recorded and published online.

 
 

Wider Benefits to Sector & Achievements for Host Institution

22 Jun

To ensure that the way the URI resources are both published for use and made accessible for re- use by other tools and services, we will be working with stakeholders throughout the process for regular review and updates of our approach. We have used this kind of development/evaluation approach successfully with projects like musicSpace.

In the first instance, the project technical team will be working regularly throughout the lifecycle of the project with our musicologist colleagues here at the University of Southampton. In particular we will be running standard evaluation and development processes to refine both the Codex and the Visualisation tools – the main outward-facing interfaces in the project.

At regular intervals, we will also be deploying beta prototypes of these services with our distributed stakeholders from Durham University and Royal Holloway who are all on board to participate in these trials. Their letters of support for the project are appended.

A workshop will also be held toward the end of the project to promote the use and uptake of the linked data outputs of the project. The workshop will be suitable for musicologists as well as computer scientists and cater for a range of abilities and range of familiarity with the semantic web.

The aim of the workshop is to give tutorials of the use and reuse of the project outputs and encourage the linking to minted URIs by publishers of existing linked data (we have liaised with Yves Raimond at the BBC who publish a large amount of linked data including classical music performances from BBC Radio 3, whom has expressed an interest in such a workshop). We have contacted DevCSI who have agreed to aid us in engagement with the UKHE developer community, and we have contacted the DCC who have agreed to aid us by linking to our announcements for our proposed workshop.

We anticipate that the outcomes from the project will also continue to be of interest to various research venues where we have published this kind of work in the past, such as the International Society for Music Information Retrieval (ISMIR) and International Association of Music Libraries.

Impact on Academic Researchers, Teachers and Learners – Linking, visualising and exploring Musicology Linked Data (WP6 and WP7)

Both the codex and timeline visualisation offer additional ways to accelerate information discovery and knowledge building. Based on our experience with the musicSpace project and our proposed output here, key figures in musicology anticipate the projects benefits for researchers and students alike (see supporting letters from Professor Katharine Ellis, Royal Holloway, University of London; Michael Spitzer, President of the Society for Music Analysis, Durham University; Professor Philip Olleson, President, The Royal Musical Association and Emeritus Professor of Historical Musicology, University of Nottingham).

The outputs of MusicNet impact researchers by improving the workflow for musicology research, proving a trusted codex of links to scholarly and commercial data sources where information on specific composers can be located. A Linked Data timeline visualisation allows comparison between musicology and historical events published as linked data. Minted URIs for composers provide a single academic reference for related online materials, suitable for use in teaching environments.

Impact for Data Holders and Data Service Providers of Data Alignment (WP1, WP3, WP5)

Our partners (Grove, RISM, the British Library, etc. see letters of support), who hold the main musicology sources, will have links to their data exposed on our codex website. By aligning data sources against the identifiers and attributes, MusicNet provides a rich representation of the musicology space. Data Holders benefit because their data is immediately linked to other data holders. An ongoing benefit to others is that they can openly associate with the URIs used here, to enable linking to other academic and commercial data sources. In doing so, their visibility to academic researchers and students is increased, because links to their data sources are published as links through our minted URIs. Publishing of Linked Data allows more opportunities to link to more kinds of related sources; the exposing of Linked Data from their sources provides a representation of them in the Semantic Web, eventually meaning that their data can be browsed along with data from other domains, for example allowing browsing of both Baroque Music with Baroque Architecture.

Complementarity to other services:

Although the Library of Congress, for example, provides an authority service for names and items that can be used in the creation of library metadata, it is a commercially run subscription service, and so there is a price barrier to smaller organisations and individual creators of datasets. Many data providers with which we have worked on the musicSpace project, have voiced objections to the naming schema used by the Library of Congress and other authority services, which typically use initials followed by surname and therefore does not provide the end user with full name information that may be available. Our proposal to use URIs to identify composers will mean that data partners are no longer limited in how they represent names in their data sources, while still being able to link to other sources that represent names differently. In addition, by including data about name variants in different sources, MusicNet will also address the issue of compatibility with legacy data. The findings from the NAMES project will be exploited to best present the name variations in addition to the core linked data output.

Impact for Linked Data Creators (WP2, WP5 and WP8)

Publishers of linked data recognise the usefulness of authoritative identifiers, in order for their data to be useful outside of its published context. For example Yves Raimond (Senior Software Engineer, BBC) notes in his letter of support that they are currently required to use DBPedia and MusicBrainz as identifiers in their music linked data output, and that while DBPedia provides URIs for some composers, the coverage of composers is limited. Similarly, musicbrainz URIs confuse performers and composers, leading to ambiguity when applied to Classical Music. These problems have impacted the usefulness of the BBC’s classical music linked data output. Each of these problems are addressed by MusicNet’s minted URIs and the persistence of the service.

 
 

Aims, Objectives and Final Outputs

22 Jun

Problem In any domain, a key activity of researchers is to search for and synthesise data from multiple sources in order to create new knowledge. In many cases this process is laborious, to the point of making certain questions nearly intractable because the cost of the search outstrips the time available to consider the work. As more resources are published as linked data this should mean that, with appropriate tools, data from multiple heterogeneous sources can be more rapidly discovered and automatically integrated. This will enable previously intractable queries to be explored, and more standard queries to be significantly accelerated. But linked data is not of itself a complete solution. A key challenge of linked data is that its strength is also its weakness: anyone can publish anything. So in classical music, for instance, 17 sources may publish work on Schubert, but there is no de facto way to know that any of these Schuberts are the same. The sources are not aligned. Without alignment, much of the benefit of linked data is diminished: resources can effectively be stranded rather than discovered, or become tangled nets of only guessed associations.

Proposed Solution To address these problems, this project proposes to produce a suite of resources and tools that will support effective linked data exploration with a focus in musicology. The project’s original data contribution will be archival, canonical linked data references, aka “minted” URIs, for classical music composers. These URIs will associate recognized reference data sources in Musicology like COPAC, RISM, Grove, the British Library, etc (see partner letters) into standard representative pointers for composers. The original tools contribution will be data alignment mechanisms that will easily enable domain experts to associate any linked data resources with our minted reference URIs. The URIs and the alignment tools mean that musicologists as data contributors will be able to harmonize rather than replicate their resources with standard sources. Our instructional prototype contribution will be: a Codex and a Visualiser. The codex will act as a dynamic catalogue of any linked data resource that use our URIs. This prototype will act as a resource hub for musicologists: they will be able to access it with confidence of exploring well-aligned, disambiguated resources. Likewise for tool developers, this hub will be a clear data reference point for testing linked data resources. As an example of these features – resource hub, research access, tool demonstrator – we will provide a rich temporal visualisation tool. This visualization will act as a model & service template both of how linked data can be richly visualised and explored by the researcher, as well as how tool developers might take advantage of these affordances to develop new tool resources and interactions.

Domain We are focusing on musicology because we already have strong relationships with both commercial and research resource partners in musicology – Grove, BBC, British Library, COPAC, to name a few – where, through the AHRC musicSpace project we demonstrated how commercial and research developed heterogeneous data resources could be integrated for rapid exploration and knowledge building. Both the data partners of this project and our current musicSpace evaluators are keen to work with us to deliver minted URIs and these associated services that will make both their existing and new data more useful and usable by musicologists.

User Analysis We are focusing on minted URIs and data alignment services within linked data because our extensive experience in musicSpace with stakeholders and with the data resources themselves shows this service to be a sine qua non necessity for linked data resources to be useful and usable.

Deliverables. This project will deliver:

a. An archival, canonical reference set of minted musicology URIs

b. An ongoing commitment to maintain this research for ongoing scholarship

c. A suite of tools to support the alignment and integration of new linked data resources for increased discovery and usefulness

d. A backlinks service that will make new link data resources published to our Codex associated with our minted URIs and thus easy to integrate into new tools and services. A model tool to show how these resources can be dynamically added and explored in a rich hierarchical timeline and visualised alongside other historical events.

These deliverables address the following specific aims of the call:

Make a collection of resources available on the Web as structured linked data

The project will produce and publish linked data about classical music composers using data from publishers partnering on the musicSpace project. This data will be exposed using existing linked data technology and will form the basis of an online source of canonical data about (and, in time, comprehensive index of) musical composers. It is intended that as well as exposing basic meta- data about each composer (for example birth/death date and nationality) the linked data will provide URLs that reference back into the online web catalogues of our data partners so that musicologists can immediately access all relevant data from each partner collection. Composer data is fundamental to the work of musicologists and music educators, and we see this as the essential first step in the provision of linked data services for classical music.

Due to the nature of linked data, and the requirement to support the hosting of the data output of the project past the project end date, we have agreed with the ECS systems team (see supporting letter) to develop a best practice for a packaged lightweight linked data deployment strategy, to enable ECS to sustain hosting of the linked data at the permanent URIs into the future. A report on our best practice recommendations for lightweight hosting of linked data will be published.

As part of producing the linked data, unique URIs will need to be minted for each composer that exist within the data partners’ current datasets. The project team comprises experts from both musicology and the Semantic Web, which ensures that the ideal skill sets are available for creating an authoritative and reusable URI scheme. Utilising domain knowledge, data licensed from trusted musicological scholarly catalogues and in accordance with the ‘Four Rules to Linked Data’ as recommended by data.gov.uk, the project envisages producing the definitive URIs for musical composers, that are trustable as backed up by musicology scholars.

In addition to URI minting, the datasets from each data partner will need to be aligned to ensure that composers from one dataset will match up with the same composers from another. This matching should be capable of handling different formatting of names (composer disambiguation) as well as input errors occurring when the data partners digitised their catalogues. A subset of this co-reference alignment has been performed under the musicSpace project, and we propose for this project that the existing alignments are exposed as linked data, and that the alignment work be expanded to all composers within the data sets, by using an expanded version of our prototype alignment tool created for musicSpace.

Develop a prototype with instructional step-by-step demonstration and documentation

During the musicSpace project and through engagement with musicologists at the University of Southampton and the musicological community more broadly (including stakeholders identified at Durham University and Royal Holloway) through musicSpace’s dissemination and demo activities, it was apparent that a number of crucial research and education tools have yet to be developed. The data required for these tools however is available, albeit in an unhelpful format. A prime example of a cited teaching aide for HE musicology students was that of a timeline visualisation. Currently the passage of time and influence of composers throughout history can only be understood by time-consuming information-triage across the multiple online musicology catalogues. If however linked data were available it would be possible to make use of the popular open source timeline software Simile to better understand the temporal relationship between composers. Students could then use the timeline as an entry point into the multiple online musicology catalogues rather than themselves having to perform an exhaustive search of each. A benefit to using a Linked Data approach here is that any other Linked Data sources can be added to the same timelines, so that correlations between other historical events and music can be shown on the timeline, providing additional context for end-users.

In order to directly unlock the benefits of the Linked Data to non-technical end-users, a Codex will be created on top of the data that will allow musicologists to search for items of interest, and to get links to all references to those items in the partner collections. For example, a user is interested in the works of Beethoven, and searches the Codex. The search finds all Linked Data that references Beethoven and offers links to all of the collections so that the user can quickly explore the data from those providers. These links include the musicSpace data partners as well as Linked Data publishers such as MusicBrainz, DBPedia and the BBC, allowing users to listen to works through the BBC iPlayer, using their existing Linked Data output which includes classical music performances on Radio 3. The Codex will also utilise the backlinks technology, developed by the co-located enAKTing project, to automatically update the codex’s links to show all catalogues that utilise our minted URIs, so that future uses of the URIs are exposed to users.

The project intends to produce the above prototype visualisations to meet the specific needs of students of musicologists and by proxy their educators and lecturers. In addition it will also provide rich documentation to allow future projects to make use of the underlying data that is being exposed. Documentation will also be provided to demonstrate how third party datasets, both currently known or not yet existing, can be joined with the published linked data to add additional information or meaning. Video tutorials on how to find Linked Data on the web, and how to explore it with the data we expose will be posted to YouTube and the project blog.

Explore and report on the opportunities and barriers in making content structured

It is anticipated that much will be learned through the alignment of multiple data sources and that the tools generated to aide this technique will be useful for other research domains. The project intends to regularly publish findings on the project blog specifically regarding the discovery of similar resources within dissimilar non-structured datasets and how best to converge these into a single canonical structured linked dataset.

More formalised discovery resulting from the efforts of the project will be deposited into the School of Electronics & Computer Science at the University of Southampton’s EPrints Open Access online repository.

One of the roles of the musicologist on this project is to highlight important sources within musicology that can be further leveraged by conversion into Linked Data.