TaPP 2015

7th International Workshop on Theory and Practice of Provenance
Edinburgh, Scotland
July 8-9, 2015
In cooperation with USENIX and ACM SIGPLAN and SIGMOD

Affiliated with BICOD 2015, July 6-8, Edinburgh, United Kingdom

The Theory and Practice of Provenance workshop series was started in San Francisco in 2009 and held in San Jose (2010), Heraklion, Crete (2011), Boston (2012), Lombard, Illinois (2013) and Cologne, Germany (2014, as part of ProvenanceWeek 2014). TaPP aims to be a venue for early-stage and innovative research ideas related to provenance, and a forum to encourage exchange of ideas between researchers working on provenance and practitioners or potential users of such research. Industry and academic participants interested in provenance in any setting are welcome, and workshop contributions describing unsolved problems or new potential application areas for provenance research are particularly welcome.

We are pleased to announce that TaPP will feature two invited speakers:

We gratefully acknowledge support from the Scottish Informatics and Computer Science Alliance (SICSA) for this event. Thanks to their generous support, we will be able to offer free registration to up to 6 PhD students at Scottish institutions for TaPP/BICOD.

Call for papers


Provenance provides needed insight into the origins and derivation of data, as well as formal documentation that can be instrumental in data quality assessment, program debugging, and search. Research topics of relevance to TAPP span the entire metadata lifecycle: from modelling to capture, storage, usage, querying and mining, to security and interoperable exchange across systems. TAPP also invites application-oriented contributions, on provenance-aware systems and other practical usage of provenance.

Workshop Format

In keeping with its successful tradition, TaPP'15 is a workshop, as opposed to a mini-conference. We aim to provide a platform for presenting and discussing a range of fresh ideas, and actively encourage inter-disciplinary work beyond the confines of the data management community.

Important dates

Abstract submission April 20, 2015
Paper submission deadline April 27, 2015 May 1, 2015
Poster submission deadline May 25, 2015 May 29, 2015
Author notification June 1, 2015 June 4, 2015
Final versions due June 15, 2015
Workshop July 8-9, 2015

Submission instructions

What to Submit

Research Papers: Contributions are typically 4 and never more than 6 pages long. They should describe challenges for provenance research, brief descriptions of new applications, pie-in-the sky research ideas, and anything else that will help engage the researchers' minds. While brief and readable descriptions of research are encouraged, recycled conference submissions are strongly discouraged. Contributions are collected into online proceedings, hosted by Usenix and indexed by DBLP, Google Scholar, etc.

We expect the program to include a mixture of presentations and discussions. Authors should expect ample opportunity to present their ideas at the workshop.

Submissions should be no more than 6 pages in ACM SIGPLAN (two-column) format. If supporting material is needed, it may be included in an appendix, but the committee will not be obliged to read the appendix. Shorter submissions (under 4 pages) are welcome to describe early-stage work, and are not considered to be part of the formal proceedings.

Research paper contributions should be submitted online at: https://www.easychair.org/conferences/?conf=tapp15. As in previous years, accepted TaPP papers will be open access via a USENIX web site.

Posters: Please submit a 1-page poster proposal (in any reasonable format) summarizing the research topic of your poster presentation to tapp15@easychair.org, by the poster deadline. A draft of the poster itself may also be submitted but is not necessary. Authors of accepted poster proposals will be allocated a poster board area large enough for an A0 or A1 poster. Abstracts of posters will also be included on the TaPP proceedings site.


Conference Chairs
Program Committee


Presenters: Please note that all presentations are 20 minutes plus 5 minutes for questions

Renee Miller's keynote is held in room G.07, jointly with BICOD. All other sessions are in IF G.07A

July 8 (Wednesday)
12:00Registration begins / lunch available
14.00-15.00Keynote by Renee Miller, Big Data Curation (chair: Paolo Missier)
15.30-17.10Session I: Capture (chair: Boris Glavic)
July 9 (Thursday)
9.00-10.00Keynote by Trevor Martin, Uncertainty and Provenance in Collaborative Situation Awareness (chair: Jun Zhao)
10.30-12.10Session II: Query and System (chair: Bertram Ludäscher)
12.10-12.403-Minute Gong show -- poster pitches (Chair: Simon Miles)
12.40-14.00Poster session + Lunch
14.00-15.15Session III: Scientific applications (Chair: Floris Geerts)
15.45-17:00Session IV: Foundations (Chair: Adriane Chapman)
17:00-17:30Town Hall and conclusions

Accepted papers (by session):

Session I: Capture (Chair: Boris Glavic)
Session II: Query and System (Chair: Bertram Ludäscher)
Session III: Scientific applications (Chair: Floris Geerts)
Session IV: Foundations (Chair: Adriane Chapman)

Accepted posters:

Invited talks

Professor Renee Miller, University of Toronto, Big Data Curation

More than a decade ago, Peter Buneman used the term curated databases to refer to databases that are created and maintained using the (often substantial) effort and domain expertise of humans. These human experts clean the data, integrate it with new sources, prepare it for analysis, and share the data with other experts in their field. In data curation, one seeks to support human curators in all activities needed for maintaining and enhancing the value of their data over time. Curation includes data provenance, the process of understanding the origins of data, how it was created, cleaned, or integrated. Big Data offers opportunities to solve curation problems in new ways. The availability of massive data is making it possible to infer semantic connections among data, connections that are central to solving difficult integration, cleaning, and analysis problems. Some of the nuanced semantic differences that eluded enterprise-scale curation solutions can now be understood using evidence from Big Data. Big Data Curation leverages the human expertise that has been embedded in Big Data, be it in general knowledge data that has been created through mass collaboration, or in specialized knowledge-bases created by incentivized user communities who value the creation and maintenance of high quality data. In this talk, I describe our experience in Big Data Curation. This includes our experience over the last five years curating NIH Clinical Trials data that we have published as Open Linked Data at linkedCT.org. I overview how we have adapted some of the traditional solutions for data curation to account for (and take advantage of) Big Data.

Professor Trevor Martin, University of Bristol, Uncertainty and Provenance in Collaborative Situation Awareness

The so-called big data revolution has been characterised by an increase in sources of data as well as in the volume of data to be processed.

In many cases - for example, network behaviour and control, security monitoring, enterprise management information - the data for situation awareness and decision-making is drawn from multiple sources and must be integrated into a coherent whole as far as possible.

This process generally requires both machines and human analysts and experts, It includes compensating for different formats, different granularities and resolutions, identifying and correcting errors (both systematic and intermittent), as well as managing uncertainties and gaps in data. Often the process requires assumptions and choices to be made in arriving at a reasonably robust overview of a situation - for example, in deciding that a failed attempt to access a building is potentially malicious, we might need to take account of someone's recent travel, long term patterns of behaviour, current schedules of close colleagues, etc. where each of these components may have been derived from lower-level raw data. Provenance in this context refers to the derivation pathways and their overall reliability.

In this talk, I will describe the use of graded (fuzzy) representations in modelling and managing the uncertainties, reliability and granularity of derived data in combining sources for situation awareness.


As in previous years, the proceedings of TaPP 2015 will be published electronically by USENIX. The proceedings will be available to registered participants before the workshop and will be available under open access terms (with no access fees) permanently after the workshop.

USENIX site for TaPP 2015 proceedings


Registration is coordinated with BICOD 2015 and it is possible to register for each event separately or jointly for both events (at a small discount). Registration includes access to the conference sessions, reception, coffee breaks and lunch.

The full registration fee for TaPP 2015 is £150. The student registration fee is £120. There is no early registration discount.

Registration is now closed.

Free registration for students

Up to six PhD students from Scottish universities will be offered free registration for BICOD and TaPP subsidised by SICSA. If you are an eligible student who wishes to register for BICOD and TaPP, please contact James Cheney (jcheney@inf.ed.ac.uk). These free places will be offered on a first-come, first-served basis. Please apply as soon as possible.

Local information

Please see the BICOD 2015 local information page for information about travel to Edinburgh, hotels, the conference venue, and sightseeing options.

If you are in need of a letter to support an application for a UK Visa, please follow the instructions here as soon as possible.


10-Jul-2015: TaPP 2015 is over! TaPP 2016 is planned as part of ProvenanceWeek 2016, in the Washington DC area, June 6-9, 2016!

19-Jun-2015: The program and list of accepted posters is available

4-Jun-2015: The list of accepted papers is available

27-May-2015: Poster deadline extended to May 29

4-May-2015: Registration is open!

22-Apr-2015: Submission deadline extended to May 1

24-Mar-2015: Updated with more information about invited speakers

13-Feb-2015: Updated with link to ACM SIGPLAN proceedings format

16-Jan-2015: Updated page with links to USENIX proceedings page and local information.

16-Dec-2014: Preliminary call for papers.

17-Nov-2014: TaPP 2015 web page posted.