TaPP 2015

Call for Papers

Important Dates

Submission instructions

Organizers

Submission site

Program

Invited talks

Proceedings

Registration

Local Information

Sponsors

Affiliated with BICOD 2015, July 6-8, Edinburgh, United Kingdom

The Theory and Practice of Provenance workshop series was started in San Francisco in 2009 and held in San Jose (2010), Heraklion, Crete (2011), Boston (2012), Lombard, Illinois (2013) and Cologne, Germany (2014, as part of ProvenanceWeek 2014). TaPP aims to be a venue for early-stage and innovative research ideas related to provenance, and a forum to encourage exchange of ideas between researchers working on provenance and practitioners or potential users of such research. Industry and academic participants interested in provenance in any setting are welcome, and workshop contributions describing unsolved problems or new potential application areas for provenance research are particularly welcome.

We are pleased to announce that TaPP will feature two invited speakers:

Professor Renee Miller, University of Toronto/IBM (joint invited speaker with BICOD 2015), Big Data Curation
Professor Trevor Martin, University of Bristol, Uncertainty and Provenance in Collaborative Situation Awareness

We gratefully acknowledge support from the Scottish Informatics and Computer Science Alliance (SICSA) for this event. Thanks to their generous support, we will be able to offer free registration to up to 6 PhD students at Scottish institutions for TaPP/BICOD.

Call for papers

Focus

Provenance provides needed insight into the origins and derivation of data, as well as formal documentation that can be instrumental in data quality assessment, program debugging, and search. Research topics of relevance to TAPP span the entire metadata lifecycle: from modelling to capture, storage, usage, querying and mining, to security and interoperable exchange across systems. TAPP also invites application-oriented contributions, on provenance-aware systems and other practical usage of provenance.

Workshop Format

In keeping with its successful tradition, TaPP'15 is a workshop, as opposed to a mini-conference. We aim to provide a platform for presenting and discussing a range of fresh ideas, and actively encourage inter-disciplinary work beyond the confines of the data management community.

Important dates

Abstract submission	April 20, 2015
Paper submission deadline	~~April 27, 2015~~ May 1, 2015
Poster submission deadline	~~May 25, 2015~~ May 29, 2015
Author notification	~~June 1, 2015~~ June 4, 2015
Final versions due	June 15, 2015
Workshop	July 8-9, 2015

Submission instructions

What to Submit

Research Papers: Contributions are typically 4 and never more than 6 pages long. They should describe challenges for provenance research, brief descriptions of new applications, pie-in-the sky research ideas, and anything else that will help engage the researchers' minds. While brief and readable descriptions of research are encouraged, recycled conference submissions are strongly discouraged. Contributions are collected into online proceedings, hosted by Usenix and indexed by DBLP, Google Scholar, etc.

We expect the program to include a mixture of presentations and discussions. Authors should expect ample opportunity to present their ideas at the workshop.

Submissions should be no more than 6 pages in ACM SIGPLAN (two-column) format. If supporting material is needed, it may be included in an appendix, but the committee will not be obliged to read the appendix. Shorter submissions (under 4 pages) are welcome to describe early-stage work, and are not considered to be part of the formal proceedings.

Research paper contributions should be submitted online at: https://www.easychair.org/conferences/?conf=tapp15. As in previous years, accepted TaPP papers will be open access via a USENIX web site.

Posters: Please submit a 1-page poster proposal (in any reasonable format) summarizing the research topic of your poster presentation to tapp15@easychair.org, by the poster deadline. A draft of the poster itself may also be submitted but is not necessary. Authors of accepted poster proposals will be allocated a poster board area large enough for an A0 or A1 poster. Abstracts of posters will also be included on the TaPP proceedings site.

Organizers

Conference Chairs

Paolo Missier, Newcastle University, PC co-chair
Jun Zhao, Lancaster University, PC co-chair
James Cheney, The University of Edinburgh - local chair

Program Committee

Vanessa Braganholo, UFF, Brasil
Adriane Chapman, The MITRE Corporation, USA
Sarah Cohen-Boulakia, LRI, Universite Paris-Sud, France
Vasa Curcin, King's College, London, UK
Tom De Nies, Ghent University - iMinds, Belgium
Saumen Dey, UC Davis, USA
Lois Delcambre, Portland State University, USA
Alan Fekete, University of Sydney, Australia
Irini Fundulaki, ICS-FORTH, Greece
Floris Geerts, University of Antwerp, Belgium
Ashish Gehani, SRI International, USA
Boris Glavic, Illinois Institute of Technology, USA
Paul Groth, Elsevier, NL
Melanie Herschel, University of Stuttgart, Germany
Bertram Ludäscher, UIUC and NCSA, USA
Simon Miles, King's College London, UK
Luc Moreau, University of Southampton, UK
Paolo Papotti, QCRI, Qatar
Sudeepa Roy, University of Washington, USA
Perdita Stevens, University of Edinburgh, UK

Program

Presenters: Please note that all presentations are 20 minutes plus 5 minutes for questions

Renee Miller's keynote is held in room G.07, jointly with BICOD. All other sessions are in IF G.07A

July 8 (Wednesday)
12:00	Registration begins / lunch available
13.55-14.00	Welcome
14.00-15.00	Keynote by Renee Miller, Big Data Curation (chair: Paolo Missier)
15.00-15.30	Break
15.30-17.10	Session I: Capture (chair: Boris Glavic)
17:30-19:30	Reception
July 9 (Thursday)
8.55-9.00	Opening
9.00-10.00	Keynote by Trevor Martin, Uncertainty and Provenance in Collaborative Situation Awareness (chair: Jun Zhao)
10.00-10.30	Break
10.30-12.10	Session II: Query and System (chair: Bertram Ludäscher)
12.10-12.40	3-Minute Gong show -- poster pitches (Chair: Simon Miles)
12.40-14.00	Poster session + Lunch
14.00-15.15	Session III: Scientific applications (Chair: Floris Geerts)
15.15-15.45	Break
15.45-17:00	Session IV: Foundations (Chair: Adriane Chapman)
17:00-17:30	Town Hall and conclusions

Accepted papers (by session):

Session I: Capture (Chair: Boris Glavic)

Timothy McPhillips, Shawn Bowers, Khalid Belhajjame, and Bertram Ludäscher. Retrospective Provenance Without a Runtime Provenance Recorder (Slides)
Luc Moreau and Paul Groth. Provenance of Publications: A PROV style for latex
Manolis Stamatogiannakis, Paul Groth and Herbert Bos. Decoupling Provenance Capture and Analysis from Execution (Slides)
David Gammack and Adriane Chapman. Provenance Tipping Point

Session II: Query and System (Chair: Bertram Ludäscher)

Amit Chavan, Silu Huang, Amol Deshpande, Aaron Elmore, Samuel Madden and Aditya Parameswaran. Towards a unified query language for provenance and versioning (Slides)
Xing Niu, Raghav Kapoor, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic. Interoperability for Provenance-aware Databases using PROV and JSON (Slides)
Adam Bates, Kevin Butler and Thomas Moyer. Take Only What You Need: Leveraging Mandatory Access Control Policy to Reduce Provenance Storage Costs
Nikilesh Balakrishnan, Thomas Bytheway, Lucian Carata, Oliver R. A. Chick, James Snee, Sherif Akoush, Ripduman Sohan, Margo Seltzer, and Andy Hopper. Recent Advances in Computer Architecture: The opportunities and challenges for provenance (Slides)

Session III: Scientific applications (Chair: Floris Geerts)

Daniel de Oliveira, Vítor Silva and Marta Mattoso. How much domain data should be in provenance databases?
Joao Pimentel, Juliana Freire, Leonardo Murta and Vanessa Braganholo. Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow (Slides)
Saumen Dey, Khalid Belhajjame, David Koop, Meghan Raul, and Bertram Ludäscher. Linking prospective and retrospective provenance in scripts

Session IV: Foundations (Chair: Adriane Chapman)

Stefan Fehrenbach and James Cheney. Language-integrated Provenance in Links (Slides)
Boris Glavic, Sven Köhler, Sean Riddle, and Bertram Ludäscher. Towards Constraint-based Explanations for Answers and Non-Answers (Slides)
Maxime Debosschere and Floris Geerts. Cell-based causality for data repairs

Accepted posters:

Boris Glavic, Tanu Malik, Quan Pham, Making Database Applications Shareable
Simone I. Conte, Alan Dearle, Graham N. C. Kirby, Adrian O'Lenskie and Ian Paterson, Modelling Context and Provenance in a Sea of Data
Xing Niu, Raghav Kapoor, Boris Glavic, Heuristic and Cost-based Optimization for Provenance Computation
Weili Fu, Provenance for configuration language security
Tanu Malik, Miao Yu, Cristian Vlaescu, PROVaaS: A Pay-as-you-go Service for Storing and Querying Provenance Data
Alessandro Spinuso, Rosa Filgueira and Malcolm Atkinson, User Centered Provenance Management for Data Intensive Platforms
Erisa Karafili, Hanne Riis Nielson and Flemming Nielson, Coordination languages for Provenance
Mojtaba Eskandari, Bruno Crispo and Anderson Santana De Oliveira, A Proposal Architecture for Logical Data Tracking in Cloud
Julien Lacroix and Omar Boucelma, Specification of a Provenance-Based Access Control Approach with PROV-CONSTRAINTS
Mike Mineter, Collaboration for Research Enhancement by Active Metadata

Invited talks

Professor Renee Miller, University of Toronto, Big Data Curation

More than a decade ago, Peter Buneman used the term curated databases to refer to databases that are created and maintained using the (often substantial) effort and domain expertise of humans. These human experts clean the data, integrate it with new sources, prepare it for analysis, and share the data with other experts in their field. In data curation, one seeks to support human curators in all activities needed for maintaining and enhancing the value of their data over time. Curation includes data provenance, the process of understanding the origins of data, how it was created, cleaned, or integrated. Big Data offers opportunities to solve curation problems in new ways. The availability of massive data is making it possible to infer semantic connections among data, connections that are central to solving difficult integration, cleaning, and analysis problems. Some of the nuanced semantic differences that eluded enterprise-scale curation solutions can now be understood using evidence from Big Data. Big Data Curation leverages the human expertise that has been embedded in Big Data, be it in general knowledge data that has been created through mass collaboration, or in specialized knowledge-bases created by incentivized user communities who value the creation and maintenance of high quality data. In this talk, I describe our experience in Big Data Curation. This includes our experience over the last five years curating NIH Clinical Trials data that we have published as Open Linked Data at linkedCT.org. I overview how we have adapted some of the traditional solutions for data curation to account for (and take advantage of) Big Data.

Professor Trevor Martin, University of Bristol, Uncertainty and Provenance in Collaborative Situation Awareness

The so-called big data revolution has been characterised by an increase in sources of data as well as in the volume of data to be processed.

In many cases - for example, network behaviour and control, security monitoring, enterprise management information - the data for situation awareness and decision-making is drawn from multiple sources and must be integrated into a coherent whole as far as possible.

This process generally requires both machines and human analysts and experts, It includes compensating for different formats, different granularities and resolutions, identifying and correcting errors (both systematic and intermittent), as well as managing uncertainties and gaps in data. Often the process requires assumptions and choices to be made in arriving at a reasonably robust overview of a situation - for example, in deciding that a failed attempt to access a building is potentially malicious, we might need to take account of someone's recent travel, long term patterns of behaviour, current schedules of close colleagues, etc. where each of these components may have been derived from lower-level raw data. Provenance in this context refers to the derivation pathways and their overall reliability.

In this talk, I will describe the use of graded (fuzzy) representations in modelling and managing the uncertainties, reliability and granularity of derived data in combining sources for situation awareness.

Proceedings

As in previous years, the proceedings of TaPP 2015 will be published electronically by USENIX. The proceedings will be available to registered participants before the workshop and will be available under open access terms (with no access fees) permanently after the workshop.

USENIX site for TaPP 2015 proceedings

Registration

Registration is coordinated with BICOD 2015 and it is possible to register for each event separately or jointly for both events (at a small discount). Registration includes access to the conference sessions, reception, coffee breaks and lunch.

The full registration fee for TaPP 2015 is £150. The student registration fee is £120. There is no early registration discount.

Registration is now closed.

Free registration for students

Up to six PhD students from Scottish universities will be offered free registration for BICOD and TaPP subsidised by SICSA. If you are an eligible student who wishes to register for BICOD and TaPP, please contact James Cheney (jcheney@inf.ed.ac.uk). These free places will be offered on a first-come, first-served basis. Please apply as soon as possible.

Local information

Please see the BICOD 2015 local information page for information about travel to Edinburgh, hotels, the conference venue, and sightseeing options.

If you are in need of a letter to support an application for a UK Visa, please follow the instructions here as soon as possible.

Announcements

10-Jul-2015: TaPP 2015 is over! TaPP 2016 is planned as part of ProvenanceWeek 2016, in the Washington DC area, June 6-9, 2016!

19-Jun-2015: The program and list of accepted posters is available

4-Jun-2015: The list of accepted papers is available

27-May-2015: Poster deadline extended to May 29

4-May-2015: Registration is open!

22-Apr-2015: Submission deadline extended to May 1

24-Mar-2015: Updated with more information about invited speakers

13-Feb-2015: Updated with link to ACM SIGPLAN proceedings format

16-Jan-2015: Updated page with links to USENIX proceedings page and local information.

16-Dec-2014: Preliminary call for papers.

17-Nov-2014: TaPP 2015 web page posted.