Affiliated with BICOD 2015, July 6-8, Edinburgh, United Kingdom
The Theory and Practice of Provenance workshop series was started in San Francisco in 2009 and held in San Jose (2010), Heraklion, Crete (2011), Boston (2012), Lombard, Illinois (2013) and Cologne, Germany (2014, as part of ProvenanceWeek 2014). TaPP aims to be a venue for early-stage and innovative research ideas related to provenance, and a forum to encourage exchange of ideas between researchers working on provenance and practitioners or potential users of such research. Industry and academic participants interested in provenance in any setting are welcome, and workshop contributions describing unsolved problems or new potential application areas for provenance research are particularly welcome.
We are pleased to announce that TaPP will feature two invited speakers:
- Professor Renee Miller, University of Toronto/IBM (joint invited speaker with BICOD 2015), Big Data Curation
- Professor Trevor Martin, University of Bristol, Uncertainty and Provenance in Collaborative Situation Awareness
We gratefully acknowledge support from the Scottish Informatics and Computer Science Alliance (SICSA) for this event. Thanks to their generous support, we will be able to offer free registration to up to 6 PhD students at Scottish institutions for TaPP/BICOD.
Call for papers
Focus
Provenance provides needed insight into the origins and derivation of data, as well as formal documentation that can be instrumental in data quality assessment, program debugging, and search. Research topics of relevance to TAPP span the entire metadata lifecycle: from modelling to capture, storage, usage, querying and mining, to security and interoperable exchange across systems. TAPP also invites application-oriented contributions, on provenance-aware systems and other practical usage of provenance.
Workshop Format
In keeping with its successful tradition, TaPP'15 is a workshop, as opposed to a mini-conference. We aim to provide a platform for presenting and discussing a range of fresh ideas, and actively encourage inter-disciplinary work beyond the confines of the data management community.
Important dates
Abstract submission | April 20, 2015 |
Paper submission deadline | |
Poster submission deadline | |
Author notification | |
Final versions due | June 15, 2015 |
Workshop | July 8-9, 2015 |
Submission instructions
What to Submit
Research Papers: Contributions are typically 4 and never more than 6 pages long. They should describe challenges for provenance research, brief descriptions of new applications, pie-in-the sky research ideas, and anything else that will help engage the researchers' minds. While brief and readable descriptions of research are encouraged, recycled conference submissions are strongly discouraged. Contributions are collected into online proceedings, hosted by Usenix and indexed by DBLP, Google Scholar, etc.
We expect the program to include a mixture of presentations and discussions. Authors should expect ample opportunity to present their ideas at the workshop.
Submissions should be no more than 6 pages in ACM SIGPLAN (two-column) format. If supporting material is needed, it may be included in an appendix, but the committee will not be obliged to read the appendix. Shorter submissions (under 4 pages) are welcome to describe early-stage work, and are not considered to be part of the formal proceedings.
Research paper contributions should be submitted online at: https://www.easychair.org/conferences/?conf=tapp15. As in previous years, accepted TaPP papers will be open access via a USENIX web site.
Posters: Please submit a 1-page poster proposal (in any reasonable format) summarizing the research topic of your poster presentation to tapp15@easychair.org, by the poster deadline. A draft of the poster itself may also be submitted but is not necessary. Authors of accepted poster proposals will be allocated a poster board area large enough for an A0 or A1 poster. Abstracts of posters will also be included on the TaPP proceedings site.
Organizers
Conference Chairs
- Paolo Missier, Newcastle University, PC co-chair
- Jun Zhao, Lancaster University, PC co-chair
- James Cheney, The University of Edinburgh - local chair
- Vanessa Braganholo, UFF, Brasil
- Adriane Chapman, The MITRE Corporation, USA
- Sarah Cohen-Boulakia, LRI, Universite Paris-Sud, France
- Vasa Curcin, King's College, London, UK
- Tom De Nies, Ghent University - iMinds, Belgium
- Saumen Dey, UC Davis, USA
- Lois Delcambre, Portland State University, USA
- Alan Fekete, University of Sydney, Australia
- Irini Fundulaki, ICS-FORTH, Greece
- Floris Geerts, University of Antwerp, Belgium
- Ashish Gehani, SRI International, USA
- Boris Glavic, Illinois Institute of Technology, USA
- Paul Groth, Elsevier, NL
- Melanie Herschel, University of Stuttgart, Germany
- Bertram Ludäscher, UIUC and NCSA, USA
- Simon Miles, King's College London, UK
- Luc Moreau, University of Southampton, UK
- Paolo Papotti, QCRI, Qatar
- Sudeepa Roy, University of Washington, USA
- Perdita Stevens, University of Edinburgh, UK
Program
Presenters: Please note that all presentations are 20 minutes plus 5 minutes for questions
Renee Miller's keynote is held in room G.07, jointly with BICOD. All other sessions are in IF G.07A
July 8 (Wednesday) | |
12:00 | Registration begins / lunch available |
13.55-14.00 | Welcome |
14.00-15.00 | Keynote by Renee Miller, Big Data Curation (chair: Paolo Missier) |
15.00-15.30 | Break |
15.30-17.10 | Session I: Capture (chair: Boris Glavic) |
17:30-19:30 | Reception |
July 9 (Thursday) | |
8.55-9.00 | Opening |
9.00-10.00 | Keynote by Trevor Martin, Uncertainty and Provenance in Collaborative Situation Awareness (chair: Jun Zhao) |
10.00-10.30 | Break |
10.30-12.10 | Session II: Query and System (chair: Bertram Ludäscher) |
12.10-12.40 | 3-Minute Gong show -- poster pitches (Chair: Simon Miles) |
12.40-14.00 | Poster session + Lunch |
14.00-15.15 | Session III: Scientific applications (Chair: Floris Geerts) |
15.15-15.45 | Break |
15.45-17:00 | Session IV: Foundations (Chair: Adriane Chapman) |
17:00-17:30 | Town Hall and conclusions |
Accepted papers (by session):
Session I: Capture (Chair: Boris Glavic)
- Timothy McPhillips, Shawn Bowers, Khalid Belhajjame, and Bertram Ludäscher. Retrospective Provenance Without a Runtime Provenance Recorder (Slides)
- Luc Moreau and Paul Groth. Provenance of Publications: A PROV style for latex
- Manolis Stamatogiannakis, Paul Groth and Herbert Bos. Decoupling Provenance Capture and Analysis from Execution (Slides)
- David Gammack and Adriane Chapman. Provenance Tipping Point
Session II: Query and System (Chair: Bertram Ludäscher)
- Amit Chavan, Silu Huang, Amol Deshpande, Aaron Elmore, Samuel Madden and Aditya Parameswaran. Towards a unified query language for provenance and versioning (Slides)
- Xing Niu, Raghav Kapoor, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic. Interoperability for Provenance-aware Databases using PROV and JSON (Slides)
- Adam Bates, Kevin Butler and Thomas Moyer. Take Only What You Need: Leveraging Mandatory Access Control Policy to Reduce Provenance Storage Costs
- Nikilesh Balakrishnan, Thomas Bytheway, Lucian Carata, Oliver R. A. Chick, James Snee, Sherif Akoush, Ripduman Sohan, Margo Seltzer, and Andy Hopper. Recent Advances in Computer Architecture: The opportunities and challenges for provenance (Slides)
Session III: Scientific applications (Chair: Floris Geerts)
- Daniel de Oliveira, Vítor Silva and Marta Mattoso. How much domain data should be in provenance databases?
- Joao Pimentel, Juliana Freire, Leonardo Murta and Vanessa Braganholo. Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow (Slides)
- Saumen Dey, Khalid Belhajjame, David Koop, Meghan Raul, and Bertram Ludäscher. Linking prospective and retrospective provenance in scripts
Session IV: Foundations (Chair: Adriane Chapman)
- Stefan Fehrenbach and James Cheney. Language-integrated Provenance in Links (Slides)
- Boris Glavic, Sven Köhler, Sean Riddle, and Bertram Ludäscher. Towards Constraint-based Explanations for Answers and Non-Answers (Slides)
- Maxime Debosschere and Floris Geerts. Cell-based causality for data repairs
Accepted posters:
- Boris Glavic, Tanu Malik, Quan Pham, Making Database Applications Shareable
- Simone I. Conte, Alan Dearle, Graham N. C. Kirby, Adrian O'Lenskie and Ian Paterson, Modelling Context and Provenance in a Sea of Data
- Xing Niu, Raghav Kapoor, Boris Glavic, Heuristic and Cost-based Optimization for Provenance Computation
- Weili Fu, Provenance for configuration language security
- Tanu Malik, Miao Yu, Cristian Vlaescu, PROVaaS: A Pay-as-you-go Service for Storing and Querying Provenance Data
- Alessandro Spinuso, Rosa Filgueira and Malcolm Atkinson, User Centered Provenance Management for Data Intensive Platforms
- Erisa Karafili, Hanne Riis Nielson and Flemming Nielson, Coordination languages for Provenance
- Mojtaba Eskandari, Bruno Crispo and Anderson Santana De Oliveira, A Proposal Architecture for Logical Data Tracking in Cloud
- Julien Lacroix and Omar Boucelma, Specification of a Provenance-Based Access Control Approach with PROV-CONSTRAINTS
- Mike Mineter, Collaboration for Research Enhancement by Active Metadata
Invited talks
Professor Renee Miller, University of Toronto, Big Data Curation
More than a decade ago, Peter Buneman used the term curated databases to refer to databases that are created and maintained using the (often substantial) effort and domain expertise of humans. These human experts clean the data, integrate it with new sources, prepare it for analysis, and share the data with other experts in their field. In data curation, one seeks to support human curators in all activities needed for maintaining and enhancing the value of their data over time. Curation includes data provenance, the process of understanding the origins of data, how it was created, cleaned, or integrated. Big Data offers opportunities to solve curation problems in new ways. The availability of massive data is making it possible to infer semantic connections among data, connections that are central to solving difficult integration, cleaning, and analysis problems. Some of the nuanced semantic differences that eluded enterprise-scale curation solutions can now be understood using evidence from Big Data. Big Data Curation leverages the human expertise that has been embedded in Big Data, be it in general knowledge data that has been created through mass collaboration, or in specialized knowledge-bases created by incentivized user communities who value the creation and maintenance of high quality data. In this talk, I describe our experience in Big Data Curation. This includes our experience over the last five years curating NIH Clinical Trials data that we have published as Open Linked Data at linkedCT.org. I overview how we have adapted some of the traditional solutions for data curation to account for (and take advantage of) Big Data.
Professor Trevor Martin, University of Bristol, Uncertainty and Provenance in Collaborative Situation Awareness
The so-called big data revolution has been characterised by an increase in sources of data as well as in the volume of data to be processed.
In many cases - for example, network behaviour and control, security monitoring, enterprise management information - the data for situation awareness and decision-making is drawn from multiple sources and must be integrated into a coherent whole as far as possible.
This process generally requires both machines and human analysts and experts, It includes compensating for different formats, different granularities and resolutions, identifying and correcting errors (both systematic and intermittent), as well as managing uncertainties and gaps in data. Often the process requires assumptions and choices to be made in arriving at a reasonably robust overview of a situation - for example, in deciding that a failed attempt to access a building is potentially malicious, we might need to take account of someone's recent travel, long term patterns of behaviour, current schedules of close colleagues, etc. where each of these components may have been derived from lower-level raw data. Provenance in this context refers to the derivation pathways and their overall reliability.
In this talk, I will describe the use of graded (fuzzy) representations in modelling and managing the uncertainties, reliability and granularity of derived data in combining sources for situation awareness.
Proceedings
As in previous years, the proceedings of TaPP 2015 will be published electronically by USENIX. The proceedings will be available to registered participants before the workshop and will be available under open access terms (with no access fees) permanently after the workshop.
USENIX site for TaPP 2015 proceedings
Registration
Registration is coordinated with BICOD 2015 and it is possible to register for each event separately or jointly for both events (at a small discount). Registration includes access to the conference sessions, reception, coffee breaks and lunch.
The full registration fee for TaPP 2015 is £150. The student registration fee is £120. There is no early registration discount.
Registration is now closed.
Free registration for students
Up to six PhD students from Scottish universities will be offered free registration for BICOD and TaPP subsidised by SICSA. If you are an eligible student who wishes to register for BICOD and TaPP, please contact James Cheney (jcheney@inf.ed.ac.uk). These free places will be offered on a first-come, first-served basis. Please apply as soon as possible.
Local information
Please see the BICOD 2015 local information page for information about travel to Edinburgh, hotels, the conference venue, and sightseeing options.
If you are in need of a letter to support an application for a UK Visa, please follow the instructions here as soon as possible.
Announcements
10-Jul-2015: TaPP 2015 is over! TaPP 2016 is planned as part of ProvenanceWeek 2016, in the Washington DC area, June 6-9, 2016!
19-Jun-2015: The program and list of accepted posters is available
4-Jun-2015: The list of accepted papers is available
27-May-2015: Poster deadline extended to May 29
4-May-2015: Registration is open!
22-Apr-2015: Submission deadline extended to May 1
24-Mar-2015: Updated with more information about invited speakers
13-Feb-2015: Updated with link to ACM SIGPLAN proceedings format
16-Jan-2015: Updated page with links to USENIX proceedings page and local information.
16-Dec-2014: Preliminary call for papers.
17-Nov-2014: TaPP 2015 web page posted.