The submission site is here.
Algebraic Run-Time Optimization for Multiset Programming
Fritz Henglein, Department of Computer Science (DIKU) University of Copenhagen
We describe Dynamic Symbolic Computation, a simple method for gradually enhancing a base implementation for a domain-specific language with computational performance improvements based on algebraic optimizations executed at run time. It consists of adding constructors for computationally expensive operations if this facilitates asymptotic performance improvements in some contexts while incurring only constant-factor overhead. The resulting implementation can be thought of as executing standard computation steps from the base implementation interleaved with symbolic computation steps on the newly introduced constructors. We illustrate Dynamic Symbolic Computation by evolving list representations of multisets in a generic SQL-style query language into sophisticated data structures with symbolic unions, outer products and scalar multiplication. The resultant data structures, in turn, lend themselves to efficient extensions of multiset operations to fuzzy sets, fuzzy multisets and discrete probability distributions.
Challenges in Reducing the Pain of Feature Engineering for Trained Systems
Christopher Ré, University of Wisconsin
A new generation of systems, including web search, Google's Knowledge Graph, IBM's Watson/DeepQA, and several recommendation systems, combine rich data sources with software driven by machine learning; we call such systems trained systems. The spectacular successes of these trained systems are among the most notable in all of computing, and are generating excitement in high-value industries like health care, finance, energy, and elsewhere. A pain point when engineering trained systems is extracting features or signals, which are then fed to a machine learning component. Currently, extracting features is a tedious process but has little infrastructure support. If trained systems are to make an impact outside of a few high-end applications, a key challenge is to make feature extraction easier, faster, and more effective. We argue that part of the solution is not a machine learning problem per se; instead it is a problem that may require input from several areas of computer science including software engineering, databases, and programming languages.
This talk will be long on challenges and short on solutions. I will describe some of the challenges that our research group has faced trying to allow people without computer science PhDs to engineer features for trained systems. To make these challenges concrete, I will describe trained systems that the Hazy group has built over the last few years.
Video overviews of initial demonstrations of our systems are available on our group's YouTube channel, www.youtube.com/HazyResearch, or from our website, hazy.cs.wisc.edu (which also lists the students who did the real work.)
|9:00-10:00||Algebraic Run-Time Optimization for Multiset
|10:00-10:30||Enabling Operator Reordering in Data Flow Programs Through
Static Code Analysis
Fabian Hueske, Aljoscha Krettek and Kostas Tzoumas.
Programming for Software Defined Networks
Naga Praveen Katta, Jennifer Rexford and David Walker.
|11:30-12:00||Typing Massive JSON Datasets
Dario Colazzo, Giorgio Ghelli and Carlo Sartiani.
|12:00-12:30||Haskell DSLs for
Interactive Web Services
Andrew Farmer and Andy Gill.
|2:00-3:00||Challenges in Reducing the Pain of Feature Engineering for Trained Systems
|3:00-3:30||K3: Language Design for Building Multi-Platform,
Panchapakesan Shyamshankar, Zachary Palmer and Yanif Ahmad.
|3:30-4:00||Coffee / cake break|
Thanks to Torsten Grust for taking some pictures during the workshop, and to Malcolm Wallace for making video recordings of the talks (available on YouTube and via the links above).
|Final papers due:|
|ICFP 2012:||September 10-12|