Towards Incremental Computation of Advanced Analytics
Halls department, Hall 5
Wednesday, 27 December 2017
11:00 - 12:00
Sophisticated analytics requires an advanced language that goes beyond relational calculus. For example, statistical models, machine learning programs, and graph algorithms are usually expressed as linear algebra programs. Currently, there exist systems and frameworks that optimize such programs under large volumes of data. Recently, with technological advancement, the velocity of incoming data has increased dramatically, thereby increasing the demand for online analytics. Under this setting, the re-evaluation of the analytics program on each dataset change is prohibitively expensive. To circumvent this, developers build ad-hoc online solutions that are amenable to dynamic datasets. However, this requires domain expertise and time-consuming labour. This work targets the Incremental View Maintenance of workloads expressed as linear algebra programs. Previous works on relational calculus IVM are not applicable for matrix algebra workloads. We first describe the challenges of IVM under this setting, then we describe an approach that represents delta changes in a factored compressed form. Finally, we present LAGO, a unified framework for matrix algebra that automatically generates incremental trigger programs, thereby freeing the user from erroneous manual derivations, performance tuning, and low-level implementation details.
Amir Shaikhha is a 4th year Ph.D. student at EPFL. His research aims to build efficient data analytics systems using high-level languages. More specifically, he is interested in using compilation techniques for generating efficient low-level code (e.g. C code) from the high-level specification (e.g. Scala code) of performance-critical systems (e.g. database systems). He received his M.Sc. from EPFL and B.S. from Sharif University of Technology.