BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
BlinkDB[1] uses two key ideas: (1) an adaptive optimization framework that builds and maintains a set of multi-dimensional stratified samples from original data over time, and (2) a dynamic sample selection strategy that selects an appropriately sized sample based on a query’s accuracy or response time requirements.