A Libertine of Computer Science

Tag: Big Data

Reservoir Sampling [蓄水池抽样算法]

Introduction to Disk-oriented DBMS

Spark Deployment [部署Spark]

Query Suspend and Resume

Revisit Join

Pull-based vs Push-based Query Engine

Hadoop Deployment [部署Hadoop]

Understanding of Spark Structured Streaming Execution via Source Code [通过源码理解Spark的结构化流执行]

BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data

Compile Hadoop [编译Hadoop]

Event in Yarn [Yarn事件机制]

Setup a Hadoop Standalone via VirtualBox on Ubuntu [搭建基于VirtualBox的Hadoop伪分布式集群]

ZooKeeper Cluster Setup [搭建ZooKeeper集群]

Introduction to Big Data Systems [大数据系统介绍]

Setup a Hadoop 2.x/3.x Distributed Cluster via VirtualBox on Ubuntu [搭建基于VirtualBox的Hadoop分布式集群]

Log-Structured Merge Trees [LSM Tree 介绍]