A Libertine of Computer Science
Categories
Book (8)
Development (66)
Research (68)
Writing (5)
Tags
Adjustment (12)
Algorithm (13)
Big Data (16)
Blockchain (1)
C&CPP (14)
Compile (6)
Concurrency (5)
CPU (5)
CUDA (5)
Database (25)
Distribute Computing (19)
Docker (5)
DuckDB (4)
FileSystem (12)
Git (1)
GPU (5)
Hash (1)
Idiom (1)
Java (10)
Latency (4)
Linux (1)
LLM (1)
Makefile (2)
Machine Learning (16)
NoSQL (1)
Note (12)
OS (8)
Paper (3)
Parallelism (7)
Pointer (3)
Probability (12)
Python (12)
RDMA (2)
Recommendation (1)
Reinforcement Learning (10)
Shell (3)
TensorFlow (5)
Virtualization (4)
Tag: Big Data
Reservoir Sampling [蓄水池抽样算法]
Introduction to Disk-oriented DBMS
Spark Deployment [部署Spark]
Query Suspend and Resume
Revisit Join
Pull-based vs Push-based Query Engine
Hadoop Deployment [部署Hadoop]
Understanding of Spark Structured Streaming Execution via Source Code [通过源码理解Spark的结构化流执行]
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data
Compile Hadoop [编译Hadoop]
Event in Yarn [Yarn事件机制]
Setup a Hadoop Standalone via VirtualBox on Ubuntu [搭建基于VirtualBox的Hadoop伪分布式集群]
ZooKeeper Cluster Setup [搭建ZooKeeper集群]
Introduction to Big Data Systems [大数据系统介绍]
Setup a Hadoop 2.x/3.x Distributed Cluster via VirtualBox on Ubuntu [搭建基于VirtualBox的Hadoop分布式集群]
Log-Structured Merge Trees [LSM Tree 介绍]