# Designing Data-Intensive Applications

## 1. Reliable, Scalable, and Maintainable Applications

![1. Reliable, Scalable, and Maintainable Applications](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2F64b7110edd9ee2a24ba567e3a8523c0346039b24.png?generation=1589985423509875\&alt=media)

## 2. Data Models and Query Languages

![2. Data Models and Query Languages](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2F843c523a2a2217827948931ce52571b00df877ce.png?generation=1589989470932185\&alt=media)

## 3. Storage and Retrieval

![3. Storage and Retrieval](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2Fe5ed72b52626efe29a209d312be1810ca0a56be7.png?generation=1603951355789680\&alt=media)

## 4. Encoding and Evolution

![4. Encoding and Evolution](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2Fa01af2222525185ba2b882334d85ead3532c6ee6.png?generation=1590105572079604\&alt=media)

A note on B-tree lightweight lock for concurrency control: My intuition is that we can use a R/W lock on the nodes we visited.

## 5. Replication

![5. Replication](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2Fb6337243af3fb248fc2536c4eca59d6b280df411.png?generation=1602824314195069\&alt=media)

## 6. Partitioning

![6. Partitioning](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2Fd800602702447ded8a6002eb56d8068347e2c579.png?generation=1602819362769752\&alt=media)

### Consistent hashing

A hashing strategy for easier rebalancing. Used both in data partitioning and request load balancing.

Map nodes and keys into a same space using a same hash function. A key would be stored on the node with a successor hashed value. Concatenate the begin and the end of this hashed space so every key has a node successor.

### Other view of paritioning

#### From the Grokking the System Design Interview course in my words

* Layers of paritioning:
  * Partition by key. Lowest level. This is often what we talk about when discussing partitioning.
  * Partition by feature. Different service storages could be considered to be partitions of one large system.
  * Partition query layer. The level closest to the application. At this level we can add a service to abstracts away the detail of partitioning methods and make life easier for application writers.
* Partitioning Criteria
  * Hash (+ mod N round robin)
  * Key range
  * Compound: hash + key range
* Common problems
  * Joins are usually not supported because of inefficiency: may be solved by denormalization (keep redundant information).
  * Foreign key constraints not supported: implement in the application
  * Rebalancing

## 7. Transactions

![7. Transactions](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2F2324da303fa85aaca0f026a2bdc979967a275bf1.png?generation=1604029165375064\&alt=media)

## 8. The Trouble with Distributed Systems

![8. The Trouble with Distributed Systems](https://2432187657-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M61Gh75cV28cjPX0aId%2Fsync%2F6803a141824d126a31508b97404948805b694c5d.png?generation=1589953879442552\&alt=media)
