❤️
Noob Coding
  • Noob Coding
  • Languages
    • PHP (Laravel)
    • Python
    • Fluent Python
  • Topics
    • Computer Networks
    • Designing Data-Intensive Applications
    • Information Retrieval
      • Ad Hoc Retrieval
      • Classification and Clustering
    • Operating Systems
    • Parallel Programming
Powered by GitBook
On this page
  • 1. Reliable, Scalable, and Maintainable Applications
  • 2. Data Models and Query Languages
  • 3. Storage and Retrieval
  • 4. Encoding and Evolution
  • 5. Replication
  • 6. Partitioning
  • Consistent hashing
  • Other view of paritioning
  • 7. Transactions
  • 8. The Trouble with Distributed Systems

Was this helpful?

  1. Topics

Designing Data-Intensive Applications

PreviousComputer NetworksNextInformation Retrieval

Last updated 4 years ago

Was this helpful?

1. Reliable, Scalable, and Maintainable Applications

2. Data Models and Query Languages

3. Storage and Retrieval

4. Encoding and Evolution

A note on B-tree lightweight lock for concurrency control: My intuition is that we can use a R/W lock on the nodes we visited.

5. Replication

6. Partitioning

Consistent hashing

A hashing strategy for easier rebalancing. Used both in data partitioning and request load balancing.

Map nodes and keys into a same space using a same hash function. A key would be stored on the node with a successor hashed value. Concatenate the begin and the end of this hashed space so every key has a node successor.

Other view of paritioning

From the Grokking the System Design Interview course in my words

  • Layers of paritioning:

    • Partition by key. Lowest level. This is often what we talk about when discussing partitioning.

    • Partition by feature. Different service storages could be considered to be partitions of one large system.

    • Partition query layer. The level closest to the application. At this level we can add a service to abstracts away the detail of partitioning methods and make life easier for application writers.

  • Partitioning Criteria

    • Hash (+ mod N round robin)

    • Key range

    • Compound: hash + key range

  • Common problems

    • Joins are usually not supported because of inefficiency: may be solved by denormalization (keep redundant information).

    • Foreign key constraints not supported: implement in the application

    • Rebalancing

7. Transactions

8. The Trouble with Distributed Systems

1. Reliable, Scalable, and Maintainable Applications
2. Data Models and Query Languages
3. Storage and Retrieval
4. Encoding and Evolution
5. Replication
6. Partitioning
7. Transactions
8. The Trouble with Distributed Systems