Introduction to Distributed Databases
What is a Distributed Database?
A distributed database is a database system that stores data across multiple physical locations, either on the same network or across different geographic locations. Unlike traditional centralized databases, data is distributed across multiple nodes, servers, or data centers.
Why Distributed Databases?
Modern applications require distributed databases to address several critical challenges:
- Scalability: Handle massive amounts of data and traffic by distributing load
- High Availability: Ensure service continuity even when components fail
- Performance: Reduce latency by placing data closer to users
- Fault Tolerance: Survive hardware failures and disasters
- Geographic Distribution: Serve users globally with low latency
Types of Distributed Databases
1. Homogeneous Distributed Databases
All sites use the same database management system (DBMS) and appear as a single logical database to users.
- Same DBMS across all nodes
- Easier to manage and maintain
- Example: Multiple PostgreSQL instances synchronized together
2. Heterogeneous Distributed Databases
Different sites may use different DBMSs with varying schemas and data models.
- Different DBMSs at different locations
- More complex integration
- Example: MySQL at one site, Oracle at another, integrated through middleware
Distributed Database Architecture
| Architecture | Description | Use Case |
|---|---|---|
| Client-Server | Clients connect to one or more database servers | Web applications, mobile apps |
| Peer-to-Peer | Each node is both client and server | Blockchain, decentralized systems |
| Multi-tier | Multiple layers (presentation, application, database) | Enterprise applications |
Key Challenges in Distributed Databases
1. Data Distribution
Deciding how to distribute data across nodes while maintaining performance and consistency.
2. Transaction Management
Ensuring ACID properties across multiple nodes is complex and requires sophisticated protocols.
3. Network Partitioning
Handling scenarios where network failures split the database into isolated segments.
4. Data Consistency
Maintaining consistent data across all replicas when updates occur.
CAP Theorem Preview
The CAP theorem states that distributed databases can only guarantee two of three properties: Consistency, Availability, and Partition Tolerance. We'll explore this in detail in Lesson 4.
Popular Distributed Database Systems
| Database | Type | Key Features |
|---|---|---|
| Cassandra | NoSQL (Column-family) | High availability, linear scalability |
| MongoDB | NoSQL (Document) | Flexible schema, horizontal scaling |
| CockroachDB | SQL (Relational) | ACID compliant, geo-distribution |
| Amazon DynamoDB | NoSQL (Key-value) | Managed service, auto-scaling |
| Google Spanner | SQL (Relational) | Global consistency, SQL support |
Basic Concepts
Node
An individual database server or instance in the distributed system. Each node stores a portion of the data.
Replica
A copy of data stored on multiple nodes to ensure availability and fault tolerance.
Shard
A horizontal partition of data. Each shard contains a subset of the total data.
Cluster
A collection of interconnected database nodes working together as a single system.
Real-World Example
Netflix uses Apache Cassandra with hundreds of nodes distributed across multiple AWS regions. This allows them to handle billions of requests per day, survive data center failures, and provide low-latency access to users worldwide.
Test Your Knowledge - Lesson 1
Answer the following questions to proceed. You need 70% (2/3 correct) to pass.
Question 1: What is a key benefit of distributed databases?
Question 2: What is a shard in distributed databases?
Question 3: Which database is known for being ACID compliant with geo-distribution?
Replication Strategies
What is Database Replication?
Database replication is the process of copying and maintaining database objects (like tables and databases) in multiple database servers. This creates redundant copies of data to improve availability, fault tolerance, and read performance.
Benefits of Replication
- High Availability: If one server fails, others can continue serving requests
- Load Distribution: Read queries can be distributed across replicas
- Disaster Recovery: Data is preserved even if one location is destroyed
- Geographic Distribution: Place data closer to users for lower latency
Replication Architectures
1. Master-Slave Replication (Primary-Secondary)
One primary node handles all writes, and changes are replicated to one or more secondary nodes that handle reads.
Advantages:
- Simple to implement and understand
- Scales read operations effectively
- Reduced load on the master for read-heavy workloads
Disadvantages:
- Single point of failure for writes (master)
- Replication lag may cause stale reads
- Limited write scalability
2. Master-Master Replication (Multi-Master)
Multiple nodes can accept both read and write operations, with changes synchronized between all masters.
Advantages:
- No single point of failure
- Scales both reads and writes
- Better availability
Disadvantages:
- Complex conflict resolution needed
- Risk of write conflicts
- More difficult to maintain consistency
3. Peer-to-Peer Replication
All nodes are equal peers, each can read and write. Used in NoSQL databases like Cassandra.
Synchronous vs Asynchronous Replication
Synchronous Replication
Write operations must be confirmed by replicas before completing the transaction.
Use When: Data consistency is critical (financial transactions, inventory management)
Asynchronous Replication
Write operations are confirmed immediately after being written to the primary, replicas updated later.
Use When: Performance is critical and you can tolerate eventual consistency (social media feeds, caching)
Replication Lag
Replication lag is the delay between when data is written to the primary and when it appears on replicas.
Handling Replication Lag
Strategies to mitigate replication lag issues:
- Read-your-writes consistency: Route user's reads to primary after they write
- Monotonic reads: Always read from the same replica for a session
- Monitoring: Track lag metrics and alert when exceeding thresholds
- Connection pooling: Use smart routing to direct traffic appropriately
Replication Topologies
| Topology | Structure | Best For |
|---|---|---|
| Star | Central master, multiple slaves | Simple setups, read scaling |
| Chain | Master → Slave1 → Slave2 → Slave3 | Geographic distribution |
| Ring | Circular replication between nodes | Multi-master scenarios |
| Mesh | All nodes replicate to all others | Maximum redundancy |
Real-World Example: Instagram
Instagram uses PostgreSQL with master-slave replication. They use the master for writes and route read queries to multiple read replicas. During peak times, they can have 12+ read replicas to handle billions of read requests per day.
PostgreSQL Replication Example
Test Your Knowledge - Lesson 2
Answer the following questions to proceed. You need 70% (2/3 correct) to pass.
Question 1: In Master-Slave replication, where are write operations handled?
Question 2: What is the main advantage of asynchronous replication over synchronous?
Question 3: What is replication lag?
Partitioning and Sharding
What is Data Partitioning?
Data partitioning is the process of dividing a large database into smaller, more manageable pieces called partitions. Each partition contains a subset of the data and can be stored on different servers or storage devices.
Why Partition Data?
- Scalability: Distribute data across multiple servers to handle more data
- Performance: Smaller datasets mean faster queries
- Availability: Partition failures don't affect entire database
- Manageability: Easier to backup, restore, and maintain smaller partitions
Partitioning vs Sharding
While often used interchangeably, there's a subtle difference:
- Partitioning: Splitting data within a single database instance (vertical or horizontal)
- Sharding: Horizontal partitioning across multiple database instances/servers
Types of Partitioning
1. Horizontal Partitioning (Sharding)
Dividing rows of a table across multiple partitions. Each partition has the same schema but different rows.
2. Vertical Partitioning
Dividing columns of a table across multiple partitions. Each partition has different columns but the same rows.
Use Case: Separate frequently accessed columns from rarely accessed ones.
Sharding Strategies
1. Range-Based Sharding
Partition data based on ranges of a key (e.g., date ranges, alphabetical ranges).
Advantages: Simple to implement, easy to add new ranges
Disadvantages: Can lead to unbalanced shards (hotspots)
2. Hash-Based Sharding
Use a hash function on a shard key to determine which shard stores the data.
Advantages: Even data distribution, no hotspots
Disadvantages: Difficult to add/remove shards (requires re-hashing)
3. Directory-Based Sharding
Use a lookup table (directory) that maps keys to shards.
Advantages: Flexible, can use complex logic
Disadvantages: Directory is a single point of failure, needs maintenance
4. Geographic Sharding
Partition data based on geographic location for lower latency.
Consistent Hashing
An advanced hashing technique that minimizes data movement when adding or removing shards.
Sharding Challenges
- Cross-shard queries: Queries spanning multiple shards are expensive
- Shard rebalancing: Moving data between shards is complex
- Hotspots: Uneven access patterns can overload specific shards
- Transactions: Distributed transactions across shards are complex
- Application complexity: Application must be shard-aware
Choosing a Shard Key
The shard key determines how data is distributed. A good shard key should:
- High cardinality: Many unique values for even distribution
- Avoid hotspots: No single value accessed much more than others
- Query-friendly: Most queries can target a single shard
- Immutable: Shouldn't change after record creation
Real-World Example: MongoDB Sharding
MongoDB automatically shards collections across multiple servers. For example, Foursquare uses MongoDB sharding to manage billions of check-ins. They shard by user_id to ensure each user's data is on one shard, making queries efficient.
MongoDB Sharding Example
Test Your Knowledge - Lesson 3
Answer the following questions to proceed. You need 70% (2/3 correct) to pass.
Question 1: What is horizontal partitioning?
Question 2: What is the main advantage of hash-based sharding?
Question 3: Which is a good characteristic for a shard key?
Consistency Models & CAP Theorem
The CAP Theorem
The CAP theorem, proposed by Eric Brewer, states that a distributed database system can only guarantee two of the following three properties simultaneously:
The Three Properties
- Consistency (C): All nodes see the same data at the same time
- Availability (A): Every request receives a response (success or failure)
- Partition Tolerance (P): System continues operating despite network partitions
In reality, network partitions are inevitable in distributed systems, so you must choose between Consistency and Availability.
CAP Trade-offs
CP Systems (Consistency + Partition Tolerance)
Sacrifice availability to maintain consistency during network partitions.
AP Systems (Availability + Partition Tolerance)
Sacrifice consistency to maintain availability during network partitions.
CA Systems (Consistency + Availability)
In practice, CA systems don't exist in truly distributed environments because network partitions are inevitable.
CA Systems Reality Check
Traditional relational databases (MySQL, PostgreSQL) running on a single node can be considered CA, but they're not truly distributed. Once you distribute them, you must handle partitions, moving into CP or AP territory.
Consistency Models
1. Strong Consistency (Linearizability)
Once a write completes, all subsequent reads will see that write or a later one.
2. Eventual Consistency
If no new updates are made, all replicas will eventually converge to the same value.
3. Causal Consistency
Writes that are causally related are seen by all nodes in the same order.
4. Read-Your-Writes Consistency
A client will always see their own writes in subsequent reads.
5. Monotonic Reads
Once a client reads a value, it will never see an older value in subsequent reads.
ACID vs BASE
ACID (Traditional Databases)
| Property | Description |
|---|---|
| Atomicity | All or nothing: transaction fully completes or fully fails |
| Consistency | Database moves from one valid state to another |
| Isolation | Concurrent transactions don't interfere with each other |
| Durability | Completed transactions survive system failures |
BASE (Distributed NoSQL)
| Property | Description |
|---|---|
| Basically Available | System guarantees availability (CAP theorem) |
| Soft state | State may change over time without input (convergence) |
| Eventual consistency | System will become consistent over time |
Quorum-Based Consistency
Use voting to balance consistency and availability. Common in distributed databases like Cassandra and DynamoDB.
Cassandra Consistency Levels
Cassandra allows you to tune consistency per query:
ONE: 1 node responds (fastest, least consistent)QUORUM: Majority of nodes respondALL: All nodes respond (slowest, most consistent)LOCAL_QUORUM: Quorum in local datacenter
Real-World Example: Amazon
Amazon's DynamoDB uses eventual consistency by default for high availability. However, critical operations like checkout and payment use strong consistency settings to prevent inventory overselling and payment errors.
Conflict Resolution
In eventually consistent systems, conflicts can occur when the same data is updated on multiple nodes simultaneously.
Conflict Resolution Strategies:
- Last Write Wins (LWW): Use timestamps, newest write wins (simple but can lose data)
- Version Vectors: Track causality, identify true conflicts
- Application-level resolution: Let application decide (e.g., shopping cart merge)
- CRDTs: Conflict-free Replicated Data Types that merge automatically
Test Your Knowledge - Lesson 4
Answer the following questions to proceed. You need 70% (2/3 correct) to pass.
Question 1: According to CAP theorem, which two properties can be guaranteed together in the presence of network partitions?
Question 2: What does eventual consistency guarantee?
Question 3: For strong consistency in a quorum system with N=3 replicas, which setting works?
Database Failover Mechanisms
What is Database Failover?
Database failover is the automatic or manual process of switching to a standby database server when the primary server fails. It's a critical component of high-availability database systems.
Goals of Failover
- Minimize Downtime: Restore service as quickly as possible
- Prevent Data Loss: Ensure no committed transactions are lost
- Maintain Consistency: Keep data consistent across nodes
- Transparent to Users: Users shouldn't notice the switch
Types of Failover
1. Automatic Failover
System automatically detects failure and promotes a standby without human intervention.
2. Manual Failover
Database administrator manually triggers failover after assessing the situation.
3. Planned Failover (Switchover)
Controlled failover during maintenance windows with zero data loss.
Failure Detection
Health Check Methods:
- Heartbeat: Regular "alive" signals between nodes
- TCP Connection: Monitor database port connectivity
- Query-based: Execute test queries periodically
- Replication lag: Monitor replication delay
Split-Brain Problem
A dangerous scenario where two nodes both think they're the primary, potentially causing data divergence and corruption.
Split-Brain Scenario
Fencing and STONITH
Techniques to prevent split-brain by forcibly isolating or shutting down the old primary.
Consensus Algorithms
Raft Consensus
A consensus algorithm used in many distributed databases for leader election and maintaining consistency.
Paxos Consensus
An older, more complex consensus algorithm that Raft was designed to improve upon.
Failover Patterns by Database
PostgreSQL with Patroni
MySQL Group Replication
MongoDB Replica Sets
Monitoring Failover Systems
Key Metrics to Monitor:
| Metric | Description | Alert Threshold |
|---|---|---|
| Replication Lag | Time delay between primary and standby | > 10 seconds |
| Health Check Status | Is the primary responding? | Any failure |
| Connection Count | Active database connections | > 80% of max |
| Failover Count | Number of failovers | > 1 per week |
| Recovery Time | Time to complete failover | > 5 minutes |
Real-World Example: GitHub
In 2018, GitHub experienced a 24-hour outage due to a split-brain scenario in their MySQL cluster. Network partition caused both data centers to think they were primary. They had to carefully reconcile divergent data. Since then, they've improved their consensus mechanisms and monitoring.
Testing Failover
Regular failover testing is crucial to ensure your system works when needed.
Test Your Knowledge - Lesson 5
Answer the following questions to proceed. You need 70% (2/3 correct) to pass.
Question 1: What is the split-brain problem?
Question 2: What is STONITH in database failover?
Question 3: What is the main advantage of automatic failover over manual failover?
Real-World Implementation & Best Practices
Designing a Distributed Database System
Implementing a production-ready distributed database system requires careful planning and consideration of multiple factors.
Key Design Decisions
- Database Selection: SQL vs NoSQL, specific technology
- Replication Strategy: Master-slave, multi-master, or peer-to-peer
- Partitioning Strategy: How to shard data across nodes
- Consistency Model: Strong vs eventual consistency
- Failover Approach: Automatic, manual, or hybrid
- Geographic Distribution: Single region vs multi-region
Multi-Region Architecture
Distributing databases across geographic regions for global availability and low latency.
Multi-Region Challenges:
- Cross-region latency: Speed of light limits
- Data sovereignty: Legal requirements for data location
- Consistency: CAP theorem trade-offs amplified
- Cost: Data transfer between regions is expensive
Best Practices for High Availability
1. Redundancy at Every Layer
2. Automate Everything
- Automated failover
- Automated backups
- Automated monitoring and alerting
- Automated scaling
- Infrastructure as Code (Terraform, CloudFormation)
3. Monitor Proactively
4. Test Failure Scenarios
Backup Strategies
3-2-1 Backup Rule
- 3 copies of data (original + 2 backups)
- 2 different storage types (disk + cloud)
- 1 offsite backup (different geographic location)
Security Best Practices
Network Security
- Firewall rules: Only allow necessary ports
- VPC/Private networks: Isolate database traffic
- SSL/TLS encryption: Encrypt data in transit
- VPN/Bastion hosts: Secure administrative access
Access Control
Encryption
- At rest: Encrypt database files (LUKS, dm-crypt, AWS EBS encryption)
- In transit: SSL/TLS for all connections
- Backups: Encrypt backup files before storing
- Sensitive data: Application-level encryption for PII
Performance Optimization
Connection Pooling
Query Optimization
Caching
Real-World Case Studies
Case Study 1: Uber
Challenge: Global ride-hailing service needs low latency worldwide.
Solution:
- Schemaless (MySQL-based) for flexible data model
- Geographic sharding by city
- Multi-region deployment
- Eventually consistent for non-critical data
- Strong consistency for billing/payments
Result: Handles millions of trips per day with <50ms latency for critical operations.
Case Study 2: Discord
Challenge: Real-time chat with billions of messages, rapid growth.
Solution:
- Started with MongoDB, migrated to Cassandra
- Partitioned by channel_id
- Eventually consistent (acceptable for chat)
- Custom message ID scheme for ordering
- Aggressive caching with Redis
Result: Scales to 140 million monthly active users, storing trillions of messages.
Case Study 3: Airbnb
Challenge: Complex booking system requiring strong consistency.
Solution:
- MySQL with master-slave replication
- Sharded by listing_id
- Synchronous replication for bookings
- Asynchronous for search/browse
- Read replicas for analytics
Result: Prevents double-bookings while maintaining high availability.
Choosing the Right Database
| Use Case | Recommended Database | Why |
|---|---|---|
| Financial transactions | PostgreSQL, CockroachDB | ACID compliance, strong consistency |
| Social media feed | Cassandra, DynamoDB | High write throughput, eventual consistency OK |
| E-commerce catalog | MongoDB, Elasticsearch | Flexible schema, full-text search |
| Real-time analytics | ClickHouse, TimescaleDB | Columnar storage, time-series optimization |
| Session storage | Redis, Memcached | In-memory, fast reads/writes |
| Graph relationships | Neo4j, Amazon Neptune | Optimized for graph queries |
Migration Strategies
Blue-Green Deployment
Common Pitfalls to Avoid
- Over-sharding: Too many small shards increases complexity
- Under-monitoring: You can't fix what you can't see
- Ignoring backups: Test restores regularly!
- No capacity planning: Know your growth trajectory
- Premature optimization: Start simple, scale when needed
- Ignoring security: Security should be built-in from day one
Summary: Key Takeaways
- Distributed databases trade simplicity for scalability and availability
- Choose consistency model based on business requirements (CAP theorem)
- Implement proper replication for fault tolerance
- Use sharding to scale beyond single-server limits
- Automated failover minimizes downtime but requires careful implementation
- Monitor everything and test failure scenarios regularly
- There's no one-size-fits-all solution - understand your requirements
Final Test - Lesson 6
Answer the following questions to complete the course. You need 70% (2/3 correct) to pass.