Unleash the Power of Amazon DynamoDB: A Scalable, Performant, and Fully Managed NoSQL Database Service
Scalable and Performant: Elevate Your Data Management with Amazon DynamoDB, the Fully Managed NoSQL Database Service
Introduction:
Amazon DynamoDB is a NoSQL cloud database service that provides consistent performance at any scale. DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers. DynamoDB has consistent performance, availability, durability, and a fully managed serverless experience.
In 2021, during the 66-hour Amazon Prime Day shopping event, Amazon systems including Alexa, the Amazon.com sites, and Amazon fulfilment centres, made trillions of API calls to DynamoDB, peaking at 89.2 million requests per second, while experiencing high availability with single-digit millisecond performance. Reliability is essential, as even the slightest disruption can significantly impact customers.
The goal of the design of DynamoDB is to complete all requests with low single-digit millisecond latencies.
The Six Secrets of DynamoDB’s NoSQL Success:
DynamoDB is a fully managed cloud service - DynamoDB frees developers from the burden of patching software, managing hardware, configuring a distributed database cluster, and managing ongoing cluster operations. DynamoDB handles the resource provisioning, automatically recovers from failures, encrypts data, manages software upgrades, performs backups, and accomplishes other tasks required of a fully-managed service.
DynamoDB is highly available - DynamoDB ensures high availability and durability by replicating data across multiple data centres (Availability Zones) in AWS. In case of failures, automatic re-replication maintains data integrity. Customers can create global tables for disaster recovery and low-latency access. DynamoDB guarantees an availability SLA of 99.99% for regular tables and 99.999% for global tables across multiple AWS Regions.
DynamoDB employs a multi-tenant architecture - To maximize resource utilization and offer cost savings to customers, DynamoDB employs a strategy of storing data from multiple customers on the same physical machines. This approach enables efficient utilization of resources. To ensure isolation between different workloads of co-resident tables, DynamoDB implements resource reservations, closely manages to provision, and carefully monitors usage.
DynamoDB achieves boundless scale for tables - In DynamoDB, there are no fixed limits on the amount of data that each table can store. The tables have the flexibility to grow dynamically, expanding as per the demand of customers' applications. This elastic nature of DynamoDB allows it to seamlessly scale and accommodate the increasing data requirements, ensuring that customers' applications can handle any amount of data efficiently. DynamoDB is designed to scale the resources dedicated to a table from several servers to many thousands as needed
DynamoDB provides predictable performance - DynamoDB latencies are predictable. Even as tables grow from a few megabytes to hundreds of terabytes, latencies remain stable due to the distributed nature of data placement and request routing algorithms in DynamoDB. DynamoDB handles any level of traffic through horizontal scaling and automatically partitions and re-partitions data to meet an application’s I/O performance requirements.
DynamoDB supports flexible use cases - DynamoDB offers flexible data and consistency models for developers. Tables have no fixed schema, allowing for variable attributes and multi-valued options. Developers can choose strong or eventual consistency when reading items, catering to different application needs. DynamoDB's versatility empowers developers with customizable data access patterns.
Architecture:
-> A DynamoDB table is a collection of items, and each item is a collection of attributes. Each item is uniquely identified by a primary key. The schema of the primary key is specified at the table creation time. The primary key schema contains a partition key or a partition and sort key (a composite primary key). The partition key’s value is always used as an input to an internal hash function. The output from the hash function and the sort key value (if present) determines where the item will be stored. Multiple items can have the same partition key value in a table with a composite primary key. However, those items must have different sort key values.
-> DynamoDB also supports secondary indexes to provide enhanced querying capability. A table can have one or more secondary indexes. A secondary index allows querying the data in the table using an alternate key, in addition to queries against the primary key.
-> Any operation that inserts, updates, or deletes an item can be specified with a condition that must be satisfied for the operation to succeed. DynamoDB supports ACID transactions enabling applications to update multiple items while ensuring atomicity, consistency, isolation, and durability (ACID) across items without compromising the scalability, availability, and performance characteristics of DynamoDB tables.
-> A DynamoDB table is divided into multiple partitions to handle the throughput and storage requirements of the table. Each partition of the table hosts a disjoint and contiguous part of the table’s key-range. Each partition has multiple replicas distributed across different Availability Zones for high availability and durability.
-> In DynamoDB, replicas of a partition form a replication group. This group utilizes Multi-Paxos for leader election and consensus. Any replica can initiate an election. Once a replica is elected as the leader, it can maintain leadership by periodically renewing its leadership lease. The leader replica is responsible for handling write and strongly consistent read requests. When a write request is received, the leader generates a write-ahead log record for the corresponding key and sends it to the other replicas in the group. The application receives acknowledgement of the write once a quorum of replicas persist the log record to their local write-ahead logs.
-> DynamoDB supports strongly and eventually consistent reads. Any replica of the replication group can serve eventually consistent reads. The leader of the group extends its leadership using a lease mechanism. If the leader of the group is failure detected (considered unhealthy or unavailable) by any of its peers, the peer can propose a new round of elections to elect itself the new leader. The new leader won’t serve any writes or consistent reads until the previous leader’s lease expires. A replication group consists of storage replicas that contain both the write-ahead logs and the B-tree that stores the key-value data. To improve availability and durability, a replication group can also contain replicas that only persist recent write-ahead log entries.
-> DynamoDB consists of tens of microservices. Some of the core services in DynamoDB are the metadata service, the request routing service, the storage nodes, and the autoadmin service. The metadata service stores routing information about the tables, indexes, and replication groups for keys for a given table or index. The request routing service is responsible for authorizing, authenticating, and routing each request to the appropriate server. The storage service is responsible for storing customer data on a fleet of storage nodes. Each of the storage nodes hosts many replicas of different partitions. The autoadmin service is built to be the central nervous system of DynamoDB. It is responsible for fleet health, partition health, scaling of tables, and execution of all control plane requests. The service continuously monitors the health of all the partitions and replaces any replicas deemed unhealthy (slow or not responsive or being hosted on bad hardware). The service also performs health checks of all core components of DynamoDB and replaces any hardware that is failing or has failed.
-> Other DynamoDB services support features such as point-in-time restore, on-demand backups, update streams, global admission control, global tables, global secondary indices, and transactions.
Conclusion:
In conclusion, DynamoDB has solidified its position as a leading cloud-native NoSQL database, driving the success of thousands of applications across various industries. With its exceptional scalability, consistent performance, high availability, and simplified operations, developers have come to rely on DynamoDB as a foundational component of their projects.
Over more than 10 years, DynamoDB has maintained its core principles while introducing groundbreaking features that have revolutionized application development. The on-demand capacity feature enables developers to optimize resource allocation dynamically, while point-in-time backup and restore capabilities ensure efficient data recovery. The ability to replicate data across multiple regions provides geographic redundancy, bolstering reliability. Furthermore, DynamoDB's support for atomic transactions guarantees data integrity and consistency.