AWS Supported Database Types

  • Hosted Services
    • Relational (OLTP)
      • Multiple database engine providers
        • e.g. Amazon Aurora, SQL Server, Oracle
    • Non-Relational (NoSQL)
      • Amazon DynamoDB
    • Data Warehouse (OLAP)
      • Amazon Redshift
  • Customer Installed
    • “Bring Your Own License” (BYOL)
      • Self-install/ configuration on EC2s
Hosted Services
  • AWS Relational Database Services (RDS)
    • Amazon Aurora MySQL
    • Amazon Aurora PostgreSQL
    • Oracle
    • MS SQL Server
    • MySQL
    • PostgreSQL
    • MariaDB
Custom Instances — EC2 Based
  • Start the instance with the required Operating System (OS)
    • Select the Amazon Machine Image (AMI)
  • Install the database engine/ service
    • From ISO image
  • Create databases
  • Manage and maintain the database service
NoSQL Databases
  • Not based on standard SQL or relational design theory
    • Non-tabular model
    • E.g. a key-value, column-family, document, and graph databases
  • The design supports very fast transactions
  • DynamoDB is the AWS NoSQL Solution
Data Warehouse
  • A large central repository of consolidated data
  • Data can be aggregated from one or more source systems
  • Used for Online Analytical Processing (OLAP)
  • Redshift is the AWS OLAP/ warehousing solution
Relational Database Terminology
  • Rows = Tuples
  • Column = Attributes, Properties
  • Tables = Entities, Objects, Relations
  • Relationships
    • Primary Key
    • Foreign Key
    • Joins
Normalization
  • Process for evaluating and correcting data structures
    • Determines the best assignment of attributes to entities
  • Works thought a series of stages called normal forms
    • 1NF –> 2NF –> 3NF –> 4NF (optional)
  • Higher the normal form, (closer to 4NF)
    • Slower the reads
    • Faster the writes
EC2 Based Hosting
SetupBenefits & Considerations
1. Launch an instance
2. Install the database service
3. Config ports in security groups
4. Connect to the database
Complete control
– Manual performance management
– Manual updates

AWS Service-Based
SetupBenefits & Considerations
1. Launch the database
2. Connect to the database
Less control
– Automatic performance management
– Automatic updates

High-Availability (HA) Solutions

  • Increase availability
  • Increase recoverability
ClusteringStandby Instances
– Multiple servers/ instances
– One database with replication
– Increases availability
– Automatic failover
– Increased cost
– Multiple servers/ instances
– One database with replication
– Increases recoverability
– No automatic failover
– Reduced cost
Single AZ DeploymentMulti AZ Deployment
– One instance in one AZ in one region
– Reduced cost
– Multiple instances in multiple AZs in one region
– Increased availability (replication)
– Increased performance
– Increased cost

Scalability Solutions

  • Increase or decrease capacity
    • Storage
    • Processing
    • Network operations e.g. throughput
Scaling the InstanceRead Replica
– Change the instance type/ class
– Auto-scaling is not supported in RDS. However, it can be scripted with CLI commands
– A read-only copy of the database
– Offload read-only traffic from the main database
– Multiple instances can be in different regions to increase the availability

Database Security

EncryptionPermissions
– RDS databases support “at rest” encryption (storage encryption/ physical)
– Must be encrypted at creation time
– Can be enabled on recovery (manually)
– Admin access is based on IAM
– Data access is based on database capabilities e.g. CRUD, DB Admin roles etc.

Amazon Aurora

  • Relational database service
  • Optimized for Online Transactional Processing (OLTP)
    • Very fast data writes
  • MySQL compatible database system
  • Increased performance over MySQL (5x)
  • Has 6 replicas by default with a max of 15 replicas
Scaling Aurora
  • Initially 10 GB and scales in 10 GB increments
    • Max 64 TB
  • Compute resources
    • Max 32 CPUs
    • Max 244 BiB RAM
Aurora Availability
  • Availability defaults
    • Two (2) database copies in each AZ
    • Min three (3) AZs
  • Write capabilities
    • Continues with up to two (2) copies lost
  • Read capabilities
    • Continues with up to three (3) copies lost
Aurora Replicas
  • Up to 15 Aurora replicas
    • Automatic failover
  • Up to 5 MySQL replicas
    • No automatic failover

Amazon Redshift

  • Amazon’s data warehouse solution
  • Optimized for Online Analytical Processing (OLAP)
  • AWS Managed Service
  • Pricing
    • Entry point of $0.25/h or $1,000 per TB/year
  • Single node implementation
    • 160 GB
  • Multiple nodes
    • Leader node
      • Connections and Queries
    • Compute Node
      • Stores data and executes queries and calculations
Redshift Speed
  • Column based datastore
    • Sequential reads
    • Very fast reads
  • Data compression
  • Massively Parallel Processing (MPP)
Redshift Security
  • SSL in transit encryption
  • AES-256 storage encryption
  • Keys managed through AWS Key Management
Redshift Availability
  • Operates in one AZ
  • Snapshots can be restored in other AZs

Amazon DynamoDB

  • NoSQL database service
  • Eventual consistency model
  • Provides special features
    • Milliseconds latency at any scale
      • Very fast reads and writes
    • Stored on SSD
    • spread across 3 distinct data centers
  • Read consistency types
    • Eventual consistency reads
      • Seconds delay
    • Strongly consistent reads
      • Milliseconds delay
  • You create tables in DynamoDB now databases
  • Partition Key = Primary Key
  • Items that have different values
  • All about small fast transactions
DynamoDB Pricing
  • Storage
    • $0.25/GB per month
  • Throughput
    • Writes: billed per hour for every 10 units
    • Reads: billed per hour for every 50 units
    • Where 1 unit = 1 write per second
  • Use AWS Pricing Calculator for more accurate numbers
Other Notes

Multi-AZ — getting failover nodes
Read Replicas — scaling out for performance
RDS Backups — when retention is 0 backups are disabled

The maximum retention for an RDS database is 35 days! By default, AWS backups the database regularly using snapshots. Snapshots can also be created manually. Restoring a database is creating a new instance.

Database monitoring includes real-time monitoring and log review. You can see performance statistics related to databases. It should be used to improve the performance and availability of any database.