AWS S3, Storage Gateways and Import/Export
Simple Storage Service (S3)
S3 provides secure, durable and highly scalable object storage. The key features to S3 are,
- Objects are replicated across availability zones for redundancy.
- Files can be between 1Byte to 5TB in size
- Provides unlimited storage
- Files are stored in Buckets, equivalent to a directory
- S3 is a universal namespace. i.e buckets must be uniquely named
- New objects can be read immediately. This is not the same with updates due to propagation. AWS refers to this as,
- Read after write consistency for PUTS for new objects
- Eventual consistency for overwrite PUTS and DELETEs
There are 4 storage Tiers, each offering differing levels of availability and durability, for your data.
- S3 - Data is stored stored across multiple devices in multiple facilities and is designed to sustain the loss of 2 facilities (availability zones). AWS guarantees 99.99% availability from the S3 platform and 99.999999999% durability
- S3 IA (Infrequently Accessed) - Designed for data that is accessed less frequently but requires rapid access when needed. Though this costs less than S3 you are charged a retrieval fee. AWS guarantees 99.99% availability from the S3 platform and 99.999999999% durability.
- Reduced Redundancy Storage - Provides less durability (99.99%). Ideal for non critical data. AWS guarantees 99.99% availability from the S3 platform and 99.99% durability.
- Glacier - Designed for long term storage and data that is not frequently accessed. Retrieval times range between 3-5 hours.
What do you mean by durable ? This basically translates to the % of objects, on average that will be lost.
In terms of charges. Users are charged based on the storage consumed, requests performed and the bandwidth used.
Versioning provides you with the ability to store versions for each of your objects, in turn offering a great backup mechanism. It is worth mentioning though, once enabled this feature can only be suspended not removed.
Objects are restored within the UI/S3 section via,
- Versions / Show
- Select File line with Delete Marker
- Actions / Delete
Cross Region Replication
Cross Region Replication synchronizes objects from one region to another. This feature relies on version control. Additionally, it is worth mentioning that objects stored after this feature is enabled are replicated.
Lifecycle Rules allow you to manage storage costs by controlling the lifecycle of your objects. You can define when an object should be moved to another storage tier, such as S3 IA, or Glacier. Likewise you can define the life of your object before it is deleted.
This feature can be used in conjunction with versioning, allowing to assign rules to both current and previous versions.
S3 Transfer Acceleration
S3 Transfer Acceleration improves the upload speeds to S3 via the use of CloudFront. This works by your file being uploaded to an Edge Location. Optimization methods are then used to send your data from the Edge Location back to the origin.
By default all newly created buckets are private. Access to your buckets can be controlled either at the bucket level (via bucket policies) or at an object level (via ACLs).
Below details the various encryption methods available to your data,
- In Transit - Data uploaded/downloaded to/from your bucket is secured by SSL/TLS
- At Rest - Below details the encryption methods for data that is not in transit.
- Server side Encryption
- S3 Managed Keys (SSE-S3) - Each object is encrypted with a unique (AES-256) key.
- AWS Key Management Service (SSE-KMS) - An envelope key is used to protect your data's encryption key. An audit trail is also created based on the consumption of these keys.
- Server Side Encryption with Customer Provided Keys (SSE-C) - The customer performs full management of the key.
- Client Side Encryption
- Data is encrypted (client side) and then it is uploaded to S3.
- Server side Encryption
Import/Export is designed for importing/exporting large data sets. There are 2 AWS products within the Import/Export suite,
- Import/Export Disk - Data is exported to a physical disk and is then sent to AWS. It is then imported into either EBS, S3, or Glacier. Additionally, data can also be exported from S3.
- Import/Export Snowball - Snowball is a secure enclosure that allows for petabyte to transport data into and out of AWS using highly secure enclosures to/from S3 only.
A Storage Gateway connects an on-premise software appliance (VirtualMachine) with AWS's cloud based storage. This allows you to treat S3 as an extension to your companies local storage infrastructure. There are 3 types of storage gateway,
- Gateway Stored Volumes - All data is stored on site. The Storage Gateway backs up all data to S3.
- Gateway Cached Volumes - Only the most freq accessed data is stored locally. The entire data set is stored in S3.
- Gateway Virtual Tape Library (VTL) - Integrates with NetBackup, Backup Exec etc to provide either,
- Virtual Tape Libraries - backed up by S3.
- Virtual Tape Shelf - backed up by Glacier.
Via the use of Amazon CloudFront, files can be rapidly delivered to locations around the globe using the Amazon CDN network.
CloudFront consists of Edge Locations, Origins and Distributions. These are explained further below,
- Edge Location - The location the content is cached. Objects are cached for the life of the TTL. Edge Locations do not always correspond to regions and/or AZ's.
- Origin - The Origin refers to the AWS resource that will be distributed, ie S3 bucket, EC2 instance etc.
- Distribution - A Distribution is a collection of Edge Locations.
CloudFront works with S3, EC2, Elastic Loadbalancer, and Route 53.