upgrade
upgrade

☁️Cloud Computing Architecture

Cloud Storage Types

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Cloud storage isn't just about "where your data lives"—it's about matching storage architecture to workload requirements. Exam questions will test whether you understand the tradeoffs between latency, durability, cost, and access patterns. You'll need to recognize when an application needs block storage's raw performance versus object storage's infinite scalability, or why a company might choose cold storage over hot storage for compliance data.

The key insight here is that storage types exist on spectrums: access frequency (hot vs. cold), data structure (structured vs. unstructured), and persistence (durable vs. ephemeral). Don't just memorize definitions—know what concept each storage type illustrates and when an architect would choose one over another.


Storage by Data Structure

How data is organized determines which storage type fits best. Unstructured data (images, logs, backups) needs flexible schemas, while structured data (transactions, user records) requires rigid organization for query efficiency.

Object Storage

  • Stores data as discrete objects with unique identifiers and metadata—no hierarchy, just a flat namespace that scales infinitely
  • API-driven access (typically RESTful) makes it ideal for web applications, data lakes, and backup systems
  • Eventual consistency model trades immediate consistency for massive scalability across distributed systems

File Storage

  • Hierarchical structure using directories and files—mimics traditional filesystem organization users already understand
  • Protocol support for NFS and SMB/CIFS enables seamless integration with legacy applications and shared workloads
  • File-level locking supports collaborative access, making it essential for shared home directories and content management systems

Database Storage

  • Optimized for structured data with ACID compliance—ensures transactions complete reliably or roll back entirely
  • Relational (SQL) vs. non-relational (NoSQL) choice depends on whether you need rigid schemas or flexible document structures
  • Query optimization engines enable complex joins and aggregations that would be impossible with raw file or object storage

Compare: Object Storage vs. File Storage—both store unstructured data, but object storage uses flat namespaces with API access while file storage maintains hierarchical paths with protocol-based access. If an exam question mentions "legacy application migration," file storage is usually the answer.


Storage by Access Pattern

Access frequency is the primary cost driver in cloud storage. Architects must predict how often data will be read or written to select the appropriate storage tier.

Hot Storage

  • Sub-millisecond latency for frequently accessed data—provisioned on SSDs or high-performance media
  • Highest cost per gigabyte reflects the premium hardware and redundancy required for real-time workloads
  • Use cases include active databases, session data, and real-time analytics where any delay impacts user experience

Cold Storage

  • Optimized for data accessed less than once per month—stored on cheaper, slower media like tape or spinning disk
  • Retrieval delays of minutes to hours are acceptable because the data isn't needed immediately
  • Compliance and regulatory retention often mandates keeping data for years, making cost efficiency critical

Archive Storage

  • Designed for "write once, read rarely" workloads—think legal holds, medical records, and audit logs
  • Lowest cost tier with retrieval times potentially measured in hours and additional fees for data access
  • Lifecycle policies can automatically transition data from hot to archive based on age or access patterns

Compare: Cold Storage vs. Archive Storage—both target infrequent access, but archive storage assumes even rarer retrieval (annually vs. monthly) with longer restoration times. FRQ tip: if the scenario mentions "regulatory compliance" or "7-year retention," archive storage is your answer.


Storage by Performance Requirements

Some workloads demand raw I/O performance that only certain storage architectures can deliver. Block storage operates at the lowest level, providing the speed databases and VMs require.

Block Storage

  • Divides data into fixed-size blocks with unique addresses—the operating system sees it as a raw disk device
  • Lowest latency option because there's no filesystem overhead; the application manages data organization directly
  • Essential for boot volumes, databases, and any workload requiring consistent IOPS—often attached directly to compute instances

Ephemeral Storage

  • Temporary storage tied to instance lifecycle—created at launch, destroyed at termination with no persistence
  • Highest performance tier because it's typically local SSD directly attached to the physical host
  • Perfect for scratch space, caches, and buffers where data loss is acceptable and speed is paramount

Compare: Block Storage vs. Ephemeral Storage—both offer high performance, but block storage persists independently of compute instances while ephemeral storage vanishes when the instance stops. Exam questions often test whether you'd use ephemeral storage for a database (you wouldn't—data would be lost).


Storage for Distribution and Flexibility

Modern architectures often span multiple locations or combine cloud with on-premises infrastructure. These storage types address geographic distribution and hybrid requirements.

Content Delivery Network (CDN)

  • Edge caching distributes content to servers geographically close to users—reduces round-trip latency dramatically
  • Origin server holds the master copy while edge nodes cache frequently requested content like images, videos, and static assets
  • Cache invalidation strategies (TTL, purge APIs) control how quickly updates propagate across the network

Hybrid Storage

  • Bridges on-premises infrastructure with cloud storage—data can live in either location based on policy
  • Cloud tiering automatically moves cold data to cheaper cloud storage while keeping hot data local for performance
  • Enables gradual cloud migration without forcing an all-or-nothing cutover, reducing risk and maintaining business continuity

Compare: CDN vs. Hybrid Storage—CDN distributes copies of content for read performance, while hybrid storage manages primary data placement across environments. CDN is about latency reduction; hybrid storage is about infrastructure flexibility.


Quick Reference Table

ConceptBest Examples
Unstructured data at scaleObject Storage, File Storage
High-performance workloadsBlock Storage, Hot Storage, Ephemeral Storage
Cost-optimized retentionCold Storage, Archive Storage
Structured data managementDatabase Storage
Geographic distributionCDN
Multi-environment flexibilityHybrid Storage
Temporary/disposable dataEphemeral Storage
Legacy application supportFile Storage (NFS/SMB protocols)

Self-Check Questions

  1. Which two storage types would you compare when deciding how to store millions of user-uploaded images that need web API access—and what's the key differentiator?

  2. A financial services company must retain transaction records for 7 years but only accesses them during annual audits. Which storage type is most cost-effective, and why wouldn't hot storage be appropriate?

  3. Compare and contrast block storage and object storage: what type of workload suits each, and how do their access methods differ?

  4. An application uses temporary files for video transcoding that can be regenerated if lost. The files need maximum I/O speed. Which storage type fits, and what's the critical tradeoff the architect accepts?

  5. A company wants to migrate to the cloud gradually while keeping latency-sensitive data on-premises. Which storage approach supports this, and how does it differ from using a CDN?