File storage is as important in the cloud as it is in any other IT workspace. The convenience and flexibility of the cloud offer many advantages. You can build infrastructure without significant upfront cost—and without the loss of precious floor space.
The good news for admins is that fault tolerance is already built into the cloud storage environment, but they’ll still face some critical decisions. One choice in your cloud journey will be to settle on a file storage format for cloud resources. Options include:
- Block storage
- Object storage
Cloud-based block storage is very much like the block storage on the desktop or in the server room. A block storage device splits the data into fixed-size blocks and writes (or reads) the data a block at a time.
Network block storage is an extension of this same block storage technology that has been in use on hard disks and other storage devices for more than 40 years. Block storage is still a cost-effective alternative in scenarios that require low latency and in situations where the data requires frequent access. Nevertheless, block storage also has some disadvantages. For instance, block storage is usually more expensive, and it doesn’t scale well to large datasets. For unstructured data, backups, archives, and other scenarios that don’t require frequent access, object storage often emerges as a better choice.
Object storage stores a single object (such as a file) all at once rather than breaking it into blocks. The location of the data is determined using a hash function, which precludes the need for a lookup table or other mapping component that could serve as a bottleneck. Object storage systems also provide rich and expansive metadata that support efficient searches and enable some management capabilities that aren’t possible with block storage.
Most object storage systems include automation and self-managing features that allow a single admin to manage more data than the administrator of a block storage system. Greater efficiency leads to lower per-GB cost. The emphasis on automation and self-management also means that object storage is well adapted for DevOps environments.
The S3 API is a popular API for object storage solutions in the cloud. Amazon originally developed S3, but it’s now available to all storage providers. S3 stores data in buckets referenced by URL. A bucket is a storage resource consisting of a data object plus descriptive metadata. An optional feature called bucket versioning lets you store multiple versions of the same data object in a single bucket, which creates a built-in form of version control.
Several cloud-based object storage solutions are compatible with S3, which means if you adopt an S3-compatible solution, you can easily migrate to another S3-compatible solution or add other S3-based services to your existing configuration. Many important network services have already built their interfaces to the S3 API, which means S3 is easy to integrate with Ceph, OpenStack, Kubernetes, and other technologies.
A RESTful, HTTP-based access model means you can easily build custom web applications that interface an S3-compatible data store with other resources on a local network or in the cloud. Compatible interfaces available through the leading container and virtualization tools ease integrating S3-based object storage into automated orchestration solutions that are essential to the DevOps environment.
The higher latency of object storage means that S3-compatible storage might not be the best choice for a structured database, but for unstructured data, backup sets, and similar S3-compatible object storage scenarios, its low cost and scalability often make it the optimum solution.
Object storage is frequently used to host libraries of audio, video, and photographic files. Because each file has its own URL, it is easy to embed the files in websites or integrate them into streaming solutions. The low per-GB cost of object storage also makes it a good choice for long-term backup and cold storage scenarios, where it replaces tape file storage as an inexpensive, offsite archive format that is much easier to access than a cold-stored tape.
When it comes to websites, a dynamic site that requires a content management system (CMS) or server-side processing language like PHP is probably better off with block storage, but for static websites built around fixed HTML files, object storage is a simple and cost-effective option.
If you are considering adding S3-compatible object storage to your cloud infrastructure, you’ll have the choice of several products and vendors. AWS provides S3 in the hyperscale context; however, midsize and smaller companies without a massive infrastructure often find that alternative cloud offers lower total cost of ownership and better cost-to-performance characteristics. The term alternative cloud refers to a class of cloud vendors that provide an enterprise-grade service catalog for companies that might not have a team of cloud experts on staff. The emphasis is on low cost, simplicity, and more personal customer service.
Companies that are already using hyperscale vendors like AWS also sometimes choose an alternative cloud solution for a specific project that could benefit from the simplicity and lower cost. In that case, an S3-compatible architecture ensures that the applications and services developed around S3 storage in the Amazon cloud will work out of the box in the alternative cloud environment.