Object storage explained: S3, R2, B2 and self-hosted MinIO

Key takeaways

01 The S3 API is the standard. Almost every provider speaks it, so your code is portable in a way it was not in the days of vendor-specific FTP and NFS.
02 Egress is where you actually pay. AWS S3 storage is cheap; AWS S3 egress to the public internet is the most expensive on the list. R2 and B2 have zero or near-zero egress, which inverts the economics.
03 Presigned URLs let you grant a client time-limited download or upload access without giving them credentials. They are the right answer for almost every direct-from-client upload pattern.
04 Lifecycle rules move old objects to cheaper storage tiers or delete them automatically. Set them up once; they pay for themselves over years.
05 Versioning protects against accidental overwrite and ransomware-style mass deletion, at the cost of higher storage bills. Pair it with lifecycle rules so old versions expire.
06 Self-hosting MinIO is the right call when you need on-prem object storage with the S3 API. It is the wrong call when you are doing it to save money on a small workload; the operations cost outstrips the savings.

Object storage is the answer to 'where do I put user-uploaded files'. The S3 API has become the lingua franca, and there are now half a dozen credible providers, each with different pricing, performance and operational profiles. This guide explains the API surface you actually use, the egress pricing trap that has burned a thousand small shops, and how to pick AWS S3, Cloudflare R2, Backblaze B2, Wasabi, Hetzner Object Storage or self-hosted MinIO for the job at hand.

What object storage is

Object storage is a way to store files where each file lives at a URL, identified by a unique key inside a container called a bucket, with no directory structure underneath it in any meaningful sense. You PUT a file, you GET a file, you DELETE a file, you LIST files in a bucket. There is no filesystem, no mounted disk, no notion of seek or random write. The whole thing is HTTP from the outside.

The trade is durability and capacity for latency and write semantics. Object storage gives you eleven nines of durability across multiple data centers, capacity that scales to petabytes without you doing anything, and per-gigabyte costs in the tenths of a cent per month. In exchange, every request takes tens to hundreds of milliseconds, you cannot modify a file in place (you replace it), and you pay per request for both writes and reads.

The right mental model is a giant filing cabinet that lives at https://your-bucket.example-storage.com/some/path/file.jpg. You write the file, the URL is the address you remember it by, you read it back later. That is the whole API at the top level. Everything else is policy on top.

The S3 API as a lingua franca

Amazon shipped S3 in 2006. The API they invented for it, despite some warts, became the de facto standard for object storage. Almost every provider in the space now offers an S3-compatible endpoint, which means the same code that talks to AWS can also talk to Cloudflare R2, Backblaze B2, Wasabi, Hetzner Object Storage, MinIO running on a box in your living room, and a dozen others.

The portability this gives you is significant. The bad old days of vendor-specific storage APIs meant your application code was coupled to a single provider; switching meant rewriting a layer. With S3-compatible providers, switching is mostly a matter of updating the endpoint URL and the credentials.

The smallest useful AWS SDK example, in Python, that works against any S3-compatible provider:

import boto3

s3 = boto3.client(
    "s3",
    endpoint_url="https://eu-central-1.linodeobjects.com",   # provider-specific
    aws_access_key_id="ACCESSKEY",
    aws_secret_access_key="SECRETKEY",
    region_name="eu-central-1",
)

s3.upload_file("local/path/photo.jpg", "my-bucket", "uploads/2026/photo.jpg")

That is the whole pattern. Change the endpoint URL and the credentials and the same code points at AWS, R2, B2, Wasabi or your self-hosted MinIO. The S3 SDKs do not need to know which provider they are talking to.

The providers compared

The list shifts every year, but six providers are the ones to know in 2026:

AWS S3. The original, still the largest, the most featureful. Storage is around USD 0.023/GB/month for standard tier. Egress to the public internet is the killer (see the next section).
Cloudflare R2. Storage around USD 0.015/GB/month, zero egress. Built to live alongside Cloudflare's CDN. Class B operations (reads) are free up to a generous floor.
Backblaze B2. Storage around USD 0.006/GB/month. Egress free when used with Cloudflare's CDN under the Bandwidth Alliance arrangement, otherwise USD 0.01/GB. The economy option that scales.
Wasabi. Storage around USD 0.0069/GB/month, no egress fees. A 90-day minimum retention period (you pay for at least 90 days even if you delete sooner).
Hetzner Object Storage. Storage around EUR 0.0049/GB/month, generous egress allowance (1 TB free per TB stored, then EUR 1/TB). EU-only locations. The European data sovereignty option.
Self-hosted MinIO. Free software, your costs are the disk, the bandwidth and the operations time. A serious option when you need on-prem object storage, an over-engineered one when you don't.

Two storage providers worth being aware of but treating cautiously: AWS S3 Glacier and similar deep-cold tiers cost almost nothing to store but minutes to hours to retrieve. They are for genuine archives, not active data. Pricing-tier confusion is the most common mistake new operators make: they put backups in Standard, paying full price for data they never read, instead of moving them to Infrequent Access or a cold tier via a lifecycle rule.

Egress pricing: the trap nobody warns you about

The thing that bites every newcomer to object storage is not the storage cost but the egress cost: the price you pay per gigabyte to read data out of the provider's network onto the public internet. AWS charges USD 0.09/GB for the first 10 TB per month of egress, which sounds cheap until you serve a single popular video.

A worked example. A site serves user-uploaded images from an S3 bucket. The catalogue is 500 GB. Storage costs USD 11.50 a month, which is fine. The site serves 5 TB of image data per month. Egress costs USD 450. The egress bill is forty times the storage bill, and the operator usually only notices when the first month's invoice arrives.

The fix is one of three:

Put a CDN in front of the bucket. The CDN caches the popular objects at the edge, the origin only serves cache misses, the egress bill drops by an order of magnitude. CloudFront in front of S3 follows this pattern, and AWS waives the egress fee for CloudFront origin pulls.
Use a provider with zero or near-zero egress. R2, B2 (with Cloudflare in front), Wasabi and Hetzner all change the economics. If you are serving a few terabytes a month of public data, R2 with no egress is dramatically cheaper than S3 with egress.
Re-evaluate whether you actually need the file on the public internet. Authenticated downloads via presigned URLs that expire in minutes do not change the egress arithmetic, but they reduce the surface area for scraping and abuse.

The right answer for new projects without an existing AWS commitment is almost always R2 or B2 plus a CDN. The reason AWS S3 is still the default in many shops is not that it is the best deal, but that the rest of the stack is on AWS.

Buckets, keys and paths

A bucket is a container with a globally unique name. The name must be DNS-valid because each bucket gets its own subdomain at the provider. Once a bucket exists, you write objects to it under keys. The key is the full path from the bucket root: uploads/2026/06/photo.jpg is one key, not three nested directories.

There is no directory structure in the strict sense. The slashes in the key are convention; they look like directories to a human and to the LIST API which can return "prefixes" that group together. But there is no directory you can create or move; only keys you can write or delete.

Practical implications:

Listing a bucket with many objects is expensive. Each LIST call returns a thousand entries; for a million-object bucket you do a thousand calls. Plan for this.
Renaming an object is a copy-then-delete, not a metadata change. If you need to organise objects after the fact, design the keys carefully up front.
Putting too many objects at the top of the bucket is a performance trap. Hash-prefix the keys (ab/cd/ef-some-file) so the provider can shard internally.

Presigned URLs

A presigned URL is the trick that makes object storage usable from the browser. You generate a URL on the server that grants the holder permission to perform one specific action (usually GET or PUT) on one specific object, for a limited time. You hand the URL to the client. The client uses it directly. The credentials never leave the server.

For downloads, a presigned GET URL lets you serve private files without proxying the bytes through your own server. The browser fetches them straight from R2 or S3; the bandwidth is the provider's, not yours.

For uploads, a presigned PUT URL lets the browser upload directly to the bucket without the file ever passing through your application server. This is the standard pattern for file uploads at any scale beyond hobby use. Your application is responsible only for issuing the URL with the right object key, content-type and expiry, and then for handling the post-upload bookkeeping when the client signals completion.

A presigned-upload flow in Python:

url = s3.generate_presigned_url(
    "put_object",
    Params={
        "Bucket": "uploads",
        "Key": f"users/{user_id}/{nanoid()}.bin",
        "ContentType": "application/octet-stream",
    },
    ExpiresIn=300,   # five minutes
)
return {"upload_url": url}

The client PUTs the file body to that URL within five minutes. Anyone who steals the URL after the expiry cannot use it; anyone who steals it before can upload one object, with the content type and key you specified, which constrains the damage.

Lifecycle rules and versioning

Two policy features that pay for themselves: lifecycle rules and versioning.

Lifecycle rules are server-side scheduled actions on objects. The common ones: move objects in a prefix to a colder tier after 30 days, delete objects in a prefix after 365 days, abort incomplete multipart uploads after a week. You write the rule once; the provider runs it. The cost savings on a bucket with churn are real over a year.

Versioning keeps the previous content of an object when you overwrite or delete it. Turn it on at the bucket level. The advantage is that an accidental overwrite is recoverable; a ransomware attack that mass-deletes is recoverable. The cost is that you now pay for every previous version, which can multiply the storage bill if the bucket sees churn. Always pair versioning with a lifecycle rule that expires non-current versions after a fixed window (typically 30 to 90 days).

Using object storage for backups

Object storage is the right backup target for almost every small to medium operator. The advantages are durability, off-site by default, immutability via object lock, and a clean API that backup tools speak natively.

Three things to plan:

Encryption. Your backup tool should encrypt before upload. Server-side encryption is a useful second line, but the provider key is the wrong key to depend on alone.
Immutability. Object lock (also called WORM mode) means an object cannot be deleted before a retention date, even by your own credentials. This is the answer to ransomware that has obtained your bucket credentials.
Restore drills. A backup you have never restored is a hypothesis, not a backup. Once a quarter, restore a real file to a real environment and check it works. The bug is always in the path you have not exercised.

The companion guide on backup strategy goes into the file-level and database-level details. The point here is that object storage is the right destination for the off-site copy.

Self-hosting MinIO

MinIO is an open-source object storage server that speaks the S3 API. You install it on your own hardware, point it at some disks, and you have a private S3-compatible service. The common reasons to do this:

Data sovereignty rules require storage on-premises or in a specific jurisdiction not served by your cloud provider.
Internal networks need an S3 endpoint that can be reached without going to the public internet.
The workload is large enough that the cloud bill exceeds what hardware plus power costs.

A minimal Docker install:

docker run -d --name minio \
  -p 9000:9000 -p 9001:9001 \
  -v /srv/minio/data:/data \
  -e MINIO_ROOT_USER=admin \
  -e MINIO_ROOT_PASSWORD=$(openssl rand -base64 32) \
  minio/minio server /data --console-address ":9001"

That is a single-node install. A real production deployment is a multi-node cluster with erasure coding across at least four disks, fronted by a load balancer, with TLS termination and proper credential rotation. The operations cost is real. Self-hosting MinIO to save USD 50 a month is bad economics; doing it for petabyte-scale or compliance reasons is sensible.

Picking a provider for a specific job

Four shapes that come up most often:

A new SaaS storing user uploads served via a CDN. Cloudflare R2 with Cloudflare's CDN. Zero egress means the bill stays predictable as you grow.
Backups for a fleet of servers. Backblaze B2. The cheapest per-gigabyte serious option, with the egress allowance you need for the once-a-quarter restore drill.
You are already on AWS, the rest of your stack lives there. AWS S3 with a CloudFront distribution in front. Pay the AWS premium; the integration savings are worth more than the storage delta.
EU data must stay in the EU. Hetzner Object Storage or Scaleway Object Storage. Both speak S3 and both are based in jurisdictions where the data residency is a clear default.

The one thing you should not do is paper over an undecided choice by writing your own storage layer on top of a filesystem. That path leads to a homebrew object storage with worse durability, no presigned URLs, and a custom API your future self will hate. The S3 API is a standard for a reason. Use it from day one, even if you start with a single bucket on the cheapest provider you can find.

Frequently asked questions

Is object storage the same as a CDN? Read

No. Object storage holds the files; a CDN caches them at the edge so users get them fast. They compose well: serve user uploads from R2 with the public bucket fronted by Cloudflare's CDN, or store assets in S3 with CloudFront in front. Object storage on its own is for durability and capacity; the CDN is for latency.

Is S3 still the right default in 2026? Read

S3 is the safe default when you are already on AWS and the rest of your infrastructure expects to talk to it. If you are choosing fresh, R2 or B2 with much lower egress is hard to argue against. Pick S3 because of the surrounding AWS services, not because of the storage itself.

Can I put a database in object storage? Read

You can store database backups there cleanly. Running a database on top of object storage as live storage is a different exercise and usually a bad idea. Object storage is high-latency and not designed for the read patterns a database performs. Specialised systems like ClickHouse and Iceberg do clever things on top of object storage, but they are clever on purpose.

Are uploaded files private by default? Read

Yes. Buckets and objects default to private. Public read access is opt-in. Make sure you understand the difference between a bucket policy, an ACL and a presigned URL before you flip the public bit, because the failure mode is famously embarrassing.

What is the smallest useful object? Read

Object storage is sized for files from kilobytes to terabytes. Storing millions of tiny objects is technically possible but expensive in metadata and per-request fees. If you have a workload that wants millions of small writes per second, look at a key-value store first; object storage is for the megabyte-and-up range.

Do all providers offer S3-compatible APIs? Read

Almost all. The compatibility level differs in edge cases. Standard PUT, GET, LIST, DELETE, multipart upload and presigned URLs work everywhere. Newer features like SSE-C, conditional headers and S3 Select may or may not be present. Test the features you actually rely on against the provider you actually use.

What happens if my provider disappears tomorrow? Read

You move. Because the API is standard, the move is mostly running rclone sync against the source bucket from the destination. Plan for it as a contingency; for any data you cannot lose, have a second copy on a different provider at all times. Object storage is durable per provider, but providers are still companies.

Glossary terms used in this guide

S3 HTTP HTTPS TLS API DNS CDN VPS TCP