S3

Setting up SQL Server S3 Object Storage Integration using MinIO with Docker Compose (Updated for SQL Server 2025)

Update for SQL Server 2025:
This post and the GitHub repo have been updated for SQL Server 2025 RC1 and Ubuntu 24.04.
New in SQL Server 2025: You no longer need to install the PolyBase service to interact with Parquet files in S3. Previously, with SQL Server 2022, you had to build a custom container or manually install PolyBase. Now, S3 object integration and Parquet support work out-of-the-box!


In this blog post, I’ve implemented two example environments for using SQL Server’s S3 object integration. One for backup and restore to S3-compatible object storage and the other for data virtualization using PolyBase connectivity to S3-compatible object storage. This work aims to get you up and running as quickly as possible to work with these new features. I implemented this in Docker Compose since that handles all the implementation and configuration steps for you. The complete code for this is available on my GitHub repo. I’m walking you through the implementation here in this post.

Setting up MinIO for SQL Server 2022 s3 Object Storage Integration

Introduction

In this post, I will walk you through how to set up MinIO, so you can use it to work with SQL Server 2022’s s3 object integrations. Working with s3 and SQL Server requires a valid and trusted TLS certificate. This can be a pain for some users and environments. So I’m writing this post so you can get off the ground running with this new feature set in SQL Server 2022. The certificate we’re working with here is self-signed. You could get a real certificate for your environment, and that’s encouraged. But this walk-through intends to get you up and running fast so that you can test out SQL Server’s s3 object integrations. We’re using MinIO’s free GNU AGPL v3 edition and running it in a docker container for our s3 compatible object storage and SQL Server 2022 CTP 2.0, which is also running in a container.

Backing up to s3 Compatible Object Storage with SQL Server

Introducing S3 in SQL Server 2022

S3 compatible object storage integration is a new feature introduced in SQL Server 2022. There are two significant areas where SQL Server leverages this: backup and restore and data virtualization. This article will focus on getting started with using S3 compatible object storage for backups. Now let’s unpack that phrase ‘S3 compatible object storage’ a bit. AWS Simple Cloud Storage Service (S3) is a storage service AWS provides in their cloud. That platform’s REST API is available for others, including my company, Pure Storage, to build their own s3 compatible object storage platforms. And on at Pure Storage we have s3 available on our scale-out File and Object Platform FlashBlade. This means you can take advantage of s3 object storage anywhere you like, outside AWS’s cloud infrastructure.

s5cmd Authentication Using Enviroment Variables

At work, I get to work with some fantastic tech that pushes the boundaries of performance. I needed to do some performance testing from a Windows server into a FlashBlade using s3. I reached out to a colleague of mine, Joshua Robinson, who told me about s5cmd. s5cmd is a very fast, parallel s3 compatible command-line client.

Check out Joshua’s post for some performance numbers. Here’s a direct quote from his post.