Posts

Update (February 2, 2026): Microsoft has fixed this issue in SQL Server 2025 CU1. The container now runs successfully on Docker Desktop for macOS without needing OrbStack. See my follow-up post for details.

SQL Server 2025 RTM is here, and if you’re running Docker on macOS Tahoe 26, you might have hit a wall trying to get it running. Here’s what happened when I tried spinning up the latest container image and how I worked around it.

Introduction

When you’re running MongoDB at scale with data distributed across multiple Pure Storage FlashArrays, achieving truly consistent backups becomes a critical and interesting technical challenge. In this post, I’m walking through an automated snapshot and recovery solution for a sharded MongoDB cluster running across two separate FlashArrays. While this demonstration uses two nodes for clarity, the same approach scales to N nodes across N arrays. The coordination mechanism remains identical regardless of cluster size.

In my previous post, I showed you how to build a snapshot backup catalog using SQL Server 2025’s new native REST API integration. But what if you’re still running SQL Server 2022? Should you miss out on this powerful capability that SQL Server and Pure Storage provide? Absolutely not.

In this post, I’m going to show you how to build the same snapshot backup catalog functionality using PowerShell with SQL Server 2022’s T-SQL Snapshot Backup feature. While we don’t have the native REST integration yet, we can still leverage the power of FlashArray’s Protection Group Tags to build a queryable snapshot catalog that bridges the gap between database administration and storage management.

Update for SQL Server 2025:
This post and the GitHub repo have been updated for SQL Server 2025 RC1 and Ubuntu 24.04.
New in SQL Server 2025: You no longer need to install the PolyBase service to interact with Parquet files in S3. Previously, with SQL Server 2022, you had to build a custom container or manually install PolyBase. Now, S3 object integration and Parquet support work out-of-the-box!

In this blog post, I’ve implemented two example environments for using SQL Server’s S3 object integration. One for backup and restore to S3-compatible object storage and the other for data virtualization using PolyBase connectivity to S3-compatible object storage. This work aims to get you up and running as quickly as possible to work with these new features. I implemented this in Docker Compose since that handles all the implementation and configuration steps for you. The complete code for this is available on my GitHub repo. I’m walking you through the implementation here in this post.

SQL Server 2025 introduces native support for vector data types and external AI models. This opens up new scenarios for semantic search and AI-driven experiences directly in the database. But as with any external service integration, performance and scalability are immediate concerns, especially when generating embeddings at scale.

https://github.com/nocentino/ollama-lb-sql

Problem: Bottlenecks in Embedding Generation

When you call out to an external embedding service from T-SQL via REST over HTTPS, you’re limited by the throughput of that backend. If you’re running a single Ollama instance, you’ll quickly hit a ceiling on how fast you can generate embeddings, especially for large datasets. I recently attended an event and discussed this topic. My first attempt at generating embeddings was for a three-million-row table. I had access to some world-class hardware to generate the embeddings. When I arrived at the lab and initiated the embedding generation process for this dataset, I quickly realized it would take approximately 9 days to complete. Upon closer examination, I found that I was not utilizing the GPUs to their full potential; in fact, I was only using about 15% of one GPU’s capacity. So I started to cook up this concept in my head, and here we are, load balancing embedding generation across multiple instances of ollama to more fully utilize the resources.

I’m excited to announce the release of a new open-source project that fully automates HammerDB benchmarking for SQL Server using Docker. If you’ve ever needed to run TPC-C or TPC-H benchmarks multiple times, you know how time-consuming the manual setup can be. This project removes the hassle and gets you up and running a single command: ./loadtest.sh.

Why I Built This

In my work, I frequently benchmark SQL Server configurations, whether I’m comparing versions, testing new hardware, or validating performance tuning changes. Setting up HammerDB manually each time became a significant time bottleneck (see what I did there! ;). I needed an automated solution that would work consistently across different environments and reduce the time required to get test results.

In modern IT environments, not all workloads require the same level of storage performance, protection, or cost. Some applications need high performance with aggressive data protection, while others are perfectly fine with lower performance in exchange for cost savings. This tiered approach to storage service delivery is fundamental to efficient infrastructure management.

In my previous post on Fusion, I took an application-centric approach, showing how to deploy SQL Servers using Fusion. Let’s switch gears now and learn how to define a storage service catalog. In this post, I’ll demonstrate how to build a complete storage service catalog using Pure Storage Fusion Presets, offering Bronze, Silver, and Gold tiers with optional replication. We’ll see how to leverage different array types (FlashArray //X and FlashArray //C) to optimize both performance and cost across your fleet.

Ollama SQL FastStart streamlines the deployment of SQL Server 2025 with integrated AI capabilities through a comprehensive Docker-based solution. This project delivers a production-ready environment combining SQL Server 2025, Ollama’s large language model services, and NGINX with full SSL support—all preconfigured to work together seamlessly.

I built this project to eliminate the complex configuration hurdles that typically slow down AI integration projects. Whether you’re a database professional wanting to explore SQL Server 2025’s new vector search capabilities or a developer looking to build AI-powered applications on familiar infrastructure, this solution provides everything you need in a single docker-compose file. The entire stack—including the complex certificate trust chain between SQL Server and the Ollama API—is automatically configured, allowing you to focus on building data driven AI applications rather than infrastructure setup.

When managing storage infrastructure at scale, one of the most powerful approaches is treating related storage resources as cohesive Workloads rather than individual components. This becomes especially important when dealing with applications like SQL Server that have specific storage patterns and requirements and are often deployed at scale in a datacenter or cloud.

In this post, I’ll walk through a complete workflow for creating and managing application-specific storage Workloads using Pure Storage’s Fusion Fleet capability with PowerShell. We’ll see how we can define storage templates, called Presets, once and deploy them consistently across our entire Fleet of storage arrays.

I am honored to announce that I have been renewed as a Microsoft MVP for the ninth consecutive year, recognized in the Azure SQL and SQL Server technical areas under Data Platform. Thanks for this incredible journey that began in 2017.

Thank You

I want to thank Microsoft for this continued recognition. The MVP program has provided me with numerous opportunities to connect with brilliant minds worldwide, gain early access to cutting-edge technologies, and collaborate with product teams and engineering teams that contribute to the evolution of the data platforms we all rely on to maintain our customers’ most critical asset, data.

Posts

Getting SQL Server 2025 RTM Running in Containers on macOS

App-Consistent MongoDB Snapshots Across Multiple Pure Storage FlashArrays

Introduction

Build a Snapshot Backup Catalog in Pure Storage - The SQL Server 2022 Edition

Setting up SQL Server S3 Object Storage Integration using MinIO with Docker Compose (Updated for SQL Server 2025)

Scaling SQL Server 2025 Vector Search with Load-Balanced Ollama Embeddings

Problem: Bottlenecks in Embedding Generation

Automated SQL Server Benchmarking with HammerDB and Docker: A Complete Testing Framework

Why I Built This

Managing Enterprise Storage with Pure Storage Fusion in PowerShell - Building Storage Tiers

Getting Started with Vector Search in SQL Server 2025 Using Ollama

Managing Enterprise Storage with Pure Storage Fusion in PowerShell

Microsoft MVP 2025: Continuing the Data Platform Journey

Thank You