Best Practices for SQL Server on VMware - Distilled

Page content

The intent of this post is a quick reference guide based on the recommendation made in “Architecting Microsoft SQL Server on VMware vSphere” April 2019 version. The target audience for this blog post is for SQL Server DBAs introducing them to the most impactful configurations and settings for running SQL Server in VMware.

For the explanations for each of these settings and how to configure the base VMware infrastructure, please read the “Architecting Microsoft SQL Server on VMware vSphere” guide and consult with your VMware administrators and experts.

Click here to watch a presentation of this content from SQLBits 2023

Physical Host Configuration

Due to the workload patterns in SQL Server, it is generally better to have a dedicated cluster for your SQL Server workloads. If you cannot have a dedicated cluster. At a minimum, you will want dedicated hosts and isolate VMs running SQL Server onto these hosts and ensure your other workload is not running on these hosts.

  • Enable Turbo Boost
  • Enable Hyper-threading
  • NUMA Enabled
  • Enable VT-x/VT-d
  • Power Management set to OS Controlled
  • Disable all processor C-states (C1E halt state)

ESX Configuration

  • Power Management set to High Performance
  • DRS - Watch out for DRS which can VMotion your workloads based its resource usage. Ensure that you work with your VMware admins, so they understand your workload patterns and don’t move your VMs around between hosts if they start running hot.

VM Configuration

Create a Virtual Machine Template using the following settings. Then when needed, you can provision your virtual machines from the template. At this time, you can set the CPU, Memory, and Disk configurations required for the workload.

  • vCPU Allocation
    • VMware takes the stance of deploy and measure, initially exceeding the number of pCPUs available.
    • Best practices are defined as follows
      • High-performance systems – total vCPU allocation not to exceed total pCPUs available in the cluster
      • Lower tier systems – it’s ok to over-allocate vCPUs to pCPUs, monitor for workload impact.
  • Cores per socket - on VMware 6.0+ VMware internally configures the appropriate NUMA configuration. You can assert control over this with an advanced configuration if needed.
    • In VMware 6.5+ Cores Per Socket is ignored when creating a vNUMA topology for a VM.
    • Rule of thumb - match the underlying hardware topology. For example, if you’re using 6 core machines, don’t create all your VMs with 4 and 8 cores.
    • For SQL Server Standard edition can only use the lesser of 4 sockets and 24 cores and is not NUMA aware. You can use this setting to enforce that configuration.
    • Once online, check the NUMA topology in SQL Server in Server Property
  • CPU Hot Plug - Disabled
    • This can have a significant performance impact. Some report as much as 30% CPU costs if enabled as it disables vNUMA.
  • Memory Configuration
    • Avoid over-allocating memory, exceeding the capacity of the hosts
    • Consider setting memory reservations for critical workloads to prevent ballooning.
  • Disk Configuration
    • Add 4 PVSCSI disk adapters to the VM
    • Create and 100GB virtual attached to a PVSCSI adapter for the OS
    • As for which type of disk to use? VMDK, VMware Virtual Volume, or Raw Device Mapping? You should talk with your storage administrators and storage vendors about the capabilities available in your environment. Also, check out Cody’s post Comparing VVols to VMDKs and RDMs, which dives deep into the topic and compares and contrasts each type of disk.
      • If you are using VMDK
        • Provision them Eager Zeroed Thick
        • Take into account VMDK and Datastore size limits
          • For high-end workloads, use a 1:1 mapping of Data file to VMDK to Datastore
      • Create and attach four additional 10GB Volumes.
        • Dedicate disks to your types of data, database files, transaction log files, system databases, and TempDB. If you’re doing backups to a local disk, add an additional dedicated disk for that too.
        • Attach one disk to each PVSCSI adapter. You will have to double up one virtual disk on the PVSCI adapter with the OS virtual disk. The OS should not produce too much IO during normal operations that this is significant.
          • Example Virtual Disk to PVSCSI layout for an OLTP workloads
          • PVSCI Adapter Virtual Disk Workload
            0 Operating System
            System Databases
            Optionally Database Backups
            1 Data files - mdf/ndf
            2 Transaction log files - ldf
            3 TempDB - both data and log
        • If more disks are needed, spread the workload out evenly across the PVSCSI adapters based on the IO demands of the database files that will be on those disks.
          • Monitor file latency, IOs and queuing over time to measure and if there is disk contention consider rebalancing the workload across the 4 PVSCSI adapators.
            • See my post here for how to measure latency and IOs.
            • See Cody Hosterman’s Post here for how to measure queuing.
        • At provisioning time, scale the disk size to that needed for that VM’s workload.

Windows Configuration

  • Power Plan - High Performance
  • Page file size - configure as a fixed 8GB allocation
  • Windows Updates - before installation of SQL Server, ensure all updates are applied
  • Instant file initialization - Configure Local Security Policy to “Perform Volume Management” for the SQL Server Service account
  • Lock pages in memory - Configure Local Security Policy to “Lock Pages In Memory” for the SQL Server Service account
  • Volume format - format all SQL Server volumes with 64KB NTFS Allocation Units
  • Block Alignment - Partitions have been volume aligned by default since Windows 2008. But it is still worth checking to ensure it’s configured properly. See this guide Windows File Systems for FlashArray for more details
  • SQL Server folder structure - create the following folder structure for SQL Server. This structure is helpful with the granularity of snapshots. Additional volumes exposed as mount points can be attached if needed.
    • D:\DATA
    • L:\LOGS
    • S:\SYSTEM
    • T:\TempDB
  • Anti-Virus - ensure your AV system excludes MDF, NDF, LDF, SDF and BAK

SQL Server Instance Configuration

  • Max Memory - tier 1 workloads should have MIN and MAX values
  • MAXDOP <= Maximum Number of Cores in a NUMA Node or <= 8
  • CTFP = 50 (start at 50 then measure again once you have workload running)
  • Enable Optimize for Ad-hoc Workloads

Database Maintenance and Configuration

  • Backups - Yes, please have backups. And try to stagger them throughout your backup window if possible. You can use a layered approached combining array based snapshots and SQL Server native backups.
  • Index and Statistics Maintenance - Also consider staggering maintenance jobs and using an intelligent maintenance framework like Ola Hallengren’s
  • Data File Layout - For large and demanding databases, spread data out into file groups and data files that map back to the disk topology above. Add additional disks if needed for the workload.
  • TempDB - Create one TempDB file per core up to 8 cores.
  • Monitoring - have a SQL Server specific monitoring strategy to capture workload and system metrics. Check out Pure1 VM Analytics for free.
  • Enable Query Store - This can be your flight data recorder when you need to find performance issues. Consider looking at the Channel Advisor QDS Toolbox for an effective way to extract performance data.
  • Leverage community tools - such as sp_whoisactive for monitoring and dbatools for automation
  • Invest in Automation and Configuration Management - Check out this session I did at PASS Summit 2020 on Deploying and Managing SQL Server with dbatools!

Queue Depth Settings

  • On Pure Storage, most customers can benefit from changing from the default of 32 to 64.
  • See Cody Hosterman’s Post here for more details on configuring queue depth in VMware.