SQL Server on Linux – How I think they did it!

Page content

OK, so everyone wants to know how Microsoft did it…how they got SQL Server running on Linux. In this article, I’m going to try to figure out how.

Update: Since the publication of this post, Microsoft has published a blog post detailing the implementation here

There’s a couple of approaches they could take…a direct port or some abstraction layer…A direct port would have been hard, basically any OS interaction would have had to been looked at and that would have been time consuming and risk prone. Who comes along to save the day? Abstraction. The word you hear about a million times when you take Operating Systems classes in undergrad and grad computer science courses. :) 

Well things are finally starting to come to light on how it was done. I had a Twitter conversation this weekend with Slava Oks, who is a leader on the project team and several other very active people in the SQL Community Klaus AschenbrennerEwald Cress, and Lonny Niederstadt. This got my gears turning…to find out…how they did it!

What do we know so far?

So here’s what we know, there’s some level of abstraction going on using a layer called SQL Platform Abstraction Layer (SQLPAL) and also some directly ported code via SQLOSv2. From a design standpoint this it a pretty good compromise. Check out Figure 1, here you can see SQLPAL sits between the Database Engine and the underlying operating system. Whichever one it may be, Windows, Linux and oh yeah “other OS in Future” :)

SQLServer on Linux

Figure 1 – SQL Server on Linux – source @SQLRockstar

Background information

So to understand how we got here, it’s worth looking at the Drawbridge project from Microsoft Research. Drawbridge is basically application, or more specifically, process virtualization with a contained OS inside that process space. This is called a picoprocess. Since the process is abstracted away from the underlying operating system, the process will need some part of an OS inside its address space. This is called the Library OS. With that abstracted away…each process has a consistent view of it’s own operating environment. In figure 2, you can see the Library OS and it’s roots into ntoskrnl.dll, which is an NT user-mode kernel. This provides a consistent OS interface for the application. Essentially program code doesn’t need to change.

Now it’s up to the picoprocess as a whole to provide some abstraction back to the actual operating system and that’s where the Platform Abstraction Layer (PAL) comes in. All that’s left is to provide an application binary interface for the picoprocess and you have a completely self-contained process without the need to interact directly the host operating system. This is amazing stuff!

Drawbridge

Figure 2 – Drawbridge Architecture – Source MS Research

 

SQLPAL – SQL Server Platform Abstraction Layer

So, I wanted to see this in action. In the Windows world, hard core SQL people are familiar with attaching a debugger to a SQL process and loading debug symbols to get a view into what’s going on inside of SQL Server. Well in Linux, we can do the same, and it’s a LOT easier. On Linux, there’s a tool called strace, which will give you a view into your programs execution and any interactions it has with the OS. So I launched SQL Server and strace and here’s what I found.

So to launch strace and SQL Server, we add the SQL Server binary as a parameter to strace. Caution, do not do this as root as it may cause a permission issue with log files generated by the sqlservr process. Use sudo to change to the msssql user.

[mssql@rhel1 ~]$ strace /opt/mssql/bin/sqlservr


The first thing you’ll see is a call to execve, which is a LINUX system call to start a new process. A regular old Linux process. So that means that sqlservr is a program binary compiled for Linux.

execve(“/opt/mssql/bin/sqlservr”, [“/opt/mssql/bin/sqlservr”], [/* 24 vars */]) = 0


At this point we see it loading all the local natively compiled libraries required for the program. Here’s one example, open is a system call to open a file, subsequent reads will occur when needed. There are many more libraries loaded.

open(“/lib64/libstdc++.so.6”, O_RDONLY|O_CLOEXEC) = 3


Now, we see something interesting, a load of a library called libc++abi.so.1. This file is in the /opt/mssql/lib/ directory and is shipped in the SQL Server package. This is a C++ standard library.

open(“/opt/mssql/bin/../lib/libc++abi.so.1”, O_RDONLY|O_CLOEXEC) = 3

<div>
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
Now we see a transition into Drawbridge like functionality, with the system.sfp open. This looks like it’s responsible for setting up the OS like substrate for the application’s execution environment. 

open(“/opt/mssql/lib/system.sfp”, O_RDONLY) = 3

<div>
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>

<div>
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">During the load of system.sfp, we see several libraries, registry and DLL loads that look like they’re responsible for setting up the kernel level abstraction.</span>
</div>

<div>
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>

<div>
  <p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
    <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, &#8220;Win8.dbmanifest\0&#8221;, 16, 4704) = 16</span>
  </p>
  
  <div>
    <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
  </div>
  
  <div>
    <span style="font-size: 12px;">Reading in the registry? Man that’s never going away :)</span>
  </div>
  
  <div>
     
  </div>
  
  <div>
    <p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
      <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, &#8220;windows.hiv\0&#8221;, 12, 4753)     = 12</span>
    </p>
  </div>
  
  <div>
    <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
  </div>
  
  <div>
    <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Reading in the NtOsKrn.dll, the NT user-mode kernel</span>
  </div>
  
  <div>
    <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
  </div>
  
  <div>
    <p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
      <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">pread(3, &#8220;NtOsKrnl.dll\0&#8221;, 13, 5123)    = 13</span>
    </p>
    
    <div>
      <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
    </div>
  </div>
</div>
Next SFP we see load is system.common.sfp. This looks to be a second stage boot process, perhaps Drawbridge’s library OS? 

open(“/opt/mssql/lib/system.common.sfp”, O_RDONLY) = 4


During this phase we see many other DLLs loading. Looks like we’re setting up an environment…here’s an example of something loaded at this time. Clearly higher level OS provided functionality.
 

pread(4, “kerberos.dll\0”, 13, 15055)   = 13

 
After a few more SFP files are opened for certificates and NetFX4, and then we end up at sqlservr.sfp. And inside here, it loads things familiar to deep dive SQL Server pros…first we see the program binary load sqlservr.exe, SqlDK.dll, sqllang.dll, SQLOS.dll, and sqlmin.dll. I omitted some output for readability.

open(“/opt/mssql/lib/sqlservr.sfp”, O_RDONLY) = 7

…omitted

pread(7, “sqlservr.exe\0”, 13, 13398)   = 13

…omitted

pread(7, “SqlDK.dll\0”, 10, 14079)      = 10

…omitted

pread(7, “sqllang.dll\0″, 12, 14382)    = 12

…omitted

pread(7, “SQLOS.dll\0”, 10, 14418)      = 10

…omitted

pread(7, “sqlmin.dll\0”, 11, 14511)     = 11


And finally, we end up with application output, something we’ve all seen…SQL Server starting up.

nanosleep({999999999, 0}, 2016-11-17 14:11:37.53 Server      Microsoft SQL Server vNext (CTP1) – 14.0.1.246 (X64) 

<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Nov  1 2016 23:24:39 </span>
</p>

<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">Copyright (c) Microsoft Corporation</span>
</p>

<p style="margin: 0px; font-size: 18px; line-height: normal; font-family: 'Andale Mono'; color: #28fe14; background-color: rgba(0, 0, 0, 0.901961);">
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;">on Linux (Red Hat Enterprise Linux)</span>
</p>

<div>
  <span style="font-variant-ligatures: no-common-ligatures; font-size: 12px;"><br /></span>
</div>
Oh, and now it makes much more sense why SQL Server on Linux is using Windows like file pathing inside the application, right? Well, think it through, SQL Server is interacting with an operating system that it thinks is still Windows, via the platform abstraction layer.
 

2016-11-17 14:11:37.53 Server      Logging SQL Server messages in file ‘C:\var\opt\mssql\log\errorlog’.

<h2>
  <span style="font-size: 12px;"><strong>SQLOSv2</strong></span>
</h2>

So in that Twitter conversation I had with Slava and others, we learned it’s not straight PAL, but a SQL Server specific PAL. This allows the product team to provide another path to the underlying OS for performance sensitive code. Look back at figure 1, you’ll see two paths from SQL Sever into SQLPAL. One uses the Win32 APIs, likely provided by Drawbridge (or some variant), and the other is perhaps natively compiled code…really that’s just a guess on my part. 

Final thoughts

All, this is a pretty awesome time we’re getting into…Microsoft embracing Linux, SQL on Linux, PowerShell on Linux. I’ve said this many times…Windows, Linux…it’s just an OS. I would like to thank Slava for his insight and also the product team for a fantastic preview release. It’s amazing how seamless this really is.

In a sidebar conversation with Ewald, he made the point that as SQL Server professionals that our investment in the understanding of SQL Server’s internals will persist with this implementation. Which I think is a huge relief for those that have invested years into understanding it’s internals! 

Please leave some comments on what your thoughts are on how this works. If you want to contact me directly, you can reach me at aen@centinosystems.com or @nocentino

 

Disclaimer

Well, if you made it this far…awesome! I want you to know, I don’t have any inside knowledge of how this was developed. I basically sat down and traced the code with the techniques I showed here.  

References 

https://www.microsoft.com/en-us/research/project/drawbridge/

https://blogs.msdn.microsoft.com/wsl/2016/05/23/pico-process-overview/