Nuxi The CloudABI Development Blog

CloudABI for Mac OS X, part one: Position Independent Executables

April 21, 2016 by Ed Schouten

Given that a lot of interesting things are happening with CloudABI’s development lately, I thought I’d finally set up a blog. That way I can publish some articles once every so often, some of them being very technical, others hopefully a bit less. Today I’m starting off with the first of a three-part series on how we managed to port CloudABI over to Mac OS X. Enjoy!

An introduction to CloudABI

CloudABI allows you to develop applications that can be run without modifications on a variety of UNIX-like operating systems. What makes CloudABI unique compared to, say, your average program running on Linux or BSD is that CloudABI makes exclusive use of capability-based security. In a nutshell, it means that the rights of a process are purely determined by the file descriptors it possesses. This makes it possible to easily develop applications that are very strongly sandboxed. If an attacker manages to take over execution of your program, he/she will only be able to interact with the small number of resources that your program happened to own at the time; not to compromise the entire system. The use of capability-based security also has some nice advantages in the domain of testability and maintainability. Be sure to check out a talk I gave at 32C3 if you’re interested in a more complete introduction.

CloudABI consists of a couple of separate components. First of all, there is the CloudABI specification that formally describes the system calls that a CloudABI program can use to communicate with the operating system. On top of this there is the CloudABI C library, which implements many standard C/POSIX programming interfaces. Finally, there is the CloudABI Ports Collection where we’ve ported a number of existing Open Source libraries and applications over to CloudABI and packaged them into several formats (Arch Linux, Cygwin, Debian, FreeBSD, Homebrew). This allows you to develop applications for CloudABI in a uniform way, regardless of the operating system you’re using.

To run CloudABI executables, you’ll obviously need to use an operating system that supports loading them. For example, support has already been integrated into FreeBSD 11 and we’ve also developed a patchset for the Linux kernel.

What about Mac OS X?

As there are quite a lot of engineers out there who use Apple hardware on a day-to-day basis, we think that being able to use CloudABI on Mac OS X makes a lot of sense. The only problem is that though the kernel used by Mac OS X is Open Source Software, expecting our users to replace the kernel on their shiny Macbooks makes little sense.

This is why we’ve developed a light-weight emulator that can map CloudABI executables inside the virtual memory of a Mac OS X process and start executing them directly, forwarding any system calls on behalf of the CloudABI application to Mac OS X. Though this emulator does not provide any of the security guarantees that CloudABI normally offers, it’s an elegant tool for developing and testing CloudABI software on Mac OS X.

In this article I’m going to look at one specific aspect of how the emulation works from a technical point of view, namely why our emulator requires that we use Position Independent Executables (PIE). In the next articles I will talk about our use of virtual Dynamic Shared Objects (vDSO) and how Thread Local Storage (TLS) has been set up to work efficiently.

What are Position Independent Executables?

What typically does not need to be given a lot of thought when writing code in languages like C and C++ is that the resulting executables often hardcode memory addresses to other locations in the program. For example, consider the following function that converts the abbreviated name of a month to a number between 1 and 12:

#include <string.h>

const char *const months[12] = {
    "Jan", "Feb", "Mar", "Apr", "May", "Jun",
    "Jul", "Aug", "Sep", "Oct", "Nov", "Dec",
};

int lookup_month(const char *month) {
  for (int i = 0; i < 12; ++i)
    if (strcmp(months[i], month) == 0)
      return i + 1;
  return -1;
}

In this small piece of code we already need to hardcode memory addresses in 14 locations. First of all, our lookup_month() function needs to know both the address at which the strcmp() function starts and where the months[] array is placed. Second, the months[] array also needs to be initialized to point to all of the individual strings stored within.

For every memory address that a compiler needs to hardcode, it generates a relocation entry that is written into the object file’s headers. Let’s take a look at what these relocation entries look like by running the readelf command:

$ x86_64-unknown-cloudabi-cc -o months.o -c months.c
$ x86_64-unknown-cloudabi-readelf -r months.o
Relocation section '.rela.text.lookup_month' at offset 0x238 contains 2 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000020  000500000009 R_X86_64_GOTPCREL 0000000000000000 months - 4
000000000031  000600000004 R_X86_64_PLT32    0000000000000000 strcmp - 4
Relocation section '.rela.data.rel.ro.months' at offset 0x268 contains 12 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000000  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 0
000000000008  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 4
000000000010  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 8
000000000018  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + c
000000000020  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 10
000000000028  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 14
000000000030  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 18
000000000038  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 1c
000000000040  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 20
000000000048  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 24
000000000050  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 28
000000000058  000300000001 R_X86_64_64       0000000000000000 .rodata.str1.1 + 2c
...

The first two relocations, R_X86_64_GOTPCREL and R_X86_64_PLT32, are used to patch up the lookup_months() function to refer to months[] and strcmp() through the Global Offset Table (GOT) and the Procedure Lookup Table (PLT). The R_X86_64_64 relocations are used to initialize the months[] array with 64-bit pointers to the strings. Note that these relocations could have been avoided by declaring months[] as a two-dimensional array, containing twelve four-byte strings.

When the linker is used to generate an executable consisting of one or more object files, it comes up with a fixed memory layout at which all of the functions and variables are placed. This allows the linker to apply and eliminate all of these relocations. Well, almost all of them. When linking an application dynamically, relocations referring to symbols that are provided by external libraries (such as libc’s strcmp()) are retained. These are then processed by the run-time linker (ld-linux.so on Linux) on process startup.

This now brings us to the answer to our question. A Position Independent Executable is nothing more than an executable for which none of the relocations for hardcoding absolute memory addresses have been eliminated by the linker; they are all still stored in the headers of the executable. This makes it possible to load the executable at any offset in memory, as long as all of its relocations are applied on startup.

In the Linux and BSD ecosystem Position Independent Executables have become popular as it is a requirement for applying full address space layout randomization (ASLR). For running CloudABI executables on Mac OS X, we need to make use of this technique for a different reason. If CloudABI executables would need to be loaded at a fixed memory address, it could be the case that the memory regions used by the executable would overlap with the regions at which Mac OS X decided to load the emulator. PIE solves this by making it possible for the emulator to load the executable at virtually any offset, simply by performing a couple of mmap() calls.

Position Independent Executables versus LLVM’s new linker

One of the things that got us excited earlier this year is the new linker the LLVM developers are working on, simply called LLD. Not only is it very fast, like Clang it has the advantage over the GNU tools that by default a single installation provides support for all hardware architectures. This is why we switched over to LLD for package builds when LLVM 3.8 was released.

Though LLD 3.8 worked pretty well for us thus far, it turned out that its support for PIE still had a number of small bugs in it. These bugs caused the linker to either generate relocations in places it shouldn’t, or to not generate them in places it should. We managed to fix all of these bugs in the meantime (commits: #1, #2, #3), so LLVM 3.9 will be the first release to ship with a linker that does PIE properly, both for x86-64 and ARM64. At the same time, we’ve adjusted Clang’s frontend to enable the use of PIE by default.

The cloudabi-toolchain Homebrew package for Mac OS X currently installs the latest development snapshot of LLVM, but will of course be switched over to LLVM 3.9 once released.

Applying the relocations on startup

The next piece of the puzzle of getting PIE to work is to extend the program startup process. Immediately after starting up, CloudABI executables now need to apply their own relocations to their memory image. This entire process needs to be implemented in such a way that it doesn’t depend on any code that requires relocations itself, of course.

Though this may sound as hard as replacing your tires while driving, modern CPUs like x86-64 and ARM64 have some features that make this relatively easy, such as RIP-relative addressing. With RIP-relative addressing, global variables, other functions and jump targets may all be addressed not by using their absolute memory addresses, but by using addresses relative to the program counter of the CPU. You’ll therefore see that on those architectures relocations are only needed to patch up global variables and constants, but not machine code. This means that you can get a relocator to work quite easily, as long as you stay away from global variables until relocation has finished. The relocator that’s part of cloudlibc is only about 50 lines of code in size.

As you can see, cloudlibc’s relocator is only capable of handling a single relocation type (R_X86_64_RELATIVE on x86-64, R_AARCH64_RELATIVE on ARM64). This relocation type can be used to set a 64-bit pointer to the address at which the executable was loaded, with an offset added to it (the addend). There is no need to handle any other relocation types, such as the ones we saw previously. Unlike the GNU linker, LLD seems to normalize all relocations to this single type, which simplifies things a lot.

A downside of Position Independent Executables is that in order to apply the relocations, we sometimes have to make certain constants writable, only so we can apply relocations to them. To solve this, both the GNU linker and LLD can place all of these constants together in one section and generate a PT_GNU_RELRO header by providing the -zrelro command line option. We can use the information stored this header to make this section read-only afterwards using mprotect().

Once all of that is done, the program has been properly relocated, meaning that startup can finally continue as usual. And there was much rejoicing.

Closing words

I hope you enjoyed reading this article as much as I enjoyed writing it. Be sure to send an email to info@nuxi.nl or send me a message on Twitter at @EdSchouten if you have any feedback. Stay tuned for part two!