This post kicks off a short series into reversing the Adobe Reader sandbox. I initially started this research early last year and have been working on it off and on since. This series will document the Reader sandbox internals, present a few tools for reversing/interacting with it, and a description of the results of this research. There may be quite a bit of content here, but I’ll be doing a lot of braindumping. I find posts that document process, failure, and attempt to be far more insightful as a researcher than pure technical result.
I’ve broken this research up into two posts. Maybe more, we’ll see. The first here will detail the internals of the sandbox and introduce a few tools developed, and the second will focus on fuzzing and the results of that effort.
This post focuses primarily on the IPC channel used to communicate between the sandboxed process and the broker. I do not delve into how the policy engine works or many of the restrictions enabled.
This is by no means the first dive into the Adobe Reader sandbox. Here are a few prior examples of great work:
2011 – A Castle Made of Sand (Richard Johnson)
2011 – Playing in the Reader X Sandbox (Paul Sabanal and Mark Yason)
2012 – Breeding Sandworms (Zhenhua Liu and Guillaume Lovet)
2013 – When the Broker is Broken (Peter Vreugdenhil)
Breeding Sandworms was a particularly useful introduction to the sandbox, as it describes in some detail the internals of transaction and how they approached fuzzing the sandbox. I’ll detail my approach and improvements in
part two of this series.
After evaluating existing research, however, it seemed like there was more work to be done in a more open source fashion. Most sandbox escapes in Reader these days opt instead to target Windows itself via win32k/dxdiag/etc and not the sandbox broker. This makes some sense, but leaves a lot of attack surface unexplored.
Note that all research was done on Acrobat Reader DC 20.6.20034 on a Windows 10 machine. You can fetch installers for old versions of Adobe Reader here. I highly recommend bookmarking this. One of my favorite things to do on a new target is pull previous bugs and affected versions and run through root cause and exploitation.
Sandbox Internals Overview
Adobe Reader’s sandbox is known as protected mode and is on by default, but can be toggled on/off via preferences or the registry. Once Reader launches, a child process is spawned under low integrity and a shared memory section mapped in. Inter-process communication (IPC) takes place over this channel, with the parent process acting as the broker.
Adobe actually published some of the sandbox source code to Github over 7 years ago, but it does not contain any of their policies or modern tag interfaces. It’s useful for figuring out variables and function names during reversing, and the source code is well written and full of useful comments, so I recommend pulling it up.
Reader uses the Chromium sandbox (pre Mojo), and I recommend the following resources for the specifics here:
These days it’s known as the “legacy IPC” and has been replaced by Mojo in Chrome. Reader actually uses Mojo to communicate between its RdrCEF (Chromium Embedded Framework) processes which handle cloud connectivity, syncing, etc. It’s possible Adobe plans to replace the broker legacy API with Mojo at some point, but this has not been announced/released yet.
We’ll start by taking a brief look at how a target process is spawned, but the main focus of this post will be the guts of the IPC mechanisms in play. Execution of the child process first begins with
This function crafts the target process and its restrictions. Some of these
are described here in greater detail, but they are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
From here, the policy manager enforces interceptions, handled by the InterceptionManager, which handles hooking and rewiring various Win32 functions via the target process to the broker. According to documentation, this is not for security, but rather:
From here we can now take a look at how the IPC mechanisms between the target and broker process actually work.
The broker process is responsible for spawning the target process, creating a shared memory mapping, and initializing the requisite data structures. This shared memory mapping is the medium in which the broker and target communicate and exchange data. If the target wants to make an IPC call, the following happens at a high level:
- The target finds a channel in a free state
- The target serializes the IPC call parameters to the channel
- The target then signals an event object for the channel (ping event)
- The target waits until a pong event is signaled
At this point, the broker executes
ThreadPingEventReady, the IPC processor entry point, where the following occurs:
- The broker deserializes the call arguments in the channel
- Sanity checks the parameters and the call
- Executes the callback
- Writes the return structure back to the channel
- Signals that the call is completed (pong event)
There are 16 channels available for use, meaning that the broker can service up to 16 concurrent IPC requests at a time. The following diagram describes a high level view of this architecture:
From the broker’s perspective, a channel can be viewed like so:
In general, this describes what the IPC communication channel between the broker and target looks like. In the following sections we’ll take a look at these in more technical depth.
The IPC facilities are established via
TargetProcess::Init, and is really what we’re most interested in. The following snippet describes how the shared memory mapping is created and established between the broker and target:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
shared_mem_size in the source code here comes out to 65536 bytes, which isn’t right. The shared section is actually 0x20000 bytes in modern Reader binaries.
Once the mapping is established and policies copied in, the SharedMemIPCServer
is initialized, and this is where things finally get interesting.
SharedMemIPCServer initializes the ping/pong events for communication, creates channels, and registers callbacks.
The previous architecture diagram provides an overview of the structures and layout of the section at
runtime. In short, a
ServerControl is a broker-side view of an IPC channel. It contains the server side event handles, pointers to both the channel and its buffer, and general information about the connected IPC endpoint. This structure is not visible to the target process and exists only in the broker.
ChannelControl is the target process version of a
ServerControl; it contains the target’s event handles, the state of the channel, and information about where to find the channel buffer. This channel buffer is where the
CrossCallParams can be found as well as the call return information after a successful IPC dispatch.
Let’s walk through what an actual request looks like. Making an IPC request requires the target
to first prepare a
CrossCallParams structure. This is defined as a class, but we can model it as a struct:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
I’ve also gone ahead and defined a few other structures needed to complete the picture. Note that the return structure,
CrossCallReturn, is embedded within the body of the
There’s a great ASCII diagram provided in the sandbox source code that’s highly instructive, and I’ve duplicated it below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
A tag is a dword indicating which function we’re invoking (just a number between 1 and approximately 255, depending on your version). This is handled server side dynamically, and we’ll explore that further later on.
Each parameter is then sequentially represented by a
1 2 3 4 5
The offset is the delta value to a region of memory somewhere below the
CrossCallParams structure. This is handled in the Chromium source code via the
Let’s look at a call in memory from the target’s perspective. Assume the channel buffer is at
1 2 3 4 5 6 7 8 9
0x2a10134 shows we’re invoking tag 3, which carries 7 parameters (
The first argument is type 0x1 (we’ll describe types later on), is at delta
offset 0xa0, and is 0x86 bytes in size. Thus:
1 2 3 4 5 6 7 8 9 10 11 12 13
This shows the delta of the parameter data and, based on the parameter type, we know it’s a unicode string.
With this information, we can craft a buffer targeting IPC tag 3 and move onto
sending it. To do this, we require the
structure. This is a simple structure defined at the start of the IPC shared memory section:
1 2 3 4 5
And in the IPC shared memory section:
1 2 3
So we have 16 channels, a handle to
server_alive, and the start of our
server_alive handle is a mutex used to signal if the server has crashed.
It’s used during tag invocation in
SharedmemIPCClient::DoCall, which we’ll describe later on. For now, assume that if we
WaitForSingleObject on this and it returns
WAIT_ABANDONED, the server has crashed.
ChannelControl is a structure that describes a channel, and is again defined as:
1 2 3 4 5 6 7
channel_base describes the channel’s buffer, ie. where the
CrossCallParams structure can be found. This is an offset from the base of the shared memory section.
state is an enum that describes the state of the channel:
1 2 3 4 5 6 7
The ping and pong events are, as previously described, used to signal to the opposite endpoint that data is ready for consumption. For example, when the client has written out its
CrossCallParams and ready for the server, it signals:
1 2 3 4
When the server has completed processing the request, the
pong_event is signaled and the client reads back the call result.
A channel is fetched via
SharedMemIPCClient::LockFreeChannel and is invoked when
GetBuffer is called. This simply identifies a channel in the
IPCControl array wherein
state == kFreeChannel, and sets it to
kBusyChannel. With a
channel, we can now write out our
CrossCallParams structure to the shared memory buffer. Our target buffer begins at
Writing out the
CrossCallParams has a few nuances. First, the number of
actual parameters is NUMBER_PARAMS+1. According to the source:
1 2 3 4
This can be observed in the
1 2 3 4
Note the offset written is the offset for
index+1. In addition, this offset is aligned. This is a pretty simple function that byte aligns the delta inside the channel buffer:
1 2 3 4 5 6 7 8
Because the Reader process is x86, the alignment is always 8.
The pseudo-code for writing out our
CrossCallParams can be distilled into the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
CrossCallParams structure has been written out, the sandboxed process signals the
ping_event and the broker is triggered.
Broker side handling is fairly straightforward. The server registers a
ping_event handler during
RegisterWait is just a thread pool wrapper around a call to
ThreadPingEventReady function marks the channel as
kAckChannel, fetches a pointer to the provided buffer, and invokes
InvokeCallback. Once this
returns, it copies the
CrossCallReturn structure back to the channel and signals the
InvokeCallback parses out the buffer and handles validation of data, at a high level (ensures strings are strings, buffers and sizes match up, etc.). This is probably a good time to document the supported argument types. There are 10 types in total, two of which are placeholder:
1 2 3 4 5 6 7 8 9 10 11 12
These are taken from
but you’ll notice there are two additional types:
MEM_TYPE, and are unique to Reader.
ASCII_TYPE is, as expected, a simple 7bit ASCII string.
MEM_TYPE is a memory structure used by the broker to read
data out of the sandboxed process, ie. for more complex types that can’t be trivially passed via the API. It’s additionally used for data blobs, such as PNG images, enhanced-format datafiles, and more.
Some of these types should be self-explanatory;
WCHAR_TYPE is naturally a wide char,
ASCII_TYPE an ascii string, and
ULONG_TYPE a ulong. Let’s look at a few of the non-obvious types, however:
VOIDPTR_TYPE, this is a standard type in the Chromium sandbox so we can just refer to the source code.
GetParameterVoidPtr. Simply, once the value itself is extracted it’s cast to a void ptr:
This allows tags to reference objects and data within the broker process itself. An example might be
NtOpenProcessToken, whose first parameter is a handle to the target process. This would be retrieved first by a call to
OpenProcess, handed back to the child process, and then supplied in any future calls that may need to use the handle as a
In the Chromium source code,
INPTR_TYPE is extracted as a raw value via
GetRawParameter and no additional processing is performed. However, in Adobe Reader, it’s actually extracted in the same way
INOUTPTR_TYPE is wrapped as a
CountedBuffer and may be written to during the IPC call. For example, if
CreateProcessW is invoked, the
PROCESS_INFORMATION pointer will be of type
The final type is
MEM_TYPE, which is unique to Adobe Reader. We can define the structure as:
1 2 3 4 5
As mentioned, this type is primarily used to transfer data buffers to and from the broker process. It seems crazy. Each tag is responsible for performing its own validation of the provided values before they’re used in any
Once the broker has parsed out the passed arguments, it fetches the context dispatcher and identifies our tag handler:
1 2 3
The handler is fetched from
which winds up calling
This is a pretty simple function that crawls the registered IPC tag list for
the correct handler. We finally hit
InvokeCallbackArgs, unique to Reader,
which invokes the handler with the proper argument count:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
In total, Reader supports tag functions with up to 17 arguments. I have no idea why that would be necessary, but it is. Additionally note the first two arguments to each tag handler: context handler (dispatcher) and
CrossCallParamsEx. This last structure is actually the broker’s version of a
CrossCallParams with more paranoia.
A single function is used to register IPC tags, called from a single initialization function, making it relatively easy for us to scrape them all at runtime. Pulling out all of the IPC tags can be done both statically and dynamically; the former is far easier, the latter is more accurate. I’ve implemented a static generator using IDAPython, available in this project’s repository (
ida_find_tags.py), and can be used to pull all supported IPC tags out of Reader along with their parameters. This is not going to be wholly indicative of all possible calls, however. During initialization of the sandbox, many feature checks are performed to probe the availability of certain capabilities. If these fail, the tag is not registered.
Tags are given a handle to
CrossCallParamsEx, which gives them access to the
CrossCallReturn structure. This is defined here and, repeated from above, defined as:
1 2 3 4 5 6 7 8 9 10 11 12
This 52 byte structure is embedded in the
CrossCallParams transferred by the sandboxed process. Once the tag has returned from execution, the following occurs:
1 2 3 4 5 6 7 8 9 10 11 12
and the sandboxed process can finally read out its result. Note that this mechanism does not allow for the exchange of more complex types, hence the availability of
MEM_TYPE. The final step is signaling the
pong_event, completing the call and freeing the channel.
Now that we understand how the IPC mechanism itself works, let’s examine the implemented tags in the sandbox. Tags are registered during initialization by a function we’ll call
InitializeSandboxCallback. This is a large function that handles allocating sandbox tag objects and invoking their respective initalizers. Each initializer uses a function,
RegisterTag, to construct and register individual tags. A tag is defined by a
1 2 3 4 5
Arguments array is initialized to
INVALID_TYPE and ignored if the tag does not use all 17 slots. Here’s an example of a tag structure:
1 2 3 4 5
Here we see tag 3 with 7 arguments; the first is
WCHAR_TYPE and the remaining 6 are
ULONG_TYPE. This lines up with what know to be the NtCreateFile tag handler.
Each tag is part of a group that denotes its behavior. There are 20 groups in total:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
The names were extracted either from the Reader binary itself or through correlation with Chromium. Each dispatcher implements an initialization routine that invokes
RegisterDispatchFunction for each tag. The number of registered tags will differ depending on the installation, version, features, etc. of the Reader process.
SandboxBrokerServerDispatcher, for example, can have a sway of approximately 25 tags.
Instead of providing a description of each dispatcher in this post, I’ve instead put together a separate page, which can be found here. This page can be used as a tag reference and has some general information about each. Over time I’ll add my notes on the calls. I’ve additionally pushed the scripts used to extract tag information from the Reader binary and generate the table to the
sander repository detailed below.
Over the course of this research, I developed a library and set of tools for examining and exercising the Reader sandbox. The library,
libread, was developed to programmatically interface with the broker in real time,
allowing for quickly exercising components of the broker and dynamically reversing various facilities. In addition, the library was critical during my fuzzing expeditions. All of the fuzzing tools and data will be available in the next post in this series.
libread is fairly flexible and easy to use, but still pretty rudimentary and, of course, built off of my reverse engineering efforts. It won’t be feature complete nor even completely accurate. Pull requests are welcome.
The library implements all of the notable structures and provides a few helper functions for locating the
ServerControl from the broker process. As we’ve seen, a
ServerControl is a broker’s view of a channel and it is held by the broker alone. This means it’s not somewhere predictable in shared memory and we’ve got to scan the broker’s memory hunting it. From the sandbox side there is also a
find_memory_map helper for locating the base address of the shared memory map.
In addition to this library I’m releasing
sander. This is a command line tool that consumes
libread to provide some useful functionality for inspecting the sandbox:
1 2 3 4 5 6 7
The most useful functionality provided here is the
-m flag. This allows one to monitor the IPC calls and their arguments in real time:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
We’re also able to dump all IPC calls in the brokers’ channels (
-d), which can help debug threading issues when fuzzing, and trigger a test IPC call (
-t). This latter function demonstrates how to send your own IPC calls via
libread as well as allows you to test out additional tooling.
The last available feature is the
-c flag, which captures all IPC traffic and logs the channel buffer to a file on disk. I used this primarily to seed part of my corpus during fuzzing efforts, as well as aid during some reversing efforts. It’s extremely useful for replaying requests and gathering a baseline corpus of real traffic. We’ll discuss this further in forthcoming posts.
That about concludes this initial post. Next up I’ll discuss the various fuzzing strategies used on this unique interface, the frustrating amount of failure, and the bugs shooken out.