While working on another research project (post to be released soon, will update here), I stumbled onto a very Hexacorn[0] inspired type of code injection technique that fit my situation perfectly. Instead of tainting the other post with its description and code, I figured I’d release a separate post describing it here.
When I say that it’s Hexacorn inspired, I mean that the bulk of the strategy is similar to everything else you’ve probably seen; we open a handle to the remote process, allocate some memory, and copy our shellcode into it. At this point we simply need to gain control over execution flow; this is where most of Hexacorn’s techniques come in handy. PROPagate via window properties, WordWarping via rich edit controls, DnsQuery via code pointers, etc. Another great example is Windows Notification Facility via user subscription callbacks (at least in modexp’s proof of concept), though this one isn’t Hexacorns.
These strategies are also predicated on the process having certain capabilities (DDE, private clipboards, WNF subscriptions), but more importantly, most, if not all, do not work across sessions or integrity levels. This is obvious and expected and frankly quite niche, but in my situation, a requirement.
Fibers
Fibers are “a unit of execution that must be manually scheduled by the application”[1]. They are essentially register and stack states that can be swapped in and out at will, and reflect upon the thread in which they are executing. A single thread can be running at most a single fiber at a time, but fibers can be hot swapped during execution and their quantum user controlled.
Fibers can also create and use fiber data. A pointer to this is stored in
TEB->NtTib.FiberData
and is a per-thread structure. This is initially set
during a call to ConvertThreadToFiber
. Taking a quick look at this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
We need to spawn off the test in a new thread, as the main thread will always have a fiber instantiated and the call will fail. If we run this in a debugger we can inspect the data after the break:
1 2 3 4 5 6 7 8 9 |
|
In addition to fiber data, fibers also have access to the fiber local storage (FLS). For all intents and purposes, this is identical to thread local storage (TLS)[2]. This allows all thread fibers access to shared data via a global index. The API for this is pretty simple, and very similar to TLS. In the following sample, we’ll allocate an index and toss some values in it. Using our previous example as base:
1 2 3 4 |
|
A pointer to this data is stored in the thread’s TEB, and can be extracted from
TEB->FlsData
. From the above example, assume the returned FLS index for this
data is 6:
1 2 3 4 5 6 7 8 9 |
|
Note that the offset is always the index + 2.
Abusing FLS Callbacks to Obtain Execution Control
Let’s return to that FlsAlloc
call from the above example. Its first
parameter is a PFLS_CALLBACK_FUNCTION
[3] and is used for, according to MSDN:
1 2 3 4 |
|
Well isn’t that lovely. These callbacks are stored process wide in
PEB->FlsCallback
. Let’s try it out:
1
|
|
And fetching it (assuming again an index of 6):
1 2 3 4 5 |
|
What happens when we let this run to process exit?
1 2 3 4 5 6 7 8 |
|
Recall the MSDN comment about when the FLS callback is invoked: ..on fiber
deletion, thread exit, and when an FLS index is freed
. This means that worst
case our code executes once the process exits and best case following a
threads exit or call to FlsFree
. It’s worth reiterating that the primary
thread for each process will have a fiber instantiated already; it’s quite
possible that this thread isn’t around anymore, but this doesn’t matter as the
callbacks are at the process level.
Another salient point here is the first parameter to the callback function. This parameter is the value of whatever was in the indexed slot and is also stashed in ECX/RCX before invoking the callback:
1 2 3 |
|
Which, when executed:
1 2 3 4 5 6 7 |
|
Under specific circumstances, this can be quite useful.
Anyway, PoC||GTFO, I’ve included some code below. In it, we overwrite the
msvcrt!_freefls
call used to free the FLS buffer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
I tested this on an updated Windows 10 x64 against notepad and mspaint; on process exit, the callback is executed and we gain control over execution flow. Pretty useful in the end; more on this soon…
References
[0] http://www.hexacorn.com
[1] https://docs.microsoft.com/en-us/windows/win32/procthread/fibers
[2] https://docs.microsoft.com/en-us/windows/win32/procthread/thread-local-storage
[3] https://docs.microsoft.com/en-us/windows/win32/api/winnt/nc-winnt-pfls_callback_function