Hooking covers a range of techniques used for many purposes like debugging, monitoring, intercepting messages, extending functionality etc. Hooking is also used by a lot of rootkits to camouflage themselves on the system. Rootkits use various hooking techniques when they have to hide a process, hide a network port, redirect file writes to some different files, prevent an application from opening a handle to a particular process and many more. In this article I will be explaining the various API hooking techniques used by some advanced rootkits. There are lots of Code Injection techniques, but in this blog I will concentrate on DLL Injections because these assist hooking activities to inject and execute malicious code.
DLL injection is a technique used for running code within the address space of another process by forcing it to load a DLL. DLL injection is used by almost every malware to place malicious routines in user memory. Though DLL Injection will just place a DLL in memory, executing code present in the DLL is triggered after API hooking is done. Let’s have a look at the various methods for injecting DLLs.
a) APPINIT_DLL hook and LOADAPPINIT_DLL
The AppInit_DLLs infrastructure provides an easy way to hook system APIs by allowing custom DLLs to be loaded into the address space of every interactive application.
The above registry key has an entry for a set of DLLs which are loaded in the process memory when the process loads User32.dll. Many malwares try to add their malicious DLLs in the list by modifying the registry key. As almost every user-mode interactive process imports User32.dll, it definitely has a wider existence. Also, the value of the key LOADAPPINIT_DLL should be 1 to allow User32.dll to globally enable the APPINIT_DLL key.
From Windows 7 onwards, a new code-signing requirement is enforced. Developers must code-sign their DLLs if it has to be included in the list so that users can trust the application. To further add protection, Windows 8 has adopted secure boot mechanism. If the OS is secure boot enabled, APPInit_DLLs mechanism is disabled as part of a no-compromise approach. According to Microsoft, the AppInit_DLLs mechanism is not a recommended approach for legitimate applications because it can lead to system deadlocks and performance problems.
It installs an application-defined hook procedure into a hook chain. We use it to install a hook procedure to monitor the system for certain types of events. These events are associated either with a specific thread or with all threads in the same context as the calling thread. The most famous example implementation of this function is a keylogger application. For installing the hook, we require a malicious DLL which exports one or more functions. These functions will be called whenever the hooked events occur. We then create a program which loads the above DLL in memory using LoadLibrary and then call SetWindowsHookEx function. The 1st parameter for function is the specific event which is to be hooked. In case of Keyloggers, the event name is WH_KEYBOARD. Other parameters are name of the DLL and the address of the exported method, which can be found using GetProcAddress. A detailed PoC for SetwindowsHookEx implementation can be referred from Dejan Lukan’s article.
The CreateRemoteThread function creates a thread in the virtual address space of an arbitrary process. It can be used to inject a custom DLL in the process memory of a remote process.
Following steps are followed in this approach:
1. Call OpenProcess function to get a handle of the target process. In parameters to the function, specify all process access permissions so that the local process is privileged enough to perform write operations later. If we fail to open process with the specified permissions, then there is no point of proceeding further because it will fail.
2. Get the address of Kernel32.LoadLibraryA method using GetProcAddress. Why we need this address you would realise later in step 5.
3. Allocate some memory inside target process’s address space using VirtualAllocEx. The memory size should be enough to store the full path string of the DLL to be injected.
4. Write argument to LoadLibrary to the process’s newly allocated memory using WriteProcessMemory function. In arguments we pass the full path string of DLL. The string has to be written in the target process memory because it can’t access a string in memory of some different process using a pointer.
5. Finally call CreateRemoteThread function with address of LoadLibrary function and the DLL string. This will result in a call to LoadLibrary method in the target process and hence load our DLL successfully. An interesting fact which can be observed here is that this method luckily works because LoadLibrary needs only one argument, and only those methods which have one argument can be called through CreateRemoteThread.
This program would implement all the above mentioned steps.
Windows 7 onwards Session Separation technique is being used to limit CreateRemoteThread hooking method. It ensures that core system processes including services always run in session 0 while all user process’s run in different sessions. However, NtCreateThreadEx API has come to rescue as it allows any process to inject DLL into any other process irrespective of session in which it is running as long as it has sufficient privileges. Refer Nagareshwar’s article for more reading on NtCreateThreadEx.
Import Address Table (IAT) is an array of links representing the various DLLs imported by the PE loader during process initiation. IAT hooking is a technique of modifying the address of a particular DLL in the IAT with address of hook function. Before performing IAT hooking we must make sure that we are able to put the hook function in the user’s address space through any of the DLL injection methods. IAT hooking will not be useful to us if the target program performs run-tie dynamic linking through LoadLibrary and GetProcAddress APIs to get the real address of each DLL functions. To get around this, hooking the GetProcAddress function would be the only solution but it will be a much tougher job.
Inline Hooking is mostly seen in userland process than kernel mode processes. Typically, an inline function hook is implemented by overwriting the beginning of target function with an unconditional jump to a Detour function. Detour function calls a Trampoline function, which contains the overwritten bytes of the original target function, and then calls the target function. The target function returns to the detour function which finally gives control back to the source function. This whole process would appear more clear from the diagram below.
Inline hooking is easy in XP because any function prologue in XP is 5 bytes, and jump instruction also requires 5 bytes (1 byte for JMP’s opcode and 4 bytes for address.)
System Service Dispatch Table is an array located in Kerneland that basically stores the function pointers to kernel routines. It provides syscall or service numbers for each function to all userland processes using which get mapped to actual addresses through SSDT mapping. In order to hook a syscall in the SSDT, we will have thus to replace its address in the SSDT by the address of our function.
The SSDT uses a structure called the System Service Table (SST). In the structure below, ServiceTable is the pointer to our SSDT array.
SSDT is accessed through the KeServiceDescriptorTable variable. This is the main SSDT and it stores function pointers to kernel routines present in ntoskrnl.exe. Similarly there is KeServiceDescriptorTableShadow variable which has two SSDT arrays. The 1st SSDT array is a copy of the previous array whereas the other one stores function pointer to kernel routines present in Win32k.sys kernel mode driver. Every thread gets the KeServiceDescriptorTable pointer into its Thread Control Block. SSDT and Shadow SSDT can be viewed in WinDbg using “dps KiServiceTable” and “dps Win32k!W32pServiceTable” commands respectively which will give a long list of all the APIs from ntoskrnl and win32k. To find whether the SSDT is hooked or not is very simple here. If any function pointer in the list points to address outside the kernel address range, it implies that the SSDT is hooked.
To practically understand how SSDT hooking is implemented in malware codes, you must go through this program given by rohitab.com. In the given implementation, in order to modify the SSDT addresses, the write protection enforced is being disabled by modifying the control register, CR0. Then we get the service number for the API we need to hook using GetServiceNumber API. This service number helps us to calculate the address of the required function pointer. Finally we replace this kerneland address with the userland address of our hooking function.
PatchGuard (or Kernel Patch Protection) is being created for 64 bit OS which prevents kernel from patching. This makes SSDT hooking impossible unless the PatchGuard is disabled by some external tool. Also, SSDT structure and format is being changed a little bit to further complicate the hooking.
PatchGuard protects the OS in following ways:
- protects system modules (NTOS, NDIS, HAL)
- protects System Service Dispatch Table
- protects Global Descriptor Table
- protects Interrupt Descriptor Table
- use kernel stacks that are not allocated by the kernel
- prevents patch of any part of the kernel
But recently exposed Uroburos rootkit by G Data in their red paper mentions how it bypassed the PatchGuard security mechanism. A function named KeBugCheckEx deliberately crashes Windows if it detects this kind of kernel hooking activity (or several other suspect activities). So, naturally, Uroburos hooks KeBugCheckEx to hide its other activities. Further it turns off the Driver Signing Policy by exploiting a known vulnerability in a legitimate driver which allows the rootkit to load its own driver for hooking.
An I/O Request Packet (IRP) is the basic I/O manager structure used to communicate with drivers and to allow drivers to communicate with each other. Each driver in Windows creates a number of devices which are responsible for handling IRP of varying types, depending on the underlying system. When a new driver is loaded for a particular device, DriverEntry routine is called which initailizes the driver. It creates Device Objects for each physical, logical, or virtual device for which it handles I/O requests.
I/O manager simultaneously creates a Driver Object and sends a pointer to the Driver Object to DriverEntry routine. The DriverEntry routine is supposed to fill in the DispatchXXX entry points in Driver Object with addresses/entry points for the driver’s standard routines. This is done because only the driver knows the addresses of its Device Objects.
When user-mode applications want to communicate with device drivers and file system drivers, they issue a call through the DeviceIoControl API. The I/O Manager, present within the Kernel Executive module, on receiving the call creates an I/O Request Packet (IRP) and delivers it to the concerned device driver. IRPs are also created when a high-level driver wants to communicate with a lower-level driver. Function codes present in IRP are used to denote which driver function is to be called. Eg. IRP_MJ_READ function code specified in IRP will map to address corresponding to DispatchREAD function in the Driver Object. IRP hooking is performed by modifying the addresses of driver’s routines in the Driver Object, so that when IRP for a particular operation is sent, the hooked routine would get executed.
Interrupt Descriptor Table (IDT) stored in IDT register contains pointer to Interrupt Service Routines (ISR). IDT hooking as the name suggest would modify the IDT entries to execute the hook function each time the interrupts are received. As each processor has a different IDT register, we make sure that the IDT entry we want to hook points to the same hooked ISR on all processor cores or else the hook will execute only a certain number of times. IDT register can be manipulated with the LIDT (Load IDT) and SIDT (Store IDT) instructions. SIDT will obtain the address of IDTR, and LIDT being a privileged instruction can be used to make changes to the IDTR. Sample program to perform IDT hooking can be referred from here.
Global Descriptor Table (GDT) hooks are similar to IDT hooks. SGDT and LGDT instructions are used to modify the register contents. These descriptor structures are protected by Kernel Patch Protection as described earlier.
System calls provide userland processes a way to request services from the kernel. The SYSENTER instructions (and equvialent SYSCALL on AMD) enable fast entry to the kernel, avoiding interrupt overhead. Sysenter is faster than the previous INT 0x2e only because it uses various Model Specific Registers (MSR) like SYSENTER_EIP, SYSENTER_ESP and SYSENTER_CS. To get more understanding on sysenter, like the significance of each MSR and how these are used to fetch the addresses, this FireEye blog would be a good reference. One important concept to note is that Sysenter is called in Ntdll.dll and it jumps to the value assigned in SYSENTER_EIP register which is also called as MSR-176h. That means for sysenter hooking, we have to modify the SYSENTER_EIP register. Modifications to MSRs are done using “wrmsr” instruction. The most easy bypass for Sysenter hooking would be to rewriting the register to its original value, however because KiFastCallEntry is not exported by ntoskrnl, getting the address could be tricky.
Being aware of the API hooking techniques helps us understand how malwares enter the system and hide its activities from user. We also get a fair idea of what to look and where to look for symptoms of possible malware existence in the OS.
2. The Rootkit Arsenal, Second Edition