Overview

We have a Malpedia Yara rule for the targetted campaign dubbed "SoulSearcher" detailed in a Fortinet blog that is matching on thousands of samples!

These samples seem to be very similar (same metadata etc.) but have different hashes. Our assumption is that these samples are part of some self-propogating worm, but we need to figure out...

1) Is this worm-set of samples related to SoulSearcher 2) What is going on with the worm-set of samples? Are they actually a worm or something else?

Malpedia Yara Rule

rule win_soul_auto {

    meta:
        author = "Felix Bilstein - yara-signator at cocacoding dot com"
        date = "2023-01-25"
        version = "1"
        description = "Detects win.soul."
        info = "autogenerated rule brought to you by yara-signator"
        tool = "yara-signator v0.6.0"
        signator_config = "callsandjumps;datarefs;binvalue"
        malpedia_reference = "https://malpedia.caad.fkie.fraunhofer.de/details/win.soul"
        malpedia_rule_date = "20230124"
        malpedia_hash = "2ee0eebba83dce3d019a90519f2f972c0fcf9686"
        malpedia_version = "20230125"
        malpedia_license = "CC BY-SA 4.0"
        malpedia_sharing = "TLP:WHITE"

    /* DISCLAIMER
     * The strings used in this rule have been automatically selected from the
     * disassembly of memory dumps and unpacked files, using YARA-Signator.
     * The code and documentation is published here:
     * https://github.com/fxb-cocacoding/yara-signator
     * As Malpedia is used as data source, please note that for a given
     * number of families, only single samples are documented.
     * This likely impacts the degree of generalization these rules will offer.
     * Take the described generation method also into consideration when you
     * apply the rules in your use cases and assign them confidence levels.
     */


    strings:
        $sequence_0 = { 732f 8b55fc 3b5510 7327 8b55b0 8b5dec }
            // n = 6, score = 400
            //   732f                 | mov                 ecx, dword ptr [ebp - 0x18]
            //   8b55fc               | mov                 dword ptr [ebp - 0x2c], ecx
            //   3b5510               | mov                 ecx, dword ptr [ebp - 0x3c]
            //   7327                 | mov                 dword ptr [ebp - 0x30], edx
            //   8b55b0               | add                 ebx, edi
            //   8b5dec               | mov                 edi, dword ptr [ebp - 0x44]

        $sequence_1 = { c1e90b 0faf4df0 3bf1 7306 }
            // n = 4, score = 400
            //   c1e90b               | mov                 eax, dword ptr [ebp + 8]
            //   0faf4df0             | cmp                 eax, dword ptr [ebp - 4]
            //   3bf1                 | jae                 0x202
            //   7306                 | movzx               ecx, byte ptr [eax]

        $sequence_2 = { d3e2 8515???????? 7405 e8???????? }
            // n = 4, score = 400
            //   d3e2                 | shl                 edx, cl
            //   8515????????         |                     
            //   7405                 | je                  7
            //   e8????????           |                     

        $sequence_3 = { 81c7680a0000 81fa00000001 7318 3b45fc }
            // n = 4, score = 400
            //   81c7680a0000         | mov                 word ptr [edi], bx
            //   81fa00000001         | add                 ecx, ecx
            //   7318                 | not                 edx
            //   3b45fc               | jmp                 0x21

        $sequence_4 = { 03d2 668939 8d7cd104 c745f800000000 c745e408000000 e9???????? 2bc7 }
            // n = 7, score = 400
            //   03d2                 | dec                 eax
            //   668939               | lea                 ecx, [ebp - 0x10]
            //   8d7cd104             | je                  0x5e
            //   c745f800000000       | jne                 0x55
            //   c745e408000000       | dec                 eax
            //   e9????????           |                     
            //   2bc7                 | lea                 ecx, [0x1b354]

        $sequence_5 = { ff25???????? ff25???????? 48895c2408 4889742410 }
            // n = 4, score = 400
            //   ff25????????         |                     
            //   ff25????????         |                     
            //   48895c2408           | dec                 eax
            //   4889742410           | mov                 dword ptr [esp + 8], ebx

        $sequence_6 = { 8b5dec 8b7d08 e9???????? 5f 5e b801000000 5b }
            // n = 7, score = 400
            //   8b5dec               | add                 ecx, edx
            //   8b7d08               | mov                 edx, ecx
            //   e9????????           |                     
            //   5f                   | mov                 ecx, dword ptr [ebp - 8]
            //   5e                   | mov                 word ptr [ecx + edi*2], dx
            //   b801000000           | cmp                 eax, dword ptr [ebp - 4]
            //   5b                   | jae                 0x1ff

        $sequence_7 = { 5d c3 57 eb05 }
            // n = 4, score = 400
            //   5d                   | movzx               ecx, byte ptr [eax]
            //   c3                   | shl                 esi, 8
            //   57                   | mov                 dword ptr [ebp - 0x20], eax
            //   eb05                 | mov                 eax, dword ptr [edi + 0x40]

        $sequence_8 = { b801000000 e9???????? 8b4f24 8b5f38 3bcb 7305 8b4728 }
            // n = 7, score = 400
            //   b801000000           | lea                 edi, [edi + edi + 1]
            //   e9????????           |                     
            //   8b4f24               | cmp                 edi, 0x40
            //   8b5f38               | jb                  0xffffffc1
            //   3bcb                 | sub                 edi, 0x40
            //   7305                 | mov                 dword ptr [ebp + 8], eax
            //   8b4728               | cmp                 edi, 4

        $sequence_9 = { 83c10b 894dec e9???????? 2bc7 2bf7 8bfa }
            // n = 6, score = 400
            //   83c10b               | mov                 edi, eax
            //   894dec               | mov                 dword ptr [ebp - 8], edx
            //   e9????????           |                     
            //   2bc7                 | movzx               edx, word ptr [edx]
            //   2bf7                 | jae                 0xad
            //   8bfa                 | cmp                 dword ptr [ebp - 0x14], 0x13

        $sequence_10 = { e8???????? e9???????? 48c7458007000000 4c89742478 664489742468 33c0 4883c9ff }
            // n = 7, score = 200
            //   e8????????           |                     
            //   e9????????           |                     
            //   48c7458007000000     | cmp                 edx, dword ptr [ebp + 0x10]
            //   4c89742478           | jae                 0x2f
            //   664489742468         | mov                 edx, dword ptr [ebp - 0x50]
            //   33c0                 | mov                 ebx, dword ptr [ebp - 0x14]
            //   4883c9ff             | add                 edi, 0xa68

        $sequence_11 = { 488d4dd0 e8???????? ff15???????? 448bc0 488d153e250200 488d4dd0 ff15???????? }
            // n = 7, score = 200
            //   488d4dd0             | test                ecx, ecx
            //   e8????????           |                     
            //   ff15????????         |                     
            //   448bc0               | dec                 eax
            //   488d153e250200       | mov                 dword ptr [esp + 8], ebx
            //   488d4dd0             | dec                 eax
            //   ff15????????         |                     

        $sequence_12 = { 745c 833d????????01 7553 488d0d54b30100 e8???????? 33ff }
            // n = 6, score = 200
            //   745c                 | sub                 esp, 0x30
            //   833d????????01       |                     
            //   7553                 | dec                 eax
            //   488d0d54b30100       | mov                 ebx, ecx
            //   e8????????           |                     
            //   33ff                 | dec                 eax

        $sequence_13 = { 4533c0 488d55ef 488d4da7 e8???????? }
            // n = 4, score = 200
            //   4533c0               | jae                 0xe
            //   488d55ef             | jae                 0x31
            //   488d4da7             | mov                 edx, dword ptr [ebp - 4]
            //   e8????????           |                     

        $sequence_14 = { 4533c0 488d55c8 488d4df0 e8???????? 488d542448 488d4df0 e8???????? }
            // n = 7, score = 200
            //   4533c0               | mov                 dword ptr [esp + 8], ebx
            //   488d55c8             | dec                 eax
            //   488d4df0             | mov                 dword ptr [esp + 0x10], esi
            //   e8????????           |                     
            //   488d542448           | push                edi
            //   488d4df0             | dec                 eax
            //   e8????????           |                     

        $sequence_15 = { 498bcc e8???????? 85c0 0f84e9000000 488d15ca0f0200 488bcb e8???????? }
            // n = 7, score = 200
            //   498bcc               | dec                 eax
            //   e8????????           |                     
            //   85c0                 | mov                 dword ptr [esp + 8], ebx
            //   0f84e9000000         | dec                 eax
            //   488d15ca0f0200       | mov                 dword ptr [esp + 0x10], esi
            //   488bcb               | push                edi
            //   e8????????           |                     

        $sequence_16 = { 4883ec20 4d8b21 4533ff 4d8bf1 }
            // n = 4, score = 200
            //   4883ec20             | push                edi
            //   4d8b21               | jmp                 9
            //   4533ff               | mov                 dword ptr [ebp - 0x1c], ecx
            //   4d8bf1               | jmp                 0x25

        $sequence_17 = { 66f2af 6689442420 48f7d1 4c8d41ff 488d4c2420 e8???????? 488d4c2420 }
            // n = 7, score = 200
            //   66f2af               | sub                 eax, edi
            //   6689442420           | sub                 esi, edi
            //   48f7d1               | mov                 edi, edx
            //   4c8d41ff             | shr                 edi, 5
            //   488d4c2420           | mov                 edx, dword ptr [ebp - 8]
            //   e8????????           |                     
            //   488d4c2420           | mov                 word ptr [edx], cx

        $sequence_18 = { 03d0 8d0452 488b542418 c1e008 4d8d1442 }
            // n = 5, score = 200
            //   03d0                 | sub                 edx, edi
            //   8d0452               | mov                 word ptr [ecx + 0x646], dx
            //   488b542418           | add                 edi, edx
            //   c1e008               | mov                 word ptr [ecx + 2], di
            //   4d8d1442             | mov                 edx, 2

        $sequence_19 = { ffe1 418b5610 85d2 7509 45895e08 e9???????? 83ff10 }
            // n = 7, score = 200
            //   ffe1                 | mov                 dword ptr [esp + 0x10], esi
            //   418b5610             | push                edi
            //   85d2                 | dec                 eax
            //   7509                 | sub                 esp, 0x30
            //   45895e08             | dec                 eax
            //   e9????????           |                     
            //   83ff10               | mov                 dword ptr [esp + 8], ebx

        $sequence_20 = { e8???????? 4c8d1d111f0100 4c895c2428 488d154d6d0100 488d4c2428 }
            // n = 5, score = 200
            //   e8????????           |                     
            //   4c8d1d111f0100       | jmp                 0x1f
            //   4c895c2428           | shr                 ecx, 0xb
            //   488d154d6d0100       | imul                ecx, dword ptr [ebp - 0x10]
            //   488d4c2428           | cmp                 esi, ecx

        $sequence_21 = { 488d0587fdffff 48894740 488d058cfdffff 48894748 8b442424 4c896770 894768 }
            // n = 7, score = 200
            //   488d0587fdffff       | mov                 ecx, dword ptr [ebp - 0x20]
            //   48894740             | movzx               edx, word ptr [ecx + ebx*2 + 0x180]
            //   488d058cfdffff       | cmp                 eax, 0x1000000
            //   48894748             | sub                 eax, edi
            //   8b442424             | sub                 esi, edi
            //   4c896770             | mov                 edi, edx
            //   894768               | shr                 edi, 5

        $sequence_22 = { 4885c0 746a 4c8bc0 0fb7d3 8bce e8???????? }
            // n = 6, score = 200
            //   4885c0               | dec                 eax
            //   746a                 | sub                 esp, 0x30
            //   4c8bc0               | dec                 eax
            //   0fb7d3               | mov                 ebx, ecx
            //   8bce                 | dec                 eax
            //   e8????????           |                     

    condition:
        7 of them and filesize < 1400832
}

SoulSearcher Samples

  • 579fa00bc212a3784d523f8ddd0cfc118f51ca926d8f7ea2eb6e27157ec61260
  • 69a9ab243011f95b0a1611f7d3c333eb32aee45e74613a6cddf7bcb19f51c8ab
  • 0f7af0cad4aade0e7058051a449059b35358ddda075d88b2d289625adc02deef
  • 1af5252cadbe8cef16b4d73d4c4886ee9cecddd3625e28a59b59773f5a2a9f7f
  • a6f75af45c331a3fac8d2ce010969f4954e8480cbe9f9ea19ce3c51c44d17e98
  • c4efb58723fd75d51eb92302fbd7541e4462f438282582b5efa3c6c7685e69fd

Comparison

  • The SoulSearcher samples are either DLLs or they are 64bit, this does not match with our worm-set of 32bit exe files
  • The metadata of the SoulSearcher samples sometimes spoofs Microsoft Files but none of them spoof Word like our worm-set
  • There are three functions that are common between the samples 0F7AF0CAD4AADE0E7058051A449059B35358DDDA075D88B2D289625ADC02DEEF these appear to be part of a compression algorithm but it's uncertain (maybe just open source statically compiled lib, or... same fdev!)

Question 1 - Is The Worm-Set Part of SoulSearch

It appears as though there is a shares library between the worm-set and the soulsearcher set but there is no other overlap. We can say with medium confidence that these samples are not related and that the overlap is highly likely a shared statically compiled library.

Based on this we should update the Malpedia rule, designate the shared library code sigs as "shared" and exclude them from allowing a full match of the yara rule.

Worm-Set Samples

  • 7a9397dc47ce8a604c359da17dded029580b56cc6d988e841eaaabe622b23fa4
  • b3ba13bbd74e20a40d9c02c0398a474c93a352322211561f35d60cdc26739477
  • a36f318370a5ace5f9c4611c1f8edfffdb05dd8e109a3dd2127ca3881670d7bd
  • a803aa42fe614cafdb86da10d730cecbc8a2432eb7080fd004e640cc536b595c
  • fc2642273c70d99a172ce45f57fe02d7f65b0ecd2b29b882ce84d0f3e01c8197

Older sample that seems to be related? (2014)

  • 4824d76a4d15840e30bb0a3da01757c22502f0d9cfeac05e9219100c477beeee

Common Artifacts

Metadata

These samples all have many common metadata artifacts.

LegalCopyright © 2006 Microsoft Corporation. All rights reserved.
InternalName WinWord
FileVersion 12.0.4518.1014
CompanyName Microsoft Corporation
LegalTrademarks1 Microsoft® is a registered trademark of Microsoft Corporation.
LegalTrademarks2 Windows® is a registered trademark of Microsoft Corporation.
ProductName 2007 Microsoft Office system
ProductVersion 12.0.4518.1014
FileDescription Microsoft Office Word
OriginalFilename WinWord.exe
charsetID 1252
Translation 0x0000 0x04e4
LangID 0x0000

Compile Time

Tue Nov 11 14:39:16 2003 UTC

Imports

  • KERNEL32.dll
  • ole32.dll
  • SHLWAPI.dll

Yara Match Strings

0x7a6d:$sequence_0: 73 2F 8B 55 FC 3B 55 10 73 27 8B 55 B0 8B 5D EC
0x80c8:$sequence_1: C1 E9 0B 0F AF 4D F0 3B F1 73 06
0x7f71:$sequence_3: 81 C7 68 0A 00 00 81 FA 00 00 00 01 73 18 3B 45 FC
0x7415:$sequence_4: 03 D2 66 89 39 8D 7C D1 04 C7 45 F8 00 00 00 00 C7 45 E4 08 00 00 00 E9 84 00 00 00 2B C7
0x7a7a:$sequence_6: 8B 5D EC 8B 7D 08 E9 0D F5 FF FF 5F 5E B8 01 00 00 00 5B
0x7d2c:$sequence_8: B8 01 00 00 00 E9 89 04 00 00 8B 4F 24 8B 5F 38 3B CB 73 05 8B 47 28
0x72c9:$sequence_9: 83 C1 0B 89 4D EC E9 90 07 00 00 2B C7 2B F7 8B FA

Worm Differences (Polymorphic)

  • The PE header (DOS header) has what appears to be some junk that is changed between samples
  • The PE first .data section (0x42b000) contains data blobs that change between the samples

Reverse engineering the program flow we can see that if the binary is exeuted with no command line arguments it will create a polymorphic copy of itself in the %TEMP% directory and launch the copy. Because the copy is launched with no arguments it replicates the same behaviour creating an endless loop filling the %TEMP% directory with copies of the malware.

Question 2 - Where Are All These Samples Comming From?

Why Does VT have 500,000 copies of this malware?!

Because VT runs some submissions in a sandbox and collects the dropped files from the sandbox run for anlysis (running some of these in a sandbox) a loop is created where this worm endlessly creates new copies of itself. This issue is not limited to VT but can effect any intel feed that reprocesses their own samples.