Is macOS under the biggest malware attack ever?

No. I just clickbaited you but don’t leave yet, keep reading for something fun!

A couple of days ago I found something curious on VirusTotal. There were more than 40 thousand binaries with the same size in a single day. That seemed very odd so I loaded two random binaries and compared their contents. The only difference was on strings section.

VirusTotal detections were very low (two to three) and identified the samples as EvilQuest/ThiefQuest malware.

To prove that all the binaries were the same except for strings, I wrote a quick Mach-O stats utility in Go (yes, 2020 is this crazy!) to hash the code and strings sections separately. The hypothesis is that the code section would have the same hash for all the samples, and the strings section would have a unique hash for each sample. The output confirmed that this was indeed the case - same code, different strings.

Running this program against 206091 binaries totalling 34GB of data:

bash

Mach-O Stats
(c) 2020 Pedro Vilaca. All Rights Reserved
 100% |███████████████████████████████████████| (206091/206091, 1563 it/s) [2m11s:0s]
__text map
cd87dfd659fc2334ccc59093c1f41ba9abf4c88046d438ddd8bc2d82f55859d7 206091

Given that the strings are encrypted/obfuscated, my first idea was that this could be a new version with mutated versions being used in different sources. Doesn’t make that much sense given that the code was the same but given that EvilQuest has ransomware features, this could be for example different BitCoin wallets for each sample.

Now it was time to load one of the samples into a disassembler and give a look at its contents. Assuming that the VirusTotal detections were correct even if too low, I grabbed the known sample of EvilQuest. This sample contains debugging symbols so it’s very easy to navigate since most function names are explicit about their intents. The new sample fixed that mistake and had that information removed.

Before bringing the heavy diffing guns such as BinDiff and Diaphora I like to give a look around to feel what’s going on. In this case the code had differences but was very similar. I could see what were clearly obfuscated/encrypted strings like in the original sample. So, I tried to find those functions using the symbols from the first sample. That was fast and easy and confirmed that the code was related (either from the same author or someone reusing it - attribution is hard :P).

Scott Knight released a script to decrypt/encrypt the original samples strings, but it doesn’t work with the new samples. It makes sense given that there are keys and tables that could have changed, and also what appears to be a new type of obfuscated/encrypted string format.

000Bg{0000090nQ4XL1qPsnl1ZjpKX0lkFoa0000053

The new strings type appears to always starts with 000Bg{.

Learning a new programming language is easier when you have things to do with it, so I decided to write a decrypter/deobfuscator in Go. In hindsight it wasn’t a smart decision because it’s kind of ugly to deal with buffers in Go and much easier in C (or I don’t know yet the best way to do it in Go).

bash

$ ./evilquest_deobfuscator -s "000Bg{0000090nQ4XL1qPsnl1ZjpKX0lkFoa0000053"
EvilQuest String Deobfuscator
(c) 2020 Pedro Vilaca. All Rights Reserved
000Bg{0000090nQ4XL1qPsnl1ZjpKX0lkFoa0000053 -> rb+

Meanwhile, the next day there were again more than 40 thousand new samples with the same size. Confirmed again that the only difference was in strings. While reversing and writing the strings decrypter I noticed that the hash of the sample I was using was modified. That generated a brain click and I went to bed thinking that this wasn’t a big malware campaign (very sad!) because it didn’t make sense with so many samples but it could be a VirusTotal issue. VirusTotal sandbox just got trapped into an analysis loop. This idea was reinforced by the fact that the sample had been submitted from the ZZ country code, meaning unknown origin. Connecting these two ideas reinforced my belief that this was the right path.

After I finished the strings decrypter I could verify that my unique samples campaign hypothesis wasn’t valid. The strings were all the same, just encrypted/obfuscated with different keys.

So, the next step was to verify the code to see what was happening there. This was very easy to find since it’s the first thing the sample does.

At the entrypoint we can observe the mutation function being called first with argv[0] as its argument.

IDA

10001A8D0         public start
10001A8D0 start   proc near
(...)
10001A8D0         push    rbp
10001A8D1         mov     rbp, rsp
10001A8D4         sub     rsp, 2F0h
10001A8DB         mov     rax, cs:___stack_chk_guard_ptr
10001A8E2         mov     rax, [rax]
10001A8E5         mov     [rbp+var_8], rax
10001A8E9         mov     [rbp+var_94], 0
10001A8F3         mov     [rbp+var_98], edi
10001A8F9         mov     [rbp+var_A0], rsi
10001A900         mov     rax, [rbp+var_A0]
10001A907         mov     rdi, [rax]      ; argv[0]
10001A90A         call    fg_open_and_reencrypt_cstrings ; binary self modifies here
(...)

Next follows opening the executable itself with rb+ mode (reading and writing). Fun enough there is a memory leak because the decrypted string buffer is malloc’ed in the decryptor function. One of the differences from this sample versus the previous is the increased usage of dynamically allocated memory, increasing the potential for memory leaks. There are a lot more memory leaks all over the code. Xcode Instruments has a nice leak detector (hint, hint).

IDA

10001A840 fg_open_and_reencrypt_cstrings proc near
10001A840                                         ; CODE XREF: start+3A↓p
10001A840
10001A840 var_24          = dword ptr -24h
10001A840 __filename      = qword ptr -20h
10001A840 FILE_pointer    = qword ptr -18h
10001A840 var_10          = qword ptr -10h
10001A840 var_4           = dword ptr -4
10001A840
10001A840         push    rbp
10001A841         mov     rbp, rsp
10001A844         sub     rsp, 30h
10001A848         mov     [rbp+var_10], rdi
10001A84C         mov     rdi, [rbp+var_10]
10001A850         lea     rax, a000bg0000090nq_18 ; "000Bg{0000090nQ4XL1qPsnl1ZjpKX0lkFoa000"...
10001A857         mov     [rbp+__filename], rdi
10001A85B         mov     rdi, rax
10001A85E         call    fg_decrypt_0000Bg_string ; decrypt/decode string
10001A863         mov     rdi, [rbp+__filename]
10001A867         mov     rsi, rax  ; "rb+"
10001A867                           ; memleak here since the returned ptr was calloc'ed
10001A86A         call    _fopen
10001A86F         mov     [rbp+FILE_pointer], rax
10001A873         cmp     [rbp+FILE_pointer], 0
10001A878         jz      loc_10001A890
10001A87E         mov     rdi, [rbp+FILE_pointer] ; FILE *
10001A882         call    _ftrylockfile
10001A887         cmp     eax, 0
10001A88A         jz      loc_10001A89C
10001A890
10001A890 loc_10001A890:            ; CODE XREF: fg_open_and_reencrypt_cstrings+38↑j
10001A890         mov     [rbp+var_4], 0FFFFFFFFh
10001A897         jmp     loc_10001A8C1
10001A89C ; ---------------------------------------------------------------------------
10001A89C
10001A89C loc_10001A89C:            ; CODE XREF: fg_open_and_reencrypt_cstrings+4A↑j
10001A89C         mov     rdi, [rbp+FILE_pointer] ; FILE* handle
10001A8A0         call    fg_reencrypt_cstrings
10001A8A5         mov     rdi, [rbp+FILE_pointer] ; FILE *
10001A8A9         call    _funlockfile
10001A8AE         mov     rdi, [rbp+FILE_pointer] ; FILE *
10001A8B2         call    _fclose
10001A8B7         mov     [rbp+var_4], 0
10001A8BE         mov     [rbp+var_24], eax
10001A8C1
10001A8C1 loc_10001A8C1:            ; CODE XREF: fg_open_and_reencrypt_cstrings+57↑j
10001A8C1         mov     eax, [rbp+var_4]
10001A8C4         add     rsp, 30h
10001A8C8         pop     rbp
10001A8C9         retn
10001A8C9 fg_open_and_reencrypt_cstrings endp

The fg_reencrypt_cstrings function is previous listing is where the mutation occurs. The function will find the __cstring section and iterate over its contents, decrypting and encrypting the strings, and write back to the binary. The original binary is already modified when it returns from fg_open_and_reencrypt_cstrings .

(...)
for ( j = 0; j < sg->nsects; ++j ) {
    v12 = (__int64)sub_100006580(a1, v17, 80LL);
    // obfuscated string is "__cstring"
    v2 = fg_decrypt_0000Bg_string("000Bg{00000H0nQ4XL1qPsnl3oBkir1CDCUq3Z{iy|22B2MZ0000073");
    if ( !strcmp((const char *)v12, v2) ) {
        v11 = (__int64)sub_100006580(a1, *(unsigned int *)(v12 + 48), *(_QWORD *)(v12 + 40));
        v10 = 0LL;
        v9 = 0LL;
        v8 = 0;
        fseek(a1, *(unsigned int *)(v12 + 48), 0);
        while ( (unsigned __int64)v8 < *(_QWORD *)(v12 + 40) ) {
            if ( *(_BYTE *)(v11 + v8) ) {
                ++v9;
            }
            else if ( v9 ) {
                v7 = (char *)calloc(1uLL, v9 + 1);
                __memcpy_chk(v7, v10 + v11, v9, -1LL);
                v6 = fg_decrypt_0000Bg_string(v7);
                __s = (char *)fg_encrypt_0000Bg_string(v6);
                if ( v7 != v6 ) {
                    v3 = strlen(__s);
                    if ( v3 == strlen(v7) ) {
                        fseek(a1, v10 + *(unsigned int *)(v12 + 48), 0);
                        fwrite(__s, 1uLL, v9, a1);
                        free(v6);
                    }
                }
                free(v7);
                free(__s);
                v10 += v9 + 1;
                v9 = 0LL;
            }
            else {
                ++v10;
            }
            ++v8;
        }
    }
    v17 += 80LL;
}
(...)

At this point it was very clear that the sandbox loop was happening. The original sample submitted to VirusTotal mutated itself, generating a new sample that was also submitted to the sandbox because it was a “new” executable file (the sandbox sees the original as a dropper) and so on. This explains why since 5th September 2020 there are around more than 40k new daily samples.

Date	Total
2020-09-05	14033
2020-09-06	47782
2020-09-07	47887
2020-09-08	47849
2020-09-09	48540
2020-09-10	48819
2020-09-11	45681
2020-09-12	45263
2020-09-13	46797
2020-09-14	16437
2020-09-15	43440
2020-09-16	40616
2020-09-17	40165
2020-09-18	40382
2020-09-19	39901

It’s also easy to see on VirusTotal graph feature with the relationships between the samples. This is a simple example but you can draw graphs with more items that show this relationship.

vt graph

Given all these assumptions it should be easy to find the patient zero sample.

As far as I can see this started on 2020-09-05. I looked up the earliest submission date for that day and there were two samples with the following hashes:

9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721
2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c

Both have ookcucythguan as submission name, and were submitted from Germany via the web interface. A minute later we can observe other samples being analyzed for the first time.

2020-09-05 17:06:57 9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721 (P0)
2020-09-05 17:07:47 2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c (P0)
2020-09-05 17:08:32 3499d8119db3bd9365a7b1d0b3f677cc9adc5efe9097234ac92e1aa915ef11b1
2020-09-05 17:08:33 a3f9a98d8a60c77666d4bf73b9ae2b72dafa32251813cba3a79a2aeb7511037c
2020-09-05 17:08:35 ade3e5d2bc094dd2835905aea82d58801a8fb53aa6449bd520d404fcfbc19e88
2020-09-05 17:08:36 ca984bcda781d11cc220d45bc01b2e34bd8349e83139f6a9fdd9dd55ddbad4fd
2020-09-05 17:08:38 d1a11b45b807a9f7da05db69ba1706a850064497376a4775cbb15b1d94b95588
2020-09-05 17:08:39 60db8eb741601aba3514e809775a0514a47a05117f50a93773e9c38ce868326d
2020-09-05 17:09:16 bd6e1b8ee1c01cb0326d01e026d9d9adcce64c1d021a97b18416680f2e774ca1
2020-09-05 17:09:18 5e8a0a3b6aeb4a37fc1d949ebe4846accd5842d4eb145d37fa7af41b0acbc70c
(...)

Let’s see if the code for those initial samples is identical:

bash

Mach-O Stats
(c) 2020 Pedro Vilaca. All Rights Reserved
 
__text map
cd87dfd659fc2334ccc59093c1f41ba9abf4c88046d438ddd8bc2d82f55859d7 10
__cstring map
c8bcd6734f292c094295c8902432e57bfd7e040e52b684966298789e79280a17 1
6e663fe4412847efc35eba032dd931ac47d68296a5498a2b32888ce721f52ae4 1
8feeda9ad8667378faa4f18c593941b25a9778f4dfb9d4c158af9dc960b0f3f8 1
5992aa7be51662a1aa4c9a8c817d914c8ae5e1af178ef440cb38d553f7ff626a 1
51245ddb85164d78d0e968a5b0ed9607b6ee54b11a6a43622018d7260fff1c95 1
f475383e966261ee28209a636bda87280640b34198adcb507d4518bad93e1728 1
112c82eaa856d4594d9c2e61020fb922e0f203fc0a7791fe2cc98f1a829f7bcd 1
6cdd2f844a19db4a9dafc95883a7bbc53b205637fc3a71b823ceda15c45c160e 1
f083c6df6305f979fd9228ea520997327fd3bda7dea7cbf98291235edb3c5683 1
3c444bd356c468001b1d2c79d4df2f9efd676302f4203b8342ecdb63334b0fa5 1

What this tells us is that the __text section is identical (same SHA256 for all 10 samples), while each __cstring section is unique (10 different hashes).

VirusTotal has information about the execution parents of processes submitted to the sandbox, for dropper analysis. Let’s check the execution parents of these samples:

First the patient zero samples:

{
  "sha256": "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721",
  "submission_names": [
    "ookcucythguan"
  ],
  "execution_parents": null,
  "first_seen": "2020-09-05 17:06:57"
}

{
  "sha256": "2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c",
  "submission_names": [
    "ookcucythguan"
  ],
  "execution_parents": null,
  "first_seen": "2020-09-05 17:07:47"
}

And next patient zero “children”:

{
  "sha256": "3499d8119db3bd9365a7b1d0b3f677cc9adc5efe9097234ac92e1aa915ef11b1",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmdd"
  ],
  "execution_parents": [
    "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721"
  ],
  "first_seen": "2020-09-05 17:08:32"
}

{
  "sha256": "a3f9a98d8a60c77666d4bf73b9ae2b72dafa32251813cba3a79a2aeb7511037c",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmgd"
  ],
  "execution_parents": [
    "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721"
  ],
  "first_seen": "2020-09-05 17:08:33"
}

{
  "sha256": "ade3e5d2bc094dd2835905aea82d58801a8fb53aa6449bd520d404fcfbc19e88",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmjd"
  ],
  "execution_parents": [
    "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721"
  ],
  "first_seen": "2020-09-05 17:08:35"
}

{
  "sha256": "ca984bcda781d11cc220d45bc01b2e34bd8349e83139f6a9fdd9dd55ddbad4fd",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmld"
  ],
  "execution_parents": [
    "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721"
  ]
}

{
  "sha256": "d1a11b45b807a9f7da05db69ba1706a850064497376a4775cbb15b1d94b95588",
  "submission_names": [
    "/Users/user1/Library/osxmobiledata/com.apple.afsvcpd"
  ],
  "execution_parents": [
    "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721"
  ],
  "first_seen": "2020-09-05 17:08:38"
}

{
  "sha256": "bd6e1b8ee1c01cb0326d01e026d9d9adcce64c1d021a97b18416680f2e774ca1",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmdd"
  ],
  "execution_parents": [
    "2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c"
  ],
  "first_seen": "2020-09-05 17:09:16"
}

A couple of samples later and we can already observe “grandsons” of patient zero:

{
  "sha256": "5e8a0a3b6aeb4a37fc1d949ebe4846accd5842d4eb145d37fa7af41b0acbc70c",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmhd"
  ],
  "execution_parents": [
    "3499d8119db3bd9365a7b1d0b3f677cc9adc5efe9097234ac92e1aa915ef11b1",
    "2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c"
  ],
  "first_seen": "2020-09-05 17:09:18"
}

{
  "sha256": "6ecdd0f33349a66635ed29b57afd9eafc3391c5a6b2267a3a866b66045290efc",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmfd"
  ],
  "execution_parents": [
    "5e8a0a3b6aeb4a37fc1d949ebe4846accd5842d4eb145d37fa7af41b0acbc70c",
    "cda3796c74c9047466384fda223f618d7efe5c00390ae1654e9dff7b3ab07f36",
    "d1a11b45b807a9f7da05db69ba1706a850064497376a4775cbb15b1d94b95588"
  ],
  "first_seen": "2020-09-05 17:10:05"
}

If my theory is correct, we should be able to see the mutated samples analyzed all day long and the next day. This hypothesis holds true if we verify next day execution timeline:

2020-09-06 00:00:00
2020-09-06 00:00:01
2020-09-06 00:00:25
2020-09-06 00:00:26
2020-09-06 00:00:30
2020-09-06 00:00:31
2020-09-06 00:00:33
2020-09-06 00:01:00
(...)
2020-09-06 23:59:37
2020-09-06 23:59:42
2020-09-06 23:59:43
2020-09-06 23:59:45
2020-09-06 23:59:48
2020-09-06 23:59:49
2020-09-06 23:59:58

We can find the first next day samples and see if their parent(s) belongs to the previous day:

{
  "sha256": "1a1e84793d5e68259e276d5728736f8dcdadbc10beaf579f3b56c415055f1474",
  "submission_names": [
    "/Users/user1/Library/osxmobiledata/com.apple.afsvcpd"
  ],
  "execution_parents": [
    "ee7075bd8f20e94b61436e6344631d0bbe7380a30c73f6dca39f14b402d5672f"
  ],
  "first_seen": "2020-09-06 00:00:00"
}

{
  "sha256": "f34f8e458fd98006f454c8e327c1df1872ce8cd93989ed9a41a914507061487e",
  "submission_names": [
    "/Users/user1/client/tmp/8e3b432bab64466a202b0557a8273f9eab1a63cff9b1ba62292a5180136cc95d/sample.bin"
  ],
  "execution_parents": [
    "d780db5b3796dfd39d6bb40aac51d94e4a758308be8414d6e1e76fa2c0c22f7f",
    "8e3b432bab64466a202b0557a8273f9eab1a63cff9b1ba62292a5180136cc95d"
  ],
  "first_seen": "2020-09-06 00:00:00"
}

Their parents from previous day:

{
  "sha256": "ee7075bd8f20e94b61436e6344631d0bbe7380a30c73f6dca39f14b402d5672f",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmtd",
    "/Library/osxmobiledata/com.apple.afsvcpd",
    "/Users/user1/Library/osxmobiledata/com.apple.afsvcpd",
    "com.apple.afsvcpd0"
  ],
  "execution_parents": [
    "7ccd8fe515bfc9316f9370e9191b971f231d5d4f28a865e549289b5e25a67f14"
  ],
  "first_seen": "2020-09-05 18:12:03"
}

{
  "sha256": "d780db5b3796dfd39d6bb40aac51d94e4a758308be8414d6e1e76fa2c0c22f7f",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmkd"
  ],
  "execution_parents": [
    "e9a3d60e34b380fee9a3910ad4c652589fa683c32f471fc6dcac99759541b1ea"
  ],
  "first_seen": "2020-09-05 18:11:32"
}

{
  "sha256": "8e3b432bab64466a202b0557a8273f9eab1a63cff9b1ba62292a5180136cc95d",
  "submission_names": [
    "/Users/user1/Library/com.apple.fmrd"
  ],
  "execution_parents": [
    "dd3d15e7cf6a1f62922041f7213d4b1eef7595dee613c1663a98f6015a4f936f"
  ],
  "first_seen": "2020-09-05 19:45:30"
}

Last verify if it is the same code and different strings, which still holds true.

bash

Mach-O Stats
(c) 2020 Pedro Vilaca. All Rights Reserved

__text map
cd87dfd659fc2334ccc59093c1f41ba9abf4c88046d438ddd8bc2d82f55859d7 5
__cstring map
6ac22c26e24ceb5a922b3e4b57365757b12211eb858cfca5746a8a30846757bd 1
ef848f3c91e9ad50f02ae61970289b33b5398b2a48129bbd78c98ba85ae40e74 1
0239a72b6f4b339aa8e82504d2be8acc42dc089ddafbbb44330bb8c32d63f5ec 1
318e952c78afe7b772b54e93aded05857705114e71cf53d7520f1f3814730249 1
9ccedba8c6262fd7e1ac78380266c095308179862d547e9227d33aa6dd3fb24e 1

Given all this it seems that there is a “bug” in VirusTotal macOS sandbox which allows to “fork” bomb it. This is a feature to tackle polymorphic code and it makes sense to exist. But in this case there is no code polymorphism, just strings being mutated. From my point of view there should be some kind of trigger to stop this after a day or two. But it can be a complicated decision and problem to solve. Where is the balance?

This could lead to a possible DoS or wasteful usage of VirusTotal macOS sandbox by submitting a couple of different Mach-O samples that modify themselves. If the files are big enough it could consume a lot of disk space. And flood everyone else looking at daily feeds.

The sandbox appears to be executing a sample every 2 seconds so we might be able to infer VirusTotal macOS sandbox analysis capacity.

Regarding the sample itself, it appears to be a new version of EvilQuest/ThiefQuest. There is a command line switch to display the version number, currently 3.105. I haven’t yet analyzed its capabilities to understand if there are any new features or improvements to the initial public found in last June. Its development appears to be active and so this threat might grow in the future.

The hardcoded C2 is still the same 159.65.147.28 as described in this post about an updated version back in July.

In the first days it had very few detections (2 to 3) but it seems AVs finally catched up. On 2020-09-09 the number of detections finally grew to 5, doubling next day and most vendors finally catching up over the next days. A reanalysis of the initial samples returns 21 detections at the time of writing.

I guess those detection signatures for the first version weren’t that good.

{
  "sha256": "9efc7a1f373026a266a642b8417544b92de08e25b6bcdc12d7bfd44bb8993721",
  "positives": 2,
  "scan_date": "2020-09-05 17:06:57"
}

{
  "sha256": "2f1fbd634ebac9079c29e1e659fe3e3f3fd7f3d0aefd4d513563d371b558d22c",
  "positives": 2,
  "scan_date": "2020-09-05 17:07:47"
}

The question is if the malware author was trying to mess around with the sandbox or was just a coincidence. If the latter, I don’t understand what’s the benefit of mutating the strings when the code is still the same. Might be able to fool lame AV signatures. Besides that, a significant amount of encrypted/obfuscated strings will just put the spotlight on this type of binary. Doesn’t make that much of a sense.

The conclusion is that unfortunately it’s not the biggest malware attack ever with near half a million samples on VirusTotal but a new version of EvilQuest/ThiefQuest that triggered a cute loop in VirusTotal sandbox.

Hope you have enjoyed this little adventure and was worth the clickbait.

Have fun,
fG!

P.S.: The Go code is already pushed to GitHub. Here and here.