The blog continues at

July 30, 2014

Practical Suggestions for Writing a Pintool

This is my list of practical suggestions to people developing a pintool. Since I dealt with these previously I thought to jot them down to help others. By applying this you should be somewhat closer to avoid your pintool from unexpected termination.

Start from scratch. So you use a sample pintool to develop your own. Rather than to modify the sample, start with an empty project and gradually build it up by taking elements from the sample.

Simplicity. Keep the code-base small and easy to understand.

Testing. As a part of development, aim to test if all blocks have been exercised. Refrain from adding unreachable blocks.

Errors. Check for errors as early as possible, specially when returning from a Pin API.

Safe memory dereference. Whenever you have to dereference the target's memory use PIN_SafeCopy. If you want to read an integer you should use this function, too, rather than the dereference "*" operator.

Thread safety. Be aware the target may be running with multiple threads. Possibly, you want your pintool to be thread safe.

Multi-threading. Sometimes you want your variables to be stored in the thread context to have the ability to distinguish the analysis between threads. In that case looking at the sample inscount_tls.cpp is a good start.

Probe mode. Use of probe mode is always preferred as it gives better performance. However, only limited Pin facilities available in probe mode.

Limit instrumentation. Consider restricting the instrumentation to routines or libraries and even can avoid the instrumentation of shared libraries to get better performance.

Standard library. It's good idea to use C++ standard library in a pintool as it provides the most frequently needed data structures.

Visual Studio. Visual C++ project file is available with Pin framework in MyPinTool folder. Alternatively, you can create one for yourself after looking at an earlier post.

Trace vs Ins. Instruction instrumentation is practically the same as trace instrumentation. You can do instruction analysis from the trace by iterating through the instructions.

Output. Having output routines in Fini makes the application to run faster than having them in analysis functions. However if the application terminates unexpectedly and so Fini is not called there will be no results shown. By having output routines in analysis functions makes the application to run slower but if the application terminates unexpectedly partial results may be shown.

July 26, 2014

Inspection of SAR Instructions

SAR stands for Shift Arithmetic Right and the instruction performs arithmetic shift. The instruction preserves the sign of the value to be shifted and so the vacant bits are filled according to the sign-bit.

Compilers generate SAR instruction when right shift operator ">>" is used on a signed integer.

The use of SAR instruction can potentially lead to create a signedness bug if it's assumed the shift is unsigned.

Given the following simplified example.

char retItem(char* arr, int value)
    return arr[value>>24];

If value is positive the code is working as expected. However if value is negative the program can read out of the bounds of arr.

Other example would be to compare the signed value after the shift to an unsigned value leading to implicit conversion that may lead to trigger bug.

In my experiment, in several cases, it is seen that memory is being dereferenced involving SAR instruction. These places may be worthy to look for bugs, specially if the value to be shifted is a user input or is a controlled one.

If an unsigned jump is followed by a signed shift that could be a potential to look for bugs as well.

Regular expressions or scripts can be used to search for patterns of occurrences of SAR instructions. When it's not feasible to review all occurrences of SAR, a pintool may be used to highlight what SAR instructions have been executed, and only focus on those executed.

July 22, 2014

Examining Native Code by Looking for Patterns

Earlier this year a post was published of examining data format without using the program that reads the format. That post discusses patterns to look for, in order to identify certain constructs. This post focuses on static methods of examining code that can be either the complete code section of the file, memory dump, or just fragment. It also describes selected ideas what patterns to look for when examining a given code.

The reason one may look for patterns in code is to locate certain functionalities or to get high-level understanding of what the code does. Others may look for certain construct that may be the key part of the program in security point of view.

It's true to say one can expect this to be a rapid method compared to other methods such as line-by-line instruction analysis.

But, it's always good to read documentation, if possible at all, to get an overview of the expectations.

There are methods that more effective if performed on small region. Therefore to narrow the scope of the search wherein to look for pattern is something good to do at the beginning of analysis albeit it's not always feasible to do with enough certainty. Anyway, one can always widen the search region if required at a later stage.

Compilers tend to produce executable files with particular layout. Some have the library code at the beginning of the code section, while others have it at the end of the code section.

If there is no information about the compiler or no information about the layout there are other ways to locate the library in the code.

You may look for library function calls that can be visible in disassembler. Library code may have distinct color in disassembler.

Library/runtime code often have many implementations of functions to use the advantage of latest hardware. An example is MSVC. And so SSE instructions/functions may indicate the presence of library/runtime code.

Library code can be spotted by looking for strings can be associated with particular libraries.

Library/runtime code can be spotted by looking for constant values that can be associated with particular libraries such as cryptographic libraries that tend to have many constants.

To guess the compiler that was used to generate the code is possible by analyzing the library/runtime code.

In case the code is just a fragment of user code you may consider examining the instructions how they are encoded. Intel encodings are redundant and one instruction can have multiple encodings. This is something to make guess on what compiler was used.

If multiple encodings of an instruction is found in a binary the code that could be generated with a polymorphic encoder.

Also, code has other characteristics that may differ between compilers such as padding and stack allocation.

Imports and exports as well as strings can tell a lot. You may check where they are referenced in the code.

Debugging symbols can help awfully lot if the disassembler can handle that. Sometimes it's available sometimes it's not.

No matter what code you're looking at it most likely deals with input data. That case it may get the data from file, from network, via standard API calls. These are valuable areas to audit for security problems, and it's possible to follow how the data returned by these APIs. It may require to analyze caller functions as usually these APIs are wrapped around many calls before using the input.

Just like when reading the data the code may write data, or send data via standard API calls. These areas may be security-sensitive.

Programs have centralized, well-established functions. These functions, for example, read dword values, read data into structures and propagate any other internal storage. Discovery of these functions not considered hard, they are normally small, and have instructions of memory read and write. By looking where they referenced from we can find good attack surfaces.

Good to keep in mind that code sections can contain data besides code. But normally data is stored in data section. In the disassembler it's convenient to see how the data is referenced, and may decide if there is an attack surface nearby.

CRC and hash constants may indicate there is some data which is being CRC'd or hashed. You may figure out where is that data from and how can you perform security testing around.

When a library is using a parameter hardcoded it's often encoded as a part of the instruction rather than stored in data section of the executable. Example encoding looks like mov eax, <param> or mov al, <param>.

When a data format is parsed often a magic value is tested. Looking for instructions like cmp reg, <magic> or cmp dword ptr [addr], <magic> or similar instructions can help to locate attack surfaces.

Longer strings may be broken into immediate values and compared with multiple cmp instructions.

Looking for strcmp function calls is good idea to look for if you want to find code that test for data format as often strcmp functions are used for this purpose.

If the code is optimized for speed there are many ways to confirm. Normally the readability of code bad, for example when the code performs division or use the same memory address for multiple variables. If EBP register is used in arithmetic or other than to store stack base address that could indicate the code is optimized.

Perhaps there are circumstances when looking at the frequency of instructions, looking for undocumented instructions, or rare instruction, or instructions that not present can give us valuable clues that help the examination.

Intuitively going through the code and looking for undefined patterns can be good idea if the scientific ways have been exhausted.

July 16, 2014

251 Potential NULL Pointer Dereferences in Flash Player

251 potential NULL pointer dereference issues have been identified in Flash Player 14 by pattern matching approach. The file examined is NPSWF32_14_0_0_145.dll (17,029,808 bytes).

The issues are classified as CWE-690: Unchecked Return Value to NULL Pointer Dereference.

I don't copy&paste all the issues in this blog post but bringing up few examples.

First Example

0:012> uf 5438a1d0
5438a1d0 f6410810        test    byte ptr [ecx+8],10h
5438a1d4 8b4104          mov     eax,dword ptr [ecx+4]
5438a1d7 7411            je      NPSWF32_14_0_0_145!BrokerMainW+0xf6f85 (5438a1ea)

5438a1d9 85c0            test    eax,eax
5438a1db 740b            je      NPSWF32_14_0_0_145!BrokerMainW+0xf6f83 (5438a1e8)

5438a1dd 8b4c2404        mov     ecx,dword ptr [esp+4]
5438a1e1 8b448808        mov     eax,dword ptr [eax+ecx*4+8]
5438a1e5 c20400          ret     4

5438a1e8 33c0            xor     eax,eax <--Set return value to NULL

5438a1ea c20400          ret     4 <--Return with NULL
0:012> u 5438a47b L2
5438a47b e850fdffff      call    NPSWF32_14_0_0_145!BrokerMainW+0xf6f6b (5438a1d0)
5438a480 8a580c          mov     bl,byte ptr [eax+0Ch] <--Dereference NULL 

Second Example

0:012> uf 54362e60
54362e60 8b4128          mov     eax,dword ptr [ecx+28h]
54362e63 8b4c2404        mov     ecx,dword ptr [esp+4]
54362e67 3b4804          cmp     ecx,dword ptr [eax+4]
54362e6a 7205            jb      NPSWF32_14_0_0_145!BrokerMainW+0xcfc0c (54362e71)

54362e6c 33c0            xor     eax,eax <--Set return value to NULL
54362e6e c20400          ret     4 <--Return with NULL

54362e71 56              push    esi
54362e72 8b748808        mov     esi,dword ptr [eax+ecx*4+8]
54362e76 56              push    esi
54362e77 e8e4b0faff      call    NPSWF32_14_0_0_145!BrokerMainW+0x7acfb (5430df60)
54362e7c 83c404          add     esp,4
54362e7f 85c0            test    eax,eax
54362e81 7407            je      NPSWF32_14_0_0_145!BrokerMainW+0xcfc25 (54362e8a)

54362e83 8b4010          mov     eax,dword ptr [eax+10h]
54362e86 5e              pop     esi
54362e87 c20400          ret     4

54362e8a 8bc6            mov     eax,esi
54362e8c 83e0f8          and     eax,0FFFFFFF8h
54362e8f 5e              pop     esi
54362e90 c20400          ret     4
0:012> u NPSWF32_14_0_0_145+006b4eb2 L2
54364eb2 e8a9dfffff      call    NPSWF32_14_0_0_145!BrokerMainW+0xcfbfb (54362e60)
54364eb7 8b7004          mov     esi,dword ptr [eax+4] <--Dereference NULL

Third Example

0:012> uf 5429979a
5429979a 0fb74108        movzx   eax,word ptr [ecx+8]
5429979e 48              dec     eax
5429979f 48              dec     eax
542997a0 740c            je      NPSWF32_14_0_0_145!BrokerMainW+0x6549 (542997ae)

542997a2 83e815          sub     eax,15h
542997a5 7403            je      NPSWF32_14_0_0_145!BrokerMainW+0x6545 (542997aa)

542997a7 33c0            xor     eax,eax <--Set return value to NULL
542997a9 c3              ret <--Return with NULL

542997aa 8d4110          lea     eax,[ecx+10h]
542997ad c3              ret

542997ae 8d410c          lea     eax,[ecx+0Ch]
542997b1 c3              ret
0:012> u NPSWF32_14_0_0_145+005f3423 L2
542a3423 e87263ffff      call    NPSWF32_14_0_0_145!BrokerMainW+0x6535 (5429979a)
542a3428 8038fe          cmp     byte ptr [eax],0FEh <--Dereference NULL

You can find a list of 251 potential NULL pointer dereferences in Flash Player here.

July 14, 2014

Issues with Flash Player & Firefox in Non-default Configurations

Few months ago I encountered a bug when a fuzzed flash file is being rendered by Flash Player in Firefox. This bug can be reached only in the non-default configuration described below so very unlikely you are affected by this bug.

To trigger the bug the flash player module has to be loaded into Firefox's virtual address space. And this can be achieved if Flash Player protected mode is disabled and Firefox plugin container process is disabled too.

The bug involves to dereference arbitrary memory address via a CALL instruction in the vtable dispatcher. Here you can see the bug in the exception state.

0:048> g
Implementation limit exceeded: attempting to allocate too-large object
error: out of memory
(170fc.16998): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=00000001 ebx=00000000 ecx=0034f670 edx=00000000 esi=1600f2c8 edi=0000001c
eip=5996bd5f esp=0034f638 ebp=0034f668 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
*** ERROR: Symbol file could not be found.  Defaulted to export symbols for C:\Windows\SysWOW64\Macromed\Flash\NPSWF32_14_0_0_145.dll - 
5996bd5f 8b461c          mov     eax,dword ptr [esi+1Ch] ds:002b:1600f2e4=????????
0:000> u eip L10
5996bd5f 8b461c          mov     eax,dword ptr [esi+1Ch] <--Read unmapped address
5996bd62 a801            test    al,1
5996bd64 7420            je      NPSWF32_14_0_0_145!unuse_netscape_plugin_Plugin+0x5e9 (5996bd86)
5996bd66 33d2            xor     edx,edx
5996bd68 39550c          cmp     dword ptr [ebp+0Ch],edx
5996bd6b 7519            jne     NPSWF32_14_0_0_145!unuse_netscape_plugin_Plugin+0x5e9 (5996bd86)
5996bd6d 8b4e04          mov     ecx,dword ptr [esi+4]
5996bd70 83e0fe          and     eax,0FFFFFFFEh
5996bd73 89461c          mov     dword ptr [esi+1Ch],eax
5996bd76 8b06            mov     eax,dword ptr [esi] <--Read unmapped address
5996bd78 51              push    ecx
5996bd79 8bce            mov     ecx,esi
5996bd7b 895604          mov     dword ptr [esi+4],edx
5996bd7e 895618          mov     dword ptr [esi+18h],edx
5996bd81 ff500c          call    dword ptr [eax+0Ch] <--Dereference arbitrary memory content
5996bd84 eb06            jmp     NPSWF32_14_0_0_145!unuse_netscape_plugin_Plugin+0x5ef (5996bd8c)

I had reported this bug to Adobe and they opened a case PSIRT-2707 on 14/April/2014 but so far Adobe didn't confirm whether or not it was able to reproduce the bug or the exception state reported.

Again, the bug doesn't affect the default configuration, and so very unlikely you're affected by this. However, users using Firefox with plugin-container disabled as well as Flash Player plugin with protected mode disabled are affected by this issue.

The original report is about Flash Player 13_0_0_182 and Firefox 28.0 but the testcase fails with Flash Player 14_0_0_145 and Firefox 30.0 (latest available till today).

These are the steps to reproduce the bug.
  • Edit mms.cfg to have ProtectedMode=0 to disable protected mode in Flash Player
  • Start cmd.exe and type "set MOZ_DISABLE_OOP_PLUGINS=1" to disable plugin-container in Firefox
These settings above required to get Flash Player plugin loaded in firefox.exe's address space.
  • Start Firefox from command prompt opened previously
  • Open fuzzed.swf in Firefox (drag n drop should work)
  • Attach firefox.exe process to Windbg when you notice that Firefox is hanging
  • Exception should occur in few second. If you see the out-of-memory error in the debugger log without exception you may restart the browser and try again.
The fuzzed flash file has the following changes compared to the template file. The value of the first item in the integer pool has been changed to a large value. TagLength of DoAbc tag and FileSize of the main header have been therefore updated to maintain the integrity of the flash file.

Drop me an email if you think you need the testcase.
  This blog is written and maintained by Attila Suszter. Read in Feed Reader.