December 24, 2011

Analysis of Data Formats by Guessing - Part II

I'm writing about how to find possible TIMESTAMP values in raw binary data.

Here is the first part of data format analysis by guessing but you can read this entry independently anyway.

TIMESTAMP is a DWORD value, represents Unix time, and we assume it's encoded as little-endian somewhere in the binary.

The prototype program reads the binary data into a byte array. It iterates through the array elements by reading four bytes at each position in the array, interprets as DWORD, and compares if the value falls into the date range we are after.

The looser date range you set the more likely to find false positives in your search result.

I've tested the code with a relatively strict date range, i.e. one month, and didn't find any false positives but did find TIMESTAMP. There were cases when multiple TIMESTAMPs found in PE files; for example one in the header and one in the .rdata section.

If you cannot set a loose date range, and there is a likelihood you came across to false positives you can filter out the results by applying some of the below.
  • You know more about the date e.g. it cannot be Saturday or Sunday?
  • It should be in a high entropy region of the dump, or on the contrary?
  • There might be other values stored around, do you know more about those values? Presence of scattered zeroes?
  • Should it be aligned to some value?
  This blog is written and maintained by Attila Suszter. Read in Feed Reader.