Dude, Where's My Data?


Harlan's tweet (view picture to the right) got me thinking, and I would like to share a case example that I feel drove this particular point home for me.

 Many of the 'Swiss Army' forensics tools will parse data for you and automate various tasks. For example, X-Ways will parse link files and EnCase will  parse (or mount, whatever term you prefer) PST files. Instead of exporting out these files and working with them in separate programs, these Swiss Army knives will display the data in a more readable format within their GUI.

Cell phone forensic programs work in a similar fashion. They will first acquire the phone (if you’re lucky that day) then parse typical data such as SMS and MMS messages, call logs and contact information. These programs can also generate a pretty report for you to turn over to your clients (whether it’s a prosecutor,  a defense attorney or your cousin Vinnie). As an examiner, this can be great thing.  No need to locate and export the database, run queries and convert timestamps.

Each of these programs has taken a task that is repetitive and automated it - in most cases, saving the examiner time. But where is the Swiss Army knife getting its data from, how is it interpreting it and  is it getting all the data?
 
Now, to get to my case example.  I had an iPhone where I was tasked with getting the voicemails.  In order to do this I had three “Swiss Army knife” tools at my disposal:
  • Swiss Army Knife A – $$$$$
  • Swiss Army Knife B – $$$
  • Swiss Army Knife C – $
All three programs were able to acquire a file system image of the cell phone as well as parsing the SMS, MMS and call logs, but what I needed were the voicemails.  Only one of these tools (A) automatically parsed the voicemail.db file which contained information regarding the voicemails. I did locate the voicemail.db file within the file system of the two other programs (B and C) - the programs just didn’t parse this database automatically.

Now, if an examiner had just the option of B or C that did not automatically parse the voicemails and point them out – would they have assumed there were no voicemails? Ok, this may not be the best example as voicemails are a pretty common thing, but what if it were a not so common artifact?

I decided to use A to conduct the remainder of the exam since it had already parsed the voicemails saving me the time of exporting out the database, running quires and converting timestamps. I could now get a pretty report.   I went to generate the report and the program threw an error. No report, no exported voicemails.  Dude, where's my data?
 
In my quest to find out why the $$$$$ Swiss Army knife threw an error, I went to view the contents of the voicemail.db file to see if there was some abnormal data causing issues. I opened the voicemail.db file with an SQLite viewer and noted several columns of data NOT displayed by the $$$$$ program.

Included were two columns I thought right off the bat could be important – a "flag" column and a "trashed" column. The flag column designates certain statuses of the voicemail such as heard, unheard or deleted. The trashed column is the date that the voicemails where placed in the deleted folder. What if the examiner needs to prove the suspect had listened to a voicemail?  I know I don't always listen to my voicemails (sorry Kim) and opt to just call the person back instead. (Now I know you can't prove they listened to it, per say. Maybe their speaker was busted,  or their nephew had their phone but this is just an example for illustrative purposes so just roll with me).

After some more testing, I determined it was blank values in the database that were causing errors with the reporting. Was I going to wait for the next software update to get a report? No. Time to work with the data myself. I had three options:

Good → Export the data from the database into an Excel sheet, use formulas to convert the timestamps

Better → Write a script to parse the data as I would probably need it again

Best → Have someone else write a script to parse the data

Now, arguably, Better and Best could be switched.  It’s always good to write your own tools so you gain a deeper understanding of the data. However, in my case I had someone (Cheeky4n6Monkey) reach out to me when I was working on iParser asking if I needed any help.  I know he enjoys learning and is always looking for a forensic project to help on. So rather then write my own parser  I thought this would be a nice project to involve him in.

A few emails later I had a custom tool written by him that gave me the exact data I wanted and hopefully he got to learn something in the process too.

So in summary, I am going to quote Harlan’s tweet again:

 How much do you know about what your tools do for you?  May I make the following suggestions?  Look at the data with different tools to see what your tool may not be doing. Look at the raw data, or look at the data in its native format to see how your tool interprets the data and what it may be missing. Read the forums associated with your tool, see what it may be capable of that you are missing out on based upon how others use it.

Do they get the data you need? In my case, not always. Sometimes I need to roll up my sleeves and do the dirty work by myself (err, in this case I asked someone else to join in with me).

Do you know what you need? How do you know what data you need, if you don’t even know it exists? Keep researching, reading blogs, watching webcasts and asking questions.  Don’t assume that everything will be handed to you on a silver platter by your tools.

Umm, in case you don't get my movie reference, Google "Dude, Where's My Car"  :-)


Comments