Read The Codes Before The Wires
I spent the early part of adulthood as an avionics technician in the US Navy fixing F/A 18 Hornets. I spent a lot of time learning the systems and advanced methodologies for troubleshooting. My skills were in high demand when very complicated gripes were discovered in the Fleet. While I was very good at troubleshooting advanced problems, I was horrible at fixing routine ones. Most avionics techs took a tool pouch to the aircraft. I took a multi-meter and a time domain reflectometer. It probably took me three years to realize that most gripes could be fixed in under half an hour if I used the standard flowcharts and didn't do a "deep dive" on every problem.
$30k Investigations
As my career moved from Avionics to InfoSec, I found myself jumping right into memory/disk images, reverse engineering binaries and packet capture parsing. Again, I was quite good at advanced problems but was making routine events into time intensive (and expensive) projects. I quickly realized that approach to investigations was not scalable or affordable. While I still have to do the advanced forensics, I don't start there any more.
Foo vs. Flow Charts
New threats will always require strong foo from investigators to resolve. Most incidents are routine, however. Aside from reducing the time spent doing investigations, standardized (flow chart based) investigations also provide great value in scalability. Entry level investigators can easily be taught standard processes. When incidents warrant escalation, the more skilled responders can be handed the investigation with the routine steps already completed.
Build the Flow Chart
The standard steps I use in building in flow charts are as follows:
- Eliminate event as false positive. Methods for doing this will vary from tool to tool but they are pretty standard. The most junior member of the team should be able to handle these steps.
- Determine Success of Breach. The main thing to determine here is whether or not the incident was successful. If enforcement mechanisms stopped the event before breach could occur, the only action to take is to look for ways of thwarting the event type earlier in the attack cycle.
- Define Impact. Meta data (network and endpoint logging) can normally answer the question "Did the breached asset access or disclose protected data?" This is normally the step that the most time is wasted in PCAP and endpoint forensics. NetFlow/IPFIX records will tell you if bi-directional communications occurred between the breached machine and protected assets. If there is no record of access, leave the packets alone. Adjust security mechanisms to close off the vulnerabilities leading to breach and move onto the next fire. You can reverse engineer the binaries on the compromised machine if it will help in fixing vulnerabilities (but only if it will.)
- Unleash the Foo. If meta data reveals that assets were "touched" by compromised endpoints, it's time to start looking at the bits. Fire up Encase, IDA Pro, crack open PCAP and have your 3l337 responders unleash their foo. These steps should be at the bottom of the flow chart and not at the top.
Wrap Up
Detailed investigations can reveal useful information in determining event impact and shoring up defenses. Well defined incident response processes can allow your best responders to use their skill when appropriate and offload the routine work to the next generation of responders. While quality investigations will deliver good intelligence, efficiently processing more events will provide a much higher return than a handful of masterful deep dives.