Advanced Reverse Engineering Techniques for Binary Code Security Retrofitting and Analysis

Open Access
- Author:
- Wang, Shuai
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 06, 2018
- Committee Members:
- Dinghao Wu, Dissertation Advisor/Co-Advisor
Dinghao Wu, Committee Chair/Co-Chair
Peng Liu, Committee Member
Sencun Zhu, Committee Member
Trent Jaeger, Outside Member
Danfeng Zhang, Special Member - Keywords:
- reverse engineering
software security
binary code analysis - Abstract:
- In software security, many techniques and applications depend on binary code reverse engineering, i.e., analyzing and retrofitting executables with the source code unavailable. Despite the fact that many security hardening techniques rely heavily on reverse engineering, modern binary disassembling and reconstruction techniques still cannot adequately fulfill many of the requirements. In particular, no reverse engineering tool can disassemble an executable into assembly code which can be reassembled back in a fully automated manner, especially when the processed objects are Commercial-Off-The-Shelf (COTS) binaries with most symbol and relocation information stripped. Due to the lack of support for direct reassembling, existing binary instrumentation tools leverage patch or replica-based rewriting techniques to guarantee the correct functionality of the instrumented outputs, which usually incur high execution slowdown and binary code size increase. We present Uroboros, a tool that can disassemble legacy executables to the extent that the generated code can be assembled back to working binaries without manual effort. The key technique proposed in Uroboros is named reassembleable disassembling, in which we develop a set of methods to precisely recover each component of a binary executable, including code, data and meta-information. In particular, Uroboros is the first to be capable of not only recovering the assembly program, but enabling reassembling of the disassembled output with the correct functionality. We further extend Uroboros into a general purpose binary instrumentation platform with a rich set of binary instrumentation APIs and utilities. Our evaluation on widely-used program binaries shows that Uroboros can provide support for reassembly and instrumentation on legacy binary executables with better performance, lower labor cost, and a broader scope of applications. In addition, we build advanced binary analysis and instrumentation applications for security purpose. Function recognition in program binaries serves as the foundation for many security retrofitting and analysis tasks. However, as binaries are usually stripped before distribution, function information is indeed absent in most binaries. We develop FID to recognize functions through machine learning techniques. FID extracts semantic information from binary code and trains a machine learning model for recognition. Our evaluation demonstrates that FID has a high recognition accuracy on commonly-used program binaries as well as obfuscated code. We further build program diversification tools. By transforming software into different forms before deployment, software diversification can effectively mitigate many attacks. Enlightened by research in other areas, we seek to apply different diversifications to the same program for a synergy effect such that the resulting hybrid transformations can have boosted diversification effects at modest cost. Given a set of commonly-used diversification passes, we propose a novel selection strategy to promptly construct a transformation composition that performs better than any single transformation in the set.