针对现有协议格式逆向方法在现实中复杂语义环境下存在的逆向准确度不高的问题,抓住“协议结构与代码数据结构之间的协同映射”这一规律,提出了识别数据结构的协议格式逆向方法,利用细粒度的污点追踪,记录并分析协议在动态执行中的内存访问,通过在内存中追踪基址捕获与输入中不同字段相对应的数据结构,最后基于数据结构的独立性,逆向分析协议中的字段等格式信息.实验结果表明,与以协议字段在解析过程被当作一个整体访问为前提的传统方法相比,所提出的方法可以有效地识别出协议中的数据结构,从而更准确地逆向推理出协议格式.
To address the accuracy problem of current techniques due to the complex program semantics, we propose a reversing technique based on the mapping relationship between the protocol format and pro- gram data structures. By fine-grained dynamic taint tracking, we analyze the execution contexts and mem- ory segments for each tainted bytes, identify the base memory address, and identify fields via independent program data structure. The results of experiment show that compared with current techniques which based on the observation that one field of input should be accessed as one unit; this technique identifies da- ta structures in protocol effectively, so as to get protocol format more accurately.