motivation
Linux很多驱动并没有经过大量的测试,很多安全漏洞都是出现在设备驱动中,但是现存的fuzzing方案要么需要真实设备的支持使得驱动并不能大规模的测试,要么基于模拟,但是现存的模拟方法基于人工并且覆盖的函数并不多,也不能大规模用于测试。作者在此提出一种自动化模拟的方法。
Related Work
Linux Fuzz Method
- SyzVegas uses multi-armed-bandit algorithms to adapt the task selection and seed selection algorithms in Syzkaller, improving the performance of Syzkaller.
- HEALER [41] utilizes system call relation learning to improve the effectiveness of fuzzing.
- kAFL [36] uses Intel-PT to collect coverage from the processor and performs coverage-guided fuzzing.
- HFL [18] combines fuzzing with symbolic execution for hybrid kernel fuzzing.
- IMF [10] discovers deep kernel bugs by leveraging inferred dependence model between kernel API functions.
- MoonShine [27] performs static analysis to detect dependencies between system calls and collects system call traces from real-world programs for distilling seeds.
- Digtool [30] designs a hypervisor to catch kernel behaviors dynamically and leverages logs to discover kernel vulnerabilities.
- NTFuzz [5] utilizes static binary analysis to infer the system call types and performs type- aware fuzzing on Windows.
Linux SA & SE Method
- SADA [3] recognizes both unsafe streaming DMA access and coherent DMA access in device drivers.
- DCUAF [2] takes a local-global analysis to extract function pairs that may be executed concurrently and performs a lockset analysis to detect concurrent use-after-free bugs. (并发相关问题)
- DSAC [4] uses a heuristics-based method to extract func- tions that may be sleeping at runtime to detect sleep-in-atomic- context bugs.
- HERO [48] finds disordered error handling bugs by pairing error-handling functions based on their unique structures.
- EECATCH [28] can detect exaggerated error handling bugs, which are wrong error handling consequences worse than the error it- self.
- CRIX [21] is designed to detect missing-check bugs in drivers.
- DR.CHECKER [22] performs a soundy static analysis to find bugs with pointer analysis and taint analysis. However, static analysis has its shortcomings. First, manual analysis is needed to determine whether bugs reported by the tool are actual bugs. Second, static analysis is usually heuristics-based, so the manual effort is also needed to formulate rules for a specific type of bug.
- SymDrive [35] removes the need for hardware via symbolic execution. However, besides the common problems of symbolic execution, SymDrive also requires manual annotation of the driver code and manual configuration.
Solution
Virtual Device Modeling
设备在开始和用户态程序交互之前需要进行初始化,初始化过程中会检查硬件是否完好,对设备的模拟需要先满足初始化的过程。Data Space Modeling
在初始化过程中设备驱动将从设备寄存器中读取数据并且进行约束判断。
作者对该类问题的解决方法为先启发式静态分析找出该驱动中的read类函数,然后进行数据流分析进行约束求解
作者判断read类函数的方法为先找出Linux中提供的read API,然后判断设备中是否提供了对Linux中提供API的wrapper函数,作者对wrapper函数的分析如下 - The function name contains “read”.
- The number of basic blocks of the function is less than five.
- The function calls the built-in functions capable of reading.
分析出read函数之后对read函数读出的值进行Def-Use分析,但是并不是所有的约束都需要满足,需要对后续的程序控制流进一步分析来判断该约束是否要满足
I/O and Memory Space Modeling
每个PCI设备最多实现6个PCI地址区域,驱动会对对应的类型进行判断1
2if (!(pci_resource_flags(dev, 0) & IORESOURCE_IO))
return -ENODEV;
PrIntFuzz 首先定位相关的宏或者函数,然后提取对应区域和对应资源的类型来绕过检测,然后PrIntFuzz会记录这个区域的位置和对应的资源类型
Configuration Space Modeling
PCI devices contain several con- figuration registers that hold configuration information about the hardware, and drivers can read or write to these registers
Among these configuration registers, five standard registers are critical: vendorID, deviceID, class, subsystem vendorID, and subsystem deviceID. These registers are read by the PCI bus when scanning for devices to determine which driver this hardware should match. Correspondingly, on the driver side, the pci_device_id structure is provided to define the list of different types of devices supported by this driver [25], as shown in Listing 3.1
2
3
4
5
6struct pci_device_id {
__u32 vendor, device; /* Vendor and device ID or PCI_ANY_ID*/
__u32 subvendor, subdevice; /* Subsystem ID's or PCI_ANY_ID */
__u32 class, class_mask; /* (class,subclass,prog-if) triplet */
kernel_ulong_t driver_data; /* Data private to the driver */
};
pci_device_id是pci_driver的field,作者进行了field-sensitive分析来定位pci_driver和pci_device_id
除了上述的标准寄存器,还会对其他特殊的寄存器进行检验1
2
3
4
5
6pci_read_config_dword(pdev, 0x80, ®);
if (reg != ADM8211_SIG1 && reg != ADM8211_SIG2) {
printk("%s : Invalid signature (0x%x)\n", pci_name(pdev), reg);
err = -EINVAL;
goto err_disable_pdev;
}
对于这些检验,PrIntFuzz先提取了对应的函数,然后采用了Data Space Modeling类似的方法来处理