Arya

Getting Coverage(二)
[+] 又是噩梦,这一次是海啸一般的地震[+] 唯一值得开心的事就是今天终于吃到了板栗[+] 想吃蒜泥白肉!!0x...
扫描右侧二维码阅读全文
19
2019/01

Getting Coverage(二)

[+] 又是噩梦,这一次是海啸一般的地震
[+] 唯一值得开心的事就是今天终于吃到了板栗
[+] 想吃蒜泥白肉!!

0x01 获取fuzzing的基本覆盖范围

我们可以使用coverage()跟踪评估我们之前写的过的一些测试方法的有效性,比如我们可以用之前写过的fuzzer()生成一随机长度的字符串作为输入来测试cgi_decode()的解码功能,利用coverage()进行跟踪随机生成的字符串对函数的覆盖率

from Fuzzer import fuzzer
sample = fuzzer()
sample

生成一一段随机的字符串

!7#%"*#0=)$;%6*;>638:*>80"=</>(/*:-(2<4 !:5*6856&?""11<7+%<%7,4.8,*+&,,$,."

当然,我们要先将cgi_decode()try-catch起来,避免出现报错导致程序终止进行

with Coverage() as cov_fuzz:
    try:
        cgi_decode(sample)
    except:
        passcov_fuzz.coverage()
print(cov_fuzz.coverage())  

输出
{('cgi_decode', 17), ('cgi_decode', 28), ('cgi_decode', 18), ('cgi_decode', 29), ('__exit__', 123), ('cgi_decode', 9), ('cgi_decode', 20), ('cgi_decode', 10), ('cgi_decode', 21), ('cgi_decode', 11), ('cgi_decode', 12), ('cgi_decode', 14), ('cgi_decode', 25), ('cgi_decode', 15), ('cgi_decode', 16)}

我们可以比较一下这串字符串与最大覆盖率的差距

cov_max.coverage() -cov_fuzz.coverage()

{('cgi_decode', 22), ('cgi_decode', 23), ('cgi_decode', 30), ('cgi_decode', 19), ('cgi_decode', 26)}

接下来我们可以测试fuzzer()产生的100个随机字符串测试cgi_decode()时的覆盖率

我们可以定义一个数组cumulative_coverage去存储一段时间内的累积的覆盖量,如cumulative_coverage[0]表示输入了第1个字符串之后的覆盖行数,cumulative_coverage[1]表示输入第1组和第2组字符覆盖的行数的并集,cumulative_coverage[n]表示前n-1行代码的覆盖行数的并集

trials = 100

def population_coverage(population, function):
    cumulative_coverage = []
    all_coverage = set()

    for s in population:
        with Coverage() as cov:
            try:
                function(s)
            except:
                pass
        all_coverage |= cov.coverage()#求出二者的并集,并赋值给all_coverage进行累计
        cumulative_coverage.append(len(all_coverage))#将每次更新的all_coverage赋给数组,观察增长率

    return all_coverage, cumulative_coverage

现在可以创建随机输入的字符串

def hundred_inputs():
    population = []
    for i in range(trials):
        population.append(fuzzer())
    return population
all_coverage, cumulative_coverage = population_coverage(hundred_inputs(), cgi_decode)

利用matplotlib.pyplot模块观察每一次的累计覆盖行数的走势

plt.plot(cumulative_coverage) #绘制散点图
plt.title('Coverage of cgi_decode() with random inputs')
plt.xlabel('# of inputs')
plt.ylabel('lines covered')
plt.show()

弹出一个表格
展示出了每次输入之后的累计走势
Image5-1.png

当然,我们可以统计大量数据,上面是一组100个数据的增长趋势,那么现在来100组的每组100个数据的平均每次的增长趋势,利用数组sum_coverage存储前n次的累计综合,利用average_coverage存储``
sum_coverage[i] / 100``,计算出平均每组的平均增长率,并绘制成图像

runs = 100

# Create an array with TRIALS elements, all zerosum_coverage = [0] * trials

for run in range(runs):
    all_coverage, coverage = population_coverage(hundred_inputs(), cgi_decode)
    assert len(coverage) == trials
    for i in range(trials):
        sum_coverage[i] += coverage[i]

average_coverage = []for i in range(trials):
    average_coverage.append(sum_coverage[i] / runs)
plt.plot(average_coverage)
plt.title('Average coverage of cgi_decode() with random inputs')
plt.xlabel('# of inputs')
plt.ylabel('lines covered')

Image5-2.png

0x02 获取外部程序的覆盖范围

接下来作者演示了,如何用python获取c的覆盖范围,首先我们要先写出C代码的cgi_decode实现方法

cgi_c_code = """/* CGI decoding as C program */

#include <stdlib.h>#include <string.h>#include <stdio.h>
"""

cgi_c_code += r"""
int hex_values[256];

void init_hex_values() {    
    for (int i = 0; i < sizeof(hex_values) / sizeof(int); i++) {        
    hex_values[i] = -1;    
}    
hex_values['0'] = 0; 
hex_values['1'] = 1; 
hex_values['2'] = 2; 
hex_values['3'] = 3;    
hex_values['4'] = 4; 
hex_values['5'] = 5; 
hex_values['6'] = 6; 
hex_values['7'] = 7;
hex_values['8'] = 8; 
hex_values['9'] = 9;

hex_values['a'] = 10; 
hex_values['b'] = 11; 
hex_values['c'] = 12; 
hex_values['d'] = 13;    
hex_values['e'] = 14; 
hex_values['f'] = 15;

hex_values['A'] = 10; 
hex_values['B'] = 11; 
hex_values['C'] = 12; 
hex_values['D'] = 13;    
hex_values['E'] = 14; 
hex_values['F'] = 15;
}
"""

cgi_c_code += r"""
int cgi_decode(char *s, char *t) {
while (*s != '\0') {
if (*s == '+')
*t++ = ' ';        
else if (*s == '%') {            
int digit_high = *++s;            
int digit_low = *++s;           
if (hex_values[digit_high] >= 0 && hex_values[digit_low] >= 0) { 
*t++ = hex_values[digit_high] * 16 + hex_values[digit_low];           
}            
else                
return -1;        
}        
else            
*t++ = *s;       
s++;    
}    
*t = '\0'; 
return 0;
}
"""

创建一个c源程序

with open("cgi_decode.c", "w") as f:
    f.write(cgi_c_code)

接下来将这个c代码编译成一个可执行文件

cc --coverage -o cgi_decode cgi_decode.c

生成了一个cgi_decode文件
接下来可以执行这个文件

./cgi_decode 'Send+mail+to+me%40fuzzingbook.org'

Send mail to me@fuzzingbook.org

文章引入了一个新的概念:gcov工具,可以用来收集被覆盖的代码的信息以及测试代码覆盖率,并且对于每个给定的源文件, 可以创建一个新的.gcov文件来收集覆盖信息

创建一个.gcov文件

!gcov cgi_decode.c

File 'cgi_decode.c'
Lines executed:91.89% of 35
cgi_decode.c:creating 'cgi_decode.c.gcov'

在这个文件中,每一行的前缀代表着它被调用的次数(其中-代表该行不可执行,####代表改行未执行)我们可以查看cgi_decode()函数,并查看到唯一未被执行的代码是return -1

查看的代码如下

lines = open('cgi_decode.c.gcov').readlines()for i in range(30, 50):
    print(lines[i], end='')

Image5-3.png

我们也可以利用.gcov文件获得一个覆盖集合

def read_gcov_coverage(c_file):
    gcov_file = c_file + ".gcov"
    coverage = set()
    with open(gcov_file) as file:
        for line in file.readlines():
            elems = line.split(':')
            covered = elems[0].strip()
            line_number = int(elems[1].strip())
            if covered.startswith('-') or covered.startswith('#'):
                continue
            coverage.add((c_file, line_number))
    return coverage
 
coverage = read_gcov_coverage('cgi_decode.c')

list(coverage)[:5] #读取前5个片段
[('cgi_decode.c', 15),
 ('cgi_decode.c', 20),
 ('cgi_decode.c', 55),
 ('cgi_decode.c', 29),
 ('cgi_decode.c', 34)]

接下来作者在最后,简单吐槽了一下手动的测试并不能覆盖到所有的语句,之后在进行fuzzing时,我们才能检测到所有的语句,包括一些报错的语句,才能尽可能覆盖所有的代码。

[+] 本节主要是阐述了黑盒测试以及白盒测试的思想,追根溯源也就是覆盖度量的思想
[+] 在大部分的覆盖率指标中,最重要的是语句覆盖率和分支覆盖率
[+] 只有尽可能进行大量的随机测试,才能够尽可能的覆盖到每一个语句

Last modification:January 19th, 2019 at 10:19 pm
If you think my article is useful to you, please feel free to appreciate

Leave a Comment