Arya

Fuzzing: Breaking Things with Random Inputs(二)
[+]今天开始一定不再熬夜了,也不再拖拉进度了[+]考研期间的好习惯毁于一旦可真是伤心(大哭[+]依然忘记买板栗了...
扫描右侧二维码阅读全文
17
2019/01

Fuzzing: Breaking Things with Random Inputs(二)

[+]今天开始一定不再熬夜了,也不再拖拉进度了
[+]考研期间的好习惯毁于一旦可真是伤心(大哭
[+]依然忘记买板栗了
[+]偶尔想玩一整天,不用学习,也没有负罪感就真好
[+]顺便这一节真的好长,剩下的部分晚上回来再贴代码

0x04 捕获错误

在大部分情况下,当程序崩溃或者挂起的时候,很容易就识别出程序一定出现了错误,但很多情况下,故障不容易被发现,所以需要额外的检查。

缓冲区溢出就是一个比较常见的错误,在C/C++中,程序可以访问内存的任意部分,所以经常容易在代码运行时出现一些内存访问错误。为了检测出这些错误,我们可以使用
AddressSanitizer工具进行内存错误检测

AddressSanitizer是clang中的一个内存错误检测器,它可以检测到以下问题:
Out-of-bounds accesses to heap, stack and globals(越界访问堆,栈和全局变量)
Use-after-free(变量释放之后使用)
Use-after-return (to some extent)(返回后使用变量)
Double-free, invalid free(重复释放内存,非法释放内存)
Memory leaks (experimental)(内存泄漏)

接下来是安装clang,然后使用AddressSanitizer
版本要求: LLVM3.1 或者gcc4.8
(Ubuntu14.04的默认gcc版本为4.8)

sudo apt-get install clang-3.3

写一个没有错误的C代码,命名为program.c


#include <stdlib.h>
#include <string.h>

int main(int argc, char** argv) {
    /* Create an array with 100 bytes, initialized with 42 */
    char *buf = malloc(100);
    memset(buf, 42, 100);

    /* Read the N-th element, with N being the first command-line argument */
    int index = atoi(argv[1]);
    char val = buf[index];

    /* Clean up memory so we don't leak */
    free(buf);
    return val;
}

使用address sanitization编译program.c

clang -fsanitize=address -g -o program program.c

访问buf[99]

/program 99; echo $?

42
接下来访问buf[110],但是在源代码中,我们只给buf分配了100byte单元
于是,报错了
Image3-1.png

有一行比较明显的错误信息提示
ERROR: AddressSanitizer: heap-buffer-overflow on address
明显是越界访问错误

作者举了一个越界访问漏洞的例子--HeartBleed bug,并引用了一则小漫画说明了这种漏洞,heartbleed bug是OpenSSL库中的安全bug,在实现TLS的心跳扩展时没有对输入进行适当验证(缺少边界检查)
也就是说,相当于客户端向服务器需要3个字节的HAT,但却越界读取了500个字节
Image3-2.png

接下来作者再现了越权读取的信息泄露漏洞

首先写一个包含实际数据和随机数据的字符串

s = ("<space for reply>" + fuzzer(100)
     + "<secret-certificate>" + fuzzer(100)
     + "<secret-key>" + fuzzer(100) + "<other-secrets>")

并且当s的长度小于2048时定义一个字符串常量uninitialized_memory_marker模拟未被初始化的内存

uninitialized_memory_marker = "deadbeef"while len(s) < 2048:
    s += uninitialized_memory_marker

接下来创建内存单元,并将字符串改为可修改的列表

memory = []
for c in s:
    memory.append(c)

定义一个函数,模拟存在心脏滴血漏洞的服务器

def heartbeat(reply, length):
    global memory

    # Store reply in memory
    for i in range(len(reply)):
        memory[i] = reply[i]
    memory[i + 1] = '\n'

    # Send back heartbeat
    s = ""
    for i in range(length):
        s += memory[i]
    return s

调用函数

print(heartbeat("potato", 6))#发送potato,返回6个字节

potato
返回无误

print(heartbeat("bird", 4))

bird

当输入以下代码时

heartbeat("hat", 500)

返回了500个字节,包括实际数据,以及未被初始化的内存
hat\n\no\nfor reply>#,,!3?30>#61)$4--8=<7)4 )03/%,5+! "4)0?.9+?3();<42?=?0<secret-certificate>7(+/+((1)#/0\'4!>/<#=78%6$!!$<-"3"\'-?1?85!05629%/); *)1\'/=9%<secret-key>.(#.4%<other-secrets>deadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadbeefdeadb

以上是模拟心脏滴血漏洞的python代码
那么要如何预防这类信息泄露的漏洞呢,
作者给出了一个简单的代码片段,利用assert语句进行断言并抛出错误

from ExpectError import ExpectError
with ExpectError():
    for i in range(10):
        s = heartbeat(fuzzer(), random.randint(1, 500))
        assert not s.find(uninitialized_memory_marker)
        assert not s.find("secret")

抛出错误
Traceback (most recent call last):
File "<ipython-input-43-8e163fe1f7be>", line 4, in <module>

``assert not s.find(uninitialized_memory_marker)``

AssertionError (expected)

至此,可以看出第一节作者给我们提供了防止非法输入的方法,现在提供了一个检查非法输出的方法

同时作者也指出,在早期的检测错误或者是否存在漏洞的方法,都是使用了对输入的断言或者是对输出的数据断言。而断言查找错误的最重要用途之一是检查复杂数据结构的完整性。

作者给出了一个关于机场代码对于各个机场全程的映射

airport_codes = {
    "YVR": "Vancouver",
    "JFK": "New York-JFK",
    "CDG": "Paris-Charles de Gaulle",
    "CAI": "Cairo",
    "LED": "St. Petersburg",
    "PEK": "Beijing",
    "HND": "Tokyo-Haneda",
    "AKL": "Auckland"}  # plus many more

打印输出

airport_codes["YVR"]

Vancouver

"AKL" in airport_codes

True

因为机场的代码对应的机场信息至关重要,我们需要检查原本存在的映射是否符合要求

def code_repOK(code):
    assert len(code) == 3, "Airport code must have three characters: " + repr(code)#代码长度为3
    for c in code:
        assert c.isalpha(), "Non-letter in airport code: " + repr(code)#输入代码是英文
        assert c.isupper(), "Lowercase letter in airport code: " + repr(code)#输入代码均为大写
    return True
assert code_repOK("SEA")

未抛出错误,输入代码符合输入要求

接下来通过调用code_repOK(code)函数检测airport_codes中,每一项是否符合要求

def airport_codes_repOK():
    for code in airport_codes:
        assert code_repOK(code)
    return True
  
with ExpectError():
    assert airport_codes_repOK()  

如果此时我们增加了一行新的映射

airport_codes["YMML"] = "Melbourne"

则抛出异常
Traceback (most recent call last):
File "<ipython-input-52-21eb3b08ef3e>", line 2, in <module>
assert airport_codes_repOK()
File "<ipython-input-49-f8128f7dc918>", line 3, in airport_codes_repOK
assert code_repOK(code)
File "<ipython-input-47-345123a45730>", line 2, in code_repOK
assert len(code) == 3, "Airport code must have three characters: " + repr(code)
AssertionError: Airport code must have three characters: 'YMML' (expected)

新增加的映射的代码长度为4,不符合要求

新增一个函数,动态的增加新的映射,并且对增加的项进行断言

def add_new_airport(code, city):
    assert code_repOK(code)
    airport_codes[code] = city

with ExpectError():  # For BER, ExpectTimeout would be more appropriate
    add_new_airport("BER", "Berlin")

没有报错

with ExpectError():
    add_new_airport("London-Heathrow", "LHR")

Traceback (most recent call last):
File "<ipython-input-55-6aeb45bf2b91>", line 2, in <module>
add_new_airport("London-Heathrow", "LHR")
File "<ipython-input-53-f4d30ab4bf9e>", line 2, in add_new_airport
assert code_repOK(code)
File "<ipython-input-47-345123a45730>", line 2, in code_repOK
assert len(code) == 3, "Airport code must have three characters: " + repr(code)
AssertionError: Airport code must have three characters: 'London-Heathrow' (expected)

对于不符合的输入,继续报错

最后作者引用了一个红黑树的检测案例体现了通过repOK()函数对非法输入的捕获
最后作者引用了一个红黑树的检测案例体现了通过repOK()函数对非法输入的捕获

我们也可以用repOK()函数检查静态类型输入,调用MyPy检查器

from typing import Dict
airport_codes = {
    "YVR": "Vancouver",  # etc
}  # type: Dict[str, str]
airport_codes[1] = "First"

安装mypy
命令如下

python3 -m pip install -U mypy
$ mypy airports.py
airports.py: error: Invalid index type "int" for "Dict[str, str]"; expected type "str"
0x05 fuzzing的结构

作者为了可以在之后的章节重新使用本章的部分代码,写了一个runner类提供接口,又写了一个PrintRunner子类继承该类的方法

class Runner(object):
    # Test outcomes
    PASS = "PASS"#当程序运行正确时返回
    FAIL = "FAIL"#当程序运行错误时返回
    UNRESOLVED = "UNRESOLVED"#当程序无法运行时返回,例如存在非法输入

    def __init__(self):
        """Initialize"""
        pass

    def run(self, inp):
        """Run the runner with the given input"""
        return (inp, Runner.UNRESOLVED)
        
 
class PrintRunner(Runner):
    def run(self, inp):
        """Print the given input"""
        print(inp)
        return (inp, Runner.UNRESOLVED)
        
p = PrintRunner()
(result, outcome) = p.run("Some input")

赋值调用并打印输出
Some input

随后定义了用于输入的子类ProgramRunner

class ProgramRunner(Runner):
    def __init__(self, program):
        """Initialize.  `program` is a program spec as passed to `subprocess.run()`"""
        self.program = program

    def run_process(self, inp=""):
        """Run the program with `inp` as input.  Return result of `subprocess.run()`."""
        return subprocess.run(self.program,
                              input=inp,
                              stdout=subprocess.PIPE,
                              stderr=subprocess.PIPE,
                              universal_newlines=True)

    def run(self, inp=""):
        """Run the program with `inp` as input.  Return test outcome based on result of `subprocess.run()`."""
        result = self.run_process(inp)

        if result.returncode == 0:
            outcome = self.PASS
        elif result.returncode < 0:
            outcome = self.FAIL
        else:
            outcome = self.UNRESOLVED

        return (result, outcome)

有定义了一个类似二进制输入输出的类

class BinaryProgramRunner(ProgramRunner):
    def run_process(self, inp=""):
        """Run the program with `inp` as input.  Return result of `subprocess.run()`."""
        return subprocess.run(self.program,
                              input=inp.encode(),
                              stdout=subprocess.PIPE,
                              stderr=subprocess.PIPE)

接下来演示这个程序的运行方式,利用ProgramRunner类调用cat命令,输入参数input

cat = ProgramRunner(program="cat")
print(cat.run("hello"))

(CompletedProcess(args='cat', returncode=0, stdout='hello', stderr=''), 'PASS')

接下来是定义一个fuzzer

class Fuzzer(object):
    def __init__(self):
        pass

    def fuzz(self):
        """Return fuzz input"""
        return ""

    def run(self, runner=Runner()):
        """Run `runner` with fuzz input"""
        return runner.run(self.fuzz())

    def runs(self, runner=PrintRunner(), trials=10):
        """Run `runner` with fuzz input, `trials` times"""
        # Note: the list comprehension below does not invoke self.run() for subclasses
        # return [self.run(runner) for i in range(trials)]
        outcomes = []
        for i in range(trials):
            outcomes.append(self.run(runner))
        return outcomes

定义了继承Fuzzer的子类

class RandomFuzzer(Fuzzer):
    def __init__(self, min_length=10, max_length=100,
                 char_start=32, char_range=32):
        self.min_length = min_length#每次生成的字符串最小长度
        self.max_length = max_length#每次生成的字符串的最大长度
        self.char_start = char_start
        self.char_range = char_range

    def fuzz(self):
        """A string of `min_length` to `max_length` characters           in the range [`char_start`, `char_start` + `char_range`]"""
        string_length = random.randrange(self.min_length, self.max_length + 1)
        out = ""
        for i in range(0, string_length):
            out += chr(random.randrange(self.char_start,
                                        self.char_start + self.char_range))
        return out

调用RandomFuzzer生成一串随机字符串

random_fuzzer = RandomFuzzer(min_length=20, max_length=20)
for i in range(10):
    print(random_fuzzer.fuzz())

得到

'>23>33)(&"09.377.*3
*+:5 ? (?1$4<>!?3>.'
4+3/(3 (0%!>!(+9%,#$
/51$2964>;)2417<9"2&
907.. !7:&--"=$7',7*
(5=5'.!*+&>")6%9)=,/
?:&5) ";.0!=6>3+>)=,
6&,?:!#2))- ?:)=63'-
,)9#839%)?&(0<6("*;)
4?!(49+8=-'&499%?< '

现在调用cat命令显示输出,并判断是否存在错误

for i in range(10):
    inp = random_fuzzer.fuzz()
    result, outcome = cat.run(inp)
    assert result.stdout == inp
    assert outcome == Runner.PASS

random_fuzzer.run(cat)

``(CompletedProcess(args='cat', returncode=0, stdout='?:+= % <1<6$:(>=:9)5', stderr=''),
'PASS')``

我们可以重复调用cat命令

random_fuzzer.runs(cat, 10)

``[(CompletedProcess(args='cat', returncode=0, stdout='3976%%&+%6=(1)3&3:<9', stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout='33$#42$ 11=*%$20=<.-', stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout='"?<'#8 </:*%9.--'97!', stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout="/0-#(03/!#60'+6>&&72", stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout="=,+:,6'5:950+><3(*()", stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout=" 379+0?'%3137=2:4605", stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout="02>!$</'*81.#</22>+:", stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout="=-<'3-#88%&9< +1&&", stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout='2;;0=3&6=8&30&<-;?*;', stderr=''),
'PASS'),``
``(CompletedProcess(args='cat', returncode=0, stdout='/#05=*3($>::#7!0=12+', stderr=''),
'PASS')]``

[+]总算是写完这一节了,顺便学了一下心脏滴血漏洞的成因
[+]也明白了fuzzing发现的bug基本都是输入和输出引起的
[+]我保证明天一定好好学习(可是破产姐妹真的太好看了啊

Last modification:January 17th, 2019 at 09:53 pm
If you think my article is useful to you, please feel free to appreciate

Leave a Comment