angr_ctf刷题小记


(&&angr入门?

超级无敌久之前划水的时候发现了一个可以用来逐步学习angr的题库:angr_ctf。
刚好angr写脚本不太熟练,故开始(约等于从零开始的)学习并做下记录 (方便后人踩坑

持续更新ing...

0xFF 一切的最开始

环境

环境:win10(1909) + wsl(ubuntu 18.04)

python相关:python(3.6.9) + angr(9.0.4495) + templite(0.2.1)

  • 上面github题库的环境是python2,但angr从8.0开始就放弃了对python2的支持,所以刷题的时候需要对题库脚本进行微调(py2->py3,比如print的东西需要用括号括起来)。
  • 安装相关自行谷歌,建议安装angr的时候使用python虚拟环境(virtualenvwrapper)。
  • 如遇到打开XX_angr_find时遇到报错:Exec format error,参考这篇blog;安装完qemu以后,每次遇到该error就执行一次sudo service binfmt-support start启用服务即可。

项目文件结构

先把整个包download下来并解压,可以看到有每一个题目的文件夹(题目生成包,用来生成可执行文件,只需刷题的话大概没用),还有distsolutions两个文件夹。

dist里有:

  1. XX_xxxx:每道题的elf。
  2. scaffoldXX.py:解题脚本挖空范例(题面,需要填空,注释用来解释语句的作用很详细)。

solutions里各道题的文件夹里有:

  1. XX_xxxx:每道题的elf。
  2. scaffoldXX.py:解题脚本挖空范例(同上)。
  3. solveXX.py:解题脚本(正确答案=v=)。

我的话直接转到dist文件夹里做了(懒得切目录)。

p.s. 具体学写脚本还是得看scaffoldXX.py里的注释 或者 这个系列,这里只是一个做题记录罢了:/

SymbolicExecution.pptx

angr理论相关不懂?

没关系,在主目录下的这个ppt文件SymbolicExecution.pptx中有超级无敌详细的说明!

(这也是我为什么要把这个ppt拎出来讲

内容大概有:

  • 符号执行的简介
  • angr的原理
  • angr在CTF中的应用
  • 例题(压缩包里的题目)解析
  • ...

而且语言幽默风趣,辅以图片&动画,完全冲散了对这种纯理论知识的恐惧(?)hhhh,建议在看到ppt的53页以后(也就是开始讲例题的地方)就可以先试着自己做题然后不懂的再回来看解析这样。

当然也可以直接通过scaffoldXX.py里的注释直接开始angr学习,但是到了后面光靠注释也看不懂的时候,就老老实实回来看ppt吧~


0x00 angr_find

用ida打开文件定位到关键分支(“Good Job”):

把地址0x8048678(输出Good Job的地址)填进print_good_address里,整合得到脚本:

import angr
import sys

path_to_binary = './00_angr_find'
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)

print_good_address = 0x8048678
simulation.explore(find=print_good_address)

if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()))
else:
    raise Exception('Could not find the solution')

0x01 angr_avoid

ida里打开,时间很长(,并且会出提示:

并且analysis完以后按F5还会蹦出这个:

[小小的脑袋里充满了大大的疑惑.jpg]

只能说真不愧是针对avoid出的题了。

既然没有图形界面(懒得设置配置文件)又不能反编译,那就走点小道好了。

shift+F12出strings window,然后双击进到“Good job”。

查看调用情况。

在IDA view里定位到函数分支,填地址,over。

avoid同理:

需要注意的是,反编译maybe_good()函数可以看出,should_succeed必须为1,而should_succeed还被avoid_me()函数调用改为0,所以avoid的地址还得加上avoid_me()函数的地址。

以及,一开始没angr成功(大意了),因为反编译complex_function()以后可以看到这个是必走的,只要在if那里避免转进“try again”就好。

最后的脚本(加了个测时间的):

import angr
import sys
import time

t = time.clock()
path_to_binary = './01_angr_avoid'
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)

print_good_address = 0x80485DD
will_not_succeed_address = [0x80485EF,0x804855B,0x80485A8]
simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()))
else:
    raise Exception('Could not find the solution')

print('time:',round(time.clock()-t,2),'s')

0x02 angr_find_condition

用ida看ida view发现分支有无限多,但是奇怪的是反编译以后并没有看到这些(迷惑)。

像这个样子的:

所以根据挖了空的题解脚本的提示填空就好。脚本:

import angr
import sys
import time

def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return b"Good Job." in stdout_output

def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return b"Try again." in stdout_output

t = time.clock()
path_to_binary = './02_angr_find_condition'
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)

simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()))
else:
    raise Exception('Could not find the solution')

print('time:',round(time.clock()-t,2),'s')

一点点坑:

布尔变量那里一开始写的是return i if stdout_output==b"Good Job." else 0,后来发现"Good Job."不能跟stdout_output完全匹配(输出stdout_output可以看到是类似于b'Enter the password: Good Job.\n'的东西),所以懒得完全匹配就直接上in了。


0x03 angr_symbolic_registers

这题的点在于angr能将数据直接注入到寄存器中(angr强大之处+1。

查看get_user_input()的反编译窗口没什么作用,我选择反汇编。

可以看到三个输入分别被传入eax,ebx和edx中。

ppt也有对这道题的tip:

所以写exp:

import angr
import claripy
import sys
import time

def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return b'Good Job.' in stdout_output
def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return b'Try again.' in stdout_output

t = time.clock()
path_to_binary = './03_angr_symbolic_registers'
project = angr.Project(path_to_binary)
start_address = 0x8048980
initial_state = project.factory.blank_state(addr=start_address)

password0_size_in_bits = 8*0x4
password0 = claripy.BVS('password0', password0_size_in_bits)
password1_size_in_bits = 8*0x4
password1 = claripy.BVS('password1', password1_size_in_bits)
password2_size_in_bits = 8*0x4
password2 = claripy.BVS('password2', password2_size_in_bits)

initial_state.regs.eax = password0
initial_state.regs.ebx = password1
initial_state.regs.edx = password2

simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
    solution_state = simulation.found[0]
    solution0 = format(solution_state.se.eval(password0),'x')
    solution1 = format(solution_state.se.eval(password1),'x')
    solution2 = format(solution_state.se.eval(password2),'x')
    solution = solution0+" "+solution1+" "+solution2
    print(solution)
else:
    raise Exception('Could not find the solution')

print('time:',round(time.clock()-t,2),'s')

0x04 angr_symbolic_stack

这个题如果按正常逆向逻辑的话就是超级无敌大水题(

然而考虑到这是angr题库,于是乖乖去看挖空exp的注释,发现花了很大篇幅在讲栈结构(还是AT&T的汇编orz 看惯了Intel汇编的人表示不习惯)。

# For this challenge, we want to begin after the call to scanf. Note that this
# is in the middle of a function.
#
# This challenge requires dealing with the stack, so you have to pay extra
# careful attention to where you start, otherwise you will enter a condition
# where the stack is set up incorrectly. In order to determine where after
# scanf to start, we need to look at the dissassembly of the call and the
# instruction immediately following it:
#   sub    $0x4,%esp
#   lea    -0x10(%ebp),%eax
#   push   %eax
#   lea    -0xc(%ebp),%eax
#   push   %eax
#   push   $0x80489c3
#   call   8048370 <__isoc99_scanf@plt>
#   add    $0x10,%esp
# Now, the question is: do we start on the instruction immediately following
# scanf (add $0x10,%esp), or the instruction following that (not shown)?
# Consider what the 'add $0x10,%esp' is doing. Hint: it has to do with the
# scanf parameters that are pushed to the stack before calling the function.
# Given that we are not calling scanf in our Angr simulation, where should we
# start?

(……)