2014年9月

这是两个多月前的一篇笔记，一直没有贴出来。而现在霓虹小兄弟已经把逆向出的代码扔出来很久了。但是写这篇笔记的时候除了towelroot V1以外啥也没有，所以~~当时其实我很快就掌握了利用的细节，主要是一开始我就没想逆向towelroot，直接靠trace定位了利用方法。

转载请注明 http://retme.net/index.php/2014/09/19/cve-2014-3153.html

膜拜geohot~以下是当时的笔记：

一，首先看补丁

https://github.com/torvalds/linux/commit/e9c243a5a6de0be8e584c604d353412584b592f8

    if (requeue_pi) {
       /*
 +      * Requeue PI only works on two distinct uaddrs. This
 +      * check is only valid for private futexes. See below.
 +      */
 +     if (uaddr1 == uaddr2)
 +         return -EINVAL;
 +
 +     /*

补丁要求两个 futex地址不能相同。如果相同会发生什么呢？

二，相关数据结构

实际上每个 futex进入内核中会计算一个 key（ get_futex_key）并且被插入哈希表futext_queues， futext_queues的结构如下：

static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];

static struct futex_hash_bucket *hash_futex(union futex_key *key)
{
    u32 hash = jhash2((u32*)&key->both.word,
             (sizeof(key->both.word)+sizeof(key->both.ptr))/4,
             key->both.offset);
    return &futex_queues[hash & ((1 << FUTEX_HASHBITS)-1)];
}

futex_hash_bucket是哈希表中的一个节点，结构如下

struct futex_hash_bucket {
    spinlock_t lock;
    struct plist_head chain;
};

其内部也是一个自旋锁，和一个队列。 chain 是一个优先级队列，等待线程的优先级越高，该线程在队列中越靠前。

plist_head链表中的成员是futex_q，代表了一个 futex的内核对象

/**
 * struct futex_q - The hashed futex queue entry, one per waiting task
 * @list:     priority-sorted list of tasks waiting on this futex
 * @task:     the task waiting on the futex
 * @lock_ptr:     the hash bucket lock
 * @key:      the key the futex is hashed on
 * @pi_state:     optional priority inheritance state
 * @rt_waiter:       rt_waiter storage for use with requeue_pi
 * @requeue_pi_key:  the requeue_pi target futex key
 * @bitset:       bitset for the optional bitmasked wakeup
 *
 * We use this hashed waitqueue, instead of a normal wait_queue_t, so
 * we can wake only the relevant ones (hashed queues may be shared).
 *
 * A futex_q has a woken state, just like tasks have TASK_RUNNING.
 * It is considered woken when plist_node_empty(&q->list) || q->lock_ptr == 0.
 * The order of wakeup is always to make the first condition true, then
 * the second.
 *
 * PI futexes are typically woken before they are removed from the hash list via
 * the rt_mutex code. See unqueue_me_pi().
 */
struct futex_q {
    struct plist_node list;

    struct task_struct *task;
    spinlock_t *lock_ptr;
    union futex_key key;
    struct futex_pi_state *pi_state;
    struct rt_mutex_waiter *rt_waiter;
    union futex_key *requeue_pi_key;
    u32 bitset;
};

看到了里面与 PI有关的东西，现在还不明白 ,一会儿通过几个函数了解一下

现在只要知道 futex 有 PI 和 non-PI之分， PI futex的 futex_q结构会有额外的几个成员， futex-> pi_state->pi_mutex会是一个rt_mutex ，而 rt_mutex_waiter是等待他的一个结构，通常分配在等待线程的栈上

三函数执行流程
1. futex_lock_pi

实际上会将一个栈上的 rt_mutex_waiter插入到链表futex_q.pi_state->pi_mutex 中，这是一个rt_mutex的结构

调用流程： futex_lock_pi->rt_mutex_timed_lock-> rt_mutex_timed_fastlock->rt_mutex_slowlock->task_blocks_on_rt_mutex

debug_rt_mutex_init_waiter(&waiter); rt_waiter 是rt_mutex_slowlock 在栈上的临时分配的结构

随后futex_lock_pi->rt_mutex_timed_lock-> rt_mutex_timed_fastlock->rt_mutex_slowlock->__rt_mutex_slowlock

将进入无限等待，除非被唤醒

static int __sched
__rt_mutex_slowlock(struct rt_mutex *lock, int state,
           struct hrtimer_sleeper *timeout,
           struct rt_mutex_waiter *waiter)
{
    int ret = 0;

    for (;;) {
       /* Try to acquire the lock: */
       if (try_to_take_rt_mutex(lock, current, waiter))
           break;

       /*
        * TASK_INTERRUPTIBLE checks for signals and
        * timeout. Ignored otherwise.
        */
       if (unlikely(state == TASK_INTERRUPTIBLE)) {
           /* Signal pending? */
           if (signal_pending(current))
              ret = -EINTR;
           if (timeout && !timeout->task)
              ret = -ETIMEDOUT;
           if (ret)
              break;
       }

2.futex_wait_requeue_pi

    /*
     * The waiter is allocated on our stack, manipulated by the requeue
     * code while we sleep on uaddr.
     */
    debug_rt_mutex_init_waiter(&rt_waiter);// 临时分配一个rt_waiter，与 futex_lock_pi类似
    rt_waiter.task = NULL;

    ret = get_futex_key(uaddr2, flags & FLAGS_SHARED, &key2, VERIFY_WRITE);
    if (unlikely(ret != 0))
       goto out;

    q.bitset = bitset;
    q.rt_waiter = &rt_waiter;   //for use with requeue_pi
    q.requeue_pi_key = &key2;  //requeue pi target key

    if(is_my_process){
       printk("[%d] futex_wait_requeue_pi:Prepare to wait on uaddr.\n",
           task_pid_vnr(current_task));

       futex_dump_futex_q(&q);
    }
    /*
     * Prepare to wait on uaddr. On success, increments q.key (key1) ref
     * count.
     *//等待从addr1 被唤醒
    ret = futex_wait_setup(uaddr, val, flags, &q, &hb);
    if (ret)
       goto out_key2;

    if(is_my_process){
       printk("[%d] futex_wait_requeue_pi:before Queue the futex_q.\n",
           task_pid_vnr(current_task));

       futex_dump_futex_q(&q);
    }

    /* Queue the futex_q, drop the hb lock, wait for wakeup. */
    futex_wait_queue_me(hb, &q, to);   //将本线程插入futex2的队列中，这里是将 rt_waiter插入去等待

3 futex_requeue_pi（futex1 ，futex2 ）会将futex1上面的 waiter唤醒并插入 futex2

如果这两个值相等，那么唤醒 futex1上的 waiter会使得 futex_wait_queue_me线程被唤醒，但是这个值又会被插入到 futex2中

由于futex_wait_requeue_pi的线程被唤醒并退出，那么 futex2的 rt_mutex队列上面便挂了一个已经被释放掉的 rt_mutex_waiter，这就是内核栈空间的use after free

四。如何利用？

futex_wait_requeue_pi所在的线程内核栈出现的 UAF问题，该线程利用 sendmmsg可以对内核堆栈进行控制

我们选择控制 rt_mutex_waiter结构中，这个结构有两个链表， UAF之后链表将被我们控制

struct rt_mutex_waiter {
    struct plist_node list_entry;
    struct plist_node pi_list_entry;
    struct task_struct   *task;
    struct rt_mutex      *lock;
}

于是我们调用 futex_lock_pi会走到task_blocks_on_rt_mutex 触发一个plist_add操作，造成内核栈信息泄漏，并且给了我们一次机会进行任意地址写

我们选择写内核栈上的 thread_info->addr_limit，一个栈上面的地址将会被写入到 addr_limit，导致我们有了从用户态写内核态的方法

这相当于造出了 CVE-2013-6282，读写任意地址

注意：该方法不能退出进程，否则释放被利用的线程将让内核崩溃

该漏洞的利用技术是Project Zero最近的大作[1]，遗憾是有些局限性，我也就没有搭环境调试了，仅学习下思路。

可能有错误和理解不到位的地方。本文只是笔记，推荐阅读原文[1]

首先看下源码[2]

      newp = (struct known_trans *) malloc (sizeof (struct known_trans)
                        + (__gconv_max_path_elem_len
                           + name_len + 3)
                        + name_len);
      if (newp != NULL)
    {
      char *cp;

      /* Clear the struct.  */
      memset (newp, '\0', sizeof (struct known_trans));

      /* Store a copy of the module name.  */
      newp->info.name = cp = (char *) (newp + 1);
      cp = __mempcpy (cp, trans->name, name_len);

      newp->fname = cp;

      /* Search in all the directories.  */
      for (runp = __gconv_path_elem; runp->name != NULL; ++runp)
        {
          cp = __mempcpy (__stpcpy ((char *) newp->fname, runp->name),
                  trans->name, name_len);
          if (need_so)
                //nul byte overflow
        memcpy (cp, ".so", sizeof (".so"));

cp是堆上的内存，如此拷贝将可能导致在cp尾部覆盖四字节0x6f732e00 即为".so"

这样做将导致内存破坏，proof如下：

$ CHARSET=//ABCDE pkexec 
*** Error in `pkexec': malloc(): memory corruption: 0x00007f15bc0732d0 ***
*** Error in `pkexec': malloc(): memory corruption: 0x00007f15bc0732d0 ***

绕过ASLR？

据说在Fedora 32-bit上可以直接这样：

  rlim.rlim_cur = rlim.rlim_max = RLIM_INFINITY;
  setrlimit(RLIMIT_STACK, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1;
  setrlimit(RLIMIT_DATA, &rlim);

绕过后，程序永远从固定基址加载

40000000-40005000 r-xp 00000000 fd:01 9909 /usr/bin/pkexec

406b9000-407bb000 rw-p 00000000 00:00 0 /* mmap() heap */

bfce5000-bfd06000 rw-p 00000000 00:00 0 [stack]

往后复制固定的四字节有什么用？

malloc 内存堆线性排列，类似于 |m| blah1 |m| blah2 |m| blah3

复制四个字节可以覆盖后面一个块的meta data，metadata是一个内存块长度，最后一个字节是flag，0x1代表正在使用，0x0代表已经free，需要回收。而0x6f732e00最后一个字节肯定是NUL byte，所以正好将下一个块堆内存标记为free。

所以如果能溢出blah2，覆盖blah3前面的m，然后坐等blah3回收，那么回收机制将会去m + &blah3的地方找链表进行断链，这时候将得到一次地址写的机会

如何找到一个合适的blah3？

首先选择攻击的目标是pkexec，这个文件有权限提权，pkexec在判断传入的路径不存在时将打出一个error message 这块堆得大小是508bytes + 4bytes metadata

而这个error message 的分配逻辑是这样的：先申请100字节，不满足则在100*2+100 = 300字节，在不满足则申请300*2+100 = 700字节

本例中的申请顺序如下：

malloc(100), malloc(300), free(100), malloc(700), free(300), realloc(508)

内存布局如下：

这时候将CHARSET=//AAAAA…设置为236 bytes 的A，将恰好覆盖到300的free space里面：

| blah |m| blah |m| charset derived value: 236 bytes |m: 0x00000201| error message: 508 bytes |

m = 0x201是指512字节的buffer，并且这段内存在使用中，这个值将在后续利用中改写

接下来如何利用？heap spray

修改过之后，m指向的内存结尾将指向 0x406xxxxx + 0x6f732e00 ,那么加完后这个值已经进入内核空间了，无法利用

如果能做一个heap spray，把堆内存推到7xxx xxxx上面，那么加完0x6f732e00最终就是一个0x5xxxxxxx的地址，这个地址的内容是spray出来可，可控

pkexec恰好有一个传入参数，没有做内存释放，可用来做heap spray，而且这个 -u 可以传多次，实际上他传了15 million+个 --user 。。。

     else if (strcmp (argv[n], "--user") == 0 || strcmp (argv[n], "-u") == 0)
        {
          n++;
          if (n >= (guint) argc)
            {
              usage (argc, argv);
              goto out;
            }

          opt_user = g_strdup (argv[n]);
        }

最后：

利用链表断链操作写一个地址，向tls_dtor_list. __exit_funcs 写入值以控制代码执行流程

[1] http://googleprojectzero.blogspot.tw/2014/08/the-poisoned-nul-byte-2014-edition.html

[2] https://github.com/lattera/glibc/blob/a2f34833b1042d5d8eeb263b4cf4caaea138c4ad/iconv/gconv_trans.c

[3] https://code.google.com/p/google-security-research/issues/detail?id=96

Retme的未来道具研究所

CVE-2014-3153 分析以及利用

作者：retme 发布时间：September 19, 2014 分类：AndroidSec No Comments

Project Zero 对 CVE-2014-5119的利用

作者：retme 发布时间：September 9, 2014 分类：Notes No Comments

最近回复

标签们

归档

其它