Part 4 — Memory map leads us to our destination
Andrey Zagrebin, Moshe Kol, Shlomi Oberman
This post is the forth and final of a four-part blog series documenting the different structures and stages of the firmware update.
- Part 1 – Just Print Me
- Part 2 – S-Records parsing S-Records
- Part 3 – From NAND to RAM through sliding windows
- Part 4 – Memory map leads us to our destination
In the previous post we detailed the flash layout and the sliding window compression used to store memory sections on-disk.
We now have a raw flash image on our hands.
The application loader
At this point, we have preloaded the last few code sections, decompressing some of them, followed by general decompressing. We’re done and ready to look at the application code. right?
Don’t get excited yet. The path to enlightenment is almost as long as the path to HP firmware unpacking.
Following the newly loaded code, we reach yet another indirect call. From the debug strings around the call opcode, it looks like this is the entry point of the printer application code:
This entry point ultimately comes from decoding a structure already loaded into memory:
For reasons apparent later, we refer to this structure as the application header or apphdr, while referring to the code using it as the applicationloader (or app loade).
The address 0x4fffc0004
is the start of this structure, and at 0x4fffc038
we find the entry point, 0x4145a9b4+1
. This address is once again in a not-yet-initialized part of RAM. Reverse-engineering the function that parses the application header, we learn valuable implementation details, presented in the following paragraphs.
One of the first operations in the app loader is displaying the bootsplash bitmap picture. This picture is identical to the one found in the Flash image before loading the firmware to RAM.
Next, the application loader again performs memset
, memcpy
, and decompression operations on chunks of memory. Curiously, both the pre-loader and application loader have their own copies of these functions rather than sharing one set. This duplication suggests a possibleorganizational barrier between the pre-loader and app loader software development teams.
This time though, instead of using hardcoded arguments, the app loader invokes these functions in a loop, reading sets of parameters from memory pointed to indirectly by the app header.
Here’s an example of decompilation of that part of the app loader that invokes memcpy
in bulk:
Notice the verify_address
function. This function checks whether the address range written to indeed overlaps so-called “protected ranges”. A protected range is a range of memory addreses that will not be overwritten, even if a section is marked for loading at an address that overlaps with that range. If there is an overlap, the loader does not invoke the relevant memcpy
, memset
or uncompress
for that section. To check whether an address range is protected, the loader compares the range against 0x1a
pairs of starting and ending addresses of protected memory ranges. The array of pairs of addresses is also pointed to by the apphdr. We’ll discuss why these ranges are so special when we discuss the different memory sections.
The memcpy
parameters are stored as an array of triplets of the form:
Offset | Length | Type | Description |
---|---|---|---|
0 | 4 | void* | dest – the start address of the block to initialize |
4 | 4 | void* | src – the start address of the block to read from |
8 | 4 | size_t | num – number of bytes to set (size of blocks) |
And similarly for memset
:
Offset | Length | Type | Description |
---|---|---|---|
0 | 4 | void* | addr – the start address of the block to initialize |
4 | 4 | int | value – the byte value to set |
8 | 4 | size_t | num – number of bytes to set (size of block to initialize) |
Note: Although the second argument represents a byte value, memset
expects an int, which is recast internally to a byte, consistent with the libc
version of memset
.
And uncompress
:
Offset | Length | Type | Description |
---|---|---|---|
0 | 4 | void* | dest – the start address of the block to initialize |
4 | 4 | void* | src – the start address of the block of compressed data |
8 | 4 | size_t | compressed_size – size of the compressed block |
There is quite a lot more code in the application loader, but we need only focus on the code that relates to loading those sections into memory required to achieve our goal, which is to reverse engineer the firmware and find security vulnerabilities.
In the end, execution is passed to the app_entry
function, pointed to by the apphdr.
Application header structure
All the parameters related to the application loader reside in the application header and the memory it points to. Let’s go through the important members of the application header structure:
(“Offset” means the decimal offset from structure start. We omit irrelevant and unknown fields.)
Offset | Value | Name | Description |
---|---|---|---|
0 | 0x3ca55a3c | magic | Checked before the stucture is used |
4 | 0x6c | size | Total size of the struct in bytes |
8 | 0x0461090d | more_magic_1 | See notes |
12 | 0xfb9ef6f2 | more_magic_2 | See notes |
20 | 0x4e0b0000 | bootsplash_bmp | bootsplash_bmp is a pointer to the bootspalsh bitmap image (BMP file format). This appears to be the same picture as the one found on the flash image before the code that is loaded to RAM |
52 | 0x4145a9b5 | entry_point | Pointer to the application entry point |
56 | 0x4fffc000 | protected_count | Pointer to a 32-bit integer counting the number of protected memory ranges |
60 | 0x4fffc070 | protected_addresses | Pointer to pairs of (start, end) protected memory ranges |
64 | 0x4e10fcc0 | section_linked_list | Pointer to a linked list of memory section descriptors |
72 | 0x4e10fa68 | memset_list_start | Start of the list of memset parameter triples |
76 | 0x4e10fad4 | memset_list_end | End of the list of memset parameter triples |
80 | 0x4e10fad4 | copy_list_start | Start of the list of memcpy parameter triples |
84 | 0x4e10fbdc | copy_list_end | End of the list of memcpy parameter triples |
88 | 0x4e10fbb8 | copy_list_barrier | See notes |
92 | 0x4e10fbdc | uncompress_list_start | Start of the list of uncompress parameter triples |
96 | 0x4e10fcc0 | uncompress_list_end | End of the list of uncompress parameter triples |
100 | 0x4e10fca8 | uncompress_list_barrier | See notes |
Notes:
- All fields are 32 bits (4 bytes) long
- The purpose of the two
more_magic
fields is not clear; we conjecture they might be a version id or some kind of bitmask. Interestingly, their two values are bitwise complements of one another. Both values, except for the most significant nibble, are checked before reading from the apphdr. Technically, each value is masked with 0x0fffffff and tested against0x0461090d
and0x0b9ef6f2
. - The
copy_list_barrier
field points to the middle of thememcpy
parameter list, and is not used in this implementation of the loader. It may indicate that the values before this point have a different purpose than those following.uncompress_list_barrier
points to the middle of theuncompress
parameter list in much the same way.
Memory sections and their descriptors
As briefly mentioned above, the apphdr has a field (section_linked_list
) pointing to a linked list of memory section descriptors. The app loader code does not seem use it. However, it contains information about the structure of the printer’s memory, including section names, which may aid us in loading and reverse-engineering of the firmware.
The field section_linked_list
points to the first element of this list and each element consists of the following members:
Offset | Type | Name |
---|---|---|
0 | memory_section* | next |
4 | char* | section_name |
8 | void* | start_addr |
12 | size_t | size |
16 | uint? | unknown |
20 | memory_section * | dest_section |
All members are 32-bit (4 bytes) long.
Following is a description of the element members:
-
next
: Pointer to the next element of the linked list. -
section_name
: Pointer to a null-terminated string containing the section name -
start_addr
: The starting address of the section -
size
: The size of the section in bytes -
unknown
: The purpose of this field was not researched. It could contain Information about the section type or various flags (e.g., rwx (“read-write-execute”) permissions) Values observed were:1
,2
,4
,0xa
,0xc
-
dest_section
: If this section is used to initialize another section (e.g. it is the source of amemcpy
oruncompress
operation), this field holds a pointer to the destination section descriptor. Otherwise, it is NULL.This field points to the descriptor (i.e., linked-list element) and not to the start of the section in memory.
Example of two entries:
[0x4e110504] .cromtext:
next: 0x4e110528 (.crommodule section descriptor)
section_name: 0x4e11051c (".cromtext")
start_addr: 0x4f522fd0
size: 0xa25a6c
unknown: 0x1
dest_section: 0x4e110b78 (.text section descriptor)
[0x4e110b78] .text:
next: 0x4e110b98 (.module section descriptor)
section_name: 0x4e110b90 (".text")
start_addr: 0x4036800c
size: 0x116655c
unknown: 0x1
dest_section: 0x0
In this example, .cromtext
has a non-zero dest_section
(0x4e110b78
). As expected, the .cromtext section is decompressed and loaded to the .text section (address 0x4e110b78
) by the app loader.
Some examples of the contents of memory sections include:
-
The
.load_apphdr
section: The section is constructed as follows:Size Description 4
bytesProtected memory entries count (0x1a) 0x6c
bytesThe apphdr struct itself 0xd0
bytesProtected memory entries (Pairs of 32-bit addresses. 0x1a*8=0xd0 bytes) -
The
.secinfo
section contains the parameter triples for thememset
,memcpy
, anddecompress
functions, elements of the section descriptor linked list, and the section names as null-terminated strings.
Insights
Now that we can associate memory address ranges with sections, we can reach some interesting conclusions:
- the memory sections do not overlap.
- the protected areas of memory include the following named sections:
.load_text
.load_rodata
.boot_ncdram_hole (empty section)
.load_ncdata (empty section)
.load_data
.load_ncbss
.load_cgdbuf
.load_bss
.nosi_text
.nosi_rodata (empty section)
.nosd_data (empty section)
.nosd_bss
.startup_text
.startup_rodata
.startup_data (empty section)
.startup_bss
.stack
.erom_support_2
.secinfo
These are mostly sections that are critical for running the app loader, and include sections that were initialized by the pre-loader.
- The
memcpy
/memset
/uncompress
parameters correspond to entire memory sections and do not overlap. - All the sections that need initialization have corresponding parameters in one of the
memcpy
/memset
/uncompress
lists — even those that have been initialized by the pre-loader and are part of the protected ranges.
The last conclusion is encouraging. If we know the address of the apphdr structure, we can have a loader script parse it and initialize the uninitialized memory automatically. The initialization includes those sections with hardcoded addresses in the pre-loader code. The magic 4-byte number at the start of the apphdr structure is unique, and can be used to find the structure.
So, are we done yet? Every path has an end, and we have finally reach ours. Once here, we discover that “Accomplishments will prove to be a journey, not a destination” (Dwight D. Eisenhower).
Let’s take a moment to remember all the stages of the firmware we had to unpack and decode to reach this point:
-
In post no. 1, we unpacked the PCL format, including a proprietary extension and extracted data encoded as a raster graphics image.
-
In post no. 2, we encountered S-records, and all they wanted to do was parse some S-records. We dealt with a proprietary S-record binary variation along the way.
-
In post no. 3, we started looking at the code and discovered that it is self-modifying and staged, with each part loading the next into memory. We also got a crash lesson in sliding-window compression 101.
-
In post no. 4, we uncovered the app header structure, saw how the different sections of code and data are loaded into memory, and started to see the light at the end of the tunnel. We also finished this blog post series.
What should we do next week?
Be on the lookout for our upcoming announcement on June 16th, when we announce the security findings for which we performed all of this initial research of firmware unpacking.
References
- No references used for this post
Thank you
Moshe Rubin and Daniel Goldberg for proofreading