Unpacking HP Firmware Updates

Part 4 — Memory map leads us to our destination

Andrey Zagrebin, Moshe Kol, Shlomi Oberman

This post is the forth and final of a four-part blog series documenting the different structures and stages of the firmware update.

Part 1 – Just Print Me
Part 2 – S-Records parsing S-Records
Part 3 – From NAND to RAM through sliding windows
Part 4 – Memory map leads us to our destination

In the previous post we detailed the flash layout and the sliding window compression used to store memory sections on-disk.
We now have a raw flash image on our hands.

The application loader

At this point, we have preloaded the last few code sections, decompressing some of them, followed by general decompressing. We’re done and ready to look at the application code. right?

Don’t get excited yet. The path to enlightenment is almost as long as the path to HP firmware unpacking.

Following the newly loaded code, we reach yet another indirect call. From the debug strings around the call opcode, it looks like this is the entry point of the printer application code:

This entry point ultimately comes from decoding a structure already loaded into memory:

For reasons apparent later, we refer to this structure as the application header or apphdr, while referring to the code using it as the applicationloader (or app loade).

The address 0x4fffc0004 is the start of this structure, and at 0x4fffc038 we find the entry point, 0x4145a9b4+1. This address is once again in a not-yet-initialized part of RAM. Reverse-engineering the function that parses the application header, we learn valuable implementation details, presented in the following paragraphs.

One of the first operations in the app loader is displaying the bootsplash bitmap picture. This picture is identical to the one found in the Flash image before loading the firmware to RAM.

Next, the application loader again performs memset, memcpy, and decompression operations on chunks of memory. Curiously, both the pre-loader and application loader have their own copies of these functions rather than sharing one set. This duplication suggests a possibleorganizational barrier between the pre-loader and app loader software development teams.

This time though, instead of using hardcoded arguments, the app loader invokes these functions in a loop, reading sets of parameters from memory pointed to indirectly by the app header.

Here’s an example of decompilation of that part of the app loader that invokes memcpy in bulk:

Notice the verify_address function. This function checks whether the address range written to indeed overlaps so-called “protected ranges”. A protected range is a range of memory addreses that will not be overwritten, even if a section is marked for loading at an address that overlaps with that range. If there is an overlap, the loader does not invoke the relevant memcpy, memset or uncompress for that section. To check whether an address range is protected, the loader compares the range against 0x1a pairs of starting and ending addresses of protected memory ranges. The array of pairs of addresses is also pointed to by the apphdr. We’ll discuss why these ranges are so special when we discuss the different memory sections.

The memcpy parameters are stored as an array of triplets of the form:

Offset	Length	Type	Description
0	4	void*	dest – the start address of the block to initialize
4	4	void*	src – the start address of the block to read from
8	4	size_t	num – number of bytes to set (size of blocks)

And similarly for memset:

Offset	Length	Type	Description
0	4	void*	addr – the start address of the block to initialize
4	4	int	value – the byte value to set
8	4	size_t	num – number of bytes to set (size of block to initialize)

Note: Although the second argument represents a byte value, memset expects an int, which is recast internally to a byte, consistent with the libc version of memset.

And uncompress:

Offset	Length	Type	Description
0	4	void*	dest – the start address of the block to initialize
4	4	void*	src – the start address of the block of compressed data
8	4	size_t	compressed_size – size of the compressed block

There is quite a lot more code in the application loader, but we need only focus on the code that relates to loading those sections into memory required to achieve our goal, which is to reverse engineer the firmware and find security vulnerabilities.

In the end, execution is passed to the app_entry function, pointed to by the apphdr.

Application header structure

All the parameters related to the application loader reside in the application header and the memory it points to. Let’s go through the important members of the application header structure:

(“Offset” means the decimal offset from structure start. We omit irrelevant and unknown fields.)

Offset	Value	Name	Description
0	0x3ca55a3c	magic	Checked before the stucture is used
4	0x6c	size	Total size of the struct in bytes
8	0x0461090d	more_magic_1	See notes
12	0xfb9ef6f2	more_magic_2	See notes
20	0x4e0b0000	bootsplash_bmp	bootsplash_bmp is a pointer to the bootspalsh bitmap image (BMP file format). This appears to be the same picture as the one found on the flash image before the code that is loaded to RAM
52	0x4145a9b5	entry_point	Pointer to the application entry point
56	0x4fffc000	protected_count	Pointer to a 32-bit integer counting the number of protected memory ranges
60	0x4fffc070	protected_addresses	Pointer to pairs of (start, end) protected memory ranges
64	0x4e10fcc0	section_linked_list	Pointer to a linked list of memory section descriptors
72	0x4e10fa68	memset_list_start	Start of the list of `memset` parameter triples
76	0x4e10fad4	memset_list_end	End of the list of `memset` parameter triples
80	0x4e10fad4	copy_list_start	Start of the list of `memcpy` parameter triples
84	0x4e10fbdc	copy_list_end	End of the list of `memcpy` parameter triples
88	0x4e10fbb8	copy_list_barrier	See notes
92	0x4e10fbdc	uncompress_list_start	Start of the list of `uncompress` parameter triples
96	0x4e10fcc0	uncompress_list_end	End of the list of `uncompress` parameter triples
100	0x4e10fca8	uncompress_list_barrier	See notes

Notes:

All fields are 32 bits (4 bytes) long
The purpose of the two more_magic fields is not clear; we conjecture they might be a version id or some kind of bitmask. Interestingly, their two values are bitwise complements of one another. Both values, except for the most significant nibble, are checked before reading from the apphdr. Technically, each value is masked with 0x0fffffff and tested against 0x0461090d and 0x0b9ef6f2.
The copy_list_barrier field points to the middle of the memcpy parameter list, and is not used in this implementation of the loader. It may indicate that the values before this point have a different purpose than those following. uncompress_list_barrier points to the middle of the uncompress parameter list in much the same way.

Memory sections and their descriptors

As briefly mentioned above, the apphdr has a field (section_linked_list) pointing to a linked list of memory section descriptors. The app loader code does not seem use it. However, it contains information about the structure of the printer’s memory, including section names, which may aid us in loading and reverse-engineering of the firmware.

The field section_linked_list points to the first element of this list and each element consists of the following members:

Offset	Type	Name
0	memory_section*	next
4	char*	section_name
8	void*	start_addr
12	size_t	size
16	uint?	unknown
20	memory_section *	dest_section

All members are 32-bit (4 bytes) long.

Following is a description of the element members:

next: Pointer to the next element of the linked list.
section_name: Pointer to a null-terminated string containing the section name
start_addr: The starting address of the section
size: The size of the section in bytes
unknown: The purpose of this field was not researched. It could contain Information about the section type or various flags (e.g., rwx (“read-write-execute”) permissions) Values observed were: 1, 2, 4, 0xa, 0xc
dest_section: If this section is used to initialize another section (e.g. it is the source of a memcpy or uncompress operation), this field holds a pointer to the destination section descriptor. Otherwise, it is NULL.

This field points to the descriptor (i.e., linked-list element) and not to the start of the section in memory.

Example of two entries:

[0x4e110504] .cromtext:
    next: 0x4e110528 (.crommodule section descriptor)
    section_name: 0x4e11051c (".cromtext")
    start_addr: 0x4f522fd0
    size: 0xa25a6c
    unknown: 0x1
    dest_section: 0x4e110b78 (.text section descriptor)

[0x4e110b78] .text:
    next: 0x4e110b98 (.module section descriptor)
    section_name: 0x4e110b90 (".text")
    start_addr: 0x4036800c
    size: 0x116655c
    unknown: 0x1
    dest_section: 0x0

In this example, .cromtext has a non-zero dest_section (0x4e110b78). As expected, the .cromtext section is decompressed and loaded to the .text section (address 0x4e110b78) by the app loader.

Some examples of the contents of memory sections include:

The .load_apphdr section: The section is constructed as follows:

Size Description

4 bytes Protected memory entries count (0x1a)

0x6c bytes The apphdr struct itself

0xd0 bytes Protected memory entries (Pairs of 32-bit addresses. 0x1a*8=0xd0 bytes)
The .secinfo section contains the parameter triples for the memset, memcpy, and decompress functions, elements of the section descriptor linked list, and the section names as null-terminated strings.

Size	Description
`4` bytes	Protected memory entries count (0x1a)
`0x6c` bytes	The apphdr struct itself
`0xd0` bytes	Protected memory entries (Pairs of 32-bit addresses. 0x1a*8=0xd0 bytes)

Insights

Now that we can associate memory address ranges with sections, we can reach some interesting conclusions:

the memory sections do not overlap.
the protected areas of memory include the following named sections:

.load_text
.load_rodata
.boot_ncdram_hole (empty section)
.load_ncdata (empty section)
.load_data
.load_ncbss
.load_cgdbuf
.load_bss
.nosi_text
.nosi_rodata (empty section)
.nosd_data (empty section)
.nosd_bss
.startup_text
.startup_rodata
.startup_data (empty section)
.startup_bss
.stack
.erom_support_2
.secinfo

These are mostly sections that are critical for running the app loader, and include sections that were initialized by the pre-loader.

The memcpy/memset/uncompress parameters correspond to entire memory sections and do not overlap.
All the sections that need initialization have corresponding parameters in one of the memcpy/memset/uncompress lists — even those that have been initialized by the pre-loader and are part of the protected ranges.

The last conclusion is encouraging. If we know the address of the apphdr structure, we can have a loader script parse it and initialize the uninitialized memory automatically. The initialization includes those sections with hardcoded addresses in the pre-loader code. The magic 4-byte number at the start of the apphdr structure is unique, and can be used to find the structure.

So, are we done yet? Every path has an end, and we have finally reach ours. Once here, we discover that “Accomplishments will prove to be a journey, not a destination” (Dwight D. Eisenhower).

Let’s take a moment to remember all the stages of the firmware we had to unpack and decode to reach this point:

In post no. 1, we unpacked the PCL format, including a proprietary extension and extracted data encoded as a raster graphics image.
In post no. 2, we encountered S-records, and all they wanted to do was parse some S-records. We dealt with a proprietary S-record binary variation along the way.
In post no. 3, we started looking at the code and discovered that it is self-modifying and staged, with each part loading the next into memory. We also got a crash lesson in sliding-window compression 101.
In post no. 4, we uncovered the app header structure, saw how the different sections of code and data are loaded into memory, and started to see the light at the end of the tunnel. We also finished this blog post series.

What should we do next week?

Be on the lookout for our upcoming announcement on June 16th, when we announce the security findings for which we performed all of this initial research of firmware unpacking.

References

No references used for this post

Thank you

Moshe Rubin and Daniel Goldberg for proofreading

JSOF Team

Get our posts to your Email

Subscribe to our mailing list

Unpacking HP Firmware Updates – Part 4