One year ago, we received a contract as a PDF file. It was digitally signed. We looked at the document - ignoring the "certificate is not trusted" warning shown by the viewer - and asked ourselfs:
"How do PDF signatures exactly work?"
We are quite familiar with the security of message formats like XML and JSON. But nobody had an idea, how PDFs really work. So we started our research journey.
Today, we are happy to announce our results. In this blog post, we give an overview how PDF signatures work and on top, we reveal three novel attack classes for spoofing a digitally signed PDF document. We present our evaluation of 22 different PDF viewers and show 21 of them to be vulnerable. We additionally evaluated 8 online validation services and found 6 to be vulnerable.
In cooperation with the BSI-CERT, we contacted all vendors, provided proof-of-concept exploits, and helped them to fix the issues and three generic CVEs for each attack class were issued:
CVE-2018-16042,
CVE-2018-18688,
CVE-2018-18689.
Full results are available in the
master thesis of Karsten Meyer zu Selhausen, in our
security report, and on our
website.
Digitally Signed PDFs? Who the Hell uses this?
Maybe you asked yourself, if signed PDFs are important and who uses them.
In fact, you may have already used them.
Have you ever opened an Invoice by companies such as Amazon, Sixt, or Decathlon?
These PDFs are digitally signed and protected against modifications.
In fact, PDF signatures are widely deployed in our world. In 2000, President Bill Clinton enacted a federal law facilitating the use of electronic and digital signatures in interstate and foreign commerce by ensuring the validity and legal effect of contracts.
He approved the eSign Act by digitally signing it.
Since 2014,
organizations delivering public digital services in an EU member state are required to support digitally signed documents, which are even admissible as evidence in legal proceedings.
In Austria, every governmental authority digitally signs any official document [
§19]. In addition, any new law is legally valid after its announcement within a digitally signed PDF.
Several countries like Brazil, Canada, the Russian Federation, and Japan also use and accept digitally signed documents.
According to Adobe Sign, the company processed
8 billion electronic and digital signatures in the 2017 alone.
Crash Course: PDF and PDF Signatures
To understand how to spoof PDF Signatures, we unfortunately need to explain the basics first. So here is a breef overview.
PDF files are ASCII files. You can use a common text editor to open them and read the source code.
PDF header. The header is the first line within a PDF and defines the interpreter version to be used. The provided example uses version PDF 1.7.
PDF body. The body defines the content of the PDF and contains text blocks, fonts, images, and metadata regarding the file itself. The main building blocks within the body are objects. Each object starts with an object number followed by a generation number. The generation number should be incremented if additional changes are made to the object.
In the given example, the Body contains four objects: Catalog, Pages, Page, and stream. The Catalog object is the root object of the PDF file. It defines the document structure and can additionally declare access permissions. The Catalog refers to a Pages object which defines the number of the pages and a reference to each Page object (e.g., text columns). The Page object contains information how to build a single page. In the given example, it only contains a single string object "Hello World!".
Xref table. The Xref table contains information about the position (byte offset) of all PDF objects within the file.
Trailer. After a PDF file is read into memory, it is processed from the end to the beginning. By this means, the Trailer is the first processed content of a PDF file. It contains references to the Catalog and the Xref table.
How do PDF Signatures work?
PDF Signatures rely on a feature of the PDF specification called
incremental saving (also known as
incremental update), allowing the modification of a PDF file without changing the previous content.
As you can see in the figure on the left side, the original document is the same document as the one described above. By signing the document, an
incremental saving is applied and the following content is added: a new
Catalog, a
Signature object, a new
Xref table referencing the new object(s), and a new
Trailer. The new
Catalog extends the old one by adding a reference to the
Signature object. The Signature object (5 0 obj) contains information regarding the applied cryptographic algorithms for hashing and signing the document. It additionally includes a
Contents parameter containing a hex-encoded PKCS7 blob, which holds the certificates as well as the signature value created with the private key corresponding to the public key stored in the certificate. The
ByteRange parameter defines which bytes of the PDF file are used as the hash input for the signature calculation and defines 2 integer tuples:
a, b : Beginning at byte offset a, the following b bytes are used as the first input for the hash calculation. Typically, a 0 is used to indicate that the beginning of the file is used while a b is the byte offset where the PKCS#7 blob begins.
c, d : Typically, byte offset c is the end of the PKCS#7 blob, while c d points to the last byte range of the PDF file and is used as the second input to the hash calculation.
According to the specification, it is recommended to sign the whole file except for the PKCS#7 blob (located in the range between
a b and
c).
Attacks
During our research, we discovered three novel attack classes on PDF signatures:
- Universal Signature Forgery (USF)
- Incremental Saving Attack (ISA)
- Signature Wrapping Attack (SWA)
In this blog post, we give an overview on the attacks without going into technical details. If you are more interested, just take a look at the sources we summarized for you
here.
Universal Signature Forgery (USF)
The main idea of Universal Signature Forgery (USF) is to manipulate the meta information in the signature in such a way that the targeted viewer application opens the PDF file, finds the signature, but is unable to find all necessary data for its validation.
Instead of treating the missing information as an error, it shows that the contained signature is valid. For example, the attacker can manipulate the
Contents or
ByteRange values within the
Signature object. The manipulation of these entries is reasoned by the fact that we either remove the signature value or the information stating which content is signed.
The attack seems trivial, but even very good implementations like Adobe Reader DC preventing all other attacks were susceptible against USF.
Incremental Saving Attack (ISA)
The Incremental Saving Attack (ISA) abuses a legitimate feature of the PDF specification, which allows to update a PDF file by appending the changes. The feature is used, for example, to store PDF annotations, or to add new pages while editing the file.
The main idea of the ISA is to use the same technique for changing elements, such as texts, or whole pages included in the signed PDF file to what the attacker desires.
In other words, an attacker can redefine the document's structure and content using the
Body Updates part. The digital signature within the PDF file protects precisely the part of the file defined in the
ByteRange. Since the
incremental saving appends the
Body Updates to the end of the file, it is not part of the defined
ByteRange and thus not part of the signature's integrity protection. Summarized, the signature remains valid, while the
Body Updates changed the displayed content.
This is not forbidden by the PDF specification, but the signature validation should indicate that the document has been altered after signing.
Signature Wrapping Attack (SWA)
Independently of the PDFs, the main idea behind Signature Wrapping Attacks is to force the verification logic to process different data than the application logic.
In PDF files, SWA targets the signature validation logic by relocating the originally signed content to a different position within the document and inserting new content at the allocated position. The starting point for the attack is the manipulation of the
ByteRange value allowing to shift the signed content to different loctions within the file.
On a very technical level, the attacker uses a validly signed document (shown on the left side) and proceeds as follows:
- Step 1 (optional): The attacker deletes the padded zero Bytes within the Contents parameter to increase the available space for injecting manipulated objects.
- Step 2: The attacker defines a new /ByteRange [a b c* d] by manipulating the c value, which now points to the second signed part placed on a different position within the document.
- Step 3: The attacker creates a new Xref table pointing to the new objects. It is essential that the byte offset of the newly inserted Xref table has the same byte offset as the previous Xref table. The position is not changeable since it is refer- enced by the signed Trailer. For this purpose, the attacker can add a padding block (e.g., using whitespaces) before the new Xref table to fill the unused space.
- Step 4: The attacker injects malicious objects which are not protected by the signature. There are different injection points for these objects. They can be placed before or after the malicious Xref table. If Step 1 is not executed, it is only possible to place them after the malicious Xref table.
- Step 5 (optional): Some PDF viewers need a Trailer after the manipulated Xref table, otherwise they cannot open the PDF file or detect the manipulation and display a warning message. Copying the last Trailer is sufficient to bypass this limitation.
- Step 6: The attacker moves the signed content defined by c and d at byte offset c*. Optionally, the moved content can be encapsulated within a stream object. Noteworthy is the fact that the manipulated PDF file does not end with %%EOF after the endstream. The reason why some validators throw a warning that the file was manipulated after signing is because of an %%EOF after the signed one. To bypass this requirement, the PDF file is not correctly closed. However, it will be still processed by any viewer.
Evaluation
In our evaluation, we searched for desktop applications validating digitally signed PDF files. We analyzed the security of their signature validation process against our 3 attack classes. The 22 applications fulfill these requirements. We evaluated the latest versions of the applications on all supported platforms (Windows, MacOS, and Linux).
Authors of this Post
Vladislav Mladenov
Christian Mainka
Karsten Meyer zu Selhausen
Martin Grothe
Jörg Schwenk
Acknowledgements
Many thanks to the CERT-Bund team for the great support during the responsible disclosure.
We also want to acknowledge the teams which reacted to our report and fixed the vulnerable implementations.
Related posts