Insecure deserialization

Insecure deserialization is when user-controllable data is deserialized by a website. This potentially enables an attacker to manipulate serialized objects in order to pass harmful data into the application code.

Sources for these notes

Tools

Java: ysoserial
PHP: phpggc.
Burpsuite Extensions: Java Deserialization Scanner, PHP Object Injection Slinger, PHP Object Injection Check
Exploits: Ruby 2.X generic deserialization to RCE gadget chain

What is deserialization

Serialization is the process of converting complex data structures, such as objects and their fields, into a "flatter" format that can be sent and received as a sequential stream of bytes.
Deserialization is the process of restoring this byte stream to a fully functional replica of the original object, in the exact state as when it was serialized.

Exactly how objects are serialized depends on the language. Some languages serialize objects into binary formats, whereas others use different string formats, with varying degrees of human readability.

Identifying

PHP

PHP uses a mostly human-readable string format, with letters representing the data type and numbers representing the length of each entry.

For example, consider a User object with the attributes:

$user->name = "carlos"; $user->isLoggedIn = true;

When serialized, this object may look something like this:

O:4:"User":2:{s:4:"name":s:6:"carlos"; s:10:"isLoggedIn":b:1;}

- `O:4:"User"` - An object with the 4-character class name `"User"`
- `2` - the object has 2 attributes
- `s:4:"name"` - The key of the first attribute is the 4-character string `"name"`
- `s:6:"carlos"` - The value of the first attribute is the 6-character string `"carlos"`
- `s:10:"isLoggedIn"` - The key of the second attribute is the 10-character string `"isLoggedIn"`
- `b:1` - The value of the second attribute is the boolean value `true`

The native methods for PHP serialization are serialize() and unserialize(). If you have source code access, you should start by looking for unserialize() anywhere in the code and investigating further.

In PHP, specific magic methods are utilized during the serialization and deserialization processes:

__sleep: Invoked when an object is being serialized. This method should return an array of the names of all properties of the object that should be serialized. It's commonly used to commit pending data or perform similar cleanup tasks.
__wakeup: Called when an object is being deserialized. It's used to reestablish any database connections that may have been lost during serialization and perform other reinitialization tasks.
__unserialize: This method is called instead of __wakeup (if it exists) when an object is being deserialized. It gives more control over the deserialization process compared to __wakeup.
__destruct: This method is called when an object is about to be destroyed or when the script ends. It's typically used for cleanup tasks, like closing file handles or database connections.
__toString: This method allows an object to be treated as a string. It can be used for reading a file or other tasks based on the function calls within it, effectively providing a textual representation of the object.

Java

Some languages, such as Java, use binary serialization formats.

To distinguish it: serialized Java objects always begin with the same bytes, which are encoded as ac ed in hexadecimal and rO0 in Base64.

Any class that implements the interface java.io.Serializable can be serialized and deserialized.

If you have source code access, take note of any code that uses the readObject() method, which is used to read and deserialize data from an InputStream.

Attacks

1. Manipulate serialized objects

If you have source code access, take note of any code that uses the readObject() method, which is used to read and deserialize data from an InputStream.

As a simple example, consider a website that uses a serialized User object to store data about a user's session in a cookie. If an attacker spotted this serialized object in an HTTP request, they might decode it to find the following byte stream:

O:4:"User":2:{s:8:"username";s:6:"carlos";s:7:"isAdmin";b:0;}

The isAdmin attribute is an obvious point of interest. An attacker could simply change the boolean value of the attribute to 1 (true), re-encode the object, and overwrite their current cookie with this modified value. In isolation, this has no effect. However, let's say the website uses this cookie to check whether the current user has access to certain administrative functionality:

1
2
3

$user = unserialize($_COOKIE); 
if ($user->isAdmin === true)
{ // allow access to admin interface }

This vulnerable code would instantiate a User object based on the data from the cookie, including the attacker-modified isAdmin attribute. At no point is the authenticity of the serialized object checked. This data is then passed into the conditional statement and, in this case, would allow for an easy privilege escalation.

Burpsuite lab

Burpsuite Lab: Modifying serialized objects

2. Modify data types

PHP-based logic is particularly vulnerable to this kind of manipulation due to the behavior of its loose comparison operator (==) when comparing different data types.

Reference: PHP type juggling

Additionally, if we spot a session cookie base64-encoded with an object, such as this PHP one, we can try to modify it.

1	`Cookie: session=Tzo0OiJVc2VyIjoyOntzOjg6InVzZXJuYW1lIjtzOjY6IndpZW5lciI7czoxMjoiYWNjZXNzX3Rva2VuIjtzOjMyOiJieno5ZmJ2OHV6YXM3MTRlcnJuaGExcTVwcGJ6eWY1aCI7fQ%3d%3d`

Decoded from base64

1	`O:4:"User":2:{s:8:"username";s:6:"wiener";s:12:"access_token";s:32:"bzz9fbv8uzas714errnha1q5ppbzyf5h";}`

s refers to string, and i to integer. We could modify the input and the data type modifying the object to:

1	`O:4:"User":2:{s:8:"username";s:13:"administrator";s:12:"access_token";i:0:"";}`

Afterwards, base64 encode the object to insert it as the cookie session:

1	`Cookie: session=Tzo0OiJVc2VyIjoyOntzOjg6InVzZXJuYW1lIjtzOjEzOiJhZG1pbmlzdHJhdG9yIjtzOjEyOiJhY2Nlc3NfdG9rZW4iO2k6MDoiIjt9`

Explanation: Let's say an attacker modified the password attribute so that it contained the integer 0 instead of the expected string. As long as the stored password does not start with a number, the condition would always return true, enabling an authentication bypass. Note that this is only possible because deserialization preserves the data type. If the code fetched the password from the request directly, the 0 would be converted to a string and the condition would evaluate to false.

Burpsuite lab

Burpsuite Lab: Modifying serialized data types

3. Abuse application functionality

A website's functionality might also perform dangerous operations on data from a deserialized object. In this case, you can use insecure deserialization to pass in unexpected data and leverage the related functionality to do damage.

For example, as part of a website's "Delete user" functionality, the user's profile picture is deleted by accessing the file path in the $user->image_location attribute. If this $user was created from a serialized object, an attacker could exploit this by passing in a modified object with the image_location set to an arbitrary file path.

Burpsuite lab

Burpsuite Lab: Using application functionality to exploit insecure deserialization

4. Magic methods

Magic methods are a special subset of methods that are invoked automatically whenever a particular event or scenario occurs. One of the most common examples in PHP is __construct(), which is invoked whenever an object of the class is instantiated, similar to Python's __init__. Typically, constructor magic methods like this contain code to initialize the attributes of the instance. However, magic methods can be customized by developers to execute any code they want.

PHP -> Most importantly in this context, some languages have magic methods that are invoked automatically during the deserialization process. For example, PHP's unserialize() method looks for and invokes an object's __wakeup() magic method.

JAVA -> In Java deserialization, the same applies to the ObjectInputStream.readObject() method, which is used to read data from the initial byte stream and essentially acts like a constructor for "re-initializing" a serialized object. However, Serializable classes can also declare their own readObject() method as follows:

private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException 
{ 
    // implementation 
}

A readObject() method declared in exactly this way acts as a magic method that is invoked during deserialization. This allows the class to control the deserialization of its own fields more closely.

You should pay close attention to any classes that contain these types of magic methods. They allow you to pass data from a serialized object into the website's code before the object is fully deserialized. This is the starting point for creating more advanced exploits.

5. Inject arbitrary objects

The methods available to an object are determined by its class. Deserialization methods do not typically check what they are deserializing. This means that you can pass in objects of any serializable class that is available to the website, and the object will be deserialized.

The fact that this object is not of the expected class does not matter. The unexpected object type might cause an exception in the application logic, but the malicious object will already be instantiated by then.

Burpsuite lab

Burpsuite Lab: Arbitrary object injection in PHP

6. Gadget chains

Classes containing these deserialization magic methods can also be used to initiate more complex attacks involving a long series of method invocations, known as a "gadget chain".

A "gadget" is a snippet of code that exists in the application that can help an attacker to achieve a particular goal. An individual gadget may not directly do anything harmful with user input. However, the attacker's goal might simply be to invoke a method that will pass their input into another gadget. By chaining multiple gadgets together in this way, an attacker can potentially pass their input into a dangerous "sink gadget", where it can cause maximum damage.

It is important to note that the vulnerability is the deserialization of user-controllable data, not the mere presence of a gadget chain in the website's code or any of its libraries.

Prebuilt gadget chains

Java: ysoserial

PHP: phpggc

About ysoserial: Not all of the gadget chains in ysoserial enable you to run arbitrary code. Instead, they may be useful for other purposes. For example, you can use the following ones to help you quickly detect insecure deserialization on virtually any server: - The URLDNS chain triggers a DNS lookup for a supplied URL. Most importantly, it does not rely on the target application using a specific vulnerable library and works in any known Java version. This makes it the most universal gadget chain for detection purposes. If you spot a serialized object in the traffic, you can try using this gadget chain to generate an object that triggers a DNS interaction with the Burp Collaborator server. If it does, you can be sure that deserialization occurred on your target. - JRMPClient is another universal chain that you can use for initial detection. It causes the server to try establishing a TCP connection to the supplied IP address. Note that you need to provide a raw IP address rather than a hostname. This chain may be useful in environments where all outbound traffic is firewalled, including DNS lookups. You can try generating payloads with two different IP addresses: a local one and a firewalled, external one. If the application responds immediately for a payload with a local address, but hangs for a payload with an external address, causing a delay in the response, this indicates that the gadget chain worked because the server tried to connect to the firewalled address. In this case, the subtle time difference in responses can help you to detect whether deserialization occurs on the server, even in blind cases.

Burpsuite lab

Burpsuite Lab: Exploiting Java deserialization with Apache Commons

Documented gadget chains

There may not always be a dedicated tool available for exploiting known gadget chains in the framework used by the target application. In this case, it's always worth looking online to see if there are any documented exploits that you can adapt manually.

Burpsuite lab

Burpsuite Lab: Exploiting Ruby deserialization using a documented gadget chain

7. Your own exploit

8. PHAR deserialization

PHP provides several URL-style wrappers that you can use for handling different protocols when accessing file paths. One of these is the phar:// wrapper, which provides a stream interface for accessing PHP Archive (.phar) files.

The PHP documentation reveals that PHAR manifest files contain serialized metadata. Crucially, if you perform any filesystem operations on a phar:// stream, this metadata is implicitly deserialized.

This means that a phar:// stream can potentially be a vector for exploiting insecure deserialization.

The explanation: This technique requires you to upload the PHAR to the server somehow. One approach is to use an image upload functionality, for example. If you are able to create a polyglot file, with a PHAR masquerading as a simple JPG, you can sometimes bypass the website's validation checks. If you can then force the website to load this polyglot "JPG" from a phar:// stream, any harmful data you inject via the PHAR metadata will be deserialized. As the file extension is not checked when PHP reads a stream, it does not matter that the file uses an image extension. As long as the class of the object is supported by the website, both the __wakeup() and __destruct() magic methods can be invoked in this way, allowing you to potentially kick off a gadget chain using this technique.

9. Memory corruption

Even without the use of gadget chains, it is still possible to exploit insecure deserialization. If all else fails, there are often publicly documented memory corruption vulnerabilities that can be exploited via insecure deserialization. TDeserialization methods, such as PHP's unserialize() are rarely hardened against these kinds of attacks, and expose a huge amount of attack surface.

Last update: 2024-06-06
Created: May 1, 2024 18:53:29