Python Subprocess Example
In this article, using Python 3.4, we’ll learn about Python’s subprocess
module, which provides a high-level interface to interact with external commands.
This module is intended to be a replacement to the old os.system
and such modules. It provides all of the functionality of the other modules and functions it replaces. The API is consistent for all uses, and operations such as closing files and pipes are built in instead of being handled by the application code separately.
One thing to notice is that the API is roughly the same, but the underlying implementation is slightly different between Unix and Windows. The examples provided here were tested on a Unix based system. Behavior on a non-Unix OS will vary.
The subprocess
module defines a class called Popen
and some wrapper functions which take almost the same arguments as its constructor does. But for more advanced use cases, the Popen
interface can be used directly.
We’ll first take a look to all the wrapper functions and see what they do and how to call them. Then we’ll see how to use Popen
directly to ease our thirst for knowledge.
1. Convenience Functions
1.1. subprocess.call
The first function we’ll overview is subprocess.call
. The most commonly used arguments it takes are (args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None)
. Although these are the most commonly used arguments, this function takes roughly as many arguments as Popen
does, and it passes almost them all directly through to that interface. The only exception is the timeout
which is passed to Popen.wait()
, if the timeout expires, the child process will be killed and then waited for again. The TimeoutExpired exception will be re-raised after the child process has terminated.
This function runs the command described by args
, waits for the command to complete and returns the exit code attribute. Let’s see a trivial example:
$ python3 >>> import subprocess >>> subprocess.call(["ls", "-l"]) 0
As you see, it’s returning 0, as it’s the return code of the ls -l
command. You will probably see the output of the command in the console as it’s sent to standard output, but it won’t be available within your code using this command this way.
Another thing to notice is that the command to run is being passed as an array, as the Popen
interface will then concatenate and process the elements saving us the trouble of escaping quote marks and other special characters. You can avoid this notation by setting the shell
flag to a True
value, which will spawn an intermediate shell process and tell it to run the command, like this:
$ python3 >>> import subprocess >>> subprocess.call("ls -l", shell=True) 0
Although the shell=True
option might be useful to better exploit the most powerful tools the shell provides us, it can be a security risk, and you should keep this in mind. To use this tool in a safer way you can use shlex.quote()
to properly escape whitespace and shell metacharacters in strings that are going to be used to construct shell commands. Let’s see and example of a security issue this can cause:
security-issues.py
import subprocess import shlex def unsafe(directory): command = "du -sh {}".format(directory) subprocess.call(command, shell=True) def safe(directory): sanitized_directory = shlex.quote(directory) command = "du -sh {}".format(sanitized_directory) subprocess.call(command, shell=True) if __name__ == "__main__": directory = "/dev/null; ls -l /tmp" unsafe(directory) safe(directory)
Here we have two functions. The first one, unsafe
, will run a du -sh
over a directory provided, without validating or cleaning it. The second one, safe
, will apply the same command to a sanitized version of the provided directory. By providing a directory like "/dev/null; ls -l /tmp"
, we are trying to exploit the vulnerability to see what the contents of /tmp
are, of course it can get a lot uglier if a rm
would intervene here.
The unsafe
function will output to stdout a line like 0 /dev/null
as the output of the du -sh
, and then the contents of /tmp
. The safe
method will fail, as no directory called "/dev/null; ls -l /tmp"
exists in the system.
Now, as we said before, this function returns the exit code of the executed command, which give us the responsibility of interpreting it as a successful or an error code. Here comes the check_call
function.
1.2. subprocess.check_call
This wrapper function takes the same arguments as subprocess.call
, and behaves almost the same. The only difference is that it interprets the exit code for us.
It runs the described command, waits for it to complete and, if the return code is zero, it returns, otherwise it will raise CalledProcessError
. Let’s see:
$ python3 >>> import subprocess >>> subprocess.check_call("exit 1", shell=True) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/python3.4/subprocess.py", line 561, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
The command exit
will ask the shell to finish with the given exit code. As it is not zero, CalledProcessError
is raised.
One way of gaining access to the output of the executed command would be to use PIPE
in the arguments stdout
or stderr
, but the child process will block if it generates enough output to a pipe to fill up the OS pipe buffer as the pipes are not being read from. So, having said that this method is not safe, let’s skip to the safe and easy way.
1.3. subprocess.check_output
This function receives almost the same arguments as the previous ones, but it adds input
and universal_newlines
. The input
argument is passed to Popen.communicate()
and thus to the command stdin. When it is uses, it should be a byte sequence, or a string if universal_newlines=True
.
Another change is in its behaviour, as it returns the output of the command. It will check the exit code of the command for you raising CalledProcessError
if it’s not 0. Let’s see an example:
$ python3 >>> import subprocess >>> subprocess.check_output(["echo", "Hello world!"]) b'Hello world!\n' >>> subprocess.check_output(["echo", "Hello World!"], universal_newlines=True) 'Hello World!\n' >>> subprocess.check_output(["sed", "-e", "s/foo/bar/"], input=b"when in the course of fooman events\n") b'when in the course of barman events\n' >>> subprocess.check_output(["sed", "-e", "s/foo/bar/"], input="when in the course of fooman events\n", universal_newlines=True) 'when in the course of barman events\n'
Here you can see the variations of the two new arguments. The input
argument is being passed to the sed
command in the third and fourth statements, and the universal_newlines
decodes the byte sequences into strings or not, depending on its value.
Also, you can capture the standard error output by setting stderr=subprocess.STDOUT
, but keep in mind that this will merge stdout and stderr, so you really need to know how to parse all that information into something you can use.
2. Popen
Now, let’s see how can we use Popen
directly. I think with the explanation made before, you get an idea. Let’s see a couple of trivial use cases. We’ll start by rewriting the already written examples using Popen
directly.
$ python3 >>> from subprocess import Popen >>> proc = Popen(["ls", "-l"]) >>> proc.wait() 0
Here, as your see, Popen
is being instantiated passing only the command to run as an array of strings (just as we did before). Popen
‘s constructor also accepts the argument shell
, but it’s still unsafe. This constructor starts the command execution the moment it is instantiated, but calling Popen.wait()
will wait for the process to finish and return the exit code.
Now, let’s see an example of shell=True
just to see that it’s just the same:
$ python3 >>> from subprocess import Popen >>> proc = Popen("ls -l", shell=True) >>> proc.wait() 0
See? It behaves the same as the wrapper functions we saw before. It doesn’t seem so complicated now, does it? Let’s rewrite the security-issues.py
example with it:
security-issues-popen.py
from subprocess import Popen import shlex def unsafe(directory): command = "du -sh {}".format(directory) proc = Popen(command, shell=True) proc.wait() def safe(directory): sanitized_directory = shlex.quote(directory) command = "du -sh {}".format(sanitized_directory) proc = Popen(command, shell=True) proc.wait() if __name__ == "__main__": directory = "/dev/null; ls -l /tmp" unsafe(directory) safe(directory)
Now, if we execute this example, we’ll see it behaves the same as the one written with the subprocess.call
.
Now, Popen
‘s constructor will not raise any errors if the command exit code is not zero, its wait
function will return the exit code and, again, it’s your responsibility to check it and perform any fallback.
So, what if we want to gain access to the output? Well, Popen
will receive the stdout
and stderr
arguments, which now are safe to use, as we are going to read from them. Let’s write a little example which let’s us see the size of a file/directory.
size.py
from subprocess import Popen, PIPE def command(directory): return ["du", "-sh", directory] def check_size(directory): proc = Popen(command(directory), stdout=PIPE, stderr=PIPE, universal_newlines=True) result = "" exit_code = proc.wait() if exit_code != 0: for line in proc.stderr: result = result + line else: for line in proc.stdout: result = result + line return result def main(): arg = input("directory to check: ") result = check_size(directory=arg) print(result) if __name__ == "__main__": main()
Here we are instantiating a Popen
and setting stdout
and stderr
to subprocess.PIPE
to be able to read the output of our command, we are also setting universal_newline
to True
so we can work with strings. Then we are calling Popen.wait()
and saving the exit code (notice that you can pass a timeout to the wait
function) and, then printing the standard output of our command if it ran successfully, or the standard error output if else.
Now, the last case scenario we are missing is sending stuff to the standard input of our command. Let’s write a script that interacts with our size.py
sending a hardcoded directory to its stdin:
size-of-tmp.py
from subprocess import Popen, PIPE def size_of_tmp(): proc = Popen(["python3", "size.py"], stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True) (stdout, stderr) = proc.communicate("/tmp") exit_code = proc.wait() if exit_code != 0: return stderr else: return stdout if __name__ == "__main__": print(size_of_tmp())
Now, this last script will execute the first one, and as it asks for a directory to check, the new script will send "/tmp"
to its standard input.
3. Download the Code Project
This was a basic example on Python subprocesses.
You can download the full source code of this example here: python-subprocess