Python

Python Subprocess Example

In this article, using Python 3.4, we’ll learn about Python’s subprocess module, which provides a high-level interface to interact with external commands.

This module is intended to be a replacement to the old os.system and such modules. It provides all of the functionality of the other modules and functions it replaces. The API is consistent for all uses, and operations such as closing files and pipes are built in instead of being handled by the application code separately.

One thing to notice is that the API is roughly the same, but the underlying implementation is slightly different between Unix and Windows. The examples provided here were tested on a Unix based system. Behavior on a non-Unix OS will vary.

The subprocess module defines a class called Popen and some wrapper functions which take almost the same arguments as its constructor does. But for more advanced use cases, the Popen interface can be used directly.

We’ll first take a look to all the wrapper functions and see what they do and how to call them. Then we’ll see how to use Popen directly to ease our thirst for knowledge.

1. Convenience Functions

1.1. subprocess.call

The first function we’ll overview is subprocess.call. The most commonly used arguments it takes are (args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None). Although these are the most commonly used arguments, this function takes roughly as many arguments as Popen does, and it passes almost them all directly through to that interface. The only exception is the timeout which is passed to Popen.wait(), if the timeout expires, the child process will be killed and then waited for again. The TimeoutExpired exception will be re-raised after the child process has terminated.

This function runs the command described by args, waits for the command to complete and returns the exit code attribute. Let’s see a trivial example:

$ python3
>>> import subprocess
>>> subprocess.call(["ls", "-l"])
0

As you see, it’s returning 0, as it’s the return code of the ls -l command. You will probably see the output of the command in the console as it’s sent to standard output, but it won’t be available within your code using this command this way.

Another thing to notice is that the command to run is being passed as an array, as the Popen interface will then concatenate and process the elements saving us the trouble of escaping quote marks and other special characters. You can avoid this notation by setting the shell flag to a True value, which will spawn an intermediate shell process and tell it to run the command, like this:

$ python3
>>> import subprocess
>>> subprocess.call("ls -l", shell=True)
0

Although the shell=True option might be useful to better exploit the most powerful tools the shell provides us, it can be a security risk, and you should keep this in mind. To use this tool in a safer way you can use shlex.quote() to properly escape whitespace and shell metacharacters in strings that are going to be used to construct shell commands. Let’s see and example of a security issue this can cause:

security-issues.py

import subprocess
import shlex


def unsafe(directory):
    command = "du -sh {}".format(directory)
    subprocess.call(command, shell=True)


def safe(directory):
    sanitized_directory = shlex.quote(directory)
    command = "du -sh {}".format(sanitized_directory)
    subprocess.call(command, shell=True)

if __name__ == "__main__":
    directory = "/dev/null; ls -l /tmp"
    unsafe(directory)
    safe(directory)

Here we have two functions. The first one, unsafe, will run a du -sh over a directory provided, without validating or cleaning it. The second one, safe, will apply the same command to a sanitized version of the provided directory. By providing a directory like "/dev/null; ls -l /tmp", we are trying to exploit the vulnerability to see what the contents of /tmp are, of course it can get a lot uglier if a rm would intervene here.

The unsafe function will output to stdout a line like 0 /dev/null as the output of the du -sh, and then the contents of /tmp. The safe method will fail, as no directory called "/dev/null; ls -l /tmp" exists in the system.

Now, as we said before, this function returns the exit code of the executed command, which give us the responsibility of interpreting it as a successful or an error code. Here comes the check_call function.

1.2. subprocess.check_call

This wrapper function takes the same arguments as subprocess.call, and behaves almost the same. The only difference is that it interprets the exit code for us.

It runs the described command, waits for it to complete and, if the return code is zero, it returns, otherwise it will raise CalledProcessError. Let’s see:

$ python3
>>> import subprocess
>>> subprocess.check_call("exit 1", shell=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.4/subprocess.py", line 561, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1

The command exit will ask the shell to finish with the given exit code. As it is not zero, CalledProcessError is raised.

One way of gaining access to the output of the executed command would be to use PIPE in the arguments stdout or stderr, but the child process will block if it generates enough output to a pipe to fill up the OS pipe buffer as the pipes are not being read from. So, having said that this method is not safe, let’s skip to the safe and easy way.

1.3. subprocess.check_output

This function receives almost the same arguments as the previous ones, but it adds input and universal_newlines. The input argument is passed to Popen.communicate() and thus to the command stdin. When it is uses, it should be a byte sequence, or a string if universal_newlines=True.

Another change is in its behaviour, as it returns the output of the command. It will check the exit code of the command for you raising CalledProcessError if it’s not 0. Let’s see an example:

$ python3
>>> import subprocess
>>> subprocess.check_output(["echo", "Hello world!"])
b'Hello world!\n'
>>> subprocess.check_output(["echo", "Hello World!"], universal_newlines=True)
'Hello World!\n'
>>> subprocess.check_output(["sed", "-e", "s/foo/bar/"], input=b"when in the course of fooman events\n")
b'when in the course of barman events\n'
>>> subprocess.check_output(["sed", "-e", "s/foo/bar/"], input="when in the course of fooman events\n", universal_newlines=True)
'when in the course of barman events\n'

Here you can see the variations of the two new arguments. The input argument is being passed to the sed command in the third and fourth statements, and the universal_newlines decodes the byte sequences into strings or not, depending on its value.

Also, you can capture the standard error output by setting stderr=subprocess.STDOUT, but keep in mind that this will merge stdout and stderr, so you really need to know how to parse all that information into something you can use.

2. Popen

Now, let’s see how can we use Popen directly. I think with the explanation made before, you get an idea. Let’s see a couple of trivial use cases. We’ll start by rewriting the already written examples using Popen directly.

$ python3
>>> from subprocess import Popen
>>> proc = Popen(["ls", "-l"])
>>> proc.wait()
0

Here, as your see, Popen is being instantiated passing only the command to run as an array of strings (just as we did before). Popen‘s constructor also accepts the argument shell, but it’s still unsafe. This constructor starts the command execution the moment it is instantiated, but calling Popen.wait() will wait for the process to finish and return the exit code.

Now, let’s see an example of shell=True just to see that it’s just the same:

$ python3
>>> from subprocess import Popen
>>> proc = Popen("ls -l", shell=True)
>>> proc.wait()
0

See? It behaves the same as the wrapper functions we saw before. It doesn’t seem so complicated now, does it? Let’s rewrite the security-issues.py example with it:

security-issues-popen.py

from subprocess import Popen
import shlex


def unsafe(directory):
    command = "du -sh {}".format(directory)
    proc = Popen(command, shell=True)
    proc.wait()


def safe(directory):
    sanitized_directory = shlex.quote(directory)
    command = "du -sh {}".format(sanitized_directory)
    proc = Popen(command, shell=True)
    proc.wait()

if __name__ == "__main__":
    directory = "/dev/null; ls -l /tmp"
    unsafe(directory)
    safe(directory)

Now, if we execute this example, we’ll see it behaves the same as the one written with the subprocess.call.

Now, Popen‘s constructor will not raise any errors if the command exit code is not zero, its wait function will return the exit code and, again, it’s your responsibility to check it and perform any fallback.

So, what if we want to gain access to the output? Well, Popen will receive the stdout and stderr arguments, which now are safe to use, as we are going to read from them. Let’s write a little example which let’s us see the size of a file/directory.

size.py

from subprocess import Popen, PIPE


def command(directory):
    return ["du", "-sh", directory]


def check_size(directory):
    proc = Popen(command(directory), stdout=PIPE, stderr=PIPE, universal_newlines=True)
    result = ""
    exit_code = proc.wait()
    if exit_code != 0:
        for line in proc.stderr:
            result = result + line
    else:
        for line in proc.stdout:
            result = result + line
    return result


def main():
    arg = input("directory to check: ")
    result = check_size(directory=arg)
    print(result)

if __name__ == "__main__":
    main()

Here we are instantiating a Popen and setting stdout and stderr to subprocess.PIPE to be able to read the output of our command, we are also setting universal_newline to True so we can work with strings. Then we are calling Popen.wait() and saving the exit code (notice that you can pass a timeout to the wait function) and, then printing the standard output of our command if it ran successfully, or the standard error output if else.

Now, the last case scenario we are missing is sending stuff to the standard input of our command. Let’s write a script that interacts with our size.py sending a hardcoded directory to its stdin:

size-of-tmp.py

from subprocess import Popen, PIPE


def size_of_tmp():
    proc = Popen(["python3", "size.py"], stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True)
    (stdout, stderr) = proc.communicate("/tmp")
    exit_code = proc.wait()
    if exit_code != 0:
        return stderr
    else:
        return stdout

if __name__ == "__main__":
    print(size_of_tmp())

Now, this last script will execute the first one, and as it asks for a directory to check, the new script will send "/tmp" to its standard input.

3. Download the Code Project

This was a basic example on Python subprocesses.

Download
You can download the full source code of this example here: python-subprocess

Sebastian Vinci

Sebastian is a full stack programmer, who has strong experience in Java and Scala enterprise web applications. He is currently studying Computers Science in UBA (University of Buenos Aires) and working a full time job at a .com company as a Semi-Senior developer, involving architectural design, implementation and monitoring. He also worked in automating processes (such as data base backups, building, deploying and monitoring applications).
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button