Python - Argparse

Author(s)	Helena Rasche
Editor(s)	Bazante Sanders
Tester(s)	Donny Vrins
Reviewers

Overview
Questions:

How do I make a proper command line script

How do I use argparse?

What problems does it solve?

Objectives:

Learn how sys.argv works

Write a simple command line program that sums some numbers

Use argparse to make it nicer.

Requirements:

Time estimation: 30 minutes

Level: Intermediate Intermediate

Supporting Materials:

instances Available on these Galaxies

Possibly Working

UseGalaxy.eu

UseGalaxy.org

UseGalaxy.org.au

UseGalaxy.fr

Published: Apr 25, 2022

Last modification: Jan 24, 2023

License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT

purl PURL: https://gxy.io/GTN:T00082

version Revision: 5

argparse is an argument parsing library for Python that’s part of the stdlib. It lets you make command line tools significantly nicer to work with.

Agenda

In this tutorial, we will cover:

sys.argv

Simple tasks

Argparse

Using argparse

Why Argparse?

Unlike previous modules, this lesson won’t use a Jupyter/CoCalc notebook, and that’s because we’ll be parsing command lines! You’ll need to open a code editor on your platform of choice (nano, vim, emacs, VSCode are all options) and use the following blocks of code to construct your command line tool.

`sys.argv`

In the coding world, whenever you run a Python script on the command line, it has a special variable available to it named argv. This is a list of all of the arguments used when you run a command line program.

Hands On: Print out argv

Create / open the file run.py in your text editor of choice

There we’ll create a simple Python script that:

imports sys, the system module needed to access argv.

Prints out sys.argv
import sys

print(sys.argv)
Run this with different command line arguments:
python run.py
python run.py 1 2 3 4
python run.py --help

Question

What did you notice about the output? There are two main points.

The name of the script (run.py) is included as the first value every time.

All of the arguments are passed as strings, no numbers.

Simple tasks

Let’s sum up all of the numbers passed on the command line. We’ll do this by hand, and then we’ll replace it with argparse to see how much effort that saves us.

Hands On

Update your script to sum up every number passed to it on the command line.

It should handle:

1 or more numbers

nothing (and maybe print out a message?)

invalid values (print out an error message that the value couldn’t be processed.)

Hints:

Skip the program name

Use try and except to try converting the string to a number.
Question

How does your updated script look?
import sys

result = 0

if len(sys.argv) == 1:
    print("no arguments were supplied")
else:
    for arg in sys.argv[1:]:
        try:
            result += float(arg)
        except:
            print(f"Could not parse {arg}")

    print(result)

Argparse

Argparse saves us a lot of work, because it can handle a number of things for us!

Ensures that the correct number of arguments are provided (and provide a nice error message otherwise)
Ensure that the correct types of arguments are provided (no strings for a number field)
Provide a help message describing your program

Argparse is used as follows. First we need to import it

import argparse

And then we can define a ‘parser’ which will parse our command line. Additionally we can provide a description field which tells people what our tool does:

parser = argparse.ArgumentParser(description='Process some integers.')

And finally we can define some arguments that are available. Just like we have arguments to functions, we have arguments to command lines. These come in two flavours:

required (without a --)
optional “flags” (prefixed with --)

Here we have an argument named ‘integers’, which validates that all input values are of the type int. nargs is the number of arguments, + means ‘1 or more’. And we have some help text as well:

parser.add_argument('integer', type=int, help='an integer parameter')
parser.add_argument('many_integers', type=int, nargs='+', help='an integer parameter')

We can also define an optional flag, here it’s called --sum. We use store_true which will set it as true if the flag is used , otherwise false.

parser.add_argument('--sum', action='store_true', help='Should we sum up the integers?')

Finally we parse the arguments, which reads sys.argv and processes it according to the above rules. The output is stored in args.

args = parser.parse_args()

We have two main variables we can use now:

args.integer # A single integer
args.many_integers # A list of ints
args.sum # A boolean, True or False.

Using argparse

Let’s go back to our script, and replace sys with argparse.

Hands On: Replacing argv.

Given the following script, replace the use of argv with argparse.

import sys

result = 0

if len(sys.argv) == 1:
    print("no arguments were supplied")
else:
    for arg in sys.argv[1:]:
        try:
            result += float(arg)
        except:
            print(f"Could not parse {arg}")

    print(result)

You should have one argument: numbers (type=float)

And print out the sum of those numbers.

Question

How does your final script look?

import argparse

parser = argparse.ArgumentParser(description='Sum some numbers')
parser.add_argument('integers', type=float, nargs='+',
                    help='a number to sum up.')
args = parser.parse_args()

print(sum(args.integers))

Try running the script with various values

python run.py
python run.py 1 3 5
python run.py 2 4 O
python run.py --help

Wow that’s a lot simpler! We have to learn how argparse is invoked but it handles a lot of cases for us:

No arguments provided
Responding to --help
Raising an error for invalid values

--help is even written for us, without us writing any special code to handle that case! This is why you need to use argparse:

It handles a lot of cases and input validation for you
It produces a nice --help text that can help you if you’ve forgotten what your tool does
It’s nice for users of your scripts! They don’t have to read the code to know how it behaves if you document it well.

There is a lot of documentation in the argparse module for all sorts of use cases!

Why Argparse?

Using argparse can be a big change to your tool but there are some benefits to using it!

Standardised interface to your tool that’s familiar to everyone who uses command line tools
Automatic Help page
Automatic Galaxy Tools?

Generating Automatic Galaxy Tools (Optional)

With the argparse2tool project, and eventually pyGalGen which will be merged into planemo, you can generate Galaxy tools automatically from argparse based Python scripts.

Hands On: Generate a Galaxy tool wrapper from your script

Write out the python script to a file named main.py

import argparse

parser = argparse.ArgumentParser(description='Sum some numbers')
parser.add_argument('integers', type=float, nargs='+',
                    help='a number to sum up.')
args = parser.parse_args()

print(sum(args.integers))

Create a virtual environment, just in case: ``
```
python -m venv .venv
. .venv/bin/activate
```
Install argparse2tool via pip:
```
pip install argparse2tool
```

Generate the tool interface:

Code In: Command

PYTHONPATH=$(argparse2tool) python main.py --generate_galaxy_xml

Code Out: Galaxy XML

<tool name="main.py" id="main.py" version="1.0">
  <description>Sum some numbers</description>
  <stdio>
    <exit_code range="1:" level="fatal"/>
  </stdio>
  <version_command><![CDATA[python main.py --version]]></version_command>
  <command><![CDATA[python main.py
#set repeat_var_1 = '" "'.join([ str($var.integers) for $var in $repeat_1 ])
"$repeat_var_1"

> $default]]></command>
  <inputs>
    <repeat title="repeat_title" min="1" name="repeat_1">
      <param label="a number to sum up." value="0" type="float" name="integers"/>
    </repeat>
  </inputs>
  <outputs>
    <data name="default" format="txt" hidden="false"/>
  </outputs>
  <help><![CDATA[TODO: Write help]]></help>
</tool>

You've Finished the Tutorial

Key points

If you are writing a command line script, no matter how small, use argparse.

--help is even written for us, without us writing any special code to handle that case

It handles a lot of cases and input validation for you

It produces a nice --help text that can help you if you’ve forgotten what your tool does

It’s nice for users of your scripts! They don’t have to read the code to know how it behaves if you document it well.

Frequently Asked Questions

Have questions about this tutorial? Have a look at the available FAQ pages and support channels

Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.
Did you use this material as a learner or student? Click the form below to leave feedback.

Citing this Tutorial

Helena Rasche, Python - Argparse (Galaxy Training Materials). https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-argparse/tutorial.html Online; accessed TODAY
Hiltemann, Saskia, Rasche, Helena et al., 2023 Galaxy Training: A Powerful Framework for Teaching! PLOS Computational Biology 10.1371/journal.pcbi.1010752
Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

@misc{data-science-python-argparse,
author = "Helena Rasche",
	title = "Python - Argparse (Galaxy Training Materials)",
	year = "",
	month = "",
	day = "",
	url = "\url{https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-argparse/tutorial.html}",
	note = "[Online; accessed TODAY]"
}
@article{Hiltemann_2023,
	doi = {10.1371/journal.pcbi.1010752},
	url = {https://doi.org/10.1371%2Fjournal.pcbi.1010752},
	year = 2023,
	month = {jan},
	publisher = {Public Library of Science ({PLoS})},
	volume = {19},
	number = {1},
	pages = {e1010752},
	author = {Saskia Hiltemann and Helena Rasche and Simon Gladman and Hans-Rudolf Hotz and Delphine Larivi{\`{e}}re and Daniel Blankenberg and Pratik D. Jagtap and Thomas Wollmann and Anthony Bretaudeau and Nadia Gou{\'{e}} and Timothy J. Griffin and Coline Royaux and Yvan Le Bras and Subina Mehta and Anna Syme and Frederik Coppens and Bert Droesbeke and Nicola Soranzo and Wendi Bacon and Fotis Psomopoulos and Crist{\'{o}}bal Gallardo-Alba and John Davis and Melanie Christine Föll and Matthias Fahrner and Maria A. Doyle and Beatriz Serrano-Solano and Anne Claire Fouilloux and Peter van Heusden and Wolfgang Maier and Dave Clements and Florian Heyl and Björn Grüning and B{\'{e}}r{\'{e}}nice Batut and},
	editor = {Francis Ouellette},
	title = {Galaxy Training: A powerful framework for teaching!},
	journal = {PLoS Comput Biol}
}

                   

Funding

These individuals or organisations provided funding support for the development of this resource

Avans

Congratulations on successfully completing this tutorial!

Do you want to extend your knowledge?
Follow one of our recommended follow-up trainings:

No feedback has been recieved yet for this training. Be the first one by filling in the feedback form.