{ "metadata": { }, "nbformat": 4, "nbformat_minor": 5, "cells": [ { "id": "metadata", "cell_type": "markdown", "source": "
In this lesson, we will be using Python 3 with some of its most popular scientific libraries. We will be using JupyterNotebook, a Python interpreter that comes with everything we need for the lesson.
\n\n\nComment\nThis tutorial is significantly based on the Carpentries Programming with Python and Plotting and Programming in Python, which is licensed CC-BY 4.0.
\nAdaptations have been made to make this work better in a GTN/Galaxy environment.
\n
\n\nAgenda\nIn this tutorial, we will cover:
\n\n
\n- Overview
\n- Python Fundamentals
\n\n
\n- Variables
\n
Python was developed by Guido van Rossum. Guido van Rossum started implementing Python in 1989. Python is a very simple programming language so even if you are new to programming, you can learn python without facing any issues.
\nFeatures of Python language:
\nReadable: Python is a very readable language.
\nEasy to Learn: Learning python is easy as this is a expressive and high level programming language, which means it is easy to understand the language and thus easy to learn.
\nCross platform: Python is available and can run on various operating systems such as Mac, Windows, Linux, Unix etc. This makes it a cross platform and portable language.
\nOpen Source: Python is a open source programming language.
\nLarge standard library: Python comes with a large standard library that has some handy codes and functions which we can use while writing code in Python.
\nFree: Python is free to download and use. This means you can download it for free and use it in your application.
\nSupports exception handling: Python supports exception handling which means we can write less error prone code and can test various scenarios that can cause an exception later on.
\nAutomatic memory management: Python supports automatic memory management which means the memory is cleared and freed automatically. You do not have to bother clearing the memory.
\n\n\n\nJupyterLab is a User Interface including notebooks. A user can open several notebooks or files as tabs in the same window, like an IDE. JupyterNotebook is a web-based interactive computational environment for creating Jupyter notebook documents. It supports several languages like Python (IPython), Julia, R etc. and is largely used for data analysis, data visualization and further interactive, exploratory computing.
\nJupyterNotebook has several advantages:
\n\n
\n- You can easily type, edit, and copy and paste blocks of code.
\n- Tab complete allows you to easily access the names of things you are using and learn more about them.
\n- It allows you to annotate your code with links, different sized text, bullets, etc. to make it more accessible to you and your collaborators.
\n- It allows you to display figures next to the code that produces them to tell a complete story of the analysis.
\nEach notebook contains one or more cells that contain code, text, or images. Each notebook can be exported (File, Export as, Executable script) as Python script that can be run from the command line.
\nWe will be using JupyterNotebook in Galaxy and as a result you need to frequently save the notebook in the workspace. This is both for good practice and to protect you in case you accidentally close the browser. Your environment will still run, so it will contain the last saved notebook you have. Furthermore, you need to download a notebook, before you delete or close a notebook in your history or you will lose it.
\n
Any Python interpreter can be used as a calculator:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-1", "source": [ "3 + 5 * 4" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">This is great but not very interesting. To do anything useful with data, we need to assign its value to a variable. In Python, we can assign a value to a variable, using the equals sign =
. For example, we can track the weight of a patient who weighs 60 kilograms by assigning the value 60 to a variable weight_kg
:
From now on, whenever we use weight_kg
, Python will substitute the value we assigned to it.
In Python, variable names:
\nThis means that, for example:
\nweight0
is a valid variable name, whereas 0weight
is notweight
and Weight
are different variablesPython knows various types of data. Three common ones are:
\nIn the example above, variable weight_kg
has an integer value of 60
. If we want to more precisely track the weight of our patient, we can use a floating point value by executing:
To create a string, we add single or double quotes around some text. To identify and track a patient throughout our study, we can assign each person a unique identifier by storing it in a string:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-7", "source": [ "patient_id = '001'" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Once we have data stored with variable names, we can make use of it in calculations. We may want to store our patient’s weight in pounds as well as kilograms:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-9", "source": [ "weight_lb = 2.2 * weight_kg" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">We might decide to add a prefix to our patient identifier:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-11", "source": [ "patient_id = 'inflam_' + patient_id" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Be aware that it is the order of execution of cells that is important in a Jupyter notebook, not the order in which they appear. Python will remember all the code that was run previously, including any variables you have defined, irrespective of the order in the notebook. Therefore if you define variables lower down the notebook and then (re)run cells further up, those defined further down will still be present. As an example, create two cells with the following content, in this order:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-13", "source": [ "print(myval)\n", "myval = 1" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">If you execute this in order, the first cell will give an error. However, if you run the first cell after the second cell, it will print out 1. To prevent confusion, it can be helpful to use the Kernel -> Restart & Run All option which clears the interpreter and runs everything from a clean slate going top to bottom.
\n\n\nQuestion: Variables and values\nWhat values do the variables
\nmass
andage
have after each of the following statements? Test your answer by executing the lines.\nmass = 47.5\nage = 122\nmass = mass * 2.0\nage = age - 20\n
\n👁 View solution
\n\n\n
mass
holds a value of 47.5,age
does not exist\nmass
still holds a value of 47.5,age
holds a value of 122\nmass
now has a value of 95.0,age
’s value is still 122\nmass
still has a value of 95.0,age
now holds 102
\n\n\nQuestion: Variables\nPython allows you to assign multiple values to multiple variables in one line by separating the variables and values with commas. What does the following program print out?
\n\nfirst, second = 'Grace', 'Hopper'\nthird, fourth = second, first\nprint(third, fourth)\n
👁 View solution
\n\nHopper Grace
\n
\n\n\nQuestion: Variables and data types\nWhat are the data types of the following variables?
\n\nplanet = 'Earth'\napples = 5\ndistance = 10.5\n
👁 View solution
\n\n\ntype(planet)\ntype(apples)\ntype(distance)\n
Lists are built into the language so we do not have to load a library to use them. We create a list by putting values inside square brackets and separating the values with commas:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-15", "source": [ "odds = [1, 3, 5, 7]\n", "print('odds are:', odds)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">We can access elements of a list using indices – numbered positions of elements in the list. These positions are numbered starting at 0, so the first element has an index of 0.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-17", "source": [ "print('first element:', odds[0])\n", "print('last element:', odds[3])\n", "print('\"-1\" element:', odds[-1])" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Yes, we can use negative numbers as indices in Python. When we do so, the index -1
gives us the last element in the list, -2
the second to last, and so on. Because of this, odds[3]
and odds[-1]
point to the same element here.
There is one important difference between lists and strings: we can change the values in a list, but we cannot change individual characters in a string. For example:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-19", "source": [ "names = ['Curie', 'Darwing', 'Turing'] # typo in Darwin's name\n", "print('names is originally:', names)\n", "names[1] = 'Darwin' # correct the name\n", "print('final value of names:', names)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">works, but the following does not:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-21", "source": [ "name = 'Darwin'\n", "name[0] = 'd'" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Lists in Python can contain elements of different types. Example:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-23", "source": [ "sample_ages = [10, 12.5, 'Unknown']" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">There are many ways to change the contents of lists besides assigning new values to individual elements:
\nWe can append new elements to a list
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-25", "source": [ "odds.append(11)\n", "print('odds after adding a value:', odds)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">We can pop an element off the end of a list
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-27", "source": [ "removed_element = odds.pop(0)\n", "print('odds after removing the first element:', odds)\n", "print('removed_element:', removed_element)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Or we can reverse the list
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-29", "source": [ "odds.reverse()\n", "print('odds after reversing:', odds)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Data which can be modified in place is called mutable, while data which cannot be modified is called immutable. Strings and numbers are immutable. This does not mean that variables with string or number values are constants, but when we want to change the value of a string or number variable, we can only replace the old value with a completely new value.
\nLists and arrays, on the other hand, are mutable: we can modify them after they have been created. We can change individual elements, append new elements, or reorder the whole list. For some operations, like sorting, we can choose whether to use a function that modifies the data in-place or a function that returns a modified copy and leaves the original unchanged.
\nBe careful when modifying data in-place. If two variables refer to the same list, and you modify the list value, it will change for both variables!
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-31", "source": [ "salsa = ['peppers', 'onions', 'cilantro', 'tomatoes']\n", "my_salsa = salsa # <-- my_salsa and salsa point to the *same* list data in memory\n", "salsa[0] = 'hot peppers'\n", "print('Ingredients in my salsa:', my_salsa)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">If you want variables with mutable values to be independent, you must make a copy of the value when you assign it.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-33", "source": [ "salsa = ['peppers', 'onions', 'cilantro', 'tomatoes']\n", "my_salsa = salsa.copy() # <-- makes a *copy* of the list\n", "salsa[0] = 'hot peppers'\n", "print('Ingredients in my salsa:', my_salsa)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Because of pitfalls like this, code which modifies data in place can be more difficult to understand. However, it is often far more efficient to modify a large data structure in place than to create a modified copy for every small change. You should consider both of these aspects when writing your code.
\nSince a list can contain any Python variables, it can even contain other lists.
\nFor example, we could represent the products in the shelves of a small grocery shop:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-35", "source": [ "x = [['pepper', 'zucchini', 'onion'],\n", " ['cabbage', 'lettuce', 'garlic'],\n", " ['apple', 'pear', 'banana']]" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Here is an example of how indexing a list of lists x
works:
Subsets of lists can be accessed by specifying ranges of values in brackets. This is commonly referred to as “slicing” the list.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-39", "source": [ "chromosomes = ['X', 'Y', '2', '3', '4']\n", "autosomes = chromosomes[2:5]\n", "print('autosomes:', autosomes)\n", "last = chromosomes[-1]\n", "print('last:', last)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">\n\nQuestion: Negative indices\nUse slicing to access only the last four characters of a string or entries of a list.
\n\nstring_for_slicing = 'Observation date: 02-Feb-2013'\nlist_for_slicing = [['fluorine', 'F'], ['chlorine', 'Cl'], ['bromine', 'Br'], ['iodine', 'I'], ['astatine', 'At']]\n
Your output should be:\n‘2013’\n[[‘chlorine’, ‘Cl’], [‘bromine’, ‘Br’], [‘iodine’, ‘I’], [‘astatine’, ‘At’]]
\nWould your solution work regardless of whether you knew beforehand the length of the string or list (e.g. if you wanted to apply the solution to a set of lists of different lengths)? If not, try to change your approach to make it more robust.\nHint: Remember that indices can be negative as well as positive.
\n\n👁 View solution
\n\nUse negative indices to count elements from the end of a container (such as list or string):
\n\nstring_for_slicing[-4:]\nlist_for_slicing[-4:]\n
\n\n\nQuestion: Slicing\nSo far we’ve seen how to use slicing to take single blocks of successive entries from a sequence. But what if we want to take a subset of entries that aren’t next to each other in the sequence?
\nYou can achieve this by providing a third argument to the range within the brackets, called the step size. The example below shows how you can take every third entry in a list:
\n\nprimes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37]\nsubset = primes[0:12:3]\nprint('subset', subset)\n
Notice that the slice taken begins with the first entry in the range, followed by entries taken at equally-spaced intervals (the steps) thereafter. What if you wanted to begin the subset with the third entry? Use the previous example to write your solution that gives the following output.
\n👁 View solution
\n\nYou would need to specify that as the starting point of the sliced range:
\n\nsubset = primes[2:12:3]\nprint('subset', subset)\n
The characters (individual letters, numbers,and so on) in a string are ordered. For example, the string ‘AB’ is not the same as ‘BA’. Because of this ordering, we can treat the string as a list of characters. Each position in the string (first, second, etc.) is given a number. This number is called an index or sometimes a subscript. Indices are numbered from 0.You can use the position’s index in square brackets to get the character at that position.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-41", "source": [ "atom_name = 'helium'\n", "print(atom_name[0])" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">A part of a string is called a substring. A substring can be as short as a single character. An item in a list is called an element. Whenever we treat a string as if it were a list, the string’s elements are its individual characters. A slice is a part of a string (or, more generally, any list-like thing). We take a slice by using [start:stop], where start is replaced with the index of the first element we want and stop is replaced with the index of the element just after the last element we want. Mathematically, you might say that a slice selects [start:stop). The difference between stop and start is the slice’s length. Taking a slice does not change the contents of the original string. Instead, the slice is a copy of part of the original string.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-43", "source": [ "atom_name = 'sodium'\n", "print(atom_name[0:3])" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">You can use the built-in function len
to find the length of a string.
Nested functions are evaluated from the inside out, like in mathematics.
\n\n\nQuestion: Slicing\nUse what you have learnt in this tutorial to answer the following questions:\na. What does the following program print?
\n\natom_name = 'carbon'\nprint('atom_name[1:3] is:', atom_name[1:3])\n
b. What does
\nthing[low:high]
do?\nc. What doesthing[low:]
(without a value after the colon) do?\nd. What doesthing[:high]
(without a value before the colon) do?\ne. What doesthing[:]
(just a colon) do?\nf. What doesthing[number:some-negative-number]
do?\ng. What happens when you choose a high value which is out of range? (i.e., tryatom_name[0:15]
)\n👁 View solution
\n\na.
\natom_name[1:3] is: ar
\nb.thing[low:high]
returns a slice from low to the value before high\nc.thing[low:]
returns a slice from low all the way to the end of thing\nd.thing[:high]
returns a slice from the beginning of thing to the value before high\ne.thing[:]
returns all of thing\nf.thing[number:some-negative-number]
returns a slice from number to some-negative-number values from the end of thing\ng. If a part of the slice is out of range, the operation does not fail.atom_name[0:15]
gives the same result asatom_name[0:]
.
“Adding” character strings concatenates them.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-47", "source": [ "full_name = 'Ahmed' + ' ' + 'Walsh'\n", "print(full_name)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Multiplying a character string by an integer “N” creates a new string that consists of that character string repeated N times.
\nSince multiplication is repeated addition.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-49", "source": [ "separator = '=' * 10\n", "print(separator)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">The same rules apply for lists. Consider the following example:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-51", "source": [ "counts = [2, 4, 6, 8, 10]\n", "repeats = counts * 2\n", "print(repeats)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">It’s equivalent to:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-53", "source": [ "counts + counts" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Strings have a length (but numbers don’t).The built-in function len
counts the number of characters in a string.
But numbers don’t have a length (not even zero). For example, the following command returns an error message.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-57", "source": [ "print(len(52))" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Python converts automatically integers to floats, when needed, but you must convert numbers to strings or vice versa when operating on them.\nIt is not allowed to add numbers and strings. For example print(1 + '2')
is ambiguous: should 1 + '2'
be 3
or '12'
?
Some types can be converted to other types by using the type name as a function.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-59", "source": [ "print(1 + int('2'))\n", "print(str(1) + '2')" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">\n\nQuestion: Data types\nWhat type of value is 3.25 + 4?
\n\n👁 View solution
\n\nIt is a float: integers are automatically converted to floats as necessary.
\n
\n\n\nQuestion: Division\nIn Python 3, the
\n//
operator performs integer (whole-number) floor division, the/
operator performs floating-point division, and the%
(or modulo) operator calculates and returns the remainder from integer division:\n\nInput: Python\n\nprint('5 // 3:', 5 // 3)\nprint('5 / 3:', 5 / 3)\nprint('5 % 3:', 5 % 3)\n
\n\nOutput\n\n5 // 3: 1\n5 / 3: 1.6666666666666667\n5 % 3: 2\n
If
\nnum_subjects
is the number of subjects taking part in a study, andnum_per_survey
is the number that can take part in a single survey, write an expression that calculates the number of surveys needed to reach everyone once.👁 View solution
\n\nWe want the minimum number of surveys that reaches everyone once, which is the rounded up value of
\nnum_subjects/ num_per_survey
. This is equivalent to performing a floor division with//
and adding 1. Before the division we need to subtract 1 from the number of subjects to deal with the case wherenum_subjects
is evenly divisible bynum_per_survey
.\nnum_subjects = 600\nnum_per_survey = 42\nnum_surveys = (num_subjects - 1) // num_per_survey + 1\n\nprint(num_subjects, 'subjects,', num_per_survey, 'per survey:', num_surveys)\n
\n\n\nQuestion: Typecasting\nWhere reasonable,
\nfloat()
will convert a string to a floating point number, andint()
will convert a floating point number to an integer:\n\nInput: Python\n\nprint(\"string to float:\", float(\"3.4\"))\nprint(\"float to int:\", int(3.4))\n
\n\nOutput\n\nstring to float: 3.4\nfloat to int: 3\n
If the conversion doesn’t make sense, however, an error message will occur.\nGiven this information, what do you expect the following program to do? What does it actually do? Why do you think it does that?
\n\nprint(\"fractional string to int:\", int(\"3.4\"))\n
👁 View solution
\n\nPython 3 throws an error. If you ask Python to perform two consecutive typecasts, you must convert it explicitly in code.
\n\nint(float(\"3.4\"))\n
\n\n\nQuestion: Typecasting\nWhich of the following will return the floating point number
\n2.0
? Note: there may be more than one right answer.\nfirst = 1.0\nsecond = \"1\"\nthird = \"1.1\"\n
\n
\n- \n
first + float(second)
- \n
float(second) + float(third)
- \n
first + int(third)
- \n
first + int(float(third))
- \n
int(first) + int(float(third))
- \n
2.0 * second
👁 View solution
\n\nAnswer: 1 and 4
\n
\n\n\nQuestion: Imaginary numbers\nPython provides complex numbers, which are written as
\n1.0+2.0j
. Ifval
is a complex number, its real and imaginary parts can be accessed using dot notation asval.real
andval.imag
.\ncomplex = 6 + 2j\nprint(complex.real)\nprint(complex.imag)\n
Output:\n6.0\n2.0
\n\n
\n- Why do you think Python uses
\nj
instead of i for the imaginary part?- What do you expect
\n1+2j + 3
to produce?- What do you expect
\n4j
to be? What about4 j
or4 + j
?👁 View solution
\n\n\n
\n- Standard mathematics treatments typically use i to denote an imaginary number. However, from media reports it was an early convention established from electrical engineering that now presents a technically expensive area to change.
\n- \n
(4+2j)
- \n
4j
and Syntax Error: invalid syntax. In the latter cases,j
is considered a variable and the statement depends on ifj
is defined and if so, its assigned value.
To carry out common tasks with data and variables in Python, the language provides us with several built-in functions. To display information to the screen, we use the print
function:
When we want to make use of a function, referred to as calling the function, we follow its name by parentheses. The parentheses are important: if you leave them off, the function doesn’t actually run! Sometimes you will include values or variables inside the parentheses for the function to use. In the case of print
, we use the parentheses to tell the function what value we want to display. print
automatically puts a single space between outputs to separate them and wraps around to a new line at the end.
We can display multiple things at once using only one print
call:
We can also call a function inside of another function call. For example, Python has a built-in function called type
that tells you a value’s data type:
Moreover, we can do arithmetic with variables right inside the print
function:
The above command, however, did not change the value of weight_kg
:
To change the value of the weight_kg
variable, we have to assign weight_kg
a new value using the equals =
sign:
A function may take zero or more arguments. An argument is a value passed into a function. You must always use parentheses, even if they’re empty, so that Python knows a function is being called.
\nEvery function call produces some result. If the function doesn’t have a useful result to return, it usually returns the special value None
. None
is a Python object that stands in anytime there is no value.
Commonly-used built-in functions include max
, min
, and round
. max
and min
work on character strings as well as numbers. From “larger” and “smaller”, they use the order: (0-9, A-Z, a-z) to compare letters.
Functions may have default values for some arguments. round
will round off a floating-point number. By default, it rounds to zero decimal places.
We can specify the number of decimal places we want.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-79", "source": [ "round(3.712, 1)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Methods have parentheses like functions, but come after the variable. Some methods are used for internal Python operations, and are marked with double underlines.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-81", "source": [ "my_string = 'Hello world!' # creation of a string object\n", "\n", "print(len(my_string)) # the len function takes a string as an argument and returns the length of the string\n", "\n", "print(my_string.swapcase()) # calling the swapcase method on the my_string object\n", "\n", "print(my_string.__len__()) # calling the internal __len__ method on the my_string object, used by len(my_string)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">You might even see them chained together. They operate left to right.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-83", "source": [ "print(my_string.isupper()) # Not all the letters are uppercase\n", "print(my_string.upper()) # This capitalizes all the letters\n", "\n", "print(my_string.upper().isupper()) # Now all the letters are uppercase" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">help
to get help for a function.Every built-in function has online documentation.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-85", "source": [ "help(round)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">An if statement (more properly called a conditional statement) controls whether some block of code is executed or not.\nThe first line opens with if and ends with a colon and the block of code to be executed is indented. An example is showed below.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-87", "source": [ "num = 37\n", "if num > 100:\n", " print('greater')\n", "else:\n", " print('not greater')\n", "print('done')" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">If the expression that follows the if statement is true, the body of the if
(i.e., the set of lines indented underneath it) is executed, and “greater”
is printed. If it is false, the body of the else
is executed instead, and “not greater”
is printed. Only one or the other is ever executed before continuing on with program execution to print “done”
:
Conditional statements don’t have to include an else
. If there isn’t one, Python simply does nothing if the expression is false:
We can also chain several expressions together using elif
, which is short for “else if”. The following Python code uses elif
to print the sign of a number.
The operators used for comparing values in conditionals are the following:
\n>
: greater than<
: less than==
: equal to!=
: does not equal>=
: greater than or equal to<=
: less than or equal toWe can also combine expressions using and
and or
. and
is only true if both parts are true:
while or
is true if at least one part is true:
True
and False
are special words in Python called booleans, which represent truth values. A statement such as 1 < 0
returns the value False
, while -1 < 0
returns the value True
.
\n\nQuestion: Conditionals\nWhat does this program print?
\n\npressure = 71.9\nif pressure > 50.0:\n pressure = 25.0\nelif pressure <= 50.0:\n pressure = 0.0\nprint(pressure)\n
\n👁 View solution
\n\n25
\n
\n\n\nQuestion: Conditionals\nWrite some conditions that print
\nTrue
if the variablea
is within 10% of the variableb
andFalse
otherwise. Compare your implementation with your partner’s: do you get the same answer for all possible pairs of numbers?\nHint: There is a built-in functionabs()
that returns the absolute value of a number.👁 View solution
\n\n\na = 5\nb = 5.1\nif abs(a - b) <= 0.1 * abs(b):\n print('True')\nelse:\n print('False')\n
Doing calculations on the values in a list one by one is very time consuming.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-97", "source": [ "odds = [1, 3, 5, 7, 9, 11]\n", "print(odds[0])\n", "print(odds[1])\n", "print(odds[2])\n", "print(odds[3])\n", "print(odds[4])\n", "print(odds[5])\n", "" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">A for loop tells Python to execute some statements once for each value in a list, a character string, or some other collection.\n“for each thing in this group, do these operations”. The for loop equivalent to the previous code is:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-99", "source": [ "for num in odds:\n", " print(num)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">The improved version uses a for loop to repeat an operation — in this case, printing — once for each thing in a sequence. The general form of a loop is:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-101", "source": [ "for variable in collection:\n", " # do things using variable, such as print" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">Using the odds example above, the loop might look like this:\n
\nwhere each number (num
) in the variable odds is looped through and printed one number after another. The other numbers in the diagram denote which loop cycle the number was printed in (1 being the first loop cycle, and 6 being the final loop cycle).
We can call the loop variable anything we like, but there must be a colon at the end of the line starting the loop, and we must indent anything we want to run inside the loop. Unlike many other languages, there is no command to signify the end of the loop body (e.g. end for); what is indented after the for statement belongs to the loop. Python uses indentation to show nesting. Any consistent indentation is legal, but almost everyone uses four spaces.
\nWhen looping through a list, the position index and corresponding value can be retrieved at the same time using the enumerate()
function.
To loop over two or more lists at the same time, the entries can be paired with the zip()
function.
We can choose any name we want for variables. It is a good idea to choose variable names that are meaningful, otherwise it would be more difficult to understand what the loop is doing.
\nHere’s another loop that repeatedly updates a variable:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-107", "source": [ "length = 0\n", "names = ['Curie', 'Darwin', 'Turing']\n", "for value in names:\n", " length = length + 1\n", "print('There are', length, 'names in the list.')\n", "print('After the loop, name is', name)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">It is worth tracing the execution of this little program step by step. Since there are three names in names
, the statement on line 4 will be executed three times. The first time around, length
is 0
(the value assigned to it on line 1) and value
is Curie
. The statement adds 1 to the old value of length
, producing 1, and updates length
to refer to that new value. The next time around, value
is Darwin
and length
is 1
, so length
is updated to be 2
. After one more update, length
is 3
; since there is nothing left in names
for Python to process, the loop finishes and the print function on line 5 tells us our final answer.
Note that a loop variable is a variable that is being used to record progress in a loop. It still exists after the loop is over, stores the value assigned to it last, and we can re-use variables previously defined as loop variables as well.
\n\n\nQuestion: range of numbers\nPython has a built-in function called
\nrange()
that generates a sequence of numbers.range
can accept 1, 2, or 3 parameters.\nIf one parameter is given,range
generates a sequence of that length, starting at zero and incrementing by 1. For example,range(3)
produces the numbers0, 1, 2
.\nIf two parameters are given,range
starts at the first and ends just before the second, incrementing by one. For example,range(2, 5)
produces2, 3, 4
.\nIfrange
is given 3 parameters, it starts at the first one, ends just before the second one, and increments by the third one. For example,range(3, 10, 2)
produces3, 5, 7, 9
.\nUsingrange
, write a loop that usesrange
to print the first 3 natural numbers:\n1\n2\n3\n
\n👁 View solution
\n\n\nfor i in range(1, 4):\n print(i)\n
\n\n\nQuestion: Number of iterations\nGiven the following loop:
\n\nword = 'oxygen'\nfor char in word:\n print(char)\n
How many times is the for loop executed?
\n👁 View solution
\n\nThe body of the loop is executed 6 times.
\n
\n\n\nQuestion: Exponentiation\nExponentiation is built into Python:
\n\nprint(5 ** 3)\n
Output:\n125
\nWrite a loop that calculates the same result as
\n5 ** 3
using multiplication (and without exponentiation).👁 View solution
\n\n\nresult = 1\nfor number in range(0, 3):\n result = result * 5\nprint(result)\n
\n\n\nQuestion: Iterations over a list\nWrite a loop that calculates the sum of elements in a list by adding each element and printing the final value, so
\n[124, 402, 36]
prints562
👁 View solution
\n\n\nnumbers = [124, 402, 36]\nsummed = 0\nfor num in numbers:\n summed = summed + num\nprint(summed)\n
\n\n\nQuestion: Polynomial\nSuppose you have encoded a polynomial as a list of coefficients in the following way: the first element is the constant term, the second element is the coefficient of the linear term, the third is the coefficient of the quadratic term, etc.
\n\nx = 5\ncoefs = [2, 4, 3]\ny = coefs[0] * x**0 + coefs[1] * x**1 + coefs[2] * x**2\nprint(y)\n
Output:\n97
\nWrite a loop using
\nenumerate(coefs)
which computes the valuey
of any polynomial, givenx
andcoefs
.👁 View solution
\n\n\ny = 0\nfor idx, coef in enumerate(coefs):\n y = y + coef * x**idx\n
\n\n\nQuestion: For loops and conditionals\nFill in the blanks so that this program creates a new list containing zeroes where the original list’s values were negative and ones where the original list’s values were positive.
\n\noriginal = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4]\nresult = ____\nfor value in original:\n if ____:\n result.append(0)\n else:\n ____\n print(result)\n
Output:\n[0, 1, 1, 1, 0, 1]
\n👁 View solution
\n\n\noriginal = [-1.5, 0.2, 0.4, 0.0, -1.3, 0.4]\nresult = []\nfor value in original:\n if value < 0:\n result.append(0)\n else:\n result.append(1)\n print(result)\n
With the while loop we can execute a set of statements as long as an expression is true. The following example prints i
as long as i
is less than 6:
Remember to increment i
, or else the loop will continue forever. The while loop requires relevant variables to be ready, in this example we need to define an indexing variable, i
, which we set to 0.
With the break
statement we can stop the loop even if the while condition is true:
With the continue
statement we can stop the current iteration, and continue with the next:
With the else
statement we can run a block of code once when the condition no longer is true:
Human beings can only keep a few items in working memory at a time. Breaking down larger/more complicated pieces of code in functions helps in understanding and using it. A function can be re-used. Write one time, use many times.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-117", "source": [ "def fahr_to_celsius(temp):\n", " return ((temp - 32) * (5/9))" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">The function definition opens with the keyword def
followed by the name of the function fahr_to_celsius
and a parenthesized list of parameter names temp
. The body of the function — the statements that are executed when it runs — is indented below the definition line. The body concludes with a return
keyword followed by the return value.
When we call the function, the values we pass to it are assigned to those variables so that we can use them inside the function. Inside the function, we use a return statement to send a result back to whoever asked for it.
\nLet’s try running our function.
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-119", "source": [ "fahr_to_celsius(32)\n", "print('freezing point of water:', fahr_to_celsius(32), 'C')\n", "print('boiling point of water:', fahr_to_celsius(212), 'C')" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">We’ve successfully called the function that we defined, and we have access to the value that we returned.
\nNow that we’ve seen how to turn Fahrenheit into Celsius, we can also write the function to turn Celsius into Kelvin:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-121", "source": [ "def celsius_to_kelvin(temp_c):\n", " return temp_c + 273.15\n", "\n", "print('freezing point of water in Kelvin:', celsius_to_kelvin(0.))" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">What about converting Fahrenheit to Kelvin? We could write out the formula, but we don’t need to. Instead, we can compose the two functions we have already created:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-123", "source": [ "def fahr_to_kelvin(temp_f):\n", " temp_c = fahr_to_celsius(temp_f)\n", " temp_k = celsius_to_kelvin(temp_c)\n", " return temp_k\n", "\n", "print('boiling point of water in Kelvin:', fahr_to_kelvin(212.0))" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">This is our first taste of how larger programs are built: we define basic operations, then combine them in ever-larger chunks to get the effect we want. Real-life functions will usually be larger than the ones shown here — typically half a dozen to a few dozen lines — but they shouldn’t ever be much longer than that, or the next person who reads it won’t be able to understand what’s going on.
\nIn composing our temperature conversion functions, we created variables inside of those functions, temp
, temp_c
, temp_f
, and temp_k
. We refer to these variables as local variables because they no longer exist once the function is done executing. If we try to access their values outside of the function, we will encounter an error:
If you want to reuse the temperature in Kelvin after you have calculated it with fahr_to_kelvin, you can store the result of the function call in a variable:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-127", "source": [ "temp_kelvin = fahr_to_kelvin(212.0)\n", "print('temperature in Kelvin was:', temp_kelvin)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">If we usually want a function to work one way, but occasionally need it to do something else, we can allow people to pass a parameter when they need to but provide a default to make the normal case easier. The example below shows how Python matches values to parameters:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-129", "source": [ "def display(a=1, b=2, c=3):\n", " print('a:', a, 'b:', b, 'c:', c)\n", "\n", "print('no parameters:')\n", "display()\n", "print('one parameter:')\n", "display(55)\n", "print('two parameters:')\n", "display(55, 66)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">As this example shows, parameters are matched up from left to right, and any that haven’t been given a value explicitly get their default value. We can override this behavior by naming the value as we pass it in:
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-131", "source": [ "print('only setting the value of c')\n", "display(c=77)" ], "cell_type": "code", "execution_count": null, "outputs": [ ], "metadata": { "attributes": { "classes": [ ">\n\nQuestion: Variable scope\nWhat does the following piece of code display when run — and why?
\n\nf = 0\nk = 0\n\ndef f2k(f):\n k = ((f - 32) * (5.0 / 9.0)) + 273.15\n return k\n\nprint(f2k(8))\nprint(f2k(41))\nprint(f2k(32))\nprint(k)\n
\n👁 View solution
\n\nOutput:\n259.81666666666666\n278.15\n273.15\n0
\n\n
k
is 0 because thek
inside the functionf2k
doesn’t know about thek
defined outside the function. When thef2k
function is called, it creates a local variablek
. The function does not return any values and does not alterk
outside of its local copy. Therefore the original value ofk
remains unchanged.
A library is a collection of files (called modules) that contains functions for use by other programs. It may also contain data values (e.g., numerical constants) and other things. A library’s contents are supposed to be related, but there’s no way to enforce that. The Python standard library is an extensive suite of modules that comes with Python itself. Many additional libraries are available from PyPI (the Python Package Index).
\nA library is a collection of modules, but the terms are often used interchangeably, especially since many libraries only consist of a single module, so don’t worry if you mix them.
\nYou can use import
to load a library module into a program’s memory, then refer to things from the module as module_name.thing_name
. Python uses .
to mean “part of”. For example, using math
, one of the modules in the standard library:
You can use help
to learn about the contents of a library module. it works just like help for a function.
You can import specific items from a library module to shorten programs. You can use from ... import ...
to load only specific items from a library module. Then refer to them directly without library name as prefix.
You can create an alias for a library module when importing it to shorten programs. Use import ... as ...
to give a library a short alias while importing it. Then refer to items in the library using that shortened name.
A consistent coding style helps others (including our future selves) read and understand code more easily. Code is read much more often than it is written, and as the Zen of Python states, “Readability counts”. Python proposed a standard style through one of its first Python Enhancement Proposals (PEP), PEP8.
\nSome points worth highlighting:
\nPython supports a large and diverse community across academia and industry.
\n