7.5. Exercises#
Each exercise carries the difficulty level indication:
[*] Easy
[**] Moderate
[***] Advanced
The highest difficulty level is meant for students who wish to challenge themselves and/or have previous experience in programming.
([*] Basics of NumPy)
Note: this exercise contains an introduction, which is absent from the downloadable exercise (.py
) file.
Introduction
NumPy and arrays
NumPy is a general numeric handling library.
It is mostly useful for arrays (called ndarrays
) and array handling.
ndarray
vs. lists
Underneath, a Python list
is actually a C array of pointers to values somewhere else in the memory.
An ndarray
is a block of memory - just an array storing values.
However, it is not directly an array, instead storing the distance between
each element (stride) and where the memory block starts, plus some other things needed to work with
the array.
Typically, it will be faster to do operations on NumPy arrays, especially computational ones.
Defining a NumPy array
Several options exist:
x = np.zeros()
- defines an array of zeros of the desired dimensionsx = np.array()
- define your array, manually, with the elements of your choicex = np.linspace()
- make an evenly spaced array over a user-defined rangex = np.arange()
- make an array over a given range with a specific step sizex = np.asarray()
- given input that looks like an array, converts it into a NumPy array
Many more can be found in the documentation, but these are some of the basic ones.
Working with NumPy arrays
Computation
You can use NumPy arrays into your normal mathematical operators.
It will typically try to do element-wise operations to the entire ndarray
.
Functions
NumPy is great in that many functions are built into it (that are fast because they’re written in a normal programming language).
Say for example you want to take a mean (average) of an array.
Instead of having to write a whole for loop, you can call np.mean
, feed it an array (or an array-like variable)
and it will make an attempt to calculate it.
Indexing The same construction is used for indexing NumPy arrays as lists or other sequnces in Python. Slice indexing can also be used.
# Some valid constructions
test_ar = np.linspace(1, 50)
val = test_ar[10]
vals = test_ar[30:50]
more_vals = test_ar[::30]
(You can also feed Boolean NumPy arrays as indices.)
One thing to note is how NumPy will handle copying data, and the unintended side effects of it.
x = np.linspace(1, 10, 10) # Making a numpy array
y = x # "Copying" it to a new variable to do something with it
y[1] = 100 # Want to change a value directly
print(x)
print(y)
Try running this in VS Code.
The issue here is that both x
and y
are looking at the same block of data fundamentally, so any direct
change to a value will be seen by both.
It’s not actually defining a new set of values for y
to look at.
This can cause debugging headaches.
Other resources
If interested, here are some links with additional information about NumPy:
This exercise contains a number of simple things to try to get familiar with NumPy.
Hint: Functions (probably) exist for most of these.
Use the NumPy documentation at numpy.org - try not to be too intimidated, it’s easier than it seems.
Google is also your friend.
import numpy as np
Exercise A
Define x_ar
so that it is an array of 124 elements ranging from 2.3 to
1776.343, evenly spaced on a logarithmic scale.
Hint: There is a function to do this.
x_ar = # Your code here
Exercise B
Reshape x_ar
to two-dimensional array (pick the lengths you’d like).
# Your code here
Exercise C
Add 10 to every value in the array.
Hint: you don’t need any loops to do so.
# Your code here
Exercise D
Reshape x_ar
to a one-dimensional array again.
Define a new array y_ar
as a linear space of 10 values from 23.4 to 1006.2
and add it to the end of x_ar
.
# Your code here
Exercise E
Print every tenth value from the new combined x_ar
. You should be able to do
this without loops.
# Your code here
Exercise F
Test this code. Try to describe what you see and understand what is going on before running it.
test_ar = np.linspace(1, 50)
mask_ar = test_ar >= 25
new_ar = test_ar[mask_ar]
print(new_ar)
Exercise G
Now find the average for every value above 100 in the combined x_ar
.
You should be able to do this without loops.
# Your code here
([*] List with the highest mean)
Find the list which has the highest mean and the value of that mean. Inform the user about your findings using print statements.
Start with writing pseudocode.
Hints:
Which NumPy function could come in handy here?
Use a
for
loop and track the highest mean.
list1 = [3, 5, 7, 9, 2, 4, 6, 8, 10, 1, 3, 5]
list2 = [7, 9, 2, 4, 7, 5, 3, 1, 8, 6]
list3 = [2, 4, 6, 9, 10, 3, 5, 6, 9]
list4 = [1, 3, 5, 7, 9, 1, 3, 5, 7, 2, 4, 6, 8, 10]
list5 = [4, 6, 8, 10, 2, 4, 6, 8, 10, 1, 3, 5, 7, 9]
# Your code here
([*] How many ways to the same array?)
In each of the exercises below, you will receive an array. There is typically more than one way to define an array in NumPy. Therefore, for each of the arrays below, try to think of at least two (and if possible more) ways to define them using NumPy’s functions and write the corresponding Python code. If you feel like it, make it a competition with your fellow students, or a competition between student groups! Feel free to use NumPy documentation.
Exercise A:
Exercise B:
Exercise C:
Exercise D:
Exercise E:
Exercise F:
Exercise G:
Exercise H:
([**] Kinetic energy)
Given a NumPy array of masses, calculate the kinetic energy of objects using
where \(m\) is mass and \(v\) = 10 nm/s is velocity.
Use these masses (given in micrograms): [1.5, 2.0, 2.5, 3.0, 3.5, 50, 100].
# Your code here
([**] Protein concentration)
A biologist has measured the concentration of their favourite protein in cell lysates over time, recorded in 10-minute intervals over a 24-hour period. The concentrations are stored in a 1D NumPy array.
import numpy as np
np.random.seed(0) # Setting seed for random number generation for reproducibility
time_points = np.arange(0, 1440, 10) # Time in minutes
protein_concentration = np.random.uniform(0.1, 1.0, len(time_points)) # Concentrations in nmol/L
Exercise A: Extract the concentrations between the sixth and tenth hours and print them.
# Your code here
Exercise B: Get concentrations measured at every full hour and print them.
# Your code here
Exercise C: Extract the concentrations from the last three hours and print them.
# Your code here
([**] Bacterial growth)
A bacterial population grows exponentially according to this equation:
\(N(t) = N_0 \cdot e^{rt}\)
where \(N(t)\) is number of bacteria in time \(t\), \(N_0\) is initial number of bacteria, \(r\) is growth rate, and \(t\) is time.
Exercise A: Population size
Given \(N_0\) = 1,000 and \(r\) = 0.2 per hour, calculate the bacterial population size every hour for time ranging from 0 to 48 hours. Use NumPy arrays.
# Your code here
Exercise B: Induction of protein expression
In the lab, you’re growing a culture of 100 mL of Escherichia coli that you transformed with a plasmid containing your favourite protein-coding gene. The initial number of cells is \(2 \cdot 10^6\). You want to grow E. coli until the culture reaches an optical density (\(OD_{600}\)) of 0.6. This value indicates an exponential growth phase and is the optimal time for induction of protein expression.
Knowing that for E. coli \(1 OD_{600}\) = \(5 \cdot 10^8\) cells/mL and that it grows with a rate of 0.3 per hour, calculate how much time you need to grow E. coli before induction.
# Your code here
([**] Heat distribution in a metal plate)
A 9x9 grid represents a metal plate where each cell indicates the temperature at that point. The temperature is stored in a 2D NumPy array.
import numpy as np
# Heat distribution across a 9x9 metal plate
np.random.seed(0) # Setting seed for random number generation for reproducibility
heat_distribution = np.random.uniform(20, 100, (9, 9)) # Degrees Celsius
Exercise A: Extract the temperatures from the top-left 3x3 corner of the plate.
# Your code here
Exercise B: Extract the middle row and the middle column and print them.
# Your code here
Exercise C: Get the temperatures along the diagonal of the plate and print them.
# Your code here
Exercise D: You can imagine your 9x9 plate consisting of nine 3x3 subplates (similar to when you’re solving Sudoku puzzles). Calculate which of the nine 3x3 subplates has the highest average temperature and print the result.
# Your code here
([**] Social network for epidemic modelling)
As you might imagine from the experience in recent years, modelling an epidemic spread is rather useful. One of the possible models is the so-called “probabilistic network model”, in which we model pathogen spread in the group via a network of social contacts. As it turns out, a social network can conveniently be represented by a matrix \(W\). Let’s look at an example:
You can imagine each row and each column representing a person. Our example matrix \(W\) therefore outlines a social network of three people. When two individuals know each other, the corresponding element in the matrix is 1. In matrix terminology, we can say that element \(w_{ij}\) = 1 when persons \(i\) and \(j\) know each other. The matrix is also symmetrical, i.e., \(w_{ij}\) = \(w_{ji}\). If \(w_{ii}\) = 1, then \(i\) abides by Socrate’s advice: “know thyself”.
Remember: in \(w_{ij}\), \(i\) denotes the row and \(j\) the column.
Below we give you a social network matrix \(W\) to use in the exercises. We also offer some general hints to keep in mind:
The matrix is symmetric, so make sure not to count the same contact twice.
When looking at contacts between different individuals, you can exclude elements from the matrix diagonal (i.e., the “know thyself” elements).
Remember good coding practices and avoid hardcoding. Your code should work for any social network matrix.
import numpy as np
# Define social network matrix
num_people = 20 # Matrix of 20 people
np.random.seed(0) # Setting seed for random number generation for reproducibility
W = np.random.choice([0, 1], size=(num_people, num_people), p=[0.6, 0.4])
np.fill_diagonal(W, 1) # Set diagonal elements to 1
Exercise A: Person number 5
How many contacts does person number 5 have? Use Python to find the answer and print it.
Hint: How does counting in Python work?
# Your code here
Exercise B: The influencer
Find out which person in your social network has the most contacts (i.e., who’s the influencer)? Make a print statement about it.
# Your code here
Exercise C: Interconnectedness
How many social contacts in total are represented in \(W\)? Calculate and make a print statement.
# Your code here
Exercise D: A new friendship
Two people have met at a party, so now they also know each other. Make a new social network matrix \(W_{new}\) that also contains this contact. In practice, identify two people in your matrix \(W\) that don’t know each other (\(w_{ij}\) = 0) and turn the corresponding matrix element(s) to 1.
Is the influencer still the same person? You can reuse your code from above to find out.
# Your code here