11. Array (matrix) operations in NumPy#
11.1. Arithmetic operators#
As with scalars, we can use some basic arithmetic operators to manipulate a vector or a matrix in NumPy. The following operators can be used: +
, -
, /
, *
, **
,%
. It’s very important to note that these operators are performed element-wise on an array.
Function |
Description |
---|---|
|
addition |
|
subtraction |
|
multiplication |
|
division |
|
power |
|
remainder |
|
square of each element in the array |
|
square root of each element in the array |
|
sum of elements in the array |
import numpy as np
a = np.array([1,3,5]) #1D array
b = np.array([5,3,1]) #1D array
c = np.array([[2,6,10],[1,3,5]]) #2D array
d = np.array([[1,1,1],[2,2,2]]) #2D array
addition = a + b # or: np.add(a, b)
print(addition)
[6 6 6]
subtraction = c - d # or: np.subtract(a, b)
print(subtraction)
[[ 1 5 9]
[-1 1 3]]
multiplication = c * 6 # or: np.multiply(a, 6)
print(multiplication)
[[12 36 60]
[ 6 18 30]]
division = a / 3 # or: np.divide(a, 3)
print(division)
[0.33333333 1. 1.66666667]
power = d ** 2 # or: np.power(a, 2)
print(power)
[[1 1 1]
[4 4 4]]
remainder = a % 2 # or: np.remainder(a, b) # or: np.mod(a, b)
print(remainder)
[1 1 1]
11.2. Broadcasting#
The element-wise operations we introduced above only work on arrays of similar shapes. However, in some cases NumPy can transform arrays so that they all have similar shapes. NumPy does this by stretching or “copying” the dimensions with size=1 to match the dimension of the matrix. If interested, you can read more on this topic here. Below are some examples.
import numpy as np
a = np.array([1,3,5]) #1D array
c = np.array([[2,6,10],[1,3,5]]) #2D array
d = np.array([[1,1,1],[2,2,2]]) #2D array
addition = a + d
print(addition)
[[2 4 6]
[3 5 7]]
division = c / a
print(division)
[[2. 2. 2.]
[1. 1. 1.]]
11.3. Logical expressions#
Similarly, we can use logical expressions to filter and manipulate data in a Numpy array. When using Numpy we have multiple options for using logical operators for AND, OR, and NOT. We can implement prewritten Numpy functions or we can use & or *, | or +, and ! or -.
Operator |
Example |
Arithmetic |
---|---|---|
|
|
|
|
|
|
|
|
|
As an example, consider the two floating point arrays:
a = [-1,2,-3,4,5,6,7]
, and b=[7,6,5,4,-3,2,-1]
The following table shows some operations which can be done using logical expressions:
Expression |
Resulting array |
Explanation |
---|---|---|
|
|
|
|
|
Multiplication with booleans works as AND |
|
|
Summing booleans works as OR |
|
|
Select the elements of |
|
|
Select the elements of |
|
|
Return a list of indices where |
a = np.array([-1,2,-3,4,5,6,7]) #1D array
b = np.array([7,6,5,4,-3,2,-1]) #1D array
# True if element greater than 0
print(a > 0)
# Multiplication with booleans works as AND
print((a>0) * (a<10))
# Summing booleans works as OR
print((a>0) + (a>=10))
# Select the elements of a where a > 0 is 'True'
print(a[a>0])
# Select the elements of b where a > 0 is 'True'
print(b[a>0])
# Return a list of indices where (a > 0) was 'True'
print(np.where(a>0)[0])
[False True False True True True True]
[False True False True True True True]
[False True False True True True True]
[2 4 5 6 7]
[ 6 4 -3 2 -1]
[1 3 4 5 6]
11.4. Some useful statistical operations#
Below is a list of some other operations that can be very usefull when working with matrices.
Operator |
Output |
---|---|
|
the arithmetic mean along the specified axis |
|
the standard deviation along the specified axis |
|
the minimum value along a specified axis |
|
the maximum value along a specified axis |
|
the sum of array elements over a given axis |
|
the indices of the maximum values along an axis |
|
the indices of the maximum values along an axis |
In NumPy, an axis is a dimension along which an operation is performed. For a two-dimensional array, there are two axes:
Axis 0 = Rows: This axis runs vertically downwards. It represents the direction along which you move from one row to the next. When you perform operations along axis 0, you are operating across rows. For example, calculating the mean along axis 0 means computing the mean of each column.
Axis 1 = Columns: This axis runs horizontally across the matrix. It represents the direction along which you move from one column to the next. When you perform operations along axis 1, you are operating across columns. For example, calculating the mean along axis 1 means computing the mean of each row.
Check the examples below to gain a better understanding of the concept. You can also test and compare different functions for different axes by yourself.
matrix = np.array([[1, 2, 3, 4, 5],
[6, 7, 8, 9, 10]])
# The arithmetic mean of all elements
print("The arithmetic mean of all elements: ")
print(np.mean(matrix))
# Mean along axis 0 (mean for each column)
print("Mean along axis 0: ")
print(np.mean(matrix, axis=0))
# Mean along axis 1 (mean for each row)
print("Mean along axis 1: ")
print(np.mean(matrix, axis=1))
# The standard deviation
print("The standard deviation: ")
print(np.std(matrix))
# The minimum value
print("The minimum value: ")
print(np.min(matrix))
# The minimum value along axis 1
print("The minimum value along axis 1: ")
print(np.min(matrix, axis=1))
# The maximum value
print("The maximum value: ")
print(np.max(matrix))
# The sum of array elements
print("The sum of array elements: ")
print(np.sum(matrix))
# The indices of the minimum values along axis 1
print("The indices of the minimum values along an axis 1: ")
print(np.argmin(matrix, axis=1))
# The indices of the maximum values along axis 0
print("The indices of the maximum values along axis 0: ")
print(np.argmax(matrix, axis=0))
The arithmetic mean of all elements:
5.5
Mean along axis 0:
[3.5 4.5 5.5 6.5 7.5]
Mean along axis 1:
[3. 8.]
The standard deviation:
2.8722813232690143
The minimum value:
1
The minimum value along axis 1:
[1 6]
The maximum value:
10
The sum of array elements:
55
The indices of the minimum values along an axis 1:
[0 0]
The indices of the maximum values along axis 0:
[1 1 1 1 1]
11.5. Matrix operations (linear algebra):#
A large part of the mathematical knowledge required for different fields of electrical engineering comes from linear algebra. Matrix operations play a significant role in linear algebra. Below is a list of common matrix operations you can perform in NumPy. Note that matrix A is of size (m,k)
and B is of size (k,n)
.
Operation |
Mathematical expression |
NumPy expression |
Output shape |
---|---|---|---|
Dot product (vectors) |
\(a \cdot b\) |
|
scalar |
Matrix multiplication |
\(A \times B\) |
|
|
Transpose |
\(A^T\) |
|
|
Inverse |
\(A^{-1}\) |
|
|
Pseudo-inverse |
\(A^{+}\) |
|
|
Determinant |
$ |
A |
$ |
Rank |
\(rank(A)\) |
|
scalar |
Trace |
\(tr(A)\) |
|
scalar |
Note that the sub-module numpy.linalg
implements basic linear algebra, such as solving linear systems, singular value decomposition, etc. However, it is recommend to use scipy.linalg
to guarantee efficient compilation.
import numpy as np
# A = 3x2 and B = 2x2 matrix
A=np.array([[3, 1],
[2, 5],
[4, 3]])
B=np.array([[1,2],
[4,7]]) # B is a square matrix
# Matrix multiplication
mat_mul=A@B # or: A.dot(B)
print("A @ B = ",mat_mul)
inverse = np.linalg.inv(B)
print("A^-1 = ",inverse)
determinant = np.linalg.pinv(A)
print("A^+ = ",determinant)
determinant = np.linalg.det(B)
print("A^+ = ",inverse)
transpose = A.T
print("A^T = ",transpose)
rank = np.linalg.matrix_rank(A)
print("rank (A) = ",rank)
trace = A.trace()
print('tr(A ):',trace)
A @ B = [[ 7 13]
[22 39]
[16 29]]
A^-1 = [[-7. 2.]
[ 4. -1.]]
A^+ = [[ 0.20512821 -0.14102564 0.16666667]
[-0.11794872 0.24358974 -0.03333333]]
A^+ = [[-7. 2.]
[ 4. -1.]]
A^T = [[3 2 4]
[1 5 3]]
rank (A) = 2
tr(A ): 8
11.6. Exercises#
11.6.1. Exercise: Euclidean distance#
Write a Python program to find the Euclidean distance between the two given one-dimensional arrays. The mathematical Euclidian distance is usually given by the formula:
\(\sqrt{((x_2-x_1)^2 + (y_2-y_1)^2)}\)
For one-dimensional arrays, you can think of the arrays as [a1, b1, c1]
and [a2, b2, c2]
, and the Euclidean distance is:
\(\sqrt{((a_2-a_1)^2 +(b_2-b_1)^2 + (c_2-c_1)^2)}\)
💡Hint: you can use np.square()
, np.sum()
and np.sqrt()
.
💡Don’t forget to print and check your progress in the intermediate steps to ensure you are on the right track.
import numpy as np
# Example arrays
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
# Find the squared difference: first subtract the arrays and then find the difference
squared_diff = # your code here
print(squared_diff)
Cell In[14], line 2
squared_diff = # your code here
^
SyntaxError: invalid syntax
# Sum of the squared distances:
sum_squared_diff = # your code here
print(sum_squared_diff)
# Square root of the sum:
euclidean_distance = # your code here
print(euclidean_distance)
print(f"Euclidean Distance between array1 and array2: {euclidean_distance:.2f}")
11.6.2. Exercise: Solving a linear system of equations#
You are tasked with solving the following system of linear equations for the variables x, y, and z:
Pepresent this system of linear equations as a matrix multiplication, i.e., A * X = B, where:
A is the coefficient matrix,
X is the column vector of variables (x, y, z), and
B is the column vector of constants on the right-hand side.
Complete the definition of the A and B arrays in the provided code below.
Find the inverse of matrix A using NumPy’s np.linalg.inv()
. Then, use it to find the correct values for the variables x, y, and z.
import numpy as np
# Define the coefficient matrix A and the constant vector B
A = # your code here
B = # your code here
# Task 1: Calculate the inverse of matrix A
A_inv = # your code here
# Task 2: Solve for vector X using X = A_inv * B
X = # your code here
# Print the results
print("Inverse of matrix A:")
print(A_inv)
print("\nSolution vector X (x, y, z):")
print(X)
11.6.3. Exercise: Underwater research - analyzing marine animal data#
You are part of a marine biologist team on an underwater research expedition. Your team has collected data on various marine animals. Each marine animal is represented by a row in a matrix, and their measurements are recorded in columns.
Your task is to analyze this data to gain insights into these fascinating creatures.
A 4x4 matrix representing data on four marine animals is given.
The columns represent the following attributes:
Size (in meters)
Swimming speed (in km/h)
Diet variety (number of different food types)
Average depth (in meters)
🐋Tasks:
Calculate and print the maximum size of any marine animal.
Calculate and print the average swimming speed across all marine animals.
Identify and print the name of the marine animal with the highest diet variety.
Identify and print the names of marine animals that have an average depth below 200 meters.
import numpy as np
# Will be given
# A 4x4 matrix for 4 marine animals and 4 attributes
# Rows represent marine animals, columns represent [size, speed, diet variety, depth]
data_matrix = np.array([
[2.5, 40, 3, 150], # Animal 1
[1.2, 25, 1, 300], # Animal 2
[3.0, 55, 5, 100], # Animal 3
[0.8, 20, 2, 250] # Animal 4
])
# Names of the marine animals
animal_names = ["Shark", "Dolphin", "Whale", "Seal"]
print("Marine Animal Data Matrix:")
print(data_matrix)
# Task 1: Calculate and print the maximum size of any marine animal.
max_size = # your code here
print(f"Maximum Size of any marine animal: {max_size} meters")
# Task 2: Calculate and print the average swimming speed across all marine animals.
avg_speed = # your code here
print(f"Average Swimming Speed of all marine animals: {avg_speed} km/h")
# Task 3: Identify and print the name of the marine animal with the highest diet variety.
# The index of the max diet variety
max_diet_index = # your code here
# Indentify the name of the animal using the found index
max_diet_animal = # your code here
print("Marine animal with the highest diet variety:", max_diet_animal)
# Task 4: Identify and print the names of marine animals that have an average depth below 200 meters.
shallow_animals = # your code here
print("Marine Animals with Average Depth Below 200 meters:")
# Print the names of the animals from animal_names using for loop
for i in shallow_animals:
print(animal_names[i])
11.6.4. Exercise: Student performance analysis#
You have data on different students’ performance in three courses: Intro2EE, Linear Circuit, and Digital Systems, during the first quarter. Each student is represented by a row, and their grades are recorded in columns. Your goal is to analyze this data by calculating and printing various statistics as indicated below:
Calculate the average grade of each student in the first quarter and print the results.
Calculate the average grade in each course and print the results.
First, find the indices (row, column) of the maximum grade in
student_data
array and next print the name of the student who got that.Identify and print the names of the students who failed the linear circuit course.
Identify and print the names of students with grades below the average in Intro2EE.
Note that we are deliberately using a small array here so that you can check your results and validate them.
Let’s practice first.
import numpy as np
# a (5,4) matrix representing student grades
student_data = np.array([
[ "Sara", 8, 7, 6],
[ "Ahmed", 9, 5, 5],
[ "Maria", 7, 6, 8],
[ "Finn", 6, 4, 7],
[ "Aisha", 10, 9, 8]
])
11.6.4.1. Before we start#
Since we have a mix of names and grades, Python saves all elements of the array as strings. However, we need to save them as numbers to be able to calculate the statistics.
Therefore, a good practice to make things easier for us is to separate and save the grades in a new list and use that in our manipulations.
# create an array only with names
names = # your code here
print ( names )
# Remove the names from the matrix
grades = # Your code here
print(grades)
As you can see, they are all string so we have to change them to int or floats
grades = # your code here
print(grades)
Now we’ll work on the tasks as described above.
11.6.4.2. Task 1#
To find the average you can use np.mean()
. However, it’s important here to note that we want the average of all grades (different courses, for each student). That means we have to average along the columns (axis = 1). It’s good practice to print the shape to also check that the number of averages matches the number of students.
# Calculate the average grade for each student
average_student_grades = # your code here
print(average_student_grades)
print(average_student_grades.shape)
11.6.4.3. Task 2#
Now calculate the average grade in each course and print the results, i.e., the average of all students.
# Calculate the average grade for courses
average_course_grades = # your code here
print(average_course_grades)
11.6.4.4. Task 3#
First, find the indices (row, column) of the maximum grade in student_data
array and next print the name of the student who got that grade.
Tip
When you find the index of the maximum value with np.argmax()
, convert the flat index of the maximum value to a 2D index using np.unravel_index(array_index, (shape))
.
# Task 3: Identify and print the student who achieved the highest grade
# Find the index of the maximum value in the grades array
max_grade_index = # your code here
# Convert the flat index of the maximum value to a 2D index
row, col = np.unravel_index(# your code here)
# Print the index of the maximum value
print(row, col)
To retrive the student’s name we can use the first column in the initial student_data
list and use the row index of the maximum value find above to print the name.
# Retrieve the student name from student_data using the found index
max_grade_student = # your code here #Same row of the max grade, and the first column
print (max_grade_student)
11.6.4.5. Task 4#
Identify and print the names of students who did not pass the Linear Circuit course. Note that LC grades are stored in the second column of the grade list, so we can use this list with the correct index and check if it’s smaller than 6.
lc_fail = # your code here
print(lc_fail)
Note that as a result we get an array with “True” values for students (rows) with a failing grade. We can now use this to select those rows in other matrices.
print(names[lc_fail])
11.6.4.6. Task 5#
Identify and print the names of students with grades below the average in Intro2EE.
# Task 5: Identify and print the names of students with grades below the average in Intro2EE
# Find all of the Intro2EE grades
I2EE_grades = # your code here
print(I2EE_grades)
# Find the average
average_course_grade = # your code here
print(average_course_grade)
# Find all of the students with grades below average
I2EE_below_average = # your code here
print(I2EE_below_average)
# Only the rows corresponding to "True" values in the boolean array are selected.
11.6.4.7. Final remark#
Note that as you get better in coding you can also join all the single commands above in a single line. This will help to make your code more compact and allows you to use fewer variables, but decreases the readibility of your code. Here is an example where we use one single line on only the initial array to complete task 5.
# Task 5
I2EE_below_average = # your code here
print(I2EE_below_average)