Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 13 additions & 25 deletions episodes/02-numpy.md
Original file line number Diff line number Diff line change
Expand Up @@ -340,15 +340,15 @@ We'll also use multiple assignment,
a convenient Python feature that will enable us to do this all in one line.

```python
maxval, minval, stdval = numpy.amax(data), numpy.amin(data), numpy.std(data)
maxval, minval, stdval = numpy.max(data), numpy.min(data), numpy.std(data)

print('maximum inflammation:', maxval)
print('minimum inflammation:', minval)
print('standard deviation:', stdval)
```

Here we've assigned the return value from `numpy.amax(data)` to the variable `maxval`, the value
from `numpy.amin(data)` to `minval`, and so on.
Here we've assigned the return value from `numpy.max(data)` to the variable `maxval`, the value
from `numpy.min(data)` to `minval`, and so on.

```output
maximum inflammation: 20.0
Expand All @@ -374,18 +374,6 @@ and press the <kbd>Tab</kbd> key twice for a listing of what is available. You c
for example: `help(numpy.cumprod)`.


::::::::::::::::::::::::::::::::::::::::::::::::::

::::::::::::::::::::::::::::::::::::::::: callout

## Confusing Function Names

One might wonder why the functions are called `amax` and `amin` and not `max` and `min` or why the other is called `mean` and not `amean`.
The package `numpy` does provide functions `max` and `min` that are fully equivalent to `amax` and `amin`, but they share a name with standard library functions `max` and `min` that come with Python itself.
Referring to the functions like we did above, that is `numpy.max` for example, does not cause problems, but there are other ways to refer to them that could.
In addition, text editors might highlight (color) these functions like standard library function, even though they belong to NumPy, which can be confusing and lead to errors.
Since there is no function called `mean` in the standard library, there is no function called `amean`.

::::::::::::::::::::::::::::::::::::::::::::::::::

When analyzing data, though,
Expand All @@ -397,7 +385,7 @@ then ask it to do the calculation:

```python
patient_0 = data[0, :] # 0 on the first axis (rows), everything on the second (columns)
print('maximum inflammation for patient 0:', numpy.amax(patient_0))
print('maximum inflammation for patient 0:', numpy.max(patient_0))
```

```output
Expand All @@ -408,7 +396,7 @@ We don't actually need to store the row in a variable of its own.
Instead, we can combine the selection and the function call:

```python
print('maximum inflammation for patient 2:', numpy.amax(data[2, :]))
print('maximum inflammation for patient 2:', numpy.max(data[2, :]))
```

```output
Expand All @@ -420,11 +408,11 @@ next diagram on the left) or the average for each day (as in the
diagram on the right)? As the diagram below shows, we want to perform the
operation across an axis:

![](fig/python-operations-across-axes.png){alt="Per-patient maximum inflammation is computed row-wise across all columns usingnumpy.amax(data, axis=1). Per-day average inflammation is computed column-wise across all rows usingnumpy.mean(data, axis=0)."}
![](fig/python-operations-across-axes.svg){alt="Per-patient maximum inflammation is computed row-wise across all columns usingnumpy.max(data, axis=1). Per-day average inflammation is computed column-wise across all rows usingnumpy.mean(data, axis=0)."}

To find the **maximum inflammation reported for each patient**, you would apply the `max` function moving across the columns (axis 1). To find the **daily average inflammation reported across patients**, you would apply the `mean` function moving down the rows (axis 0).
To find the **maximum inflammation reported for each patient**, you would apply the `max` function moving across the columns (axis 1). To find the **daily average inflammation reported across patients**, you would apply the `mean` function moving down the rows (axis 0).

To support this functionality, most array functions allow us to specify the axis we want to work on. If we ask for the max across axis 1 (columns in our 2D example), we get:
To support this functionality, most array functions allow us to specify the axis we want to work on. If we ask for the maximum across axis 1 (columns in our 2D example), we get:

```python
print(numpy.max(data, axis=1))
Expand All @@ -437,7 +425,7 @@ print(numpy.max(data, axis=1))
17. 16. 17. 19. 18. 18.]
```

As a quick check, we can ask this array what its shape is. We expect 60 patient maximums:
As a quick check, we can ask this array what its shape is. We expect 60 patient maxima:

```python
print(numpy.max(data, axis=1).shape)
Expand Down Expand Up @@ -779,11 +767,11 @@ it matter if the change in inflammation is an increase or a decrease?

## Solution

By using the `numpy.amax()` function after you apply the `numpy.diff()`
By using the `numpy.max()` function after you apply the `numpy.diff()`
function, you will get the largest difference between days.

```python
numpy.amax(numpy.diff(data, axis=1), axis=1)
numpy.max(numpy.diff(data, axis=1), axis=1)
```

```python
Expand All @@ -804,7 +792,7 @@ Notice the difference if you get the largest *absolute* difference
between readings.

```python
numpy.amax(numpy.absolute(numpy.diff(data, axis=1)), axis=1)
numpy.max(numpy.absolute(numpy.diff(data, axis=1)), axis=1)
```

```python
Expand All @@ -831,7 +819,7 @@ array([ 12., 14., 11., 13., 11., 13., 10., 12., 10., 10., 10.,
- Array indices start at 0, not 1.
- Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`.
- Use `# some kind of explanation` to add comments to programs.
- Use `numpy.mean(array)`, `numpy.amax(array)`, and `numpy.amin(array)` to calculate simple statistics.
- Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics.
- Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis.

::::::::::::::::::::::::::::::::::::::::::::::::::
Expand Down
22 changes: 11 additions & 11 deletions episodes/03-matplotlib.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,14 +79,14 @@ the medication takes 3 weeks to take effect. But a good data scientist doesn't
average of a dataset, so let's have a look at two other statistics:

```python
max_plot = matplotlib.pyplot.plot(numpy.amax(data, axis=0))
max_plot = matplotlib.pyplot.plot(numpy.max(data, axis=0))
matplotlib.pyplot.show()
```

![](fig/inflammation-01-maximum.svg){alt='A line graph showing the maximum inflammation across all patients over a 40-day period.'}

```python
min_plot = matplotlib.pyplot.plot(numpy.amin(data, axis=0))
min_plot = matplotlib.pyplot.plot(numpy.min(data, axis=0))
matplotlib.pyplot.show()
```

Expand Down Expand Up @@ -127,10 +127,10 @@ axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0))
axes2.plot(numpy.max(data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
axes3.plot(numpy.min(data, axis=0))

fig.tight_layout()

Expand Down Expand Up @@ -215,7 +215,7 @@ Update your plotting code to automatically set a more appropriate scale.
```python
# One method
axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
axes3.plot(numpy.min(data, axis=0))
axes3.set_ylim(0, 6)
```

Expand All @@ -227,10 +227,10 @@ axes3.set_ylim(0, 6)

```python
# A more automated approach
min_data = numpy.amin(data, axis=0)
min_data = numpy.min(data, axis=0)
axes3.set_ylabel('min')
axes3.plot(min_data)
axes3.set_ylim(numpy.amin(min_data), numpy.amax(min_data) * 1.1)
axes3.set_ylim(numpy.min(min_data), numpy.max(min_data) * 1.1)
```

:::::::::::::::::::::::::
Expand Down Expand Up @@ -269,10 +269,10 @@ axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0), drawstyle='steps-mid')

axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0), drawstyle='steps-mid')
axes2.plot(numpy.max(data, axis=0), drawstyle='steps-mid')

axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0), drawstyle='steps-mid')
axes3.plot(numpy.min(data, axis=0), drawstyle='steps-mid')

fig.tight_layout()

Expand Down Expand Up @@ -336,10 +336,10 @@ axes1.set_ylabel('average')
axes1.plot(numpy.mean(data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0))
axes2.plot(numpy.max(data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
axes3.plot(numpy.min(data, axis=0))

fig.tight_layout()

Expand Down
8 changes: 4 additions & 4 deletions episodes/06-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,10 +74,10 @@ for filename in filenames:
axes1.plot(numpy.mean(data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0))
axes2.plot(numpy.max(data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
axes3.plot(numpy.min(data, axis=0))

fig.tight_layout()
matplotlib.pyplot.show()
Expand Down Expand Up @@ -199,10 +199,10 @@ axes1.set_ylabel('average')
axes1.plot(numpy.mean(composite_data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.amax(composite_data, axis=0))
axes2.plot(numpy.max(composite_data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.amin(composite_data, axis=0))
axes3.plot(numpy.min(composite_data, axis=0))

fig.tight_layout()

Expand Down
18 changes: 9 additions & 9 deletions episodes/07-cond.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,8 +174,8 @@ if maximum inflammation in the beginning (day 0) and in the middle (day 20) of
the study are equal to the corresponding day numbers.

```python
max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]
max_inflammation_0 = numpy.max(data, axis=0)[0]
max_inflammation_20 = numpy.max(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
print('Suspicious looking maxima!')
Expand All @@ -186,7 +186,7 @@ the minima per day were all zero (looks like a healthy person snuck into our stu
We can also check for this with an `elif` condition:

```python
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
elif numpy.sum(numpy.min(data, axis=0)) == 0:
print('Minima add up to zero!')
```

Expand All @@ -202,12 +202,12 @@ Let's test that out:
```python
data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',')

max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]
max_inflammation_0 = numpy.max(data, axis=0)[0]
max_inflammation_20 = numpy.max(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
elif numpy.sum(numpy.min(data, axis=0)) == 0:
print('Minima add up to zero!')
else:
print('Seems OK!')
Expand All @@ -220,12 +220,12 @@ Suspicious looking maxima!
```python
data = numpy.loadtxt(fname='inflammation-03.csv', delimiter=',')

max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]
max_inflammation_0 = numpy.max(data, axis=0)[0]
max_inflammation_20 = numpy.max(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
elif numpy.sum(numpy.min(data, axis=0)) == 0:
print('Minima add up to zero!')
else:
print('Seems OK!')
Expand Down
22 changes: 11 additions & 11 deletions episodes/08-func.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,10 +217,10 @@ def visualize(filename):
axes1.plot(numpy.mean(data, axis=0))

axes2.set_ylabel('max')
axes2.plot(numpy.amax(data, axis=0))
axes2.plot(numpy.max(data, axis=0))

axes3.set_ylabel('min')
axes3.plot(numpy.amin(data, axis=0))
axes3.plot(numpy.min(data, axis=0))

fig.tight_layout()
matplotlib.pyplot.show()
Expand All @@ -234,9 +234,9 @@ def detect_problems(filename):

data = numpy.loadtxt(fname=filename, delimiter=',')

if numpy.amax(data, axis=0)[0] == 0 and numpy.amax(data, axis=0)[20] == 20:
if numpy.max(data, axis=0)[0] == 0 and numpy.max(data, axis=0)[20] == 20:
print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
elif numpy.sum(numpy.min(data, axis=0)) == 0:
print('Minima add up to zero!')
else:
print('Seems OK!')
Expand Down Expand Up @@ -317,12 +317,12 @@ It's hard to tell from the default output whether the result is correct,
but there are a few tests that we can run to reassure us:

```python
print('original min, mean, and max are:', numpy.amin(data), numpy.mean(data), numpy.amax(data))
print('original min, mean, and max are:', numpy.min(data), numpy.mean(data), numpy.max(data))
offset_data = offset_mean(data, 0)
print('min, mean, and max of offset data are:',
numpy.amin(offset_data),
numpy.min(offset_data),
numpy.mean(offset_data),
numpy.amax(offset_data))
numpy.max(offset_data))
```

```output
Expand Down Expand Up @@ -779,8 +779,8 @@ then the replacement for a value `v` should be `(v-L) / (H-L)`.)

```python
def rescale(input_array):
L = numpy.amin(input_array)
H = numpy.amax(input_array)
L = numpy.min(input_array)
H = numpy.max(input_array)
output_array = (input_array - L) / (H - L)
return output_array
```
Expand Down Expand Up @@ -836,8 +836,8 @@ do the two functions always behave the same way?
```python
def rescale(input_array, low_val=0.0, high_val=1.0):
"""rescales input array values to lie between low_val and high_val"""
L = numpy.amin(input_array)
H = numpy.amax(input_array)
L = numpy.min(input_array)
H = numpy.max(input_array)
intermed_array = (input_array - L) / (H - L)
output_array = intermed_array * (high_val - low_val) + low_val
return output_array
Expand Down
2 changes: 1 addition & 1 deletion episodes/10-defensive.md
Original file line number Diff line number Diff line change
Expand Up @@ -527,7 +527,7 @@ can you think of a function that will pass your tests but not his/hers or vice v
# a possible pre-condition:
assert len(input_array) > 0, 'Array length must be non-zero'
# a possible post-condition:
assert numpy.amin(input_array) <= average <= numpy.amax(input_array),
assert numpy.min(input_array) <= average <= numpy.max(input_array),
'Average should be between min and max of input values (inclusive)'
```

Expand Down
Loading
Loading