Q3

While loops, conditionals, and loop control statements.

A

Write a function sum_of_list which simply computes the cumulative sum of all the numbers in a list.

  • takes 1 argument: a list of numbers
  • returns 1 number: the sum of all the numbers

You cannot use any built-in functions.


In [ ]:


In [ ]:
import numpy as np
np.random.seed(65537)

i1 = np.random.random(100)
np.testing.assert_allclose(i1.sum(), sum_of_list(i1.tolist()))

i2 = np.random.random(10)
np.testing.assert_allclose(i2.sum(), sum_of_list(i2.tolist()))

B

Write a function unordered_percentile which determines the element of an input list that is the first to fall within a certain percentile of values.

  • takes 2 arguments: a list of numbers, and a floating-point percentile value (between 0 and 100)
  • returns an integer, corresponding to the index in the input list of the first number to exceed the specified percentile

A percentile is sort of like an average; more concretely, it's like a median--in fact, the median is the 50th percentile, meaning 50% of the numbers are below it and 50% are above it. Likewise, if some data point is in the 95th percentile, this means 95% of the data is less than the specific point in question.

As you can infer, this normally presumes the data are sorted. Here, we're not sorting the list beforehand, hence unordered. So we're computing percentile based on observed data, from left to right in the list. As you move left to right, you're computing a sum over the numbers, and once your sum exceeds the given percentile, return that index in the list where it happened.

For example, if you have an input list [5, 4, 3, 2, 1] and percentile value of 90, the function would return 3, as the index 3 (with the number 2) is the first to cross the 90th percentile threshold. If, however, the list were reversed ([1, 2, 3, 4, 5]) with the same percentile value (90), the function would return 4 (corresponding to the number 5), since the cumulative sum doesn't cross the 90th percentile until the very last value.

Use your function sum_of_list from Part A to help.


In [ ]:


In [ ]:
i1 = [5, 4, 3, 2, 1]
p1 = 90
a1 = 3
assert a1 == unordered_percentile(i1, p1)

i2 = [1, 2, 3, 4, 5]
p2 = 90
a2 = 4
assert a2 == unordered_percentile(i2, p2)

i3 = [12, 46, 50, 54, 27,  8, 45, 39, 25, 37]
p3 = 50
a3 = 4
assert a3 == unordered_percentile(i3, p3)

C

Write a function ordered_percentile, which does the same thing as the function you wrote in Part B, except it first sorts the list in ascending order, then computes the percentile index.

  • takes 2 arguments: a list of numbers, and a floating-point percentile value (between 0 and 100)
  • returns an integer, corresponding to the index in the input list of the first number to exceed the specified percentile

Percentile is computed exactly as before, but you'll need to sort the list beforehand. Hint: list objects in Python have an in-place sort() method you can use.

The important point here is that you need to return the index in the original, unsorted input list of the number that crosses the percentile threshold. For example, with an input list [5, 4, 3, 2, 1] and percentile value of 90, I'd first need to sort the list ([1, 2, 3, 4, 5]), then find the value which crossed the 90th percentile (4), then find the index of that number in the original, unsorted list and return that. In this case, that would be index 1.

Other than sort(), you cannot use any built-in functions.

Hint: you can make very handy use of your function from Part B here to vastly simplify the problem.


In [ ]:


In [ ]:
i1 = [5, 4, 3, 2, 1]
p1 = 90
a1 = 0
assert a1 == ordered_percentile(i1, p1)

i2 = [1, 2, 3, 4, 5]
p2 = 90
a2 = 4
assert a2 == ordered_percentile(i2, p2)

i3 = [12, 46, 50, 54, 27,  8, 45, 39, 25, 37]
p3 = 50
a3 = 6
assert a3 == ordered_percentile(i3, p3)

D

Write a function invariant_percentile which determines if the percentile of a given list of numbers is the same, regardless of whether the list is ordered or unordered.

  • takes 2 arguments: a list of numbers, and a floating-point value between 0 and 100 to indicate the percentile threshold
  • returns True if the percentile value is the same whether the list is ordered or unordered, False otherwise

In this function, you'll need to compute both the unordered and ordered percentiles of the input list (feel free to use your solutions from Part B and Part C!) and see if the resulting value that crosses the percentile threshold is the same. If so, return True; if not, return False.

For example, the list [5, 4, 3, 2, 1] with a percentile of 60 would be True: in the unordered (original) input list, the 4 would cross the 60th percentile, while in the ordered list ([1, 2, 3, 4, 5]), the 4 would still be the value that crosses the 60th percentile.


In [ ]:


In [ ]:
i1 = [5, 4, 3, 2, 1]
p1 = 60
assert invariant_percentile(i1, p1)

i2 = [5, 4, 3, 2, 1]
p2 = 90
assert not invariant_percentile(i2, p2)

i3 = [12, 46, 50, 54, 27,  8, 45, 39, 25, 37]
p3 = 3
assert invariant_percentile(i3, p3)