Monday, December 25, 2017

Python Numpy

You are aware that the Python list is pretty powerful: A list can hold any type and can hold different types at the same time. You can also change, add and remove elements.

This is wonderful, but one feature is missing.

When analyzing data, you'll often want to carry out operations over entire collections of values, and you want to do this fast. With lists, this is a problem.

Let's consider we have two lists

If you now want to calculate the Body Mass Index for each family member, you'd hope that this call can work, making the calculations element-wise.
Unfortunately, Python throws an error, because it has no idea how to do calculations with lists.

You could solve this by going through each list element one after the other, and calculating the BMI for each person separately, but this is terribly inefficient and tiresome to write.

A way more elegant solution is to use NumPy, or Numeric Python. It's a Python package that, among others, provides an alternative to the regular python list: the Numpy array.

The Numpy array is pretty similar to a regular Python list, but has one additional feature: you can perform calculations over all entire arrays.

It's really easy, and super-fast as well.

Let's start with _creating_ a numpy array. You do this with Numpy's `array()` function: the input is a regular Python list. I'm using `array()` twice here, to create Numpy versions of the `height` and `weight` lists.
Let's try to calculate everybody's BMI with a single call again:
First, we tried to do calculations with regular lists, like this, but this gave us an error, because Python doesn't now how to do calculations with lists like we want them to.

Next, these regular lists were converted to Numpy arrays. The same operations now work without any problem: Numpy knows how to work with arrays as if they are single values
First of all, Numpy can do all of this so easily because it assumes that your Numpy array can only contain values of a single type. It's either an array of floats, either an array of booleans, and so on.

If you do try to create an array with different types, like this for example, the resulting Numpy array will contain a single type, string in this case. The boolean and the float were both converted to strings.

Second, you should know that a Numpy array is simply a new kind of Python type, like the float, string and list types from before
Take this Python list and this numpy array, for example:
If you do `python_list + python_list`, the list elements are pasted together, generating a list with 6 elements.
 If you do this with the numpy arrays, on the other hand, Python will do an element-wise sum of the array:
Specifically for Numpy, there's also another way to do list subsetting: using an array of booleans. Say you want to get all BMI values in the bmi arrays that are over 23. A first step is using the greater than sign, like this:
The result is a Numpy array containing booleans: True if the corresponding bmi is above 23, False if it's below. 
Next, you can use this boolean array inside square brackets to do subsetting
  • ·         The Numpy Package provides the array, a data type that can be used to do element-wise calculations.
  • ·         Because Numpy arrays can only hold element of a single type, calculations on Numpy arrays can be carried out way faster than regular Python lists.

0 comments:

Post a Comment

Contact

Talk to us (+91-9738925800)