The array is a data structure for most programming languages It is to represent orderly datasets Having said that, I suppose it might also occur to you that lists seem to be such a structure too it really is A list may be used as an array in Python Besides, there is also a special array type in Python Then, why does NumPy still need such a data structure like ndarray Coming soon Lists and tuples can be used as arrays but they still differ slightly from arrays In a list, for example its elements may be of any type So, what are saved in the list are the pointers of objects For example, to save a simple list say, containing 4 elements like "1 2 3 4" it needs 4 pointers and 4 integer objects That would waste memory and computing time By contrast, the "array" module then creates arrays through the "array" function It also provides insert(), append() and similar functions But it does not support N-dimensional array and the functions do not have rich uses, thus not frequently used By contrast, "ndarray" is a basic data structure in NumPy also known as "array" As its name suggests, it is an array All elements in the array are of the same type Since "array" has abundant functions most of their performances are very powerful Well, let's learn about "ndarray" first There's an array of 5*5 Let's analyze this array We may actually regard this 5*5 array as one composed of five one-dimensional arrays And each one-dimensional array consists of 5 elements Here, this is the first dimension, known as axis 0 This is the second dimension, known as axis 1 The quantity of axis/axes is "rank" There are 2 axes in this array, so the value of rank here is 2 Then, let's see "axis" Many functions, when used can be followed by such an argument: axis like axis = 0 What does it mean It means operating along axis 0 Actually it means to operate each column How about axis = 1? It obviously means to operate each row "ndarray" has some basic attributes For example, "ndim" can be used for calculating its rank "shape" can be used for calculating its dimensions "size" can be used for calculating the total quantity of elements For example, look at this array here Its rank is 2 Its dimension is (2,2) The total quantity of elements is 25 Well, next, let's look at the method of creating "array" NumPy provides many functions for creating ndarray The most basic one is the function: array() We can create a one-dimensional array or create a two-dimensional array In addition, functions such as arange(), random(), linspace() are often used to create arrays arrange: this one should be familiar to you It's similar to our "range" but it can handle floating-point numbers And "random()" is to get a random number The linspace() function is also very common It creates a linearly spaced array between the starting, the ending point and with a specified quantity of numbers Sure, the ending point may be not included Set it as False It is True by default Besides, we also often use ones(), zeros() and similar functions to create such basic arrays Moreover, there's a very special function: fromfunction() It creates an array of a certain dimension from a function The function here is like this In fact, we may write this function like this return (i+1)*(j+1) Have a look Does it actually generate a 9*9 multiplication table Its effect is like this from "one times one equals one" to "nine times nine equals eighty-one" Funny, right Let's briefly demonstrate them Create an array with the array() function Suppose we're creating such an array: 2*3 Let's look at the attributes of this array This is its rank, dimension, quantity of its elements Let's create another element array For example, we use the arange() function y is successfully created ndarray provides many methods of operation and calculation to enable use to get our desired result For example, let's have a look This symbol must be quite familiar, slicing Then, what does it mean if we choose 0 to 2 it means we choose Row 0 and Row 1 If we write nothing for the first dimension it would mean we want all rows Then, let's look at the 0 and 1 at the back This is the column, meaning choosing Column 0 and Column 1 So, the result is 1 2 4 5 and we may add both dimensions in the argument We choose the first row then so, as we see, the result of Column 0 and Column 1 of Row 1 are 4 and 5 We may also traverse an array with such a method Let's do actual operations First, define an array of 3 rows and 3 columns To select the data of the 1st row or the data of the 1st column and the 3rd column what should we do We may rely on x[0] to select the data of the 1st row and use x[:, [0,2]] to select the 1st column and the 3rd column Furthermore, if we're to select the 1st row and the 3rd row does it work to just add the step length written like this Obviously, the 1st and 3rd columns can be selected in this way Sure, our two ways of writing here actually select array rows and columns of odd numbers If written like this, what will the result be Compare them As we see, this way of writing exchanges the rows similar to the way of writing we use for list processing What if we wanna exchange columns will this way alone suffice We may also operate ndarray with some functions For example, as we see, the dimension of this array is 2 * 3 When we're printing we may change its dimension with the reshape() function We may change it into a 3 * 2 array Look at the result. Has it changed Is that interesting With this method, however, as we see the original form of aArray remains the same If we really want to change its shape we may use the resize() function As we see, the result has changed Let's demonstrate it Suppose we're to create an array of 4 rows and 4 columns whose values range from 1 to 16 how can we realize it Only with a combination of the np.arange() function it works, right Very convenient, right Apart from the designated rows and columns to transform the expression of -1 is often used in scientific computing as well which is very useful For instance, x.reshape(2, -1) means transforming the array x into 2 rows and n columns Here, based on the array and the determined dimensions it can be computed that n is 8 Since it's the only and definite one, it doesn't need explicit expression just expressed as -1 One more example. For machine learning tasks we often need expressions like x.reshape(-1, 1) to change the shape of array to facilitate some vector computation There's another pair of frequently used functions vstack() and hstack() one is to stack vertically and the other is stack horizontally For ndarray, there are also some basic operators for operation like multiplication and addition Besides, here we see two very special arrays They can even be added Why can the operation of the two arrays be successful That's due to a highly important feature of NumPy arrary — broadcasting When the shapes of two arrays to be operated are different, the broadcasting mechanism is trigger off and NumPy will compare the dimensions of two arrays one by one Only when they are equal or the length of one dimension is 1 can the operation be correct For example, Arrays a and b both contain elements of 3 columns one dimension of a is 1 During processing, operation is along this dimension and the first group of values at this dimension is used So, a is expanded to two rows of [1,2,3] and thus the computed operation is like this An array can also be operated with a scalar value, like 2 If 2 is added to Array b it will be expanded to an array with the same dimensions of b and all values being 2 Then, it will be added to each element of b Operation is like this Besides, we may use some methods to conduct various operations to arrays like sum(), to get the sum As we mentioned before we may apply such arguments to add numbers in column or in rows This is to return the minimum value This is to return the index of the maximum value In this instance here, the value and the index happen to be the same And, you must know this one, acquiring the average value var and std mean variance and standard variance, respectively There are also some dedicated applications for ndarray like very common applications in linear algebra We may resort to some functions in NumPy for various calculations For example, here, we may use the dot() function to calculate the inner product of matrix In addition, in NumPy, there's a "linalg" module, sharing the same name as in SciPy library It can be roughly regarded as a subset of the corresponding module in SciPy library It's also possible for us to apply many of its functions for various operations The det() function, for example, can be used to calculate the determinant of matrix and, inv(), to calculate inverse matrixes At the end of ndarray let's talk about the ufunc function i.e., the universal function The universal function in NumPy is a kind of function that can operate each element in an array like these They are not all of them We can view all function names at the official website The universal function has some own methods like reduce(), accumulate() These functions are realized at the C language level so the calculating speed is very high In case of big data quantities, in particular it is faster than the corresponding function in "math" Let's look at an example like this It is to calculate the quadratic result of sin(t) of a group of numbers We use a function in "time" to calculate the operating time Compare the two Let's try this program Well, let's execute this program As we see The execution result is like this Use the pow() and sin() functions in the "math" library Its operating time is like this while the operating time with the universal function is like this As we see the difference is still big For this reason, when dealing with big data quantities we should prefer the universal function in NumPy to perform the task