Getting started with data science can be a bit overwhelming especially if you don't know where to begin, believe me i have been there. The greatest thing you can do is just starting. There's no greater recipe in life than that. I'll take a few steps today and share one of the most crucial things you'll need to know as you venture in your journey towards data science.
Introduction To Numpy And Panda's
NumPy and Pandas are Python libraries that have been widely used in data analysis and data science. Pandas are used to create Series and DataFrames, which makes datasets tabular. NumPy creates arrays, and are mainly used to work with numerical data. These helpful libraries make our works as data analysts easier.
We are going to take a look on how to install and use these libraries.
Installation
First, we are going to install Python locally to our computer, click here. After the installation of Python locally, we can now move on to installing a python editor. There are many editors available like : Pycharm, Visual Studio and Anaconda.
Anaconda is popularly used by data users because of the Jupyter Notebook it houses, which is the working interface used to write Python codes for data analysis. Also, there is no need installing Python separately as it comes along with Anaconda, alongside some other python popular libraries. Hence it is recommended to be downloaded. After the installation of Anaconda, open it and run the Jupyter notebook which will be found inside the package.
Let’s Code
After running the Jupyter Notebook command, now we can start coding. But first we create a new notebook by selecting Python 3
After this, let’s rename our Notebook to Book1 by opening file on the top left corner, rename, type Book1.Our notebook will now be saved as Book1.ipynb. After we have done this, now we can start coding.
The first thing we have to do is to import the NumPy and Pandas libraries.
NumPy
After importing the libraries, lets create our first numpy array From the above code, our resulting variable, ‘a’, gave us a one-dimensional array. For us to create a two-dimensional array, we do the following: Let’s find out the shape of our array, Let’s check out the datatype of our array, Now that we have got the basics, lets create an all zeros array Let’s try creating 2D array of ones Now let’s do some math with our arrays We can subtract, multiply and divide. Try them all! Now that we have got our way around, it’s time to explore statistics We can find out the minimum, maximum and mean values We can now use the ‘arange’ function as well to create a list of values within the given range. Note that in Python, counting starts from zero (0) To replace values in an array, you can simple do this
There are so much more we can do with NumPy. You can also check out their documentation here, for more details.
We'll have a look on the introduction to Pandas on our next article. Do comment and let me know your thoughts, also follow me on twitter @Thulie_Vannie