Starting with Python

Python is one of the most popular programming languages in general. In data science, it competes with Matlab and R for first place on the podium.

In our everyday we often use python to pre-process and analyze the data. In this tutorial we will explain our preferred way of installing python and managing its libraries. There are several ways to install python

this is the one we recommend for its simplicity and flexibility

Installing Python

Anaconda

Anaconda is an open-source distribution for python. It is used for data science, machine learning, deep learning, etc. It makes simple to download and organize your python environment.

To install Anaconda simply follow this link. Choose a version suitable for you and click on download. Once you complete the download, open the setup.

Follow the instructions in the setup.

After the installation is complete, launch the Anaconda navigator. The Anaconda Navigator is a desktop GUI that comes with the anaconda distribution. Navigator allows you to launch common Python programs and easily manage conda packages, environments, and channels without using command-line commands. Navigator can search for packages on Anaconda Cloud or in a local Anaconda Repository.

Miniconda

While Anaconda is one of the most simple way to install python it is quite heavy. It install multiple programs and libraries that most of the time are unnecessary. For this reason we prefer to use Miniconda.

Miniconda is a minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages. This means that fewer packages are installed and that we have more control on what to have on our PC. To use Miniconda download the installer from Miniconda

The installation is exactly the same as the one for Anaconda. After it has finished the installation you will find the Anaconda Prompt between your programs. We will use the Anaconda Prompt to install the packages and the software that we need.

Installing Spyder ide

Spyder is a powerful scientific environment ide written in Python. Spyder is designed by and for scientists, engineers and data analysts. It integrates very useful functionalities (variable explorer, plotting, etc.) making it our favorite way to write and run python code.

When you install the full version of Anaconda, it installs Spyder for you. However, Miniconda doesn’t come with Spyder. Fortunately, Miniconda does make it easy to install. In the Anaconda Prompt type following command and press the “enter” key:

conda install -c conda-forge spyder

After thinking a bit, the Anaconda prompt will prepared a bunch of files to download and ask if you’d like to proceed. Go ahead and type “y” and hit “enter” to proceed.

It will take a while but after few seconds Spyder will be installed on your computer. To launch it you can click on the shortcut that should have appeared in your Start menu or by typing “spyder” in the Anaconda prompt.

Here it is Spyder in all its glory:

Installing python packages

To pre-process and analyze the data we will need some specific packages. It is easy to install and manage packages using conda. Open the Anaconda prompt and type conda install …. For example, if we want to install pandas (a very useful package to handle data-frames) we would write:

conda install pandas

After thinking about for few seconds the prompt will download the package and its dependencies. Occasionally, conda may encounter difficulties locating the precise package you need. In such cases, pip, another renowned Python package manager, can come to the rescue. You might already be aware of pip, versatility for installing and manage packages in Python.

Let’s consider numpy, an exceptional package for array manipulation. With pip, installation is a breeze. Simply use the following command:

pip install numpy

Managing environments

Python has different versions. Sometimes when we create scripts we needs to run a specific version of python because some of our packages may rely on older version of python or we have conflicting dependencies with other packages. Virtual environments keep these dependencies in separate “sandboxes” so you can switch between both applications easily and get them running.

One of the best features of Anaconda and Minconda is the ease with which environments can be created and managed. Conda allows you to create, export, list, remove and update environments in which different versions of Python and/or packages are installed. Let’s create an environment called “test” with a older version of python. To do so just open the Anaconda prompt and type:

conda create --name test python=3.7

Press y when asked and after few seconds we will have out new “test” environment. Now we want to tell our Anaconda prompt that we want to work on our brand new “test” environment. To do so let’s write:

conda activate test

Now we would probably want to install some packages and Spyder in this new environment. Follow the tutorials above:

Creating an environment from an environment.yml file

Another interesting feature of conda is that we can save an environment file from an environment we have created. We can later use this file to recreate the same environment with all its packages. This is very convenient when we want to have several people and several PCs with the same environment. We will provide some environment files in our tutorials. These environments have been tested for their specific use and will provide a standard environment for all the users.

To create an environment from a environment file just type:

conda env create -f environment.yml

This again will take some time but it will create the new environment with all the packages specified in it.

conda env create -f environment.yml
Back to top