jDataLab Jager

8 minute read

You are just getting started with Data Science, Machine Learning or Artificial Intelligence, and Python is one of the languages you have chosen. Right now you are working on setting up a Python environment in your Mac or PC. If all of the above mentioned are true, this post is a handy reference to setting up the most popular Python data science platform Anaconda in your local computer. Anaconda offers free individual edition, which currently the easiest way to learning from data with Python.

S1. The Associated Video with video chapters

The video can be navigated through by the video chapters, including:

  1. An introduction to the relevant terms in this particular setting:

  2. Step-by-step demos

You will know how to set up a Python environment where you do machine learning and data science in an interactive notebook which allows other people reproduce your work. Specifically, you will create virtual environments with Conda; manage dependencies of projects with virtual environments; add a virtual environment as a Jupyter kernel and connect a notebook to a kernel in JupyterLab.

  • Step-by-step Demo For Windows

  • Step-by-step Demo For Mac


S2. Why Anaconda?

The benefits of using Anaconda come from its powerful components, which includes:

  • Both the classical Jupyter Notebook and a modern notebook interface JupyterLab support interactive development and reproducible work.
  • A bundled Python3 distribution, but you can still install other versions separately from Anaconda.
  • Conda is a package and environment management tool, which not only helps you create, load and switch between environments, but also makes it easy to find and install over 7500 packages.

Before continuing reading the post, check the following two notes:

  1. I assume you have previously installed a standalone Python in your local computer. If not, you can refer to another post and install Python first: Fully Remove Python and Install a Fresh Python in MacOS and Windows.

  2. We will install Visual Studio Code and Anaconda. If you have already installed either one or both and they are not working as expected, you may perform a complete removal of them as well as configurations and libraries. Then you may follow the guide and set up a brand new environment.

Therefore the guide first shows you, in Section 1 UNINSTALL, the removal of Visual Studio Code as well as Anaconda from MacOS and Windows, respectively. If you have neither one previously installed, you go directly to Section 2 SET UP THE PYTHON ENVIRONMENT FOR DATA SCIENCE.


S3. Uninstall

1. Remove Visual Studio Code and its Extensions

MacOS

The last video chapter shows how to remove VS Code from Mac.

Windows

  1. Run the uninstall program unins000.exe in the directory of your VS Code. The location depends on the installer type, System or User. The default location of the System install is C:\Program Files\Microsoft VS Code.
  2. Remove settings and configurations: Delete the directory C:\Users\username\AppData\Roaming\Code
  3. Remove all the extensions: Delete the directory C:\Users\username\.vscode

2. Remove Anaconda and its footprints

You may follow the official guide of performing a deep clean of Anaconda. As indicated by the guide, a deep clean requires the operations in both Option B and A.

"If you also want to remove all traces of the configuration files and directories from Anaconda and its programs, you can download and use the Anaconda-Clean program first, then do a simple remove. See Option B."

Also, the guide points out that anaconda-clean creates a backup folder .anaconda_backup.

"Anaconda-Clean creates a backup of all files and directories that might be removed in a folder named .anaconda_backup in your home directory. Also note that Anaconda-Clean leaves your data files in the AnacondaProjects directory untouched."

The following lists the commands in Mac as well as the operations in Windows.

MacOS

  1. In the Terminal, run the two commands consecutively:

    • conda install anconda-clean
    • anaconda-clean --yes
  2. Remove the entire Anaconda directory, which has a name such as anaconda2, anaconda3, or ~/opt:

    • Find the name of Anaconda directory.
    • sudo rm -rf ~/opt (replace ~/opt with the directory name)
  3. Remove conda directories:

    • rm -rf ~/.condarc ~/.conda ~/.continuum
  4. Remove Anaconda path from .bash_profile:

    • Run the command in Terminal: open ~/.bash_profile
    • Remove the line similar to the following and having anaconda in the path, and save the change.

    export PATH="/Users/yourname/anaconda3/bin:$PATH"

Windows

  • In the Anaconda Navigator, launch the CMD.exe prompt terminal.
  • In the terminal, run the following commands consecutively:
conda install anconda-clean
anaconda-clean --yes
  • Locate the root directory of Anaconda. Run Uninstall-Anaconda3.exe (The Anaconda directory is similar to C:\Users\Jager\anaconda3)
  • In the Windows Explorer, manually remove the Anaconda directory.
  • (appwiz.cpl) In the Control Panel, choose Uninstall a program. Search python. Uninstall the Python record prefixed with Anaconda.
  • In the Windows Explorer, remove the shortcuts in the following folder: C:\Users\Jager\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Anaconda3 (64-bit)
  • Empty the Recycle Bin

S3. Set up the Python environment for Data Science

Assuming you have neither VS Code nor Anaconda in your computer. The following shows how to make a fresh setup with VS Code and Anaconda.

1. Install Visual Studio Code

To install Visual Studio Code, Microsoft's free and cross-platform code editor, use the following procedures for MacOS and Windows, respectively.

MacOS

Refer to one video chapter for Mac in the video on top of the post.


Windows

  • In the VS Code site, there are two download options: user Installer for individual user and System Installer for all the users. Choose the one and install.
  • Run Python in the Code:

    • You must have a standalone Python install. If not, install one by following the post:

      Fully Remove Python and Install a Fresh Python in MacOS and Windows

    • In the Code:

      • Create a new file and save it with the extension py.
      • Follow the recommendation from the Code and install Microsoft Python extension.
      • Open the Command Palette. Search >Python:Select Interpreter. All the Python commands should appear in the list. You can select one for the current folder or file.
      • In the Terminal inside the Code, you can run pip to install Python packages into the currently selected Python.

2. Install Anaconda

MacOS

I recommend you follow The official instructions


Windows

I recommend you follow The official instructions

Take a glimpse of the following notes before launching the installer:

Do not install as Administrator unless admin privileges are required.

Install Anaconda to a directory path that does not contain spaces

"We recommend not adding Anaconda to the PATH environment variable, since this can interfere with other software. Instead, use Anaconda software by opening Anaconda Navigator or the Anaconda Prompt from the Start Menu."

The default Anaconda folder is C:\Users\yourusername\anaconda3

Download Windows installer


3. How to run conda command

You will need to run conda commands in the last three sections. To run conda commands,

Windows: In the Anaconda Navigator, open CMD.exe Prompt

MacOS: Open the Terminal


S4. Working with Jupyter Notebook in Anaconda

Jupyter Notebook

  • Open Anaconda Navigator.app. (the official guide)
  • In the navigator, launch Jupyter notebook.
  • The default environment is base(root) with the bundled Python 3.8.5 within Anaconda as of the date when this post was written.
  • Create a new Jupyter notebook untitled.ipynb.
  • In the notebook, enter the script: import pandas as pd in the cell. Test the cell.

Package Management

You can run conda command in terminal to install a package. Refer to the official guide: Installing conda packages

I list some commands here for a quick reference.

conda install package-name
conda install package-name=x.y.z
conda install package-name=x.y.z -n environ-name

S5. Virtual Environments

Finally it is the time to learn how to management versions and packages in Anaconda. Anaconda includes Conda as a tool to manage applications, environments and packages.

Refer to the video chapters.