Python

PYTHON LOCAL REPOSITORY UPDATED 15 DECEMBER 2020

If you are already using the Python local repositories to obtain your Python packages, please generate and run the installer script again to update the package (version 0.4).

Python is available through the Install DRE Applications installer on your virtual machine and can easily be installed on any virtual machine by an owner of a workspace.

Tip: Make sure you choose 'install for all users' when installing new software on a VM.

To work with Python, you often need additional packages. Because there is no internet connection available in the virtual machines, it is not possible to use PIP Install commands. Instead, we recommend the following solutions:

  • Anaconda

  • Local Python repository

  • Upload packages through DRE portal

Anaconda

Anaconda is an open source platform that offers a wide variety of Python libraries, including essentials such as numpy, pandas, and matplotlib. It was built 'by Data Scientists, for Data Scientists', so it also offers many tools and libraries related to data processing and visualisation.

Because Anaconda uses fixed IP addresses, it is possible to open up ports in the External Access tab in the DRE portal to access these libraries. Anaconda is best used in combination with Jupyter Lab. We therefore recommend following these steps to configure both programs together. Python Anaconda can easily be installed through the Install DRE Applications installer. The following port rules must be configured for each virtual machine running Anaconda:

Rule IP Port Reason
AnacondaRepo 104.16.130.3 443 Accessing repo.anaconda.com
Condaorg 104.17.93.24 443 Accessing conda.anaconda.org

Local Python repository

In the DRE, we have set up a local mirror that contains many packages that are also available via PIP. To use this solution, install the Python local repository through the Install DRE Applications program. This produces a Python script that should be run in a Python environment on your virtual machine (once per VM). Next, you can use the following commands in any Python script to make use of the available libraries:

from local_package_installer.local_package_installer import install_local

#Examples
install_local('numpy')
install_local('pandas')

Upload packages through DRE portal

If your packages are not available through the solutions given above, you can also download packages on your local machine, transfer them to the data drive of the workspace and finally install them from there on your VMs.

If your package has no dependencies, or has dependencies that are available through a mirror, we recommend using Github. If your package has dependencies that are not available through the mirror, we recommend using PIP commands.

Github

On your computer

  1. Go to the Github page of the package, press the green download button and choose Download zip. Save the zipped folder on your computer.

  2. Upload the zipped folder to your workspace's data drive through the DRE portal.

On the VM

  1. For large packages, first move the files from the data drive to the C: drive. This drive is much faster.

  2. Unzip the folder.

  3. Open the command line as adminstrator and navigate to the directory where the unzipped folder is located.

  4. Run python setup.py install.

The package will then be installed.

Pip commands

On your own computer

  1. In the command line, run pip download [package] (so download, not install!)
    This command will result in WHL-files (wheels) of all packages in the current directory (including dependencies like numpy).

  2. Upload these WHL-files to your workspace's data drive through the DRE portal.

On the VM

  1. For large packages, first move the files from the data drive to the C: drive. This drive is much faster.

  2. Open the command line as adminstrator and run

pip install [WHL-filename of top package] --no-index --find-links .
(include the dot!)

The packages will then be installed. Because of the flags, the installation will use the local wheels for dependent packages.