Getting the Stuff to Get Started

Warning

Installing Python can take a while. Much of it is spent letting the package manager search for available package versions that will not conflict with other installed packages. Be Patient.

1 Installing Miniconda

I am recommending that you use Miniconda for this. Minoconda is a "slimmed down" version of the larger Anaconda package many people recommend. My experience is that the overhead for Anaconda makes for lots of problems later. The instructions to get Miniconda installed on Windows or MacOS is here:

2 Adding and Locking-down Conda-Forge as your designated channel

Also, I am recommending that you work mostly through the conda-forge community. Again, this provides more consistency in many of the packages you'll be working with. There are scores of repository areas (or "channels" in the conda community). Consequently, it is easy to accidentally get a package that was written by Frank that relies on a package written by Susan, which (naturally) is reliant on one developed by Pat. The result can be a web of dependencies that may eventually conflict once you install "that one package." Therefore we will stick with conda-forge as the "go-to" channel to get your packages. From my experience, it's provided the least amount of misery.

To enable conda-forge, you should open a terminal window (in Unix or Mac, it's just a terminal window). In Windows, you can find an "Anaconda Powershell" that will launch you into a conda-friendly operating system environment.

Once you have an open working terminal, enter the following commands one line at a time.

conda config --add channels conda-forge
conda config --set channel_priority strict

Then run an update to set this channel's packages as your packages.

Warning

The late August 2023 installation of Miniconda gave me some rather odd errors when installing. I have revised the script below to reflect that. Also, this semester’s installation takes much longer on my Windows machine than in the past. Times on a Mac will vary.

conda update -c conda-forge --all
conda update -c conda-forge conda

3 Loading some Libraries.

3.1 Commonly-Used Libraries

I recommend that, at this point, you load several libraries and packages.

First, you will need the Jupyter Resource packages to run the Jupyter notebooks.

Then you can install (or reinstall) the following basic "must-have" packages for working in data science.

  • NumPy The go-to package for basic math, array, and matrix handling with Python
  • Matplotlib The go-to package for graphics and plotting in Python
  • SciPy A library used for scientific computing and technical computing
  • Pandas A package commonly used for working with tabular data as well as time series data
  • SymPy A symbolic algebraic solver engine for Python
  • Scikit-Learn An open-source set of libraries for regressions, cluster analyses, and similar "machine-learning" activities.
  • seaborn An extension to Matplotlib for graph customization and more fancy graphics (often tied to statistical-type graphs.
  • openpyxl A Python library to read/write Excel 2010 xlsx/xlsm files
  • pyreadr A Package to read and write R RData and Rds files into/from pandas dataframes
  • version_information A very handy tool by Robert Johansson for Jupyter Notebooks to record what packages and their versions were used to make your notebook. Good for troubleshooting and documentation/replicability.

In your terminal window, enter the following one line at a time:

conda install -c conda-forge jupyterlab  
conda install -c conda-forge numpy matplotlib scipy sympy pandas xarray
conda install -c conda-forge scikit-learn seaborn openpyxl pyreadr version_information
Warning

When installing with conda it will first try to reconcile all the packages so any given package doesn’t have components that may “break” other packages. This you may get a warning saying that the “solving environment” has failed and it’s trying another set of requirements.

You may also experience a few iterations of “Solving Environment,” as shown below.

Conda Struggling to Reconcile Packages

3.2 More specialized libraries (not needed for CEE 284 students, but students wanting to work in Atmospheric Sciences, Hydrology, Climate Science or will want some of these)

If you work in any hydrology, weather & climate groups, you will also want to install the following.

  • Get the following mapping libraries
    • Cartopy A basic geospatial processing library that will be essential for mapping
    • Shapely A support package to help manipulate geometric objects
    • OWSLib Access the Open Geoscience Consortium resources and services. Lots of good mapping goodies to have.
    • pyProj An interface for working with map projections
    • geopandas Geospatial access that leverages Pandas's Dataframe style resources
  • Get the following complex data and meteo data resources
    • xarray Data manipulation and archiving of multi-dimensional datasets
    • pint and pint-xarray units support
    • metpy UCAR-Unidata tools for reading, viewing, and analyzing meteo data
    • netCDF4 UCAR-Unidata tools for NetCDF4 Support
    • Siphon UCAR-Unidata support for accessing remote meteo data
    • pyGrib NOAA-ESRL tools for WMO Gridded Binary (GRIB 1/2) support
    • cfGrib ECMWF tools for WMO Gridded Binary (GRIB 1/2) support
    • cftime UCAR-Unidata support for time data support (including 360-day, leap-year-free 365-day, and other quirky calendars that only a meteorologist would love because the rest of the world are monsters)
    • cf-python an Earth science data analysis library that is built on a complete implementation of the CF data model
    • cf-plot is a set of Python routines for making the common contour, vector, and line plots that climate researchers use
    • cf-units Provision of a wrapper class to support Unidata/UCAR UDUNITS-2, and the cftime calendar functionality
    • cf_xarray cf_xarray mainly provides an accessor that allows you to interpret Climate and Forecast metadata convention attributes present on xarray object
    • timezonefinder useful for converting from civilized UTC time to more vulgar local times (including "daylight-‘savings'" time).
    • pytz further support for time zones
    • haversine Calculate the distance (in various units) between two points on Earth using their latitude and longitude.
    • wrf-python A collection of diagnostic and interpolation routines for use with output from the Weather Research and Forecasting (WRF-ARW) Model.
    • iris A powerful, format-agnostic, community-driven Python package for analyzing and visualizing Earth science data
    • geocat-comp GeoCAT-comp provides implementations of computational functions for operating on geosciences data
    • geocat-viz The GeoCAT-viz repo contains tools to help plot data, including convenience and plotting functions that are used to facilitate plotting geosciences data with Matplotlib, Cartopy, and possibly other Python ecosystem plotting packages
    • uxarray UXarray provides Xarray-styled functionality for working with unstructured grids build around the UGRID conventions
conda install -c conda-forge shapely cartopy OWSLib pyproj geopandas
conda install -c conda-forge pint pint-xarray pint-pandas metpy netCDF4 
conda install -c conda-forge siphon cfgrib pygrib timezonefinder 
conda install -c conda-forge cftime cfdm wrapt setuptools cython
conda install -c conda-forge pytz haversine cf-units cf-xarray uxarray
conda install -c conda-forge iris satpy
conda install -c conda-forge geocat-comp geocat-viz
conda install -c conda-forge basemap-data basemap-data-hires
Warning

WRF-Python and Basemap for Apple Silicon machines have been temporarily removed from conda-forge, so proceed cautiously.

conda install -c conda-forge wrf-python 
conda install -c conda-forge basemap 

4 Get Git (If you don't already have it).

Git is a revision-control program and environment that accesses code repositories made public to the larger user community. If you have a Mac, you already have it in the MacOS operating system. If you are in Windows, you can get it from your terminal window by typing the following command.

conda install -c conda-forge git

5 "Let's Light This Candle!"

How we are ready to go. To launch Jupyter, I recommend that you again open the Anaconda Powershell Prompt from the start menu if it's not open already. You can also "Pin" it to your "Start" for easy access along with Excel and Mathcad.)

At the shell prompt, enter.

jupyter lab

And then the fun starts…

5.1 Firing Up the Jupyter Service

… the first thing you will see is a flurry of activity in your shell window. That's ok. What is happening is that your laptop is creating a virtual web service.

"My God! It's full of text!"

You will see a web browser tab opening (it may ask you for a specific browser, such as Edge, Chrome, Firefox, Mosaic…). You'll have a web page called "localhost:8080" open up. This is the pretend web service housing your Jupyter workspace for coding in Python.

"The Jupyter Interface"

To the left, you will see what looks like a File Manager. (That's your Jupyter File Manager.) To the right will be a workspace. It will most likely have a "launcher" pave with apps to push or the last Juptyer Notebook you had open. You'll also see a menu at the top and other toys around the webpage's perimeter.

5.2 A Place For Your ~Crap~ Stuff

Jupyter's framework expects your work area to hang off of your home Windows directory. Therefore, I recommend creating a good working directory for your Python development wherever you keep your class materials: in your Documents, Dropbox, One-Drive-SDSMT, Google Drive, or other drive access by mouse-clicking on the Jupyter File Manager Sub-Window from your home directory.

And with that, congratulations! You have a flexible work environment that allows you to code in Python (and other languages) and create a document that includes traditional word-processing text, pictures, tables, active code, output, graphs, tables, and other resources. This will allow you to create a truly replicable and shareable piece of work that you can share with colleagues or clients.

6 Ways Forward

From here, I have a fast "spinup" course that introduces you to the basic Markdown language for documenting Python and some basic skills. Much of it is for students who join my workgroup who have yet to learn Python before this or need some practice to refresh their skills. There is also a "Stupid Python Tricks" page for more advanced use.

Play Hard. Have Fun.