PyNimbus - A Python Package to Organize NWS Data

For quite some time, I’ve wanted to release some kind of Python package to make a contribution towards the community. After a lot of planning, pondering, determining if it’s worth putting time and effort into, and noodling through my spaghetti code, I have finally decided it was time to pull the trigger and go for it.

PyNimbus, in short, is a Python package that pulls NWS data, sorts it, and puts it into a format that is “ready to go” for data visualization and/or analysis (matplotlib/pandas).

Everyone’s definition of “ready to go” output varies from application to application. One aspect of PyNimbus that I am working towards is to be able to accommodate as many of these applications as possible.

Let’s take an example: plotting storm reports. Without PyNimbus, the code may look as such (note: I wrote this code ~3 years ago. Don’t judge too hard if you’re going to read through it):

import csv
import requests
import pandas as pd
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt

def make_dataframes(data, start_index, end_index):
    # if there's more than 8 columns, delete the ones that have no data.
    for i in range(len(data)):
        if len(data[i]) > 8:
            data[i][7:] = [''.join(data[i][7:])]

    # construct pandas dataframes and return them.
    if end_index == start_index + 1:
        df = pd.DataFrame([0] * len(data[start_index]), data[start_index]).T
        return df
    else:
        lists = [data[i] for i in range(start_index + 1, end_index)]
        df = pd.DataFrame(lists, columns = data[start_index])
        return df
        
# figure out if the length of the dataframe is 0 or not.
def findLen(dataframe):
    if dataframe['County'][0] == 0: return 0
    else: return len(dataframe)
    
#plot points
def plotPoints(dfLon, dfLat, color, hazType):
    dfLonToList = dfLon.tolist()
    dfLatToList = dfLat.tolist()

    dfLonFloat = [float(dfLonToList[x]) for x in range(len(dfLonToList))]
    dfLatFloat = [float(dfLatToList[x]) for x in range(len(dfLatToList))]

    x, y = m(dfLonFloat, dfLatFloat)

    if hazType == "Tornado":
        zorder = 3
    elif hazType == "Wind":
        zorder = 2
    else:
        zorder = 1

    m.scatter(x, y, marker = 'o', color = color, s = 8, linewidth = 0.3,
        edgecolor = 'k', zorder = zorder)
        
# don't download the file
CSV_URL = 'http://www.spc.noaa.gov/climo/reports/yesterday_filtered.csv'
with requests.Session() as s:
    download = s.get(CSV_URL)
    try:
        decoded_content = download.content.decode('utf-8')
    except:
        decoded_content = download.content.decode("latin-1")
    cr = csv.reader(decoded_content.splitlines(), delimiter=',')
    my_list = list(cr)

plt.close('all')

# the loop to spit back the indexes where the first element is 'time',
# thus making a new dataframe.
index = [i for i in range(len(my_list)) if my_list[i][0] == 'Time']
index.append(len(my_list))

#make the dataframes for each of the hazards in the csv file.
tor_df = make_dataframes(my_list, index[0], index[1])
win_df = make_dataframes(my_list, index[1], index[2])
hail_df = make_dataframes(my_list, index[2], index[3])

m = Basemap(projection = 'merc', llcrnrlat= 24, urcrnrlat= 51, llcrnrlon= -127, urcrnrlon= -65,\
                lat_ts=40,resolution='l')
m.shadedrelief()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.3)

plotPoints(tor_df['Lon'], tor_df['Lat'], 'r', "Tornado")
plotPoints(win_df['Lon'], win_df['Lat'], 'b', "Wind")
plotPoints(hail_df['Lon'], hail_df['Lat'], 'g', "Hail")

plt.show()

Sure, there are points where the code is ambiguous and you could reduce it by simply calling pd.read_csv() (honestly, I don’t feel like fixing it which is why I am pointing this out). However, with PyNimbus, the code is reduced and is much easier to read:

import pynimbus as pyn
from mpl_toolkits.basemap import Basemap
from dataclasses import dataclass

@dataclass
class Tornado:
    zorder = 3
    color = 'red'

@dataclass
class Wind:
    zorder = 2
    color = 'blue'

@dataclass
class Hail:
    zorder = 1
    color = 'green'

link = "https://www.spc.noaa.gov/climo/reports/yesterday_filtered.csv"
y_reports = pyn.get_spc_storm_reports_df(link)

m = Basemap(projection = 'merc', llcrnrlat= 24, urcrnrlat= 51, llcrnrlon= -127, urcrnrlon= -65, lat_ts=40,resolution='l')
m.shadedrelief()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.3)

hazards = y_reports['Hazard'].tolist()
dfLon = y_reports['Lon'].tolist()
dfLat = y_reports['Lat'].tolist()

for i, hazard in enumerate(['Tornado', 'Wind', 'Hail']):
    if hazard == 'Tornado': type_haz = Tornado
    elif hazard == 'Wind': type_haz = Wind
    else: type_haz = Hail
    m.scatter(dfLon[i], dfLat[i], marker = 'o', color = type_haz.color, s = 8, linewidth = 0.3,
        edgecolor = 'k', zorder = type_haz.zorder)

The next question that you may pose is “who can use this package?” Well, if you’re reading this, you can use it. One rule of thumb I want to stick with is to make it so that you don’t have to know where the data is coming from - all you need to know is what kind of data you need (hurricane shapefiles/kml, tornado reports, etc).

Finally, YES it is open source, and I want you to contribute! Below are some important links regarding the package:

Written on September 2, 2019