PyNimbus - A Python Package to Organize NWS Data
For quite some time, I’ve wanted to release some kind of Python package to make a contribution towards the community. After a lot of planning, pondering, determining if it’s worth putting time and effort into, and noodling through my spaghetti code, I have finally decided it was time to pull the trigger and go for it.
PyNimbus, in short, is a Python package that pulls NWS data, sorts it, and puts it into a format that is “ready to go” for data visualization and/or analysis (matplotlib/pandas).
Everyone’s definition of “ready to go” output varies from application to application. One aspect of PyNimbus that I am working towards is to be able to accommodate as many of these applications as possible.
Let’s take an example: plotting storm reports. Without PyNimbus, the code may look as such (note: I wrote this code ~3 years ago. Don’t judge too hard if you’re going to read through it):
import csv
import requests
import pandas as pd
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
def make_dataframes(data, start_index, end_index):
# if there's more than 8 columns, delete the ones that have no data.
for i in range(len(data)):
if len(data[i]) > 8:
data[i][7:] = [''.join(data[i][7:])]
# construct pandas dataframes and return them.
if end_index == start_index + 1:
df = pd.DataFrame([0] * len(data[start_index]), data[start_index]).T
return df
else:
lists = [data[i] for i in range(start_index + 1, end_index)]
df = pd.DataFrame(lists, columns = data[start_index])
return df
# figure out if the length of the dataframe is 0 or not.
def findLen(dataframe):
if dataframe['County'][0] == 0: return 0
else: return len(dataframe)
#plot points
def plotPoints(dfLon, dfLat, color, hazType):
dfLonToList = dfLon.tolist()
dfLatToList = dfLat.tolist()
dfLonFloat = [float(dfLonToList[x]) for x in range(len(dfLonToList))]
dfLatFloat = [float(dfLatToList[x]) for x in range(len(dfLatToList))]
x, y = m(dfLonFloat, dfLatFloat)
if hazType == "Tornado":
zorder = 3
elif hazType == "Wind":
zorder = 2
else:
zorder = 1
m.scatter(x, y, marker = 'o', color = color, s = 8, linewidth = 0.3,
edgecolor = 'k', zorder = zorder)
# don't download the file
CSV_URL = 'http://www.spc.noaa.gov/climo/reports/yesterday_filtered.csv'
with requests.Session() as s:
download = s.get(CSV_URL)
try:
decoded_content = download.content.decode('utf-8')
except:
decoded_content = download.content.decode("latin-1")
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
plt.close('all')
# the loop to spit back the indexes where the first element is 'time',
# thus making a new dataframe.
index = [i for i in range(len(my_list)) if my_list[i][0] == 'Time']
index.append(len(my_list))
#make the dataframes for each of the hazards in the csv file.
tor_df = make_dataframes(my_list, index[0], index[1])
win_df = make_dataframes(my_list, index[1], index[2])
hail_df = make_dataframes(my_list, index[2], index[3])
m = Basemap(projection = 'merc', llcrnrlat= 24, urcrnrlat= 51, llcrnrlon= -127, urcrnrlon= -65,\
lat_ts=40,resolution='l')
m.shadedrelief()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.3)
plotPoints(tor_df['Lon'], tor_df['Lat'], 'r', "Tornado")
plotPoints(win_df['Lon'], win_df['Lat'], 'b', "Wind")
plotPoints(hail_df['Lon'], hail_df['Lat'], 'g', "Hail")
plt.show()
Sure, there are points where the code is ambiguous and you could reduce it by simply calling pd.read_csv()
(honestly, I don’t feel like fixing it which is why I am pointing this out). However, with PyNimbus, the code is reduced and is much easier to read:
import pynimbus as pyn
from mpl_toolkits.basemap import Basemap
from dataclasses import dataclass
@dataclass
class Tornado:
zorder = 3
color = 'red'
@dataclass
class Wind:
zorder = 2
color = 'blue'
@dataclass
class Hail:
zorder = 1
color = 'green'
link = "https://www.spc.noaa.gov/climo/reports/yesterday_filtered.csv"
y_reports = pyn.get_spc_storm_reports_df(link)
m = Basemap(projection = 'merc', llcrnrlat= 24, urcrnrlat= 51, llcrnrlon= -127, urcrnrlon= -65, lat_ts=40,resolution='l')
m.shadedrelief()
m.drawcoastlines(linewidth=0.5)
m.drawcountries(linewidth=0.5)
m.drawstates(linewidth=0.3)
hazards = y_reports['Hazard'].tolist()
dfLon = y_reports['Lon'].tolist()
dfLat = y_reports['Lat'].tolist()
for i, hazard in enumerate(['Tornado', 'Wind', 'Hail']):
if hazard == 'Tornado': type_haz = Tornado
elif hazard == 'Wind': type_haz = Wind
else: type_haz = Hail
m.scatter(dfLon[i], dfLat[i], marker = 'o', color = type_haz.color, s = 8, linewidth = 0.3,
edgecolor = 'k', zorder = type_haz.zorder)
The next question that you may pose is “who can use this package?” Well, if you’re reading this, you can use it. One rule of thumb I want to stick with is to make it so that you don’t have to know where the data is coming from - all you need to know is what kind of data you need (hurricane shapefiles/kml, tornado reports, etc).
Finally, YES it is open source, and I want you to contribute! Below are some important links regarding the package: