Demo of USGS-MAP-gwmodels¶
This page demonstrates some of the core features of USGS-MAP-gwmodel obs, applied to the Mississippi Delta inset model area. Specifically:
setup of model layering and property zones by combining voxel-based hydrogeologic facies derived from an airborne electromagnetic (AEM) survey (Burton and others, 2020) with traditional raster surfaces denoting hydrogeologic contacts (Hart and others, 2008)
culling of NetCDF-based infiltration data (Westenbroek and others, 2020) to the model area, and aggregation to monthly values
preprocessing of streamflow timeseries for use as specified inflows to the SFR package, or streamflow observations
preprocessing of head observations
preprocessing of water use data, including:
estimates of agricultural water use (Wilson, 2020)
non-agricultulral water use data from the USGS
[1]:
from pathlib import Path
import numpy as np
import pandas as pd
import xarray as xr
import matplotlib.pyplot as plt
import matplotlib as mpl
from mfsetup.grid import MFsetupGrid
Define some global inputs¶
[2]:
test_data_path = Path('../mapgwm/tests/data')
output_folder = Path('output')
output_folder.mkdir(exist_ok=True) # make the output folder if it doesn't exist
# ms delta geographic extent
delta_extent = test_data_path / 'extents/ms_delta.shp'
# simulation start date
start_date='1998-04-01'
# model grid for the MS Delta inset model
ncol=270
nrow=605
dxy=500.
delr = np.ones(ncol) * dxy
delc = np.ones(nrow) * dxy
delta_inset_model_grid = MFsetupGrid(delc, delr, top=None, botm=None,
lenuni=2, epsg=5070,
xoff=434955, yoff=1040285, angrot=0.0)
# shapefile with polygons to group observations by geographic area
geographic_groups = [test_data_path / 'extents/CompositeHydrographArea.shp',
test_data_path / 'extents/MAP_generalized_regions.shp'
]
Set up the model layering and property zones¶
from AEM-based voxel data and raster surfaces of hydrogeologic contacts
[3]:
# mean dem values for each 1 km gridcell, elevation units of feet
dem_means_raster = test_data_path / 'rasters/dem_mean_elevs_1000.tif'
# AEM electrical resistivity-based facies classes from tempest and resolve surveys
# (All facies classes version)
facies_classes_netcdf = test_data_path / 'netcdf/RSTM_res_fac_depth_15m.nc'
# Original MERAS framework (1 mi resolution), elevation units of feet
framework_rasters = [
test_data_path / 'rasters/vkbg_surf.tif', # Vicksburg-Jackson Group (top)
test_data_path / 'rasters/ucaq_surf.tif', # Upper Claiborne aquifer (top)
test_data_path / 'rasters/mccu_surf.tif', # Middle Claiborne confining unit (t
test_data_path / 'rasters/mcaq_surf.tif', # Middle Claiborne aquifer (top)
test_data_path / 'rasters/lccu_surf.tif', # Lower Claiborne confining unit (to
test_data_path / 'rasters/lcaq_surf.tif', # Lower Claiborne aquifer (top)
test_data_path / 'rasters/mwaq_surf.tif', # Middle Wilcox aquifer (top)
test_data_path / 'rasters/lwaq_surf.tif', # Lower Wilcox aquifer (top)
test_data_path / 'rasters/mdwy_surf.tif', # Midway confining unit (top)
]
framework_unit_names = [
'Undifferentiated sediments\nabove the Vicksburg',
'Vicksburg-Jackson Group',
'Upper Claiborne aquifer',
'Middle Claiborne confining unit',
'Middle Claiborne aquifer',
'Lower Claiborne confining unit',
'Lower Claiborne aquifer',
'Middle Wilcox aquifer',
'Lower Wilcox aquifer'
]
[4]:
from mapgwm.framework import setup_model_layers, plot_slice
layers, zone_array = setup_model_layers(dem_means_raster,
facies_classes_netcdf,
framework_rasters,
delta_inset_model_grid,
facies_class_variable='fac_a', # variable in facies_classes_netcdf with facies zones
dem_elevation_units='feet',
framework_raster_elevation_units='feet',
model_length_units='meters', output_folder=output_folder,
framework_unit_names=framework_unit_names)
reading data from ../mapgwm/tests/data/rasters/dem_mean_elevs_1000.tif...
finished in 0.06s
reading data from ../mapgwm/tests/data/rasters/vkbg_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/ucaq_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/mccu_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/mcaq_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/lccu_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/lcaq_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/mwaq_surf.tif...
finished in 0.03s
reading data from ../mapgwm/tests/data/rasters/lwaq_surf.tif...
finished in 0.02s
reading data from ../mapgwm/tests/data/rasters/mdwy_surf.tif...
finished in 0.03s
computing cell thicknesses...
finished in 6.46s
computing cell thicknesses...
finished in 14.16s
wrote output/botm_array/model_top.tif
wrote output/botm_array/botm0.tif
wrote output/botm_array/botm1.tif
wrote output/botm_array/botm2.tif
wrote output/botm_array/botm3.tif
wrote output/botm_array/botm4.tif
wrote output/botm_array/botm5.tif
wrote output/botm_array/botm6.tif
wrote output/botm_array/botm7.tif
wrote output/botm_array/botm8.tif
wrote output/botm_array/botm9.tif
wrote output/botm_array/botm10.tif
wrote output/botm_array/botm11.tif
wrote output/botm_array/botm12.tif
wrote output/botm_array/botm13.tif
wrote output/botm_array/botm14.tif
wrote output/botm_array/botm15.tif
wrote output/botm_array/botm16.tif
wrote output/botm_array/botm17.tif
wrote output/botm_array/botm18.tif
wrote output/botm_array/botm19.tif
wrote output/botm_array/botm20.tif
wrote output/botm_array/botm21.tif
wrote output/botm_array/botm22.tif
wrote output/botm_array/botm23.tif
wrote output/botm_array/botm24.tif
wrote output/botm_array/botm25.tif
wrote output/botm_array/botm26.tif
wrote output/botm_array/botm27.tif
wrote output/botm_array/botm28.tif
wrote output/botm_array/botm29.tif
wrote output/botm_array/botm30.tif
wrote output/zones/rasters/res_fac0.tif
wrote output/zones/rasters/res_fac1.tif
wrote output/zones/rasters/res_fac2.tif
wrote output/zones/rasters/res_fac3.tif
wrote output/zones/rasters/res_fac4.tif
wrote output/zones/rasters/res_fac5.tif
wrote output/zones/rasters/res_fac6.tif
wrote output/zones/rasters/res_fac7.tif
wrote output/zones/rasters/res_fac8.tif
wrote output/zones/rasters/res_fac9.tif
wrote output/zones/rasters/res_fac10.tif
wrote output/zones/rasters/res_fac11.tif
wrote output/zones/rasters/res_fac12.tif
wrote output/zones/rasters/res_fac13.tif
wrote output/zones/rasters/res_fac14.tif
wrote output/zones/rasters/res_fac15.tif
wrote output/zones/rasters/res_fac16.tif
wrote output/zones/rasters/res_fac17.tif
wrote output/zones/rasters/res_fac18.tif
wrote output/zones/rasters/res_fac19.tif
wrote output/zones/rasters/res_fac20.tif
wrote output/zones/rasters/res_fac21.tif
wrote output/zones/rasters/res_fac22.tif
wrote output/zones/rasters/res_fac23.tif
Plot a cross section slice¶
[5]:
framework_unit_labels = dict(zip(range(13, 32), framework_unit_names))
[6]:
plot_slice(layers, property_data=zone_array,
row=100, column=slice(None),
voxel_start_layer=0, voxel_zones=np.arange(1, 21), cmap='copper',
voxel_cmap='viridis', unit_labels=framework_unit_labels)
[6]:
<AxesSubplot:title={'center':'Row 100'}, xlabel='Column in model', ylabel='Elevation, in meters'>
Preprocess recharge¶
Chop the SWB source data down to the model area and aggregate daily values to monthly means
[7]:
from mapgwm.swb import get_monthly_means
data_file = test_data_path / 'swb/delta_swb_summer2018.nc'
# output
outputfile = output_folder / 'swb_monthly_means.nc'
# get the monthly means
get_monthly_means(data_file, outfile=outputfile,
filter=delta_inset_model_grid.bounds, check_results=True)
using dask...
opening ../mapgwm/tests/data/swb/delta_swb_summer2018.nc...
culling to bounding box: (434955.0, 1040285.0, 569955.0, 1342785.0)
Aggregating to monthly values...
wrote output/swb_monthly_means.nc
took 0.18s
checking compressed file
finished in 0.24s
[8]:
with xr.open_dataset(outputfile) as ds:
fig, ax = plt.subplots(figsize=(11, 8.5))
ds['net_infiltration'].sel(time='2018-08').plot(ax=ax, vmax=ds['net_infiltration'].quantile(0.95))
ax.set_aspect(1)
Preprocess streamflow data for specified inflows to the SFR package¶
[9]:
from mapgwm.swflows import preprocess_flows
streamflow_data_file = test_data_path / 'swflows/13Feb2020_rf_output_with_site_numbers.csv'
streamflow_data = pd.read_csv(streamflow_data_file)
# only include flows that are inflows (exclude observations)
include_line_ids = set(streamflow_data.loc[streamflow_data.group == 'meras-inflows', 'comid'])
outfile = output_folder / 'processed_inflows.csv'
[10]:
streamflow_data.head()
[10]:
site_no | comid | datetime | name | group | X | Y | predicted_bf | predicted_total_flow | category | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 07277700 | 15251676 | 2007-02-01 | HICKAHALA CREEK NR SENATOBIA, MS | meras-observations | -89.7839 | 34.6394 | 82.1099 | 107.1910 | estimated |
1 | 07277700 | 15251676 | 2007-03-01 | HICKAHALA CREEK NR SENATOBIA, MS | meras-observations | -89.7839 | 34.6394 | 75.3402 | 85.2905 | estimated |
2 | 07277700 | 15251676 | 2007-04-01 | HICKAHALA CREEK NR SENATOBIA, MS | meras-observations | -89.7839 | 34.6394 | 46.6768 | 80.6870 | estimated |
3 | 07277700 | 15251676 | 2007-05-01 | HICKAHALA CREEK NR SENATOBIA, MS | meras-observations | -89.7839 | 34.6394 | 41.8452 | 49.2666 | estimated |
4 | 07277700 | 15251676 | 2007-06-01 | HICKAHALA CREEK NR SENATOBIA, MS | meras-observations | -89.7839 | 34.6394 | 37.1356 | 47.5411 | estimated |
[11]:
data, metadata = preprocess_flows(streamflow_data, flow_data_columns=['predicted_total_flow', 'predicted_bf'],
include_line_ids=include_line_ids,
start_date=start_date,
datetime_col='datetime',
site_no_col='site_no',
line_id_col='comid',
x_coord_col='X',
y_coord_col='Y',
outfile=outfile
)
writing output/processed_inflows_info.shp... Done
writing output/processed_inflows_info.csv
writing output/processed_inflows.csv
[12]:
data.head()
[12]:
site_no | line_id | datetime | predicted_total_flow | predicted_bf | category | |
---|---|---|---|---|---|---|
datetime | ||||||
2007-02-01 | none | 19274790 | 2007-02-01 | 5.175779e+06 | 4.458664e+06 | measured |
2007-03-01 | none | 19274790 | 2007-03-01 | 2.497954e+06 | 1.873441e+06 | measured |
2007-04-01 | none | 19274790 | 2007-04-01 | 3.820059e+06 | 2.017639e+06 | measured |
2007-05-01 | none | 19274790 | 2007-05-01 | 2.154971e+06 | 1.052037e+06 | measured |
2007-06-01 | none | 19274790 | 2007-06-01 | 1.437397e+06 | 5.956531e+05 | measured |
[13]:
metadata.head()
[13]:
start_dt | site_no | end_dt | n | x | y | line_id | name | geometry | obsprefix | group | |
---|---|---|---|---|---|---|---|---|---|---|---|
site_no | |||||||||||
none | 2007-02-01 | none | 2018-12-01 | 143 | 438600.380812 | 1.105540e+06 | 19274790 | Tensas River | POINT (438600.3808115814 1105539.758365622) | none | fluxes |
Preprocess streamflow data for SFR package observations¶
Unlike the inflows, which are referenced by NHDPlus COMID, these all have site numbers
Additionally, we want to fill gaps in measured flow data with statistical estimates
The
data
DataFrame outputs have the time series;The
metadata
DataFrame outputs have site information that can be input to ``modflow-setup`` to set up the observation input for MODFLOW-6.
[14]:
# only include flows that are inflows (exclude observations)
include_sites = set(streamflow_data.loc[streamflow_data.group == 'meras-observations', 'site_no'])
outfile = output_folder / 'processed_flow_obs.csv'
# rename these columns, if present
column_renames = {'predicted_total_flow': 'qtotal_m3d',
'predicted_bf': 'qbase_m3d',
'qbase_cfs': 'qbase_m3d',
'q_cfs': 'qtotal_m3d',
'station_nm': 'name'}
preprocess the statistical estimates¶
[15]:
est_data, est_metadata = preprocess_flows(streamflow_data, flow_data_columns=['predicted_total_flow', 'predicted_bf'],
include_sites=include_sites,
start_date=start_date,
active_area=delta_extent,
datetime_col='datetime',
site_no_col='site_no',
line_id_col='comid',
x_coord_col='X',
y_coord_col='Y',
geographic_groups=geographic_groups,
geographic_groups_col='obsgroup',
flow_qualifier_column='category',
source_crs=4269,
dest_crs=5070,
source_volume_units='ft3',
source_time_units='s',
dest_volume_units='m3',
dest_time_units='d',
column_renames=column_renames,
outfile=None
)
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 7 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
reading ../mapgwm/tests/data/extents/MAP_generalized_regions.shp...
--> building dataframe... (may take a while for large shapefiles)
reading ../mapgwm/tests/data/extents/CompositeHydrographArea.shp...
--> building dataframe... (may take a while for large shapefiles)
[16]:
est_data.head()
[16]:
site_no | line_id | datetime | qtotal_m3d | qbase_m3d | category | |
---|---|---|---|---|---|---|
datetime | ||||||
2007-02-01 | 07288847 | 17929890 | 2007-02-01 | 659171.062935 | 564867.808532 | estimated |
2007-03-01 | 07288847 | 17929890 | 2007-03-01 | 316339.771464 | 278767.710811 | estimated |
2007-04-01 | 07288847 | 17929890 | 2007-04-01 | 328301.079306 | 179244.687481 | estimated |
2007-05-01 | 07288847 | 17929890 | 2007-05-01 | 632789.638827 | 216849.532247 | estimated |
2007-06-01 | 07288847 | 17929890 | 2007-06-01 | 250184.368712 | 130957.604569 | estimated |
[17]:
est_metadata.head()
[17]:
start_dt | site_no | end_dt | n | x | y | line_id | name | geometry | obsprefix | group | geo_group | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
site_no | ||||||||||||
07288000 | 2007-02-01 | 07288000 | 2018-12-01 | 143 | 496047.363537 | 1.262247e+06 | 17951955 | Big Sunflower River | POINT (496047.3635373298 1262247.29440906) | 07288000 | fluxes | delta |
07288280 | 2007-02-01 | 07288280 | 2018-12-01 | 143 | 489208.729363 | 1.239410e+06 | 17958105 | BIG SUNFLOWER RIVER NR MERIGOLD, MS | POINT (489208.7293628419 1239410.03140523) | 07288280 | fluxes | delta |
07288555 | 2007-02-01 | 07288555 | 2018-12-01 | 143 | 509254.529536 | 1.233598e+06 | 17958153 | QUIVER RIVER SOUTHEAST RULEVILLE, MS | POINT (509254.5295362755 1233597.934250077) | 07288555 | fluxes | delta |
07288650 | 2007-02-01 | 07288650 | 2018-12-01 | 143 | 472484.774612 | 1.196944e+06 | 17963095 | BOGUE PHALIA NR LELAND, MS | POINT (472484.7746117859 1196943.873882911) | 07288650 | fluxes | delta |
07288700 | 2007-02-01 | 07288700 | 2018-12-01 | 143 | 492594.661169 | 1.195806e+06 | 17970237 | Big Sunflower River | POINT (492594.6611692713 1195806.16802324) | 07288700 | fluxes | delta |
preprocess the measured values¶
[18]:
nwis_streamflow_data = test_data_path / 'swflows/nwis_dvs.csv'
nwis_streamflow_metadata = test_data_path / 'swflows/nwis_dv_sites.csv'
meas_data, meas_metadata = preprocess_flows(nwis_streamflow_data, nwis_streamflow_metadata,
flow_data_columns=['q_cfs', 'qbase_cfs'],
include_sites=include_sites,
start_date=start_date,
active_area=delta_extent,
datetime_col='datetime',
site_no_col='site_no',
line_id_col='comid',
x_coord_col='dec_long_va',
y_coord_col='dec_lat_va',
geographic_groups=geographic_groups,
geographic_groups_col='obsgroup',
source_crs=4269,
dest_crs=5070,
source_volume_units='ft3',
source_time_units='s',
dest_volume_units='m3',
dest_time_units='d',
column_renames=column_renames,
outfile=None
)
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 40 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
reading ../mapgwm/tests/data/extents/MAP_generalized_regions.shp...
--> building dataframe... (may take a while for large shapefiles)
reading ../mapgwm/tests/data/extents/CompositeHydrographArea.shp...
--> building dataframe... (may take a while for large shapefiles)
[19]:
meas_data.head()
[19]:
site_no | datetime | qtotal_m3d | qbase_m3d | category | |
---|---|---|---|---|---|
datetime | |||||
2008-01-20 | 07280400 | 2008-01-20 | 93459.185840 | 93459.185840 | measured |
2008-01-21 | 07280400 | 2008-01-21 | 90278.637631 | 90278.637631 | measured |
2008-01-22 | 07280400 | 2008-01-22 | 153155.629151 | 92176.690939 | measured |
2008-01-23 | 07280400 | 2008-01-23 | 214809.332899 | 94114.868086 | measured |
2008-01-24 | 07280400 | 2008-01-24 | 144347.957187 | 96093.658387 | measured |
[20]:
meas_metadata.head()
[20]:
site_no | name | site_tp_cd | y | x | coord_meth_cd | coord_acy_cd | coord_datum_cd | dec_coord_datum_cd | district_cd | ... | sv_begin_date | sv_end_date | sv_count_nu | geometry | start_dt | end_dt | n | obsprefix | group | geo_group | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
site_no | |||||||||||||||||||||
07280400 | 07280400 | TILLATOBA CREEK AT CHARLESTON, MS | ST | 1.231964e+06 | 543849.348583 | G | 5 | NAD83 | NAD83 | 28 | ... | 1976-08-18 | 2015-09-08 | 132.0 | POINT (543849.3485826917 1231964.199194193) | 2008-01-20 | 2009-09-22 | 612.0 | 07280400 | fluxes | delta |
07288000 | 07288000 | BIG SUNFLOWER RIVER AT CLARKSDALE, MS | ST | 1.251190e+06 | 495901.606421 | M | F | NAD27 | NAD83 | 28 | ... | 1995-05-18 | 2020-08-11 | 29.0 | POINT (495901.606420672 1251190.481277433) | 2017-10-02 | 2020-09-20 | 884.0 | 07288000 | fluxes | delta |
07288280 | 07288280 | BIG SUNFLOWER RIVER NR MERIGOLD, MS | ST | 1.209990e+06 | 489521.298812 | G | 5 | NAD83 | NAD83 | 28 | ... | 1992-10-06 | 2020-06-30 | 266.0 | POINT (489521.2988124544 1209990.152655879) | 2008-01-08 | 2020-09-18 | 4633.0 | 07288280 | fluxes | delta |
07288650 | 07288650 | BOGUE PHALIA NR LELAND, MS | ST | 1.160478e+06 | 475848.051335 | G | 5 | NAD83 | NAD83 | 28 | ... | 1953-11-10 | 2020-06-30 | 233.0 | POINT (475848.0513348849 1160478.108369947) | 2008-01-08 | 2020-09-22 | 4592.0 | 07288650 | fluxes | delta |
07288700 | 07288700 | BIG SUNFLOWER RIVER NR ANGUILLA, MS | ST | 1.113472e+06 | 484902.570251 | M | M | NAD27 | NAD83 | 28 | ... | 1980-07-23 | 2011-02-15 | 35.0 | POINT (484902.5702511699 1113472.33857191) | NaN | NaN | NaN | 07288700 | fluxes | delta |
5 rows × 47 columns
Combine the measured and estimated observations into one dataset¶
Resample both datasets to a continuous monthly frequency (labeled with the start date of each month)
Prioritize measured data when available; fill gaps with statistical estimates
Once the model has been run, the observation output from the model can be fed to ``modflow-obs`` along with the
combined_obs
timeseries (below), to set up ``PEST`` observations, or create observation output to be read by ``PEST``.
[21]:
from mapgwm.swflows import combine_measured_estimated_values
combined_obs = combine_measured_estimated_values(meas_data, est_data,
measured_values_data_col='qbase_m3d',
estimated_values_data_col='qbase_m3d',
resample_freq='MS', how='mean')
[22]:
combined_obs.head()
[22]:
site_no | line_id | datetime | category | est_qtotal_m3d | est_qbase_m3d | meas_qtotal_m3d | meas_qbase_m3d | obsval | |
---|---|---|---|---|---|---|---|---|---|
0 | 07280400 | NaN | 2008-01-01 | measured | NaN | NaN | 2.219044e+05 | 99816.041527 | 99816.041527 |
1 | 07280400 | NaN | 2008-02-01 | measured | NaN | NaN | 3.343203e+05 | 130662.640359 | 130662.640359 |
2 | 07280400 | NaN | 2008-03-01 | measured | NaN | NaN | 5.228016e+05 | 112159.341365 | 112159.341365 |
3 | 07280400 | NaN | 2008-04-01 | measured | NaN | NaN | 1.080310e+06 | 197271.496485 | 197271.496485 |
4 | 07280400 | NaN | 2008-05-01 | measured | NaN | NaN | 6.351547e+05 | 165388.483203 | 165388.483203 |
plot data for a site of interest¶
[23]:
site_no = '07288280'
site_data = combined_obs.loc[combined_obs.site_no == site_no].copy()
site_data.index = site_data.datetime
fig, ax = plt.subplots(figsize=(11, 8.5))
ax = site_data[['meas_qtotal_m3d','meas_qbase_m3d']].plot(ax=ax)
ax = site_data[['est_qtotal_m3d','est_qbase_m3d']].plot(alpha=0.3, ax=ax, color=['C0', 'C1'])
site_data.obsval.plot(ax=ax, c='k')
ax.set_title(meas_metadata.loc[site_no, 'name'])
ax.legend()
ax.set_ylabel('Flow, in m$^3$/day')
[23]:
Text(0, 0.5, 'Flow, in m$^3$/day')
Preprocess head observations¶
Preprocess head observation data, from the visGWDB program * reproject to the model CRS * cull data to a start_date
and optionally, a polygon or set of polygons defining the model area * length units are converted to those of the groundwater model. Open intervals for the wells are converted from depths to elevations * missing open intervals are filled based on well bottom depths (if availabile) and the median open interval length for the dataset. *
Wells are categorized based on the quality of the open interval information (see the documentation for :func:mapgwm.headobs.fill_well_open_intervals
). * Prefixes for observation names (with an optional length limit) that identify the location are generated * Similar to with flux observations, assign preliminary observation groups based on geographic areas defined by the * The data
DataFrame outputs have the time series; * The metadata
DataFrame outputs have site information that
can be input to ``modflow-setup`` to set up the observation input for MODFLOW-6. * Once the model has been run, the observation output from the model can be fed to ``modflow-obs`` along with the data
timeseries (below), to set up ``PEST`` observations, or create observation output to be read by ``PEST``.
[24]:
from mapgwm.headobs import preprocess_headobs, get_data
data_file = test_data_path / 'headobs/GW_monthly_stats1990-01-01_2019-12-31.txt'
metadata_file = test_data_path / 'headobs/GW_monthly_meta1990-01-01_2019-12-31.txt'
# output
outputfile = output_folder / 'preprocessed_monthly_headobs.csv'
[25]:
# read the data
data_orig, metadata_orig = get_data(data_file, metadata_file)
# do the preprocessing
data, metadata = preprocess_headobs(data_orig, metadata_orig,
head_data_columns=['head', 'last_head'],
data_length_units='feet',
active_area=delta_extent,
source_crs=4269, dest_crs=5070,
start_date=start_date,
geographic_groups=geographic_groups,
geographic_groups_col='obsgroup',
outfile=outputfile)
starting with 10 unique wells
culling 13 wells with only data prior to start date of 1998-04-01 00:00:00
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 2 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
50% of wells (n=4) have incomplete open interval information. Filling screen bottoms with well bottoms where available, and estimating screen tops from median open interval length.
12% of wells (n=1) still missing open interval information; assigned to category 4.
reading ../mapgwm/tests/data/extents/MAP_generalized_regions.shp...
--> building dataframe... (may take a while for large shapefiles)
reading ../mapgwm/tests/data/extents/CompositeHydrographArea.shp...
--> building dataframe... (may take a while for large shapefiles)
writing output/preprocessed_monthly_headobs_info.shp... Done
writing output/preprocessed_monthly_headobs_info.csv
writing output/preprocessed_monthly_headobs.csv
[26]:
data.head()
[26]:
site_no | datetime | head | last_head | head_std | n | obsprefix | |
---|---|---|---|---|---|---|---|
4 | USGS:332915091025901 | 2003-07-01 | 37.185600 | 37.185600 | NaN | 1 | USGS:332915091025901 |
5 | USGS:333411090482201 | 2008-03-01 | 28.096464 | 28.096464 | NaN | 1 | USGS:333411090482201 |
6 | USGS:333411090482201 | 2010-03-01 | 27.736800 | 27.736800 | NaN | 1 | USGS:333411090482201 |
7 | USGS:333411090482201 | 2015-03-01 | 26.560272 | 26.560272 | NaN | 1 | USGS:333411090482201 |
8 | USGS:333411090482201 | 2019-03-01 | 26.484072 | 26.484072 | NaN | 1 | USGS:333411090482201 |
[27]:
metadata.head()
[27]:
head | head_std | last_head | n | aqfr_cd | nat_aqfr_cd | screen_botm | screen_top | well_depth | well_el | ... | x | y | geometry | well_botm | category | orig_scbot | orig_sctop | obsprefix | group | geo_group | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
site_no | |||||||||||||||||||||
USGS:323340091085201 | 19.50720 | NaN | 19.50720 | 1 | 112MRVA | N100MSRVVL | -88.177600 | -75.985600 | 115.0 | 26.822400 | ... | 452922.427646 | 1.065894e+06 | POINT (452922.4276462687 1065893.772968342) | -88.177600 | 2 | NaN | NaN | USGS:323340091085201 | heads | boeuf |
USGS:325121091093801 | 29.87040 | NaN | 29.87040 | 1 | 112MRVA | N100MSRVVL | -58.605600 | -46.413600 | 90.0 | 31.394400 | ... | 450055.327781 | 1.098664e+06 | POINT (450055.3277808698 1098664.367787851) | -58.605600 | 2 | NaN | NaN | USGS:325121091093801 | heads | boeuf |
USGS:332915091025901 | 37.18560 | NaN | 37.18560 | 1 | 112MRVA | N100MSRVVL | 6.705600 | 15.240000 | 108.0 | 39.624000 | ... | 456688.018145 | 1.169617e+06 | POINT (456688.0181453624 1169616.855672613) | -68.376000 | 3 | 6.7056 | 15.24 | USGS:332915091025901 | heads | delta |
USGS:333411090482201 | 27.60726 | NaN | 27.60726 | 40 | 112MRVA | N100MSRVVL | -81.262968 | -69.070968 | 120.0 | 38.737032 | ... | 478637.876037 | 1.179983e+06 | POINT (478637.8760365412 1179983.321431023) | -81.262968 | 2 | NaN | NaN | USGS:333411090482201 | heads | delta |
USGS:333633090230701 | 23.77440 | NaN | 23.77440 | 1 | 112MRVA | N100MSRVVL | 3.048000 | 15.240000 | 115.0 | 38.100000 | ... | 517102.401121 | 1.186603e+06 | POINT (517102.4011205511 1186603.416371753) | -76.900000 | 3 | 3.0480 | 15.24 | USGS:333633090230701 | heads | delta_cha |
5 rows × 25 columns
Preprocess ag. water use from the IWUM¶
read agricultural water use estimates from the Irrigation Water Use Model (IWUM; Wilson, 2020)
cull data to the model area
convert rates from cumulative volumes per time period to cubic meters per day
assign open intervals based on raster surfaces of the estimated production zone for agricultural groundwater use
[28]:
from mapgwm.iwum import preprocess_iwum_pumping
iwum_output = test_data_path / 'iwum/irr_wu_1km_1999_to_2017.nc'
production_zone_top_raster = test_data_path / 'iwum/est_prod_zone_top.tif'
production_zone_botm_raster = test_data_path / 'iwum/est_prod_zone_botm.tif'
outfile = output_folder / 'iwum_m3.csv'
[29]:
ag_wu = preprocess_iwum_pumping(iwum_output,
start_date=None,
end_date=None,
active_area=delta_extent,
estimated_production_zone_top=production_zone_top_raster,
estimated_production_zone_botm=production_zone_botm_raster,
flux_variable='value',
nc_crs=5070,
nc_length_units='meters',
estimated_production_surface_units='meters',
model_length_units='meters',
outfile=outfile)
reading data from ../mapgwm/tests/data/iwum/est_prod_zone_top.tif...
finished in 0.40s
reading data from ../mapgwm/tests/data/iwum/est_prod_zone_botm.tif...
finished in 0.40s
Reset screen top and bottom to mean elevation at 9 locations where screen top was < screen bottom
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 14086 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
wrote output/iwum_m3.csv
[30]:
ag_wu.head()
[30]:
site_no | x | y | screen_top | screen_botm | start_datetime | end_datetime | q | |
---|---|---|---|---|---|---|---|---|
179 | iwum_179 | 477955.0 | 1041285.0 | 14.093952 | 6.803136 | 2017-06-01 | 2017-06-30 | -2043.222778 |
314 | iwum_314 | 476955.0 | 1042285.0 | 13.795248 | 6.400800 | 2017-06-01 | 2017-06-30 | -2093.868652 |
315 | iwum_315 | 477955.0 | 1042285.0 | 13.722096 | 6.352032 | 2017-06-01 | 2017-06-30 | -1526.927734 |
443 | iwum_443 | 469955.0 | 1043285.0 | 11.457432 | 3.328416 | 2017-06-01 | 2017-06-30 | -1283.890991 |
444 | iwum_444 | 470955.0 | 1043285.0 | 11.884152 | 3.947160 | 2017-06-01 | 2017-06-30 | -1245.565308 |
Plot average pumping at each location¶
[31]:
site_means = ag_wu.groupby('site_no').mean()
ax = site_means.plot(kind='scatter', x='x', y='y', c='q', s=1,
cmap='viridis_r', vmin=site_means.q.quantile(0.1))
ax.set_aspect(1)
Preprocess non-ag. water use¶
Including data from * the USGS Site-Specific Water Use Database (SWUDS) * USGS data on water use for thermo-electric power generation
SWUDs data¶
SWUDs data are assumed to come in an Excel spreadsheet, that may have incomplete information on well screen intervals
similar to other preprocessing methods, the data are reprojected and optionally culled to an area of interest
if input data do not have information on the well screen intervals, screen tops and bottoms are sampled from raster surfaces bounding estimated production zones; well depth is used to discriminate between zones
data are resampled to continous monthly values between a specified start and end date- typically bracketing the time range that the pumping should be simulated in the model
[32]:
from mapgwm.swuds import preprocess_swuds
swuds_input = test_data_path / 'swuds/withdrawals_test_dataset.xlsx'
worksheet = 'LMG-withdrawals-2000-2018'
outfile = output_folder / 'preprocessed_swuds_data.csv'
dem = test_data_path / 'rasters/dem_mean_elevs.tif'
# raster defining the tops and bottoms of non-agricultural public supply production intervals
# from Knierem
production_zones = {'mrva': (test_data_path / 'swuds/rasters/pz_MRVA_top.tif',
test_data_path / 'swuds/rasters/pz_MRVA_bot.tif',
),
'mcaq': (test_data_path / 'swuds/rasters/pz_MCAQP_top.tif',
test_data_path / 'swuds/rasters/pz_MCAQP_bot.tif',
'feet'),
'lcaq': (test_data_path / 'swuds/rasters/pz_LCAQP_top.tif',
test_data_path / 'swuds/rasters/pz_LCAQP_bot.tif')
}
[33]:
swuds_results = preprocess_swuds(swuds_input, worksheet,
dem=dem, dem_units='feet',
start_date=None,
end_date=None,
active_area=test_data_path / 'extents/ms_delta.shp',
production_zones=production_zones,
source_crs=4269,
dest_crs=5070,
data_length_units='feet',
data_volume_units='mgal',
model_length_units='meters',
outfile=outfile)
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 831 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
reading data from ../mapgwm/tests/data/rasters/dem_mean_elevs.tif...
finished in 0.26s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MRVA_top.tif...
finished in 0.01s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MRVA_bot.tif...
finished in 0.01s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MCAQP_top.tif...
finished in 0.01s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MCAQP_bot.tif...
finished in 0.01s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_LCAQP_top.tif...
finished in 0.01s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_LCAQP_bot.tif...
finished in 0.01s
processed SWUDS data written to output/preprocessed_swuds_data.csv and in dataframe attribute
writing output/preprocessed_swuds_data.shp... Done
A mapgwm.swuds.Swuds
object is returned, which has a dataframe attribute:
[34]:
swuds_results.df.head()
[34]:
site_no | q | q_monthly | month | well_elev | depth | screen_bot | screen_top | x | y | start_datetime | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2010-01-01 | 322431090530201 | 20895.471597 | 5.52 | JAN_VAL | 85.0 | 37.1856 | 47.8144 | 67.8144 | 478430.626345 | 1.050261e+06 | 2010-01-01 | POINT (478430.6263445358 1050260.766381889) |
2010-02-01 | 322431090530201 | 20895.471597 | 5.52 | FEB_VAL | 85.0 | 37.1856 | 47.8144 | 67.8144 | 478430.626345 | 1.050261e+06 | 2010-02-01 | POINT (478430.6263445358 1050260.766381889) |
2010-03-01 | 322431090530201 | 20895.471597 | 5.52 | MAR_VAL | 85.0 | 37.1856 | 47.8144 | 67.8144 | 478430.626345 | 1.050261e+06 | 2010-03-01 | POINT (478430.6263445358 1050260.766381889) |
2010-04-01 | 322431090530201 | 20895.471597 | 5.52 | APR_VAL | 85.0 | 37.1856 | 47.8144 | 67.8144 | 478430.626345 | 1.050261e+06 | 2010-04-01 | POINT (478430.6263445358 1050260.766381889) |
2010-05-01 | 322431090530201 | 20895.471597 | 5.52 | MAY_VAL | 85.0 | 37.1856 | 47.8144 | 67.8144 | 478430.626345 | 1.050261e+06 | 2010-05-01 | POINT (478430.6263445358 1050260.766381889) |
Water use for thermoelectric power generation¶
similar to the SWUDs data, data for water use related to TE power generation are assumed to come in a spreadsheet
however, the column names might different between different datasets.
therefore, the first step is to read in the data using a function that harmonizes the column names, and then combine the data using
DataFrame.append()
:
[35]:
from mapgwm.te_wateruse import read_te_water_use_spreadsheet, preprocess_te_wateruse
df2010 = read_te_water_use_spreadsheet(test_data_path / 'swuds/2010_Thermo_Model_Estimates.xlsx',
date='2010',
x_coord_col='Longitude, decimal degrees',
y_coord_col='Latitude, decimal degrees',
q_col='USGS-Estimated Annual Withdrawal (Mgal/d)',
site_no_col='PLANT CODE',
site_name_col='PLANT NAME',
source_name_col='NAME OF WATER SOURCE',
source_code_col='USGS WATER SOURCE CODE',
sheet_name='Report_table_UPDATED', skiprows=2)
df2015 = read_te_water_use_spreadsheet(test_data_path / 'swuds/2015_TE_Model_Estimates_lat.long_COMIDs.xlsx',
date='2015',
x_coord_col='LONGITUDE',
y_coord_col='LATITUDE',
q_col='WITHDRAWAL',
site_no_col='EIA_PLANT_ID',
site_name_col='PLANT_NAME',
source_name_col='NAME_OF_WATER_SOURCE',
source_code_col='WATER_SOURCE_CODE',
sheet_name='2015_ANNUAL_WD_CU', skiprows=0)
df = df2010.append(df2015)
We can then run the data through a similar function to that used for SWUDs:
[36]:
outfile = output_folder / 'preprocessed_te_data.csv'
estimated_production_zone_top = test_data_path / 'swuds/rasters/pz_MCAQP_top.tif'
estimated_production_zone_botm = test_data_path / 'swuds/rasters/pz_MCAQP_bot.tif'
te_results = preprocess_te_wateruse(df, #dem=dem, dem_units='feet',
start_date='2008-01-01',
end_date='2017-12-31',
active_area=test_data_path /'extents/ms_delta.shp',
estimated_production_zone_top=estimated_production_zone_top,
estimated_production_zone_botm=estimated_production_zone_botm,
estimated_production_surface_units='feet',
source_crs=4269,
dest_crs=5070,
data_volume_units='mgal',
model_length_units='meters',
outfile=outfile)
reading ../mapgwm/tests/data/extents/ms_delta.shp...
--> building dataframe... (may take a while for large shapefiles)
Culling 405 sites outside of the model area defined by ['../mapgwm/tests/data/extents/ms_delta.shp'].
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MCAQP_top.tif...
finished in 0.17s
reading data from ../mapgwm/tests/data/swuds/rasters/pz_MCAQP_bot.tif...
finished in 0.16s
wrote output/preprocessed_te_data.csv
writing output/preprocessed_te_data.shp... Done
[37]:
te_results.head()
[37]:
site_no | start_datetime | x | y | screen_top | screen_botm | q | geometry | source_name | site_name | |
---|---|---|---|---|---|---|---|---|---|---|
2008-01-01 | 2059 | 2008-01-01 | 497139.150871 | 1.249816e+06 | -147.409514 | -162.649514 | 0.0 | POINT (497139.1508711056 1249816.468462982) | Wells | L L Wilkins |
2008-02-01 | 2059 | 2008-02-01 | 497139.150871 | 1.249816e+06 | -147.409514 | -162.649514 | 0.0 | POINT (497139.1508711056 1249816.468462982) | Wells | L L Wilkins |
2008-03-01 | 2059 | 2008-03-01 | 497139.150871 | 1.249816e+06 | -147.409514 | -162.649514 | 0.0 | POINT (497139.1508711056 1249816.468462982) | Wells | L L Wilkins |
2008-04-01 | 2059 | 2008-04-01 | 497139.150871 | 1.249816e+06 | -147.409514 | -162.649514 | 0.0 | POINT (497139.1508711056 1249816.468462982) | Wells | L L Wilkins |
2008-05-01 | 2059 | 2008-05-01 | 497139.150871 | 1.249816e+06 | -147.409514 | -162.649514 | 0.0 | POINT (497139.1508711056 1249816.468462982) | Wells | L L Wilkins |
Plot the locations of SWUDs and TE water use¶
[38]:
fig, ax = plt.subplots(figsize=(11, 8.5))
ax = swuds_results.df.plot.scatter(x='x', y='y', c='q_monthly',
cmap='viridis',
vmax=swuds_results.df.q.quantile(0.9),
norm=mpl.colors.LogNorm(), ax=ax)
te_results.plot.scatter(x='x', y='y', c='r', ax=ax, label='TE sites')
ax.legend()
ax.set_aspect(1)
cax = fig.get_axes()[1]
#and we can modify it, i.e.:
cax.set_ylabel('Pumpage, in m$^3$/day')
[38]:
Text(0, 0.5, 'Pumpage, in m$^3$/day')