mapgwm.swuds module¶

class mapgwm.swuds.Swuds(xlsx=None, sheet=None, csvfile=None, site_no_col='SITE_NO', x_coord_col='FROM_DEC_LONG_VA', y_coord_col='FROM_DEC_LAT_VA', start_date=None, end_date=None, source_crs=4269, dest_crs=5070, data_length_units='feet', data_volume_units='mgal', model_length_units='meters', default_screen_len=20, cols='default')[source]¶

Bases: object

Code for preprocessing non-agricultural water use information into clean CSV input to MODFLOW setup. Class excludes AQ, IR, and TE water use from a swuds excel dataset. Includes logic to fill missing data, as data at many sites are limited to survey years (e.g. 2010 and 2015).

apply_footprint(active_area, active_area_id_column=None, active_area_feature_id=None)[source]¶

Keep sites in the df pandas dataframe that fall into the passed bounding shapefile polygon. Requires that df dataframe has a Point geometry column as assigned in the reproject method.

Parameters

active_area: str: path to shapefile with footprint for current analysis
active_area_id_columnstr, optional: Column in active_area with feature ids. By default, None, in which case all features are used.
active_area_feature_idstr, optional: ID of feature to use for active area By default, None, in which case all features are used.
outshp: str: optional path to output shapefile with points within the footprint

assign_missing_elevs(top_raster, dem_units='meters')[source]¶

Use the top of model raster, or land-surface raster, to assign the elevation for points where elevation is missing.

Parameters

top_raster: str: path to raster data set with land surface or model top elevation, used to assign missing values to water-use points
elev_field: str: field in df with elevation data, default is ‘FROM_ALT_VA’

assign_monthly_production(outfile='processed_swuds.csv')[source]¶

Assign production wells for water use, skipping IR (irrigation) and TE (thermal electric) to production zones. If production zones are not assigned or if the well bottom doesn’t fall into a production zone, then the screen_top and screen_bot are assigned using well_depth and the default screen length.

Production is given in cubic m per day. todo: add unit conversion parameter so other units can be used?

Parameters

outfile: str: path to final processed monthly water-use file with production zone information

static fix_path(data_path)[source]¶

Convert simple path string with forward slashes to a path used by python using os.path.join(). This function allows the user to specify a path in the yaml file simply, for example: d:/home/MAP/source_data/wateruse.csv

The string is split on ‘/’ and resulting list is passed to os.path.join(). If the first entry has a colon, then os.path.join(entry[0], os.path.sep) is used to specify a windows drive properly.

Parameters

data_path: str: path read from yaml file

Returns

data_path: str: path built using os.path.join()

classmethod from_yaml(yamlfile)[source]¶

Read input and output files from yaml file and run all the processing steps in the class to produce the processed csv file.

Parameters

yamlfile: str: path to a yaml file containing input and output information

Returns

wu: Swuds object: returns a Swuds object and also generates processed csv file specified in the yaml file.

make_production_zones(production_zones, default_elevation_units='feet')[source]¶

Make dictionary attributes for production zones. These are used to assign individual wells to production zones. The defaultdict is keyed by zone_name and then SITE_NO.

Parameters

zonelist: list of lists: List of production zone information, each zone requires a list with [zone_name, zone_top, zone_bot]
zone_name: str: name assigned to prodcuction zone
zone_top: str: path to raster with top of zone
zone_bot: str: path to raster to bottom of zone
key: str: key (column name) to use in the resulting parameter zone dictionaries. Defaults to SITE_NO

reproject(x_coord_col='FROM_DEC_LONG_VA', y_coord_col='FROM_DEC_LAT_VA', key='SITE_NO')[source]¶

Reproject from self.source_crs to self.dest_crs using gisutils

Parameters

x_coord_colstr, optional: Column name in data with x-coordinates, by default ‘x’
y_coord_colstr, optional: Column name in data with y-coordinates, by default ‘y’
key: str: key for the dictionary made, defaults to SITE_NO

sort_sites(primarysort='SITE_NO', secondarysort=None)[source]¶

Sort the dataframe by site number and quantity, or passed parameter

Parameters

primarysort: str: string to sort group groups, defaults to SITE_NO
secondarysort: str: variable in dataframe to sort SITE_NO groups, default is None

mapgwm.swuds.preprocess_swuds(swuds_input, worksheet, csv_input=None, dem=None, dem_units='meters', start_date=None, end_date=None, active_area=None, active_area_id_column=None, active_area_feature_id=None, site_no_col='SITE_NO', x_coord_col='FROM_DEC_LONG_VA', y_coord_col='FROM_DEC_LAT_VA', production_zones=None, estimated_production_surface_units='feet', source_crs=4269, dest_crs=5070, data_length_units='feet', data_volume_units='mgal', model_length_units='meters', outfile=None)[source]¶

Preprocess water use data from the USGS Site-Specific Water Use Database (SWUDS).

reproject data to a destination CRS dest_crs)
cull data to an area of interest (active_area)
assign any missing wellhead elevations from a DEM
if input data do not have information on the well screen intervals; sample screen tops and bottoms from raster surfaces bounding an estimated production zone (e.g. estimated_production_zone_top). Well bottom information is used to discriminate between multiple production zones.
reindex the data to continous monthly values extending from start_date to end_date. Typically, these would bracket the time period for which the pumping should be simulated in a model. For example, the earliest data may be from 2010, but if the model starts in 2008, it may be appropriate to begin using the 2010 rates then (start_date='2008'). If no start or end date are given, the first and last years of pumping in data are used.
fill empty months using 2010 data (the most complete survey year) if available, otherwise use the average value for the site.
backfill any remaining empty months going back to the start_date in the same way
write processed data to a CSV file and shapefile of the same name

Parameters

swuds_input: str

Excel spreadsheet of SWUDs data, in a format readable by pandas.read_excel(). If xlsx file is passed, then a selected worksheet from it will be converted to a csv file unless the csvfile parameter is specified as None.

worksheetstr

Worksheet in swuds_input to read.

csvfile: str, optional

Path to csv file with data (if xlsx if None) or to a csvfile that is created from the selected worksheet. If xlsx is None, then csvfile must be provided.

sheet: str

Name of worksheet in xlsx to be read, ignored if xlsx is None.

demstr, optional

DEM raster of the land surface. Used for estimating missing wellhead elevations. Any reprojection to dest_crs is handled automatically, assuming the DEM raster has CRS information embedded (arc-ascii grids do not!) By default, None.

dem_unitsstr, {‘feet’, ‘meters’, ..}

Units of DEM elevations, by default, ‘meters’

start_datestr

Start date for pumping rates. If earlier than the dates in data, pumping rates will be backfilled to this date.

end_datestr

End date for pumping rates. If later than the dates in data, pumping rates will be forward filled to this date.

active_areastr

Shapefile with polygon to cull observations to. Automatically reprojected to dest_crs if the shapefile includes a .prj file. by default, None.

active_area_id_columnstr, optional

Column in active_area with feature ids. By default, None, in which case all features are used.

active_area_feature_idstr, optional

ID of feature to use for active area By default, None, in which case all features are used.

site_no_colstr, optional

Column name in data with site identifiers, by default ‘SITE_NO’

x_coord_colstr, optional

Column name in data with x-coordinates, by default ‘FROM_DEC_LONG_VA’

y_coord_colstr, optional

Column name in data with y-coordinates, by default ‘FROM_DEC_LAT_VA’

production_zonesdict

Dictionary of production zone tops and bottoms, and optionally, production zone surface elevation units, keyed by an abbreviated name.

Example:

production_zones={'mrva': (test_data_path / 'swuds/rasters/pz_MCAQP_top.tif',
                           test_data_path / 'swuds/rasters/pz_MCAQP_bot.tif',
                        ),
                  'mcaq': (test_data_path / 'swuds/rasters/pz_MCAQP_top.tif',
                           test_data_path / 'swuds/rasters/pz_MCAQP_bot.tif',
                           'feet')
                  }

Where zone ‘mrva’ has a top and bottom raster, but no units assigned, and zone ‘mcaq’ has a top and bottom raster and units. For zones with no units, the estimated_production_surface_units will be used.

estimated_production_surface_unitsstr, {‘meters’, ‘ft’, etc.}

Length units of elevations in estimated production surface rasters. by default, ‘feet’

source_crsobj

Coordinate reference system of the head observation locations. A Python int, dict, str, or pyproj.crs.CRS instance passed to pyproj.crs.CRS.from_user_input()

Can be any of:

PROJ string
Dictionary of PROJ parameters
PROJ keyword arguments for parameters
JSON string with PROJ parameters
CRS WKT string
An authority string [i.e. ‘epsg:4326’]
An EPSG integer code [i.e. 4326]
A tuple of (“auth_name”: “auth_code”) [i.e (‘epsg’, ‘4326’)]
An object with a to_wkt method.
A pyproj.crs.CRS class

By default, epsg:4269

dest_crsobj

Coordinate reference system of the model. Same input types as source_crs. By default, epsg:5070

data_length_unitsstr; ‘meters’, ‘feet’, etc.

Units of lengths in data (elevations, etc.) by default, ‘meters’

data_volume_unitsstr; ‘mgd’, ‘ft3’, etc

Volumetric unit of pumping rates by default, ‘mgd’ (million gallons per day)

model_length_unitsstr; ‘meters’, ‘feet’, etc.

Length units of model. by default, ‘meters’

outfilestr

Path for output file. A shapefile of the same name is also written. If None, no output file is written. By default, None

Returns

wumapgwm.swuds.Swuds instance

Notes

time units for SWUDs data and model are assumed to be days