mapgwm.te_wateruse module

mapgwm.te_wateruse.preprocess_te_wateruse(data, start_date=None, end_date=None, active_area=None, active_area_id_column=None, active_area_feature_id=None, estimated_production_zone_top=None, estimated_production_zone_botm=None, estimated_production_surface_units='feet', source_crs=4269, dest_crs=5070, interp_method='linear', data_volume_units='mgal', model_length_units='meters', outfile=None)[source]

Preprocess water use data from thermoelectric power plants:

  • reproject data to a destination CRS dest_crs)

  • cull data to an area of interest (active_area)

  • if input data do not have information on the well screen intervals; sample screen tops and bottoms from raster surfaces bounding an estimated production zone (e.g. estimated_production_zone_top)

  • reindex the data to continous monthly values extending from start_date to end_date. Typically, these would bracket the time period for which the pumping should be simulated in a model. For example, the earliest data may be from 2010, but if the model starts in 2008, it may be appropriate to begin using the 2010 rates then (start_date='2008'). If no start or end date are given, the first and last years of pumping in data are used.

  • fill empty months by interpolation via a specified interp_method

  • backfill any remaining empty months going back to the start_date

  • write processed data to a CSV file and shapefile of the same name

Parameters
dataDataFrame

Thermoelectric water use data in the following format (similar to that output by mapgwm.te_wateruse.read_te_water_use_spreadsheet()):

site_no

power plant identifier (plant code)

start_datetime

pandas datetime representative of flux (e.g. ‘2010’)

x

x-coordinate of withdrawl, in source_crs

y

y-coordinate of withdrawl, in source_crs

q

withdrawl flux, in data_volume_units per days

start_datestr

Start date for pumping rates. If earlier than the dates in data, pumping rates will be backfilled to this date.

end_datestr

End date for pumping rates. If later than the dates in data, pumping rates will be forward filled to this date.

active_areastr

Shapefile with polygon to cull observations to. Automatically reprojected to dest_crs if the shapefile includes a .prj file. by default, None.

active_area_id_columnstr, optional

Column in active_area with feature ids. By default, None, in which case all features are used.

active_area_feature_idstr, optional

ID of feature to use for active area By default, None, in which case all features are used.

estimated_production_zone_topfile path

Raster surface for assigning screen tops

estimated_production_zone_botmfile path

Raster surface for assigning screen bottoms

estimated_production_surface_unitsstr, {‘meters’, ‘ft’, etc.}

Length units of elevations in estimated production surface rasters.

source_crsobj

Coordinate reference system of the head observation locations. A Python int, dict, str, or pyproj.crs.CRS instance passed to pyproj.crs.CRS.from_user_input()

Can be any of:
  • PROJ string

  • Dictionary of PROJ parameters

  • PROJ keyword arguments for parameters

  • JSON string with PROJ parameters

  • CRS WKT string

  • An authority string [i.e. ‘epsg:4326’]

  • An EPSG integer code [i.e. 4326]

  • A tuple of (“auth_name”: “auth_code”) [i.e (‘epsg’, ‘4326’)]

  • An object with a to_wkt method.

  • A pyproj.crs.CRS class

By default, epsg:4269

dest_crsobj

Coordinate reference system of the model. Same input types as source_crs. By default, epsg:5070

interp_methodstr

Interpolation method to use for filling pumping rates to monthly values. By default, ‘linear’

data_volume_unitsstr; e.g. ‘mgal’, ‘m3’, ‘cubic feet’, etc.

Volume units of pumping data. All time units are assumed to be in days.

model_length_unitsstr; e.g. ‘feet’, ‘m’, ‘meters’, etc.

Length units of model.

outfilestr

Path for output file. A shapefile of the same name is also written. If None, no output file is written. By default, None

Returns
df_monthlyDataFrame

Notes

  • time units for TE data and model are assumed to be days

mapgwm.te_wateruse.read_te_water_use_spreadsheet(xlsx_file, date='2010', site_no_col='site_no', site_name_col='site_name', x_coord_col='x', y_coord_col='y', q_col='q', source_name_col='source_name', source_code_col='WATER_SOURCE_CODE', source_code_filter=['GW'], **kwargs)[source]

Read water use data for thermoelectric power generation from a spreadsheet; filter by a source code, and rename columns to uniform names so that multiple datasets from spreadsheets with different formatting can be easily combined.

Parameters
xlsx_fileExcel spreadsheet

Thermoelectric power water use data, in a format readable by pandas.read_excel().

datestr

Date string in a format readable by pandas.Timestamp(), indicating the time of the water use data (most likely a year).

site_no_colstr

Column in xlsx_file with identifiers for water users (power plants).

site_name_colstr

Column in xlsx_file with names of water users (power plants).

x_coord_colstr

Column in xlsx_file with power plant location x-coordinates.

y_coord_colstr

Column in xlsx_file with power plant location y-coordinates.

q_colstr

Column in xlsx_file with power plant pumping rates.

source_name_colstr

Column in xlsx_file with names of water sources.

source_code_colstr

Column in xlsx_file with codes for water supply sources.

source_code_filtersequence of strings

Source code(s) in source_code_col to filter on; e.g. ‘GW’ for groundwater

kwargskeywword arguments to pandas.read_excel()

Arguments to read_excel that might be needed to read xlsx_file; for example sheet_name or skiprows.

Returns
dfDataFrame

TE data with unified column names:

site_no

power plant identifier (plant code)

date

pandas datetime representative of flux (e.g. ‘2010’)

x

x-coordinate of withdrawl, in source_crs

y

y-coordinate of withdrawl, in source_crs

q

withdrawl flux, in data_volume_units per days

site_name

name of power plant, if provided