Files
weewx/bin/wee_import
gjr80 f2957cfc22 Various minor changes
- changed Cumulus source parameter from folder to directory
- moved startup message so that it still shows first
2016-09-03 09:24:16 +10:00

674 lines
29 KiB
Python

#!/usr/bin/env python
#
# Copyright (c) 2009-2016 Tom Keffer <tkeffer@gmail.com> and
# Gary Roderick
#
# See the file LICENSE.txt for your rights.
#
"""Import weewx observation data from an external source.
Compatibility:
wee_import can import from a Comma Separated Values (CSV) format file,
directly from the historical records of a Weather Underground Personal
Weather Station or from one or more Cumulus monthly log files. CSV format
files must have a comma separated list of field names on the first line.
Design
wee_import utilises a config file and a number of command line options to
control the import. The config file defines the type of input to be
performed and the import data source as well as more advanced options such
as field maps etc. Details of the supported command line parameters/options
can be viewed by entering wee_import --help at the command line. Details of
the wee_import config file settings can be found in sample import conf files
distributed in the weewx/util/import directory.
wee_import utilises an abstract base class that defines the majority of the
wee_import functionality. The abstract base class and other supporting
structures are in bin/weeimport/weeimport.py. Child classes are created from
the base class for each different import type supported by wee_import. The
child classes set a number of import type specific properties as well as
defining a getData() method that reads the raw data to be imported and a
period_generator() method that generates a sequence of objects to be imported
(eg monthly log files). This way wee_import can be extended to support other
sources by defining a new child class, its specific properties as well as
getData() and period_generator() methods. The child class for a given import
type are definined in the bin/weeimport/xxximport.py files.
As with other weewx utilities, wee_import advises the user of basic
configuration, action taken and results via stdout. However, since
wee_import can make substantial changes to the weewx archive, wee_import also
logs to a file by default. This functionality is controlled via a command
line option.
Prerequisites
wee_import uses a number of weewx API calls and therefore must have a
functional weewx installation. wee_import requires weewx 3.6.0 or later.
Configuration
A number of parameters can be defined in the import config file as follows:
# EXAMPLE WEE_IMPORT CONFIGURATION FILE
#
# Copyright (c) 2009-2016 Tom Keffer <tkeffer@gmail.com>
# See the file LICENSE.txt for your rights.
##############################################################################
# Specify the source. Available options are:
# CSV - import obs from a single CSV format file
# Wunder - import obs from a Weather Underground PWS history
# Cumuls - import obs from a one or more Cumulus monthly log files
# Format is:
# source = (CSV | Wunder | Cumulus)
source = CSV
##############################################################################
# Path and name of our CSV source file. Format is:
# file = full path and filename
file = /var/tmp/data.csv
# If there is no mapped interval field how will the interval field be
# determined for the imported records. Available options are:
# derive - Derive the interval field from the timestamp of successive
# records. This setting is best used when the imported records
# are equally spaced in time and there are no missing records.
# conf - Use the interval setting from weewx.conf. This setting is
# best used if the records to be imported have been produced by
# weewx using the same archive interval as set in weewx.conf on
# this machine.
# x - Use a fixed interval of x minutes for every record. This
# setting is best used if the records to be imported are
# equally based in time but there are some missing records.
#
# Note: If there is a mapped interval field then this setting will be
# ignored.
# Format is:
# interval = (derive | conf | x)
interval = derive
# Should any missing derived observations be calculated from the imported
# data if possible. Available options are:
# True - Any missing derived observations are calculated.
# False - Any missing derived observations are not calculated.
# Format is:
# calc_missing = (True | False)
calc_missing = True
# Imported records are written to archive in transactions of tranche
# records at a time. Increase for faster throughput, decrease to reduce
# memory requirements. Format is:
# tranche = x
# where x is an integer
tranche = 250
# Specify whether a UV sensor was used to produce any UV observations.
# Available options are:
# True - UV sensor was used and UV data will be imported.
# False - UV sensor was not used and any UV data will not be imported.
# UV fields will be set to None/NULL.
# For a CSV import UV_sensor should be set to False if a UV sensor was
# NOT present when the import data was created. Otherwise it may be set to
# True or omitted. Format is:
# UV_sensor = (True | False)
UV_sensor = True
# Specify whether a solar radiation sensor was used to produce any solar
# radiation observations. Available options are:
# True - Solar radiation sensor was used and solar radiation data will
# be imported.
# False - Solar radiation sensor was not used and any solar radiation
# data will not be imported. radiation fields will be set to
# None/NULL.
# For a CSV import solar_sensor should be set to False if a solar radiation
# sensor was NOT present when the import data was created. Otherwise it may
# be set to True or omitted. Format is:
# solar_sensor = (True | False)
solar_sensor = True
# Date-time format of CSV field from which the weewx archive record
# dateTime field is to be extracted. wee_import first attempts to interpret
# date/time info in this format, if this fails it then attempts to
# interpret it as a timestamp and if this fails it then raises an error.
# Uses Python strptime() format codes.
# raw_datetime_format = Python strptime() format string
raw_datetime_format = %Y-%m-%d %H:%M:%S
# Does the imported rain field represent the total rainfall since the last
# record or a cumulative value. Available options are:
# discrete - rain field represents total rainfall since last record
# cumulative - rain field represents a cumulative rainfall reset at
# midnight
# rain = (discrete | cumulative)
rain = cumulative
# Lower and upper bounds for imported wind direction. It is possible,
# particularly for a calculated direction, to have a value (eg -45) outside
# of the weewx limits (0 to 360 inclusive). Format is:
#
# wind_direction = lower,upper
#
# where :
# lower is the lower limit of acceptable wind direction in degrees
# (may be negative)
# upper is the upper limit of acceptable wind direction in degrees
#
# Imported values from lower to upper will be normalised to the range 0 to
# 360. Values outside of the parameter range will be stored as None. Default
# is -360,360.
wind_direction = -360,360
# Map CSV record fields to weewx archive fields. Format is:
#
# weewx_archive_field_name = csv_field_name, weewx_unit_name
#
# where:
# weewx_archive_field_name - An observation name in the weewx database
# schema.
# csv_field_name - The name of a field from the CSV file.
# weewx_unit_name - The name of the units, as defined in weewx,
# used by csv_field_name. This value represents
# the units used for this field in the CSV
# file, wee_import will do the necessary
# conversions to the unit system used by the
# weewx archive.
# For example,
# dewpoint = dew, degree_C
# would map the CSV field dew, in degrees C, to the archive field dewpoint.
#
# Archive fields that do not exist in the CSV data may be omitted. Any
# omitted fields that are derived (eg dewpoint) may be calculated during
# import using the equivalent of the weewx StdWXCalculate service through
# setting the calc-missing parameter above.
[[Map]]
dateTime = timestamp, unix_epoch
usUnits =
interval =
barometer = barometer, inHg
pressure =
altimeter =
inTemp =
outTemp = Temp, degree_F
inHumidity =
outHumidity = humidity, percent
windSpeed = windspeed, mile_per_hour
windDir = wind, degree_compass
windGust = gust, mile_per_hour
windGustDir = gustDir, degree_compass
rainRate = rate, inch_per_hour
rain = dayrain, inch
dewpoint =
windchill =
heatindex =
ET =
radiation =
UV =
##############################################################################
[Wunder]
# Parameters used when importing from a WU PWS
# WU PWS Station ID to be used for import.
station_id = XXXXXXXX123
#
# When importing WU data the following weewx database fields will be
# populated directly by the imported data (provided the corresponding data
# exists on WU):
# barometer
# dateTime
# dewpoint
# outHumidity
# outTemp
# radiation
# rain
# windDir
# windGust
# windSpeed
#
# The following weewx database fields will be populated from other
# settings/config files:
# interval
# usUnits
#
# The following weewx database fields will be populated with values derived
# from the imported data provided the --calc-missing command line option is
# used during import:
# altimeter
# ET
# heatindex
# pressure
# rainRate
# windchill
#
# The following weewx fields will be populated with derived values from the
# imported data provided the --calc-missing command line option is used
# during import. These fields will only be saved to the weewx database if
# the weewx schema has been modified to accept them. Note that the pyephem
# module is required in order to calculate maxSolarRad - refer weewx Users
# Guide.
# appTemp
# cloudbase
# humidex
# maxSolarRad
# windrun
# How will the interval field be determined for the imported records.
# Available options are:
# derive - Derive the interval field from the timestamp of successive
# records. This setting is best used when the imported records
# are equally spaced in time and there are no missing records.
# conf - Use the interval setting from weewx.conf. This setting is
# best used if the records to be imported have been produced by
# weewx using the same archive interval as set in weewx.conf on
# this machine.
# x - Use a fixed interval of x minutes for every record. This
# setting is best used if the records to be imported are
# equally based in time but there are some missing records.
# This setting is recommended for WU imports.
# Due to WU frequently missing uploaded records, use of 'derive' may give
# incorrect or inconsistent interval values. Better results may be
# achieved by using the 'conf' setting (if weewx has been doing the WU
# uploading and the weewx archive_interval matches the WU observation
# spacing in time) or setting the interval to a fixed value (eg 5). The
# most appropriate setting will depend on the completeness and (time)
# accuracy of the WU data being imported.
# Format is:
# interval = (derive | conf | x)
interval = x
# Should any missing derived observations be calculated from the imported
# data if possible. Available options are:
# True - Any missing derived observations are calculated.
# False - Any missing derived observations are not calculated.
# Format is:
# calc_missing = (True | False)
calc_missing = True
# Imported records are written to archive in transactions of tranche
# records at a time. Increase for faster throughput, decrease to reduce
# memory requirements. Format is:
# tranche = x
# where x is an integer
tranche = 250
# Lower and upper bounds for imported wind direction. It is possible,
# particularly for a calculated direction, to have a value (eg -45) outside
# of the weewx limits (0 to 360 inclusive). Format is:
#
# wind_direction = lower,upper
#
# where :
# lower is the lower limit of acceptable wind direction in degrees
# (may be negative)
# upper is the upper limit of acceptable wind direction in degrees
#
# WU has at times been known to store large values (eg -9999) for wind
# direction, often no wind direction was uploaded to WU. The wind_direction
# parameter sets a lower and upper bound for valid wind direction values.
# Values inside these bounds are normalised to the range 0 to 360. Values
# outside of the bounds will be stored as None. Default is 0,360
wind_direction = 0,360
##############################################################################
[Cumulus]
# Parameters used when importing Cumulus monthly log files
#
# Directory containing Cumulus monthly log files to be imported. Format is:
# directory = full path without trailing /
directory = /var/tmp/cumulus
# When importing Cumulus monthly log file data the following weewx database
# fields will be populated directly by the imported data:
# barometer
# dateTime
# dewpoint
# heatindex
# inHumidity
# inTemp
# outHumidity
# outTemp
# radiation (if Cumulus data available)
# rain (requires Cumulus 1.9.4 or later)
# rainRate
# UV (if Cumulus data available)
# windDir
# windGust
# windSpeed
# windchill
#
# The following weewx database fields will be populated from other
# settings/config files:
# interval
# usUnits
#
# The following weewx database fields will be populated with values derived
# from the imported data provided the --calc-missing command line option is
# used during import:
# altimeter
# ET
# pressure
#
# The following weewx fields will be populated with derived values from the
# imported data provided the --calc-missing command line option is used
# during import. These fields will only be saved to the weewx database if
# the weewx schema has been modified to accept them. Note that the pyephem
# module is required in order to calculate maxSolarRad - refer weewx Users
# Guide.
# appTemp
# cloudbase
# humidex
# maxSolarRad
# windrun
# How will the interval field be determined for the imported records.
# Available options are:
# derive - Derive the interval field from the timestamp of successive
# records. This setting is best used when the imported records
# are equally spaced in time and there are no missing records.
# conf - Use the interval setting from weewx.conf. This setting is
# best used if the records to be imported have been produced by
# weewx using the same archive interval as set in weewx.conf on
# this machine.
# x - Use a fixed interval of x minutes for every record. This
# setting is best used if the records to be imported are
# equally based in time but there are some missing records.
# This setting is recommended for WU imports.
# To import Cumulus records it is recommended that the interval setting
# be set to the value used in Cumulus as the 'data log interval'.
# Format is:
# interval = (derive | conf | x)
interval = x
# Should any missing derived observations be calculated from the imported
# data if possible. Available options are:
# True - Any missing derived observations are calculated.
# False - Any missing derived observations are not calculated.
# Format is:
# calc_missing = (True | False)
calc_missing = True
# Specify the character used as the field delimiter as Cumulus monthly log
# files may not always use a comma to delimit fields in the monthly log
# files. The character must be enclosed in quotes. Must not be the same
# as the decimal setting below. Format is:
# delimiter = ','
delimiter = ','
# Specify the character used as the decimal point. Cumulus monthly log
# files may not always use a fullstop character as the decimal point. The
# character must be enclosed in quotes. Must not be the same as the
# delimiter setting. Format is:
# decimal = '.'
decimal = '.'
# Imported records are written to archive in transactions of tranche
# records at a time. Increase for faster throughput, decrease to reduce
# memory requirements. Format is:
# tranche = x
# where x is an integer
tranche = 250
# Specify whether a UV sensor was used to produce any UV observations.
# Available options are:
# True - UV sensor was used and UV data will be imported.
# False - UV sensor was not used and any UV data will not be imported.
# UV fields will be set to None/NULL.
# For a Cumulus monthly log file import UV_sensor should be set to False if
# a UV sensor was NOT present when the import data was created. Otherwise
# it may be set to True or omitted. Format is:
# UV_sensor = (True | False)
UV_sensor = True
# Specify whether a solar radiation sensor was used to produce any solar
# radiation observations. Available options are:
# True - Solar radiation sensor was used and solar radiation data will
# be imported.
# False - Solar radiation sensor was not used and any solar radiation
# data will not be imported. radiation fields will be set to
# None/NULL.
# For a Cumulus monthly log file import solar_sensor should be set to False
# if a solar radiation sensor was NOT present when the import data was
# created. Otherwise it may be set to True or omitted. Format is:
# solar_sensor = (True | False)
solar_sensor = True
# For correct import of the monthly logs wee_import needs to know what
# units are used in the imported data. The units used for temperature,
# pressure, rain and windspeed related observations in the Cumulus monthly
# logs are set at the Cumulus Station Configuration Screen. The
# [[Units]] settings below should be set to the weewx equivalent of the
# units of measure used by Cumulus (eg if Cumulus used 'C' for temperature,
# temperature should be set to 'degree_C'). Note that Cumulus does not
# support all units used by weewx (eg 'mmHg') so not all weewx unit are
# available options.
[[Units]]
temperature = degree_C # options are 'degree_F' or 'degree_C'
pressure = hPa # options are 'inHg', 'mbar' or 'hPa'
rain = mm # options are 'inch' or 'mm'
speed = km_per_hour # options are 'mile_per_hour',
# 'km_per_hour', 'knot' or
# 'meter_per_second'
Adding a New Import Source
To add a new import source:
- Create a new file bin/weeimport/xxximport.py that defines a new class
for the xxx source that is a child of class Source. The new class must
meet the following minimum requirements:
- __init__() must define:
- self.raw_datetime_format: Format of date time data field from
which observation timestamp is to be
derived. String comprising Python
strptime() format codes.
- self.rain: Whether imported rainfall field contains the
rainfall since the last record or a cumulative value.
String 'discrete' or 'cumulative'
- self.wind_dir: The range of values in degrees that will be
accepted as a valid wind direction. Two way
tuple of the format (lower, upper) where lower
is the lower inclusive limit and upper is the
upper inclusive limit.
- self.UV_sensor: Was a UV sensor present when the data to be
imported was created. If set True then any UV
data will be imported, if set False then any
imported UV data will ignored and the weewx UV
field set to None. Boolean.
- self.solar_sensor: Was a solar radiation sensor present when
the data to be imported was created. If set
True then any solar radiation data will be
imported, if set False then any imported
solar radiation data will ignored and the
weewx radiation field set to None. Boolean.
- Define a period_generator() method that:
- Accepts no parameters and generates (yields) a sequence of
objects (eg file names, dates for a http request etc) that are
passed to the getRawData() method to obtain a sequence of raw
data records.
- Define a getRawdata() method that:
- Accepts a single parameter 'period' that is provided by the
period_generator() method.
- Returns an iterable of raw source data records.
- Creates the source data field-to-weewx archive field map and
saves the map to the class object map property (the map may be
created using the Source.parseMap() method). Refer to
getRawData() methods in csvimport.py and wunderimport.py.
- Modify bin/weeimport/weeimport.py as follows:
- Add a new entry to the list of supported services defined in
SUPPORTED_SERVICES.
- Create a wee_import import config file for importing from the new
source. The import config file must:
- Add a new source name to the source parameter. The new source name
must be the same (case sensitive) as the entry added to
weeimport.py SUPPORTED_SERVICES.
- Define a stanza for the new import type. The stanza must be the
same (case sensitive) as the entry added to weeimport.py
SUPPORTED_SERVICES.
"""
# Python imports
import logging
import optparse
from distutils.version import StrictVersion
# weewx imports
import weewx
import weeimport
import weeimport.weeimport
# wee_import version number
WEE_IMPORT_VERSION = '0.1'
# minimum weewx version required for this version of wee_import
REQUIRED_WEEWX = "3.5.0"
description = """Import observation data into a weewx archive."""
usage = """wee_import --help
wee_import --version
wee_import --import-config=IMPORT_CONFIG_FILE
[--config=CONFIG_FILE]
[--date YYYY/MM/DD|YYYY/MM/DD hh:mm|YYYY/MM/DD (hh:mm)-YYYY/MM/DD (hh:mm)]
[--dry-run]
[--verbose]
[--log=-|LOG_FILE]
"""
epilog = """wee_import will import data from an external source into a weewx
archive. Daily summaries are updated as each archive record is
imported so there should be no need to separately drop and rebuild
the daily summaries using the wee_database utility."""
def main():
"""The main routine that kicks everything off."""
# Create a command line parser:
parser = optparse.OptionParser(description=description,
usage=usage,
epilog=epilog)
# Add the various options:
parser.add_option("--config", dest="config_path", type=str,
metavar="CONFIG_FILE", default="weewx.conf",
help="Use weewx configuration file CONFIG_FILE.")
parser.add_option("--import-config", dest="import_config_path", type=str,
metavar="IMPORT_CONFIG_FILE", default="wee_import.conf",
help="Use wee_import configuration file IMPORT_CONFIG_FILE.")
parser.add_option("--dry-run", dest="dry_run", action="store_true",
help="Print what would happen but do not do it.")
parser.add_option("--date", dest="date", type=str, metavar="YYYY-MM-DD",
help="Date or time to import as a string of form "
"'YYYY-MM-DD' or 'YYYY-MM-DD hh:mm'. A date range or "
"date-time range may be specified by separating two "
"date strings or two date-time strings with a hyphen. "
"Argument must be enclosed in quotation marks.")
parser.add_option("--log", dest="logging", type=str, metavar="LOG_FILE",
help="Control logging of most output. Log output is sent "
"to LOG_FILE if specified. Log output is turned off "
"by using '--log=-'. Default is to log output to file "
"'wee_import_YYYMMDDhhmmss.log' located in the same "
"directory as IMPORT_CONFIG_FILE where YYYMMDDhhmmss "
"represents the date and time wee_import was run.")
parser.add_option("--verbose", action="store_true", dest="verbose",
help="Print useful extra output.")
parser.add_option("--version", dest="version", action="store_true",
help="Display wee_import version number.")
# Now we are ready to parse the command line:
(options, args) = parser.parse_args()
# check weewx version number for compatibility
if StrictVersion(weewx.__version__) < StrictVersion(REQUIRED_WEEWX):
print "weewx %s or greater is required, found %s. Nothing done, exiting." % (REQUIRED_WEEWX,
weewx.__version__)
exit(1)
# display wee_import version info
if options.version:
print "wee_import version: %s" % WEE_IMPORT_VERSION
exit(0)
# If we made it this far we want to do something so advise the user we are
# starting up
print "Starting wee_import..."
# Set up logging
wlog = weeimport.weeimport.WeeImportLog(options.logging,
options.verbose,
options.dry_run,
options.import_config_path)
# If we got this far we must want to import something so get a Source
# object from our factory and try to import. Be prepared to catch any
# errors though.
try:
source_obj = weeimport.weeimport.Source.sourceFactory(options,
args,
wlog)
source_obj.run()
except weeimport.weeimport.WeeImportIOError, e:
wlog.printlog(logging.INFO, "**** Unable to load source file.")
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
exit(1)
except weeimport.weeimport.WeeImportFieldError, e:
wlog.printlog(logging.INFO, "**** Unable to map source data.")
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
exit(1)
except weeimport.weeimport.WeeImportMapError, e:
wlog.printlog(logging.INFO,
"**** Unable to parse source-to-weewx field map.")
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
exit(1)
except (weewx.ViolatedPrecondition, weewx.UnsupportedFeature), e:
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
print
parser.print_help()
exit(1)
except SystemExit, e:
print e
exit(0)
except (ValueError, weewx.UnitError), e:
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
exit(1)
except IOError, e:
wlog.printlog(logging.INFO, "**** Unable to load config file.")
wlog.printlog(logging.INFO, "**** %s" % e)
print "**** Nothing done, exiting."
wlog.logonly(logging.INFO, "**** Nothing done.")
exit(1)
# execute our main code
if __name__ == "__main__":
main()