This module is about meteorological loggers. Loggers are specialized computers to which sensors are connected, and loggers log the measurements of the sensors. The measurements logged in the logger are eventually transferred to a computer (e.g. by GSM modems, or through the internet, or by going to the site with your laptop, connecting it to the logger, and copying the data) and written to a file. The Datafile class provides functionality particularly for inserting such data to the database.
The Datafile class is an abstract base class that is meant to be subclassed. It provides functionality that is common to all kinds of data files. However, each make of logger, or even each type of software used to unload data from a logger, provides a different kind of data file. Therefore, Datafile subclasses are specialized, each to a specific kind of file.
A Datafile instance should never be constructed; instead, a subclass should be constructed; however, the call for constructing any subclass has the same arguments: db is a psycopg2 object, and datafiledict is a dictionary containing values for filename and datafile_fields, and optionally for subset_identifiers, delimiter, decimal_separator and date_format. The logger argument has nothing to do with meteorological logger; it is a Logger object to which error, progress, and debugging information is written.
Datafile has the following attributes and methods:
This is used only on some Datafile subclasses. Some file formats mix two or more sets of measurements in the same file; for example, there may be ten-minute and hourly measurements in th esame file, and for every 6 lines with ten-minute measurements there may be an additional line with hourly measurements (not necessarily the same variables). Such files have one or more additional distinguishing fields in each line, which helps to distinguish which set it is. We call these fields, which depend on the specific data file format, the subset identifiers.
Datafile (in fact its subclass) processes only one set of lines each time, and subset_identifiers specifies which subset it is. subset_identifiers is a comma-separated list of identifiers, and will cause Datafile (in fact its subclass) to ignore lines with different subset identifiers.
For each time series specified in the datafile_fields, retrieve the end date for the time series from the database, scan the data file, determine which is the first record of the time series not already stored in the database, and append that record and all subsequent record for the database. This is done for all time series specified in datafile_fields.
The changes are not committed; the caller must commit them.
Datafile subclasses need to define the following methods: