openeew.data package

Submodules

openeew.data.aws module

class openeew.data.aws.AwsDataClient(country_code, s3_client=None)

Bases: object

A client for downloading OpenEEW data stored as an AWS Public Dataset.

Initialize AwsDataClient with the following parameters:

Parameters:
  • country_code (str) – The ISO 3166 two-letter country code for which data is required. It is case-insensitive as the value specified will be converted to lower case.
  • s3_client (boto3.client.s3) – The S3 client to use to access OpenEEW data on AWS. If no value is given, an anonymous S3 client will be used.
country_code
Returns:ISO 3166 two-letter country code of the client (in lower case). Any data returned by the client will be for this country.
Return type:str
get_current_devices()

Gets currently-valid device metadata. Fields are the same as for get_devices_full_history().

Returns:A list of device metadata.
Return type:list[dict]
get_devices_as_of_date(date_utc)

Gets device metadata as of a chosen UTC date. Fields are the same as for get_devices_full_history().

Parameters:date_utc (str) – The UTC date with format %Y-%m-%d %H:%M:%S. E.g. ‘2018-02-16 23:39:38’.
Returns:A list of device metadata.
Return type:list[dict]
get_devices_full_history()

Gets full history of device metadata.

Returns:A list of device metadata. See Device metadata for information about the fields.
Return type:list[dict]
get_filtered_records(start_date_utc, end_date_utc, device_ids=None)

Returns accelerometer records filtered by date and device.

Parameters:
  • start_date_utc (str) – The UTC start date with format %Y-%m-%d %H:%M:%S. E.g. ‘2018-02-16 23:39:38’. Only records with a _RECORD_T equal to or greater than start_date_utc will be returned.
  • end_date_utc (str) – The UTC end date with same format as start_date_utc. Only records with a _RECORD_T equal to or less than end_date_utc will be returned.
  • device_ids (Union[str, list[str]]) – Device IDs that should be returned.
Returns:

A list of records, where each record is a dict obtained from the stored JSON value. For details about the JSON records, see Data fields for further information about how records are stored.

Return type:

list[dict]

class openeew.data.aws.DateTimeKeyBuilder(year, month=None, day=None, hour=None, minute=None)

Bases: object

A class for building the datetime part of keys that are organized by a hierarchy that can include year, month, day, hour and minute. The year part of the key must contain the century, i.e. YYYY, and other parts are zero-padded decimals, e.g. 01, 02 etc. An example of such a key template would be “year={}/month={}/day={}/hour={}/{}”.

Initialize DateTimeKeyBuilder with the following parameters, which are concatenated together in order of increasing granularity to form key template:

Parameters:
  • year (str) – Year part of key template, e.g. “year={}/”. Must contain {} for year replacement, where year will be added as YYYY, e.g. 2020.
  • month (str) – Optional month part of key template, e.g. “month={}/”. Must contain {} for month replacement, where month will be added as zero-padded number, i.e. 01, 02, …, 12.
  • day (str) – Optional day part of key template, e.g. “day={}/”. Must contain {} for day replacement, where day will be added as zero-padded number, i.e. 01, 02, …, 31.
  • hour (str) – Optional hour part of key template, e.g. “hour={}/”. Must contain {} for hour replacement, where hour will be added as zero-padded number, i.e. 00, 01, …, 23.
  • minute (str) – Optional minute part of key template, e.g. “{}”. Must contain {} for minute replacement, where minute will be added as zero-padded number, i.e. 00, 01, …, 59.
get_key_prefixes_within_range(start_dt, end_dt)

Returns a list of key search prefixes for the given date range, where each prefix corresponds to one day.

Parameters:
  • start_dt (datetime.datetime) – The start of the datetime range.
  • end_dt (datetime.datetime) – The end of the datetime range.
Returns:

List of key search prefixes.

Return type:

list[str]

get_max_key(dt)

Returns maximum possible key value for given datetime.

Parameters:dt (datetime.datetime) – The datetime to build key for.
Returns:Key part corresponding to the datetime.
Return type:str
get_min_key(dt)

Returns minimum possible key value for given datetime.

Parameters:dt (datetime.datetime) – The datetime to build key for.
Returns:Key part corresponding to the datetime.
Return type:str
template_parts
Returns:A list of strings containing the defined template parts, in the following order: year, month, day, hour and minute. The length of list depends on which parts have been specified.
Return type:list[str]

openeew.data.df module

openeew.data.df.get_df_from_records(records, ref_t_name='cloud_t', ref_axis='x')

Returns a pandas DataFrame from a list of records.

Parameters:
  • records (list[dict]) – The list of records from which to create a pandas DataFrame.
  • ref_t_name (str) – The name of the time field to use as a reference when calculating sample times. This should be either cloud_t or device_t.
  • ref_axis (str) – The axis to use when determining the number of sample points in each record.
Returns:

A pandas DataFrame with columns the same as the keys of each record and an additional sample_t column giving an individual timestamp to each of the x, y and z array elements.

Return type:

pandas.DataFrame

openeew.data.record module

openeew.data.record.add_sample_t(record, ref_t_name, ref_axis)

Adds a list of sample times to a record corresponding to each sample point in the record.

Parameters:
  • record (dict) – The record to which to add sample times.
  • ref_t_name (str) – The name of the time field to use as a reference when calculating sample times. This should be either cloud_t or device_t.
  • ref_axis (str) – The axis to use when determining the number of sample points in the record.
Returns:

A record with additional sample_t field containing list of sample times.

Return type:

dict

openeew.data.record.add_sample_t_to_records(records, ref_t_name, ref_axis)

Adds sample_t field to each record in a list of records.

Parameters:
  • records (list[dict]) – The list of records to which to add sample times.
  • ref_t_name (str) – The name of the time field to use as a reference when calculating sample times. This should be either cloud_t or device_t.
  • ref_axis (str) – The axis to use when determining the number of sample points in each record.
Returns:

A list of records with additional sample_t field containing list of sample times.

Return type:

list[dict]

openeew.data.record.get_sample_t(ref_t, idx, num_samples, sr)

Calculates Unix time for individual sample point (within a record). The sample point time is calculated by subtracting a multiple of 1/sr from the reference time corresponding to the final sample point in the record, where sr is the sample rate.

Parameters:
  • ref_t (float) – The Unix time to use as a basis for the calculation.
  • idx (int) – The zero-based index of the sample point within the record.
  • num_samples (int) – The number of sample points within the record.
  • sr (float) – The sample rate (per second) of the record.
Returns:

An estimated Unix time corresponding to the sample point.

Return type:

float

Module contents

This module provides a means to work with OpenEEW data.