era5
canari_ml.data.masks.era5
¶
Module to mask out the northern/southern hemisphere
canari_ml.data.masks.era5.MaskDatasetConfig(downloaded_files=None, identifier='masks', variable_name=None, reference_era5_file=None, base_weight=1.0, region_weights=None, weight_smoothing_sigma=10.0, **kwargs)
¶
Bases: DatasetConfig
Configuration class for generating ERA5 mask datasets.
Inherits from download_toolbox.interface.DatasetConfig and extends it to handle hemisphere-specific masks.
Attributes:
| Name | Type | Description |
|---|---|---|
variable_name |
Name of the variable to process. Defaults to None. |
|
reference_era5_file |
Path to reference ERA5 file for mask generation. Defaults to None. |
Notes:
Based on MaskDatasetConfig class from the IceNet library.
https://github.com/icenet-ai/icenet/blob/6caa234907904bfa76b8724d8c83cd989230494a/icenet/data/masks/osisaf.py
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
downloaded_files
|
optional
|
List of downloaded files. Defaults to None. |
None
|
identifier
|
optional
|
Identifier for this dataset configuration. Defaults to "masks". |
'masks'
|
variable_name
|
None | str
|
Name of the ERA5 variable to process. Must be specified. Defaults to None. |
None
|
reference_era5_file
|
None | str
|
Path to the reference ERA5 file. Must be specified. Defaults to None. |
None
|
**kwargs
|
Unpack
|
Additional keyword arguments passed to super class. |
{}
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If either variable_name or reference_era5_file are None. |
NotImplementedError
|
If location is neither north nor south. |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.MaskDatasetConfig.variable_name = variable_name
instance-attribute
¶
canari_ml.data.masks.era5.MaskDatasetConfig.reference_era5_file = reference_era5_file
instance-attribute
¶
canari_ml.data.masks.era5.MaskDatasetConfig.config
property
¶
Get the configuration object.
If not already created, initialises a Configuration object with the location name.
Returns:
| Type | Description |
|---|---|
dict
|
The dataset configuration object. |
canari_ml.data.masks.era5.MaskDatasetConfig.save_data_for_config(rename_var_list=None, source_ds=None, source_files=None, time_dim_values=None, var_filter_list=None, **kwargs)
¶
Save data for the current configuration.
Processes each variable configuration and generates corresponding files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rename_var_list
|
dict
|
Dictionary mapping old to new variable names. Defaults to None. |
None
|
source_ds
|
object
|
Source dataset. Defaults to None. |
None
|
source_files
|
list
|
List of source files. Defaults to None. |
None
|
time_dim_values
|
list
|
Time dimension values. Defaults to None. |
None
|
var_filter_list
|
list
|
List of variables to filter. Defaults to None. |
None
|
**kwargs
|
Unpack
|
Additional keyword arguments. |
{}
|
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.MaskDatasetConfig.get_config(config_funcs=None, strip_keys=None)
¶
Get the configuration object with specified keys removed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_funcs
|
dict
|
Dictionary of configuration functions. Defaults to None. |
None
|
strip_keys
|
list
|
List of keys to remove from the configuration. Defaults to None. |
None
|
Returns:
| Type | Description |
|---|---|
dict
|
The modified configuration object. |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks(dataset_config, *args, absolute_vars=None, identifier=None, base_weight=1.0, region_weights=None, weight_smoothing_sigma=10.0, mask_dataset_config_path=None, mask_config_path=None, **kwargs)
¶
Bases: Processor
A Processor class for generating and applying hemisphere-specific masks.
Inherits from preprocess_toolbox.processor.Processor to handle mask
generation and data processing, particularly for ERA5 datasets.
This class manages the creation of northern or southern hemisphere
masks based on configuration settings.
Attributes:
| Name | Type | Description |
|---|---|---|
_dataset_config |
DatasetConfig
|
Configuration object containing dataset parameters, including location, variables, and file paths. |
abs_vars |
list
|
List of variables treated as absolute in processing. |
_hemi_str |
str
|
'north' or 'south', indicating which hemisphere is being processed. |
_region |
tuple
|
Slice/slices defining the region to apply masking. |
Notes
Based on Masks class from the IceNet library.
https://github.com/icenet-ai/icenet/blob/6caa234907904bfa76b8724d8c83cd989230494a/icenet/data/masks/osisaf.py
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_config
|
DatasetConfig
|
Configuration object for the dataset. |
required |
*args
|
Unpack
|
Additional positional arguments passed to super class. |
()
|
absolute_vars
|
optional
|
Variables treated as absolute. Defaults to None. |
None
|
identifier
|
optional
|
Identifier for processing. Defaults to None. |
None
|
base_weight
|
optional
|
Base weight for regions. Defaults to 1.0. |
1.0
|
region_weights
|
optional
|
Weights for different regions. Defaults to None. |
None
|
weight_smoothing_sigma
|
optional
|
Sigma value for smoothing weights. Defaults to 10.0. |
10.0
|
mask_dataset_config_path
|
optional
|
Path for dataset config file. Defaults to None. |
None
|
**kwargs
|
Unpack
|
Additional keyword arguments passed to super class. |
{}
|
Source code in src/canari_ml/data/masks/era5.py
339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 | |
canari_ml.data.masks.era5.Masks.region
property
writable
¶
Get the current mask region.
Returns:
| Type | Description |
|---|---|
tuple
|
The current region slices used for masking. |
canari_ml.data.masks.era5.Masks.hemisphere_filename
property
¶
Get the filename for the hemisphere mask.
Returns:
| Type | Description |
|---|---|
str
|
Path to the hemisphere mask file. |
canari_ml.data.masks.era5.Masks.weighted_regions_filename
property
¶
Get the filename for the weighted regions.
Returns:
| Type | Description |
|---|---|
str
|
Path to the weighted regions file. |
canari_ml.data.masks.era5.Masks.get_config(config_funcs=None, strip_keys=None)
¶
Retrieve the configuration dictionary for the processor.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_funcs
|
optional
|
Dictionary of functions to modify config. Defaults to None. |
None
|
strip_keys
|
optional
|
Keys to remove from the config. Defaults to None. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
dict |
dict
|
Configuration dictionary containing module and class implementation, absolute variables, dataset configuration, path, processed files, and source files. |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks.process()
¶
Generate and save the hemisphere mask based on the configured region.
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks.hemisphere(*args, **kwargs)
¶
Return the hemisphere mask as an xr.DataArray.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Unpack
|
|
()
|
**kwargs
|
Unpack
|
|
{}
|
Returns:
| Type | Description |
|---|---|
DataArray
|
xr.DataArray: The hemisphere mask loaded from the specified file. |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks.weighted_regions(*args, **kwargs)
¶
Return the weighted regions as an xr.DataArray.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Unpack
|
|
()
|
**kwargs
|
Unpack
|
|
{}
|
Returns:
| Type | Description |
|---|---|
DataArray
|
xr.DataArray: The hemisphere mask loaded from the specified file. |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks.get_blank_mask()
¶
Returns an empty boolean mask for the configured region.
Returns:
| Type | Description |
|---|---|
array
|
A boolean array of shape matching the hemisphere mask,
initialised to |
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.Masks.reset_region()
¶
Resets the mask region to cover the entire dataset.
canari_ml.data.masks.era5.RegionWeightAction
¶
Bases: Action
Custom argparse action for handling region weights with their respective boundaries.
This action expects 5 values: lat_min, lat_max, lon_min, lon_max, and
weight.
It supports passing these values as a comma-separated or space-separated string.
If the number of provided values is not 5, an error is raised.
All values must be numeric.
This action accumulates region weights in the namespace object under the
region_weights attribute, allowing multiple regions to be specified by calling
the flag repeatedly.
canari_ml.data.masks.era5.WeightsArgParser(*args, **kwargs)
¶
Bases: MetaArgParser
Argument parser for handling region weights.
This class extends :class:MetaArgParser and adds arguments related to managing
region weights. It supports specifying a base weight, individual region weights,
and smoothing of the weights using a Gaussian kernel with a given sigma.
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.WeightsArgParser.add_region_weights()
¶
Source code in src/canari_ml/data/masks/era5.py
canari_ml.data.masks.era5.get_channel_info_from_processor(cfg_segment)
¶
Retrieves channel-specific information from a processor based on the given configuration segment.
This function uses :class:WeightsArgParser to parse arguments related to channels and region weights.
It then retrieves the appropriate implementation for the processor, dataset configuration, and initializes
the processor with the parsed arguments. The processor is used to process data and obtain channel-specific
information, which is then stored in a configuration file under the specified segment.
Note
Based on code from preprocess-toolbox:
https://github.com/environmental-forecasting/preprocess-toolbox/blob/35f57eecd8017fae0bf1c7a4a4ca80ca77e905d4/preprocess_toolbox/loader/cli.py#L131-L151
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cfg_segment
|
str
|
The configuration segment under which to store the channel-specific information. |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If the |