dataloader
canari_ml.data.dataloader
¶
canari_ml.data.dataloader.ZarrDataset(root_path, zarr_name, train_split=True)
¶
Bases: Dataset
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
root_path
|
str
|
Path to the directory containing 'train.zarr', 'val.zarr', 'test.zarr'. |
required |
zarr_name
|
str
|
Name of the Zarr file to load (e.g., 'train.zarr', 'val.zarr', 'test.zarr'). |
required |
train_split
|
bool
|
Whether to load the training split Defaults to True. |
True
|
Source code in src/canari_ml/data/dataloader.py
canari_ml.data.dataloader.ZarrDataset.root_path = root_path
instance-attribute
¶
canari_ml.data.dataloader.ZarrDataset.train_split = train_split
instance-attribute
¶
canari_ml.data.dataloader.ZarrDataset.store = zarr.open(zarr_path)
instance-attribute
¶
canari_ml.data.dataloader.ZarrDataset.x_array = self.store['x']
instance-attribute
¶
canari_ml.data.dataloader.ZarrDataset.y_array = self.store['y']
instance-attribute
¶
canari_ml.data.dataloader.ZarrDataset.sw_array = self.store.get('sample_weights', None)
instance-attribute
¶
canari_ml.data.dataloader.CANARIMLDataSetTorch(configuration_path, *args, batch_size=4, path=os.path.join('.', 'network_datasets'), shuffling=False, **kwargs)
¶
Bases: IceNetDataSet
Source code in src/canari_ml/data/dataloader.py
canari_ml.data.dataloader.CANARIMLDataSetTorch.hemi = 'south' if self._config['south'] else 'north' if self._config['north'] else None
instance-attribute
¶
canari_ml.data.dataloader.CANARIMLDataSetTorch.get_data_loaders(num_workers=4, ratio=None)
¶
Source code in src/canari_ml/data/dataloader.py
canari_ml.data.dataloader.CANARIMLDataSetTorch.get_data_loader(lead_time=None, generate_workers=None, base_path=os.path.join('.', 'network_datasets'), dummy=False)
¶
Create an instance of the CANARIDataLoader class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lead_time
|
optional
|
The number of forecast steps to be used by the data loader. If not provided, defaults to the value specified in the configuration file. |
None
|
generate_workers
|
optional
|
An integer representing number of workers to use for parallel processing with Dask. If not provided, defaults to the value specified in the configuration file. |
None
|
Returns:
| Type | Description |
|---|---|
object
|
An instance of the SerialLoader class configured with the specified parameters. |