zyra.utils packageο
Utilities used across Zyra (dates, files, images, credentials).
Avoid importing optional heavy dependencies at package import time to keep the CLI lightweight when only a subset of functionality is needed (e.g., pipeline runner). Submodules can still be imported directly when required.
- class zyra.utils.CredentialManager(filename: str | None = None, namespace: str | None = None)[source]ο
Bases:
object
Manage app credentials from a dotenv file.
- Parameters:
filename (str, optional) β Path to a dotenv file containing key=value pairs.
namespace (str, optional) β Optional prefix to apply to all keys when stored/retrieved.
Examples
Namespaced keys:
cm = CredentialManager(".env", namespace="MYAPP_") cm.read_credentials(expected_keys=["API_KEY"]) # expects MYAPP_API_KEY
- add_credential(key: str, value: str) None [source]ο
Add or update a credential value in memory.
- Parameters:
key (str) β Base key name (namespace is applied automatically).
value (str) β Credential value to store.
- delete_credential(key: str) None [source]ο
Delete a credential by key if present.
- Parameters:
key (str) β Base key name (namespace is applied automatically).
- get_credential(key: str) str [source]ο
Retrieve a credential value by key (with namespace applied).
- Parameters:
key (str) β Base key name (namespace is applied automatically).
- Returns:
Stored value for the namespaced key.
- Return type:
str
- Raises:
KeyError β If the key is not present in memory.
- list_credentials(expected_keys: Iterable[str] | None = None) list[str] [source]ο
List tracked credential keys, checking for expected ones when provided.
- Parameters:
expected_keys (Iterable[str], optional) β Keys to verify are present.
- Returns:
Keys currently stored in memory.
- Return type:
list of str
- Raises:
KeyError β If any expected keys are missing.
- read_credentials(expected_keys: Iterable[str] | None = None) None [source]ο
Read credentials from the dotenv file into memory.
- Parameters:
expected_keys (Iterable[str], optional) β Keys that must be present; raises if any are missing.
- Raises:
FileNotFoundError β If the dotenv path cannot be resolved.
KeyError β If any expected keys are missing after reading.
- property tracked_keys: set[str]ο
Return the set of keys currently tracked in memory.
- class zyra.utils.DateManager(date_formats: list[str] | None = None)[source]ο
Bases:
object
High-level utilities for working with dates and filenames.
- Parameters:
date_formats (list of str, optional) β Preferred strftime-style formats to use when parsing dates from filenames (e.g.,
["%Y%m%d"]
).
Examples
Use a custom filename format first, then fall back to ISO-like detection:
dm = DateManager(["%Y%m%d%H%M%S"]) when = dm.extract_date_time("frame_20240101093000.png")
- __init__(date_formats: list[str] | None = None) None [source]ο
Optionally store preferred date formats for filename parsing.
- calculate_expected_frames(start_datetime: datetime, end_datetime: datetime, period_seconds: int) int [source]ο
Calculate expected frame count between two datetimes at a cadence.
- Returns:
Number of expected frames (inclusive of endpoints).
- Return type:
int
- datetime_format_to_regex(datetime_format: str) str [source]ο
Convert a datetime format string to a regex pattern.
- extract_date_time(string: str) str | None [source]ο
Extract a date string from a filename/text using known formats.
Tries known formats first; falls back to a simple ISO-like pattern.
- extract_dates_from_filenames(directory_path: str, image_extensions: Iterable[str] = ('.jpg', '.jpeg', '.png', '.gif', '.bmp', '.dds')) tuple[str | None, str | None] [source]ο
Extract dates from the first and last image file names in a directory.
- Parameters:
directory_path (str) β Directory to scan for images.
image_extensions (Iterable[str]) β File extensions to include when scanning.
- Returns:
(first_date, last_date)
as strings, or(None, None)
.- Return type:
tuple
- find_missing_frames(directory, period_seconds, datetime_format, filename_format, filename_mask, start_datetime, end_datetime)[source]ο
Find missing frames in a local directory with inconsistent period, only for image files.
- find_missing_frames_and_predict_names(timestamps, period_seconds, filename_pattern)[source]ο
Find gaps and overfrequent frames in timestamps and predict names.
- find_start_end_datetimes(directory: str)[source]ο
Find earliest and latest datetimes from filenames in a directory.
- get_date_range(period: str) tuple[datetime, datetime] [source]ο
Compute a date range ending at the current minute from a period spec.
- Parameters:
period (str) β Period string such as
"1Y"
,"6M"
,"7D"
, or"24H"
.- Returns:
Start and end datetimes for the period ending at βnowβ (rounded to minute).
- Return type:
(datetime, datetime)
- get_date_range_iso(iso_duration: str) tuple[datetime, datetime] [source]ο
Compute a date range ending now from an ISO-8601 duration (e.g., P1Y, P6M, P7D, PT24H).
Supports a subset of ISO-8601: years (Y), months (M), days (D), hours (H) with the βPβ¦Tβ¦β structure. Examples: βP1Yβ, βP6Mβ, βP7Dβ, βPT24Hβ.
- is_date_in_range(filepath: str, start_date: datetime, end_date: datetime) bool [source]ο
Check if a filename contains a date within a range.
- Parameters:
filepath (str) β Path or filename containing a date stamp.
start_date (datetime) β Inclusive start of the permitted range.
end_date (datetime) β Inclusive end of the permitted range.
- Returns:
True if a parsed date falls within the range, else False.
- Return type:
bool
- class zyra.utils.FileUtils[source]ο
Bases:
object
Namespace for file-related helper routines.
Examples
While most functions are provided at module-level, a class instance can be created if you prefer an object to group related operations.
- class zyra.utils.JSONFileManager(file_path: str)[source]ο
Bases:
object
Convenience wrapper for manipulating JSON files.
- Parameters:
file_path (str) β Path to a JSON file on disk.
Examples
Read, update, and save:
jm = JSONFileManager("./data.json") jm.data["foo"] = "bar" jm.save_file()
- read_file() None [source]ο
Read the JSON file from disk into memory.
- Returns:
Populates
self.data
or sets it toNone
on error.- Return type:
None
- save_file(new_file_path: str | None = None) None [source]ο
Write the in-memory data back to disk.
- Parameters:
new_file_path (str, optional) β If provided, save to this path instead of overwriting the original.
- Returns:
This method returns nothing.
- Return type:
None
- update_dataset_times(target_id: str, directory: str) str [source]ο
Update start/end times for a dataset using directory image dates.
- Parameters:
target_id (str) β Dataset identifier to match in the JSON payload.
directory (str) β Directory to scan for frame timestamps.
- Returns:
Status message describing the outcome of the update.
- Return type:
str
- zyra.utils.remove_all_files_in_directory(directory: str) None [source]ο
Remove all files and subdirectories under a directory.
- Parameters:
directory (str) β Directory to clean.
- Returns:
This function returns nothing.
- Return type:
None
Notes
Errors are reported via
logging.error
for consistency with the rest of the codebase.
Modulesο
Credential storage and retrieval utilities.
This module provides CredentialManager, a small helper class to securely load, access, and manage credentials from a .env-style file without leaking them into the global process environment.
Examples
Load and access values:
from zyra.utils.credential_manager import CredentialManager
cm = CredentialManager("./.env")
cm.read_credentials(expected_keys=["API_KEY"])
token = cm.get_credential("API_KEY")
Use as a context manager:
with CredentialManager("./.env") as cm:
cm.read_credentials()
do_work(cm.get_credential("ACCESS_TOKEN"))
- class zyra.utils.credential_manager.CredentialManager(filename: str | None = None, namespace: str | None = None)[source]ο
Bases:
object
Manage app credentials from a dotenv file.
- Parameters:
filename (str, optional) β Path to a dotenv file containing key=value pairs.
namespace (str, optional) β Optional prefix to apply to all keys when stored/retrieved.
Examples
Namespaced keys:
cm = CredentialManager(".env", namespace="MYAPP_") cm.read_credentials(expected_keys=["API_KEY"]) # expects MYAPP_API_KEY
- add_credential(key: str, value: str) None [source]ο
Add or update a credential value in memory.
- Parameters:
key (str) β Base key name (namespace is applied automatically).
value (str) β Credential value to store.
- delete_credential(key: str) None [source]ο
Delete a credential by key if present.
- Parameters:
key (str) β Base key name (namespace is applied automatically).
- get_credential(key: str) str [source]ο
Retrieve a credential value by key (with namespace applied).
- Parameters:
key (str) β Base key name (namespace is applied automatically).
- Returns:
Stored value for the namespaced key.
- Return type:
str
- Raises:
KeyError β If the key is not present in memory.
- list_credentials(expected_keys: Iterable[str] | None = None) list[str] [source]ο
List tracked credential keys, checking for expected ones when provided.
- Parameters:
expected_keys (Iterable[str], optional) β Keys to verify are present.
- Returns:
Keys currently stored in memory.
- Return type:
list of str
- Raises:
KeyError β If any expected keys are missing.
- read_credentials(expected_keys: Iterable[str] | None = None) None [source]ο
Read credentials from the dotenv file into memory.
- Parameters:
expected_keys (Iterable[str], optional) β Keys that must be present; raises if any are missing.
- Raises:
FileNotFoundError β If the dotenv path cannot be resolved.
KeyError β If any expected keys are missing after reading.
- property tracked_keys: set[str]ο
Return the set of keys currently tracked in memory.
Date/time utilities for parsing, ranges, and frame calculations.
Provides DateManager
for extracting timestamps from filenames, building
date ranges from period specs (e.g., 1Y, 6M, 7D, 24H), and validating or
interpolating time-based frame sequences.
Examples
Parse dates and compute a range:
from zyra.utils.date_manager import DateManager
dm = DateManager(["%Y%m%d"])
start, end = dm.get_date_range("7D")
ok = dm.is_date_in_range("frame_20240102.png", start, end)
- class zyra.utils.date_manager.DateManager(date_formats: list[str] | None = None)[source]ο
Bases:
object
High-level utilities for working with dates and filenames.
- Parameters:
date_formats (list of str, optional) β Preferred strftime-style formats to use when parsing dates from filenames (e.g.,
["%Y%m%d"]
).
Examples
Use a custom filename format first, then fall back to ISO-like detection:
dm = DateManager(["%Y%m%d%H%M%S"]) when = dm.extract_date_time("frame_20240101093000.png")
- __init__(date_formats: list[str] | None = None) None [source]ο
Optionally store preferred date formats for filename parsing.
- calculate_expected_frames(start_datetime: datetime, end_datetime: datetime, period_seconds: int) int [source]ο
Calculate expected frame count between two datetimes at a cadence.
- Returns:
Number of expected frames (inclusive of endpoints).
- Return type:
int
- datetime_format_to_regex(datetime_format: str) str [source]ο
Convert a datetime format string to a regex pattern.
- extract_date_time(string: str) str | None [source]ο
Extract a date string from a filename/text using known formats.
Tries known formats first; falls back to a simple ISO-like pattern.
- extract_dates_from_filenames(directory_path: str, image_extensions: Iterable[str] = ('.jpg', '.jpeg', '.png', '.gif', '.bmp', '.dds')) tuple[str | None, str | None] [source]ο
Extract dates from the first and last image file names in a directory.
- Parameters:
directory_path (str) β Directory to scan for images.
image_extensions (Iterable[str]) β File extensions to include when scanning.
- Returns:
(first_date, last_date)
as strings, or(None, None)
.- Return type:
tuple
- find_missing_frames(directory, period_seconds, datetime_format, filename_format, filename_mask, start_datetime, end_datetime)[source]ο
Find missing frames in a local directory with inconsistent period, only for image files.
- find_missing_frames_and_predict_names(timestamps, period_seconds, filename_pattern)[source]ο
Find gaps and overfrequent frames in timestamps and predict names.
- find_start_end_datetimes(directory: str)[source]ο
Find earliest and latest datetimes from filenames in a directory.
- get_date_range(period: str) tuple[datetime, datetime] [source]ο
Compute a date range ending at the current minute from a period spec.
- Parameters:
period (str) β Period string such as
"1Y"
,"6M"
,"7D"
, or"24H"
.- Returns:
Start and end datetimes for the period ending at βnowβ (rounded to minute).
- Return type:
(datetime, datetime)
- get_date_range_iso(iso_duration: str) tuple[datetime, datetime] [source]ο
Compute a date range ending now from an ISO-8601 duration (e.g., P1Y, P6M, P7D, PT24H).
Supports a subset of ISO-8601: years (Y), months (M), days (D), hours (H) with the βPβ¦Tβ¦β structure. Examples: βP1Yβ, βP6Mβ, βP7Dβ, βPT24Hβ.
- is_date_in_range(filepath: str, start_date: datetime, end_date: datetime) bool [source]ο
Check if a filename contains a date within a range.
- Parameters:
filepath (str) β Path or filename containing a date stamp.
start_date (datetime) β Inclusive start of the permitted range.
end_date (datetime) β Inclusive end of the permitted range.
- Returns:
True if a parsed date falls within the range, else False.
- Return type:
bool
Filesystem helper utilities.
Lightweight helpers for common file and directory operations used across the project.
Examples
Clean a scratch directory:
from zyra.utils.file_utils import remove_all_files_in_directory
remove_all_files_in_directory("./scratch")
- class zyra.utils.file_utils.FileUtils[source]ο
Bases:
object
Namespace for file-related helper routines.
Examples
While most functions are provided at module-level, a class instance can be created if you prefer an object to group related operations.
- zyra.utils.file_utils.remove_all_files_in_directory(directory: str) None [source]ο
Remove all files and subdirectories under a directory.
- Parameters:
directory (str) β Directory to clean.
- Returns:
This function returns nothing.
- Return type:
None
Notes
Errors are reported via
logging.error
for consistency with the rest of the codebase.
Read, update, and write JSON files used as simple configs/datasets.
Provides JSONFileManager
to persist simple JSON structures and to
update dataset start/end times using dates inferred from a directory of frames.
Examples
Update a dataset time window:
from zyra.utils.json_file_manager import JSONFileManager
jm = JSONFileManager("./config.json")
jm.update_dataset_times("my-dataset", "./frames")
jm.save_file()
- class zyra.utils.json_file_manager.JSONFileManager(file_path: str)[source]ο
Bases:
object
Convenience wrapper for manipulating JSON files.
- Parameters:
file_path (str) β Path to a JSON file on disk.
Examples
Read, update, and save:
jm = JSONFileManager("./data.json") jm.data["foo"] = "bar" jm.save_file()
- read_file() None [source]ο
Read the JSON file from disk into memory.
- Returns:
Populates
self.data
or sets it toNone
on error.- Return type:
None
- save_file(new_file_path: str | None = None) None [source]ο
Write the in-memory data back to disk.
- Parameters:
new_file_path (str, optional) β If provided, save to this path instead of overwriting the original.
- Returns:
This method returns nothing.
- Return type:
None
- update_dataset_times(target_id: str, directory: str) str [source]ο
Update start/end times for a dataset using directory image dates.
- Parameters:
target_id (str) β Dataset identifier to match in the JSON payload.
directory (str) β Directory to scan for frame timestamps.
- Returns:
Status message describing the outcome of the update.
- Return type:
str
- zyra.utils.cli_helpers.configure_logging_from_env(default: str = 'info') None [source]ο
Set logging levels based on VERBOSITY env (supports ZYRA_*/DATAVIZHUB_*).
Values: debug|info|quiet. Defaults to βinfoβ. - debug: root=DEBUG - info: root=INFO - quiet: root=ERROR (suppress most logs) Also dials down noisy third-party loggers (matplotlib, cartopy, botocore, requests).
- zyra.utils.cli_helpers.detect_format_bytes(b: bytes) str [source]ο
Detect basic format from magic bytes.
Returns one of:
"netcdf"
,"grib2"
, or"unknown"
.
- zyra.utils.cli_helpers.is_grib2_bytes(b: bytes) bool [source]ο
Return True if bytes look like GRIB (
b"GRIB"
).
- zyra.utils.cli_helpers.is_netcdf_bytes(b: bytes) bool [source]ο
Return True if bytes look like NetCDF (classic CDF or HDF5-based).
Recognizes magic headers: - Classic NetCDF:
b"CDF"
- NetCDF4/HDF5:b"ΒHDF"
- zyra.utils.cli_helpers.parse_levels_arg(val) int | list[float] [source]ο
Parse levels from int or comma-separated floats.
- zyra.utils.cli_helpers.read_all_bytes(path_or_dash: str) bytes [source]ο
Read all bytes from a path or β-β (stdin).
- zyra.utils.cli_helpers.temp_file_from_bytes(data: bytes, *, suffix: str = '') Iterator[str] [source]ο
Write bytes to a NamedTemporaryFile and yield its path; delete on exit.
- zyra.utils.io_utils.open_input(path_or_dash: str) BinaryIO [source]ο
Return a readable binary file-like for path or β-β (stdin).
- zyra.utils.io_utils.open_output(path_or_dash: str) BinaryIO [source]ο
Return a writable binary file-like for path or β-β (stdout).
- zyra.utils.geo_utils.detect_crs_from_path(path: str, *, var: str | None = None) str | None [source]ο
- zyra.utils.geo_utils.to_cartopy_crs(crs: str | None)[source]ο
Return a Cartopy CRS object for an EPSG string, defaulting to PlateCarree.
Returns None if Cartopy is unavailable.
- zyra.utils.geo_utils.warn_if_mismatch(input_crs: str | None, *, target_crs: str = 'EPSG:4326', reproject: bool = False, context: str = '') None [source]ο
GRIB utilities used by connectors and managers.
This module centralizes protocol-agnostic helpers for working with GRIB2 index files (.idx), calculating byte ranges, and performing parallel multi-range downloads.
Notes
The .idx file path is assumed to be the GRIB file path with a .idx suffix appended, unless a path already ending in .idx is provided.
Pattern filtering uses regular expressions via
re.search()
.
- zyra.utils.grib.compute_chunks(total_size: int, chunk_size: int = 524288000) list[str] [source]ο
Compute contiguous byte ranges that partition a file.
The final range uses the file size as the inclusive end byte (matching the behavior used by
nodd_fetch.py
).- Parameters:
total_size (int) β Size of the file in bytes.
chunk_size (int, default 500MB) β Upper bound for each chunk.
- Returns:
Range header strings, e.g.,
["bytes=0-1048575", ...]
.- Return type:
list of str
- zyra.utils.grib.ensure_idx_path(path: str) str [source]ο
Return the .idx path for a GRIB file or pass through an explicit idx path.
- Parameters:
path (str) β The GRIB file path or .idx path.
- Returns:
path + '.idx'
ifpath
does not already end with.idx
, otherwise returnspath
unchanged.- Return type:
str
- zyra.utils.grib.idx_to_byteranges(lines: list[str], search_regex: str) dict[str, str] [source]ο
Convert .idx lines plus a variable regex into HTTP Range headers.
- Parameters:
lines (list of str) β Lines from a GRIB .idx file.
search_regex (str) β Regular expression to select desired GRIB lines (e.g., βPRES:surfaceβ).
- Returns:
Mapping of
{"bytes=start-end": matching_idx_line}
suitable for use as Range headers.- Return type:
dict
- zyra.utils.grib.parallel_download_byteranges(download_func: Callable[[str, str], bytes], key_or_url: str, byte_ranges: Iterable[str], *, max_workers: int = 10) bytes [source]ο
Download multiple byte ranges in parallel and concatenate in input order.
- Parameters:
download_func (Callable) β Function accepting
(key_or_url, range_header)
and returning bytes.key_or_url (str) β The resource identifier for the remote object.
byte_ranges (Iterable[str]) β Iterable of Range header strings (e.g., βbytes=0-99β). Order matters and is preserved in the output concatenation.
max_workers (int, default=10) β Maximum number of worker threads.
- Returns:
The concatenated payload of all requested ranges in the input order.
- Return type:
bytes