datavizhub.utils package¶
Utilities used across DataVizHub (dates, files, images, credentials).
Avoid importing optional heavy dependencies at package import time to keep the CLI lightweight when only a subset of functionality is needed (e.g., pipeline runner). Submodules can still be imported directly when required.
- class datavizhub.utils.CredentialManager(filename: str | None = None, namespace: str | None = None)[source]¶
Bases:
object
Manage app credentials from a dotenv file.
- Parameters:
filename (str, optional) – Path to a dotenv file containing key=value pairs.
namespace (str, optional) – Optional prefix to apply to all keys when stored/retrieved.
Examples
Namespaced keys:
cm = CredentialManager(".env", namespace="MYAPP_") cm.read_credentials(expected_keys=["API_KEY"]) # expects MYAPP_API_KEY
- add_credential(key: str, value: str) None [source]¶
Add or update a credential value in memory.
- Parameters:
key (str) – Base key name (namespace is applied automatically).
value (str) – Credential value to store.
- delete_credential(key: str) None [source]¶
Delete a credential by key if present.
- Parameters:
key (str) – Base key name (namespace is applied automatically).
- get_credential(key: str) str [source]¶
Retrieve a credential value by key (with namespace applied).
- Parameters:
key (str) – Base key name (namespace is applied automatically).
- Returns:
Stored value for the namespaced key.
- Return type:
str
- Raises:
KeyError – If the key is not present in memory.
- list_credentials(expected_keys: Iterable[str] | None = None) list[str] [source]¶
List tracked credential keys, checking for expected ones when provided.
- Parameters:
expected_keys (Iterable[str], optional) – Keys to verify are present.
- Returns:
Keys currently stored in memory.
- Return type:
list of str
- Raises:
KeyError – If any expected keys are missing.
- read_credentials(expected_keys: Iterable[str] | None = None) None [source]¶
Read credentials from the dotenv file into memory.
- Parameters:
expected_keys (Iterable[str], optional) – Keys that must be present; raises if any are missing.
- Raises:
FileNotFoundError – If the dotenv path cannot be resolved.
KeyError – If any expected keys are missing after reading.
- property tracked_keys: set[str]¶
Return the set of keys currently tracked in memory.
- class datavizhub.utils.DateManager(date_formats: list[str] | None = None)[source]¶
Bases:
object
High-level utilities for working with dates and filenames.
- Parameters:
date_formats (list of str, optional) – Preferred strftime-style formats to use when parsing dates from filenames (e.g.,
["%Y%m%d"]
).
Examples
Use a custom filename format first, then fall back to ISO-like detection:
dm = DateManager(["%Y%m%d%H%M%S"]) when = dm.extract_date_time("frame_20240101093000.png")
- __init__(date_formats: list[str] | None = None) None [source]¶
Optionally store preferred date formats for filename parsing.
- calculate_expected_frames(start_datetime: datetime, end_datetime: datetime, period_seconds: int) int [source]¶
Calculate expected frame count between two datetimes at a cadence.
- Returns:
Number of expected frames (inclusive of endpoints).
- Return type:
int
- datetime_format_to_regex(datetime_format: str) str [source]¶
Convert a datetime format string to a regex pattern.
- extract_date_time(string: str) str | None [source]¶
Extract a date string from a filename/text using known formats.
Tries known formats first; falls back to a simple ISO-like pattern.
- extract_dates_from_filenames(directory_path: str, image_extensions: Iterable[str] = ('.jpg', '.jpeg', '.png', '.gif', '.bmp', '.dds')) Tuple[str | None, str | None] [source]¶
Extract dates from the first and last image file names in a directory.
- Parameters:
directory_path (str) – Directory to scan for images.
image_extensions (Iterable[str]) – File extensions to include when scanning.
- Returns:
(first_date, last_date)
as strings, or(None, None)
.- Return type:
tuple
- find_missing_frames(directory, period_seconds, datetime_format, filename_format, filename_mask, start_datetime, end_datetime)[source]¶
Find missing frames in a local directory with inconsistent period, only for image files.
- find_missing_frames_and_predict_names(timestamps, period_seconds, filename_pattern)[source]¶
Find gaps and overfrequent frames in timestamps and predict names.
- find_start_end_datetimes(directory: str)[source]¶
Find earliest and latest datetimes from filenames in a directory.
- get_date_range(period: str) Tuple[datetime, datetime] [source]¶
Compute a date range ending at the current minute from a period spec.
- Parameters:
period (str) – Period string such as
"1Y"
,"6M"
,"7D"
, or"24H"
.- Returns:
Start and end datetimes for the period ending at “now” (rounded to minute).
- Return type:
(datetime, datetime)
- is_date_in_range(filepath: str, start_date: datetime, end_date: datetime) bool [source]¶
Check if a filename contains a date within a range.
- Parameters:
filepath (str) – Path or filename containing a date stamp.
start_date (datetime) – Inclusive start of the permitted range.
end_date (datetime) – Inclusive end of the permitted range.
- Returns:
True if a parsed date falls within the range, else False.
- Return type:
bool
- class datavizhub.utils.FileUtils[source]¶
Bases:
object
Namespace for file-related helper routines.
Examples
While most functions are provided at module-level, a class instance can be created if you prefer an object to group related operations.
- class datavizhub.utils.JSONFileManager(file_path: str)[source]¶
Bases:
object
Convenience wrapper for manipulating JSON files.
- Parameters:
file_path (str) – Path to a JSON file on disk.
Examples
Read, update, and save:
jm = JSONFileManager("./data.json") jm.data["foo"] = "bar" jm.save_file()
- read_file() None [source]¶
Read the JSON file from disk into memory.
- Returns:
Populates
self.data
or sets it toNone
on error.- Return type:
None
- save_file(new_file_path: str | None = None) None [source]¶
Write the in-memory data back to disk.
- Parameters:
new_file_path (str, optional) – If provided, save to this path instead of overwriting the original.
- Returns:
This method returns nothing.
- Return type:
None
- update_dataset_times(target_id: str, directory: str) str [source]¶
Update start/end times for a dataset using directory image dates.
- Parameters:
target_id (str) – Dataset identifier to match in the JSON payload.
directory (str) – Directory to scan for frame timestamps.
- Returns:
Status message describing the outcome of the update.
- Return type:
str
- datavizhub.utils.remove_all_files_in_directory(directory: str) None [source]¶
Remove all files and subdirectories under a directory.
- Parameters:
directory (str) – Directory to clean.
- Returns:
This function returns nothing.
- Return type:
None
Notes
Errors are reported via
logging.error
for consistency with the rest of the codebase.
Modules¶
Credential storage and retrieval utilities.
This module provides CredentialManager, a small helper class to securely load, access, and manage credentials from a .env-style file without leaking them into the global process environment.
Examples
Load and access values:
from datavizhub.utils.credential_manager import CredentialManager
cm = CredentialManager("./.env")
cm.read_credentials(expected_keys=["API_KEY"])
token = cm.get_credential("API_KEY")
Use as a context manager:
with CredentialManager("./.env") as cm:
cm.read_credentials()
do_work(cm.get_credential("ACCESS_TOKEN"))
- class datavizhub.utils.credential_manager.CredentialManager(filename: str | None = None, namespace: str | None = None)[source]¶
Bases:
object
Manage app credentials from a dotenv file.
- Parameters:
filename (str, optional) – Path to a dotenv file containing key=value pairs.
namespace (str, optional) – Optional prefix to apply to all keys when stored/retrieved.
Examples
Namespaced keys:
cm = CredentialManager(".env", namespace="MYAPP_") cm.read_credentials(expected_keys=["API_KEY"]) # expects MYAPP_API_KEY
- add_credential(key: str, value: str) None [source]¶
Add or update a credential value in memory.
- Parameters:
key (str) – Base key name (namespace is applied automatically).
value (str) – Credential value to store.
- delete_credential(key: str) None [source]¶
Delete a credential by key if present.
- Parameters:
key (str) – Base key name (namespace is applied automatically).
- get_credential(key: str) str [source]¶
Retrieve a credential value by key (with namespace applied).
- Parameters:
key (str) – Base key name (namespace is applied automatically).
- Returns:
Stored value for the namespaced key.
- Return type:
str
- Raises:
KeyError – If the key is not present in memory.
- list_credentials(expected_keys: Iterable[str] | None = None) list[str] [source]¶
List tracked credential keys, checking for expected ones when provided.
- Parameters:
expected_keys (Iterable[str], optional) – Keys to verify are present.
- Returns:
Keys currently stored in memory.
- Return type:
list of str
- Raises:
KeyError – If any expected keys are missing.
- read_credentials(expected_keys: Iterable[str] | None = None) None [source]¶
Read credentials from the dotenv file into memory.
- Parameters:
expected_keys (Iterable[str], optional) – Keys that must be present; raises if any are missing.
- Raises:
FileNotFoundError – If the dotenv path cannot be resolved.
KeyError – If any expected keys are missing after reading.
- property tracked_keys: set[str]¶
Return the set of keys currently tracked in memory.
Date/time utilities for parsing, ranges, and frame calculations.
Provides DateManager
for extracting timestamps from filenames, building
date ranges from period specs (e.g., 1Y, 6M, 7D, 24H), and validating or
interpolating time-based frame sequences.
Examples
Parse dates and compute a range:
from datavizhub.utils.date_manager import DateManager
dm = DateManager(["%Y%m%d"])
start, end = dm.get_date_range("7D")
ok = dm.is_date_in_range("frame_20240102.png", start, end)
- class datavizhub.utils.date_manager.DateManager(date_formats: list[str] | None = None)[source]¶
Bases:
object
High-level utilities for working with dates and filenames.
- Parameters:
date_formats (list of str, optional) – Preferred strftime-style formats to use when parsing dates from filenames (e.g.,
["%Y%m%d"]
).
Examples
Use a custom filename format first, then fall back to ISO-like detection:
dm = DateManager(["%Y%m%d%H%M%S"]) when = dm.extract_date_time("frame_20240101093000.png")
- __init__(date_formats: list[str] | None = None) None [source]¶
Optionally store preferred date formats for filename parsing.
- calculate_expected_frames(start_datetime: datetime, end_datetime: datetime, period_seconds: int) int [source]¶
Calculate expected frame count between two datetimes at a cadence.
- Returns:
Number of expected frames (inclusive of endpoints).
- Return type:
int
- datetime_format_to_regex(datetime_format: str) str [source]¶
Convert a datetime format string to a regex pattern.
- extract_date_time(string: str) str | None [source]¶
Extract a date string from a filename/text using known formats.
Tries known formats first; falls back to a simple ISO-like pattern.
- extract_dates_from_filenames(directory_path: str, image_extensions: Iterable[str] = ('.jpg', '.jpeg', '.png', '.gif', '.bmp', '.dds')) Tuple[str | None, str | None] [source]¶
Extract dates from the first and last image file names in a directory.
- Parameters:
directory_path (str) – Directory to scan for images.
image_extensions (Iterable[str]) – File extensions to include when scanning.
- Returns:
(first_date, last_date)
as strings, or(None, None)
.- Return type:
tuple
- find_missing_frames(directory, period_seconds, datetime_format, filename_format, filename_mask, start_datetime, end_datetime)[source]¶
Find missing frames in a local directory with inconsistent period, only for image files.
- find_missing_frames_and_predict_names(timestamps, period_seconds, filename_pattern)[source]¶
Find gaps and overfrequent frames in timestamps and predict names.
- find_start_end_datetimes(directory: str)[source]¶
Find earliest and latest datetimes from filenames in a directory.
- get_date_range(period: str) Tuple[datetime, datetime] [source]¶
Compute a date range ending at the current minute from a period spec.
- Parameters:
period (str) – Period string such as
"1Y"
,"6M"
,"7D"
, or"24H"
.- Returns:
Start and end datetimes for the period ending at “now” (rounded to minute).
- Return type:
(datetime, datetime)
- is_date_in_range(filepath: str, start_date: datetime, end_date: datetime) bool [source]¶
Check if a filename contains a date within a range.
- Parameters:
filepath (str) – Path or filename containing a date stamp.
start_date (datetime) – Inclusive start of the permitted range.
end_date (datetime) – Inclusive end of the permitted range.
- Returns:
True if a parsed date falls within the range, else False.
- Return type:
bool
Filesystem helper utilities.
Lightweight helpers for common file and directory operations used across the project.
Examples
Clean a scratch directory:
from datavizhub.utils.file_utils import remove_all_files_in_directory
remove_all_files_in_directory("./scratch")
- class datavizhub.utils.file_utils.FileUtils[source]¶
Bases:
object
Namespace for file-related helper routines.
Examples
While most functions are provided at module-level, a class instance can be created if you prefer an object to group related operations.
- datavizhub.utils.file_utils.remove_all_files_in_directory(directory: str) None [source]¶
Remove all files and subdirectories under a directory.
- Parameters:
directory (str) – Directory to clean.
- Returns:
This function returns nothing.
- Return type:
None
Notes
Errors are reported via
logging.error
for consistency with the rest of the codebase.
Read, update, and write JSON files used as simple configs/datasets.
Provides JSONFileManager
to persist simple JSON structures and to
update dataset start/end times using dates inferred from a directory of frames.
Examples
Update a dataset time window:
from datavizhub.utils.json_file_manager import JSONFileManager
jm = JSONFileManager("./config.json")
jm.update_dataset_times("my-dataset", "./frames")
jm.save_file()
- class datavizhub.utils.json_file_manager.JSONFileManager(file_path: str)[source]¶
Bases:
object
Convenience wrapper for manipulating JSON files.
- Parameters:
file_path (str) – Path to a JSON file on disk.
Examples
Read, update, and save:
jm = JSONFileManager("./data.json") jm.data["foo"] = "bar" jm.save_file()
- read_file() None [source]¶
Read the JSON file from disk into memory.
- Returns:
Populates
self.data
or sets it toNone
on error.- Return type:
None
- save_file(new_file_path: str | None = None) None [source]¶
Write the in-memory data back to disk.
- Parameters:
new_file_path (str, optional) – If provided, save to this path instead of overwriting the original.
- Returns:
This method returns nothing.
- Return type:
None
- update_dataset_times(target_id: str, directory: str) str [source]¶
Update start/end times for a dataset using directory image dates.
- Parameters:
target_id (str) – Dataset identifier to match in the JSON payload.
directory (str) – Directory to scan for frame timestamps.
- Returns:
Status message describing the outcome of the update.
- Return type:
str