zyra.connectors.backends package

HTTP connector backend.

Provides functional helpers to fetch and list resources over HTTP(S), as well as convenience utilities for GRIB workflows (.idx parsing and byte-range downloads) and Content-Length probing.

Functions are intentionally small and dependency-light so they can be used by the CLI, pipelines, or higher-level wrappers without imposing heavy imports.

zyra.connectors.backends.http.download_byteranges(url: str, byte_ranges: Iterable[str], *, max_workers: int = 10, timeout: int = 60, headers: dict[str, str] | None = None) bytes[source]

Download multiple byte ranges and concatenate in the input order.

zyra.connectors.backends.http.fetch_bytes(url: str, *, timeout: int = 60, headers: dict[str, str] | None = None) bytes[source]

Return the raw response body for a GET request.

Parameters - url: HTTP(S) URL to fetch. - timeout: request timeout in seconds.

zyra.connectors.backends.http.fetch_json(url: str, *, timeout: int = 60, headers: dict[str, str] | None = None)[source]

Return the parsed JSON body for a GET request.

zyra.connectors.backends.http.fetch_text(url: str, *, timeout: int = 60, headers: dict[str, str] | None = None) str[source]

Return the response body as text for a GET request.

zyra.connectors.backends.http.get_idx_lines(url: str, *, timeout: int = 60, max_retries: int = 3, headers: dict[str, str] | None = None) list[str][source]

Fetch and parse the GRIB .idx file for a URL.

zyra.connectors.backends.http.get_size(url: str, *, timeout: int = 60, headers: dict[str, str] | None = None) int | None[source]

Return Content-Length for a URL via HTTP HEAD when provided.

zyra.connectors.backends.http.list_files(url: str, pattern: str | None = None, *, timeout: int = 60, headers: dict[str, str] | None = None) list[str][source]

Best-effort directory listing by scraping anchor tags on index pages.

Returns absolute URLs; optionally filters them via regex pattern.

zyra.connectors.backends.http.post_bytes(url: str, data: bytes, *, timeout: int = 60, content_type: str | None = None, headers: dict[str, str] | None = None) int[source]

Backward-compat wrapper for post_data.

zyra.connectors.backends.http.post_data(url: str, data: bytes, *, timeout: int = 60, content_type: str | None = None, headers: dict[str, str] | None = None) int[source]

POST raw bytes to a URL and return the HTTP status code.

FTP connector backend.

Thin functional wrappers around the FTPManager to support simple byte fetches and uploads, directory listing with regex/date filtering, sync-to-local flows, and advanced GRIB workflows (.idx handling, ranged downloads).

The URL parser supports anonymous and credentialed forms, e.g.: ftp://host/path, ftp://user@host/path, ftp://user:pass@host/path.

class zyra.connectors.backends.ftp.FTPManager[source]

Bases: object

Placeholder for tests to patch.

The backend functions will attempt to delegate to this manager if present. Tests patch this attribute with a mock class exposing expected methods.

class zyra.connectors.backends.ftp.SyncOptions(overwrite_existing: bool = False, recheck_existing: bool = False, min_remote_size: int | str | None = None, prefer_remote: bool = False, prefer_remote_if_meta_newer: bool = False, skip_if_local_done: bool = False, recheck_missing_meta: bool = False, frames_meta_path: str | None = None)[source]

Bases: object

Configuration for FTP sync file replacement behavior.

Controls how sync_directory decides whether to download a remote file when a local copy already exists. Options are evaluated in precedence order:

  1. skip_if_local_done - Skip if .done marker file exists

  2. Local file missing - Always download when no local copy is present

  3. Local file is zero bytes - Always replace empty local files

  4. overwrite_existing - Unconditional replacement

  5. prefer_remote - Always prioritize remote versions

  6. prefer_remote_if_meta_newer - Use frames-meta.json timestamps

  7. recheck_missing_meta - Re-download if metadata entry missing

  8. min_remote_size - Replace if remote exceeds size threshold

  9. recheck_existing - Compare sizes when mtime unavailable

  10. Default: Replace if remote mtime (via MDTM) is newer

frames_meta_path: str | None = None

Path to frames-meta.json for metadata-aware sync operations.

min_remote_size: int | str | None = None

Replace if remote file exceeds threshold (bytes or percentage like ‘10%’).

overwrite_existing: bool = False

Replace local files unconditionally regardless of timestamps.

prefer_remote: bool = False

Always prioritize remote versions over local copies.

prefer_remote_if_meta_newer: bool = False

Use frames-meta.json timestamps for comparison instead of MDTM.

recheck_existing: bool = False

Compare file sizes when timestamps are unavailable.

recheck_missing_meta: bool = False

Re-download files that lack a companion entry in frames-meta.json.

skip_if_local_done: bool = False

Skip files that have a companion .done marker file.

zyra.connectors.backends.ftp.delete(url_or_path: str, *, username: str | None = None, password: str | None = None) bool[source]

Delete a remote FTP path (file).

zyra.connectors.backends.ftp.download_byteranges(url_or_path: str, byte_ranges: Iterable[str], *, max_workers: int = 10, timeout: int = 30, username: str | None = None, password: str | None = None) bytes[source]

Download multiple ranges via FTP REST and concatenate in the input order.

zyra.connectors.backends.ftp.exists(url_or_path: str, *, username: str | None = None, password: str | None = None) bool[source]

Return True if the remote path exists on the FTP server.

zyra.connectors.backends.ftp.fetch_bytes(url_or_path: str, *, username: str | None = None, password: str | None = None) bytes[source]

Fetch a remote file as bytes from an FTP server.

zyra.connectors.backends.ftp.get_chunks(url_or_path: str, chunk_size: int = 524288000, *, username: str | None = None, password: str | None = None) list[str][source]

Compute contiguous chunk ranges for an FTP file.

zyra.connectors.backends.ftp.get_idx_lines(url_or_path: str, *, write_to: str | None = None, timeout: int = 30, max_retries: int = 3, username: str | None = None, password: str | None = None) list[str] | None[source]

Fetch and parse the GRIB .idx for a remote path via FTP.

zyra.connectors.backends.ftp.get_remote_mtime(url_or_path: str, *, username: str | None = None, password: str | None = None) datetime | None[source]

Return modification time from FTP MDTM command, or None if unavailable.

The MDTM command returns timestamps in UTC (RFC 3659) in the format YYYYMMDDhhmmss. Returns a UTC-aware datetime. Not all FTP servers support this command; failures return None gracefully.

zyra.connectors.backends.ftp.get_size(url_or_path: str, *, username: str | None = None, password: str | None = None) int | None[source]

Return remote file size in bytes via FTP SIZE.

zyra.connectors.backends.ftp.list_files(url_or_dir: str, pattern: str | None = None, *, since: str | None = None, until: str | None = None, date_format: str | None = None, username: str | None = None, password: str | None = None) list[str] | None[source]

List FTP directory contents with optional regex and date filtering.

zyra.connectors.backends.ftp.parse_ftp_path(url_or_path: str, *, username: str | None = None, password: str | None = None) tuple[str, str, str | None, str | None][source]

Return (host, remote_path, username, password) parsed from an FTP path.

zyra.connectors.backends.ftp.should_download(remote_name: str, local_path: Path, remote_size: int | None, remote_mtime: datetime | None, options: SyncOptions, frames_meta: dict | None = None) tuple[bool, str][source]

Determine if a remote file should be downloaded based on sync options.

Args:

remote_name: The filename on the remote server. local_path: Path to the local file (may not exist). remote_size: Remote file size in bytes, or None if unknown. remote_mtime: Remote modification time from MDTM, or None if unavailable. options: SyncOptions configuration. frames_meta: Parsed frames-meta.json content, or None.

Returns:

A tuple of (should_download, reason) where reason is a short description suitable for logging.

zyra.connectors.backends.ftp.stat(url_or_path: str, *, username: str | None = None, password: str | None = None)[source]

Return minimal metadata mapping for a remote path (e.g., size).

zyra.connectors.backends.ftp.sync_directory(url_or_dir: str, local_dir: str, *, pattern: str | None = None, since: str | None = None, until: str | None = None, date_format: str | None = None, clean_zero_bytes: bool = False, username: str | None = None, password: str | None = None, sync_options: SyncOptions | None = None) None[source]

Sync files from a remote FTP directory to a local directory.

Applies regex/date filters prior to download; optionally removes local zero-byte files before syncing and deletes local files that are no longer present on the server.

Args:

url_or_dir: FTP URL or path to the remote directory. local_dir: Local directory path to sync files to. pattern: Optional regex pattern to filter filenames. since: ISO date string for start of date range filter. until: ISO date string for end of date range filter. date_format: Custom date format for parsing dates in filenames. clean_zero_bytes: Remove zero-byte local files before syncing. username: FTP username (overrides URL-embedded credentials). password: FTP password (overrides URL-embedded credentials). sync_options: Configuration for file replacement behavior. If None,

a default SyncOptions instance is used, which downloads files that are missing, zero-byte, or have a newer remote modification time than the local copy.

zyra.connectors.backends.ftp.upload_bytes(data: bytes, url_or_path: str, *, username: str | None = None, password: str | None = None) bool[source]

Upload bytes to a remote FTP path.

S3 connector backend.

Functional helpers for working with Amazon S3 using the existing S3Manager implementation under the hood. Exposes byte fetching, uploading, listing, and introspection utilities, plus GRIB-centric helpers for .idx and ranged downloads.

zyra.connectors.backends.s3.delete(url_or_bucket: str, key: str | None = None) bool[source]

Delete an object by URL or bucket+key.

zyra.connectors.backends.s3.download_byteranges(url_or_bucket: str, key: str | None, byte_ranges: Iterable[str], *, unsigned: bool = False, max_workers: int = 10, timeout: int = 30) bytes[source]

Download multiple byte ranges from an S3 object and concatenate in order.

zyra.connectors.backends.s3.exists(url_or_bucket: str, key: str | None = None) bool[source]

Return True if an S3 object exists.

zyra.connectors.backends.s3.fetch_bytes(url_or_bucket: str, key: str | None = None, *, unsigned: bool = False) bytes[source]

Fetch an object’s full bytes using ranged GET semantics.

Accepts either a single s3://bucket/key URL or bucket``+``key.

zyra.connectors.backends.s3.get_idx_lines(url_or_bucket: str, key: str | None = None, *, unsigned: bool = False, timeout: int = 30, max_retries: int = 3) list[str][source]

Fetch and parse the GRIB .idx content for an S3 object.

Accepts either a full s3:// URL or (bucket, key).

zyra.connectors.backends.s3.get_size(url_or_bucket: str, key: str | None = None) int | None[source]

Return the size in bytes for an S3 object, or None if unknown.

zyra.connectors.backends.s3.list_files(prefix_or_url: str | None = None, *, pattern: str | None = None, since: str | None = None, until: str | None = None, date_format: str | None = None) list[str][source]

List S3 keys with optional regex and date filtering.

Accepts either a full s3://bucket/prefix or bucket only (prefix may be None) and filters using regex pattern and/or filename-based date filtering via since/until with date_format.

zyra.connectors.backends.s3.parse_s3_url(url: str) tuple[str, str][source]

Parse an S3 URL into (bucket, key) with a required key.

Backward compatible with earlier versions that always returned a non-empty key. Raises ValueError if the URL does not include a key (e.g., s3://bucket or s3://bucket/).

zyra.connectors.backends.s3.parse_s3_url_optional_key(url: str) tuple[str, str | None][source]

Parse an S3 URL into (bucket, key_or_none).

  • Returns (bucket, None) when the URL points to the bucket root (e.g., s3://bucket or s3://bucket/). Useful for list/prefix operations.

  • For object operations, prefer parse_s3_url which requires a key.

zyra.connectors.backends.s3.stat(url_or_bucket: str, key: str | None = None)[source]

Return a basic metadata mapping for an object (size/etag/last_modified).

zyra.connectors.backends.s3.upload_bytes(data: bytes, url_or_bucket: str, key: str | None = None) bool[source]

Upload bytes to an S3 object using managed transfer.

  • Calls upload_file for compatibility with existing tests/mocks.

  • Sets ContentType=application/json for .json keys via ExtraArgs.

zyra.connectors.backends.vimeo.fetch_bytes(video_id: str) bytes[source]
zyra.connectors.backends.vimeo.update_description(video_uri: str, text: str, *, token: str | None = None, client_id: str | None = None, client_secret: str | None = None) str[source]

Update the description metadata for a Vimeo video.

zyra.connectors.backends.vimeo.update_video(video_path: str, video_uri: str, *, token: str | None = None, client_id: str | None = None, client_secret: str | None = None) str[source]

Replace an existing Vimeo video file and return the URI.

zyra.connectors.backends.vimeo.upload_path(video_path: str, *, name: str | None = None, description: str | None = None, token: str | None = None, client_id: str | None = None, client_secret: str | None = None) str[source]

Upload a local video file to Vimeo using PyVimeo.

Returns the Vimeo video URI on success.