Pipelines — configuration des jobs
Canonical table-job parsing for bulk pipelines.
- class fabrictools.pipelines.config.TableJobConfig[source]
Bases:
TypedDictCanonical keys for one bulk pipeline table job.
- fabrictools.pipelines.config.build_table_jobs_from_config(*, tables_config: list[dict[str, Any]], default_mode: str, default_partition_by: list[str] | None = None, supported_modes: set[str], source_keys: tuple[str, ...] = ('source_relative_path', 'source_path', 'source_table', 'bronze_path'), target_keys: tuple[str, ...] = ('target_relative_path', 'target_path', 'target_table', 'prepared_table', 'silver_table'), require_target: bool = False, require_mode: bool = False, allow_merge_condition: bool = False, cleaned_table_prefix: bool = False) list[TableJobConfig][source]
Normalize heterogeneous
tables_configdicts intoTableJobConfigrows.- Parameters:
default_mode (str) – Mode used when an entry omits
mode(unlessrequire_mode).default_partition_by (list[str] | None) – Default
partition_bywhen omitted per entry.supported_modes (set[str]) – Allowed mode strings (e.g.
overwrite,append,merge).source_keys (tuple[str, ...]) – Keys tried in order to read the source path.
target_keys (tuple[str, ...]) – Keys tried in order to read the target path.
require_target (bool) – If
True, each entry must specify a target key.require_mode (bool) – If
True, each entry must includemode.allow_merge_condition (bool) – If
True, parsemerge_conditionand require it formerge.cleaned_table_prefix (bool) – Passed to target path derivation when target is inferred.
- Returns:
List of normalized job dicts.
- Return type:
- Raises:
ValueError – On invalid entries, unsupported modes, or missing merge condition.
- fabrictools.pipelines.config.build_table_jobs_from_discovery(*, source_lakehouse_name: str, discover_fn: Callable[[...], list[str]], include_schemas: list[str] | None, exclude_tables: list[str] | None, mode: str, partition_by: list[str] | None = None, cleaned_table_prefix: bool = False) list[TableJobConfig][source]
Build
TableJobConfigrows by listing tables then deriving target paths.- Parameters:
source_lakehouse_name (str) – Lakehouse passed to
discover_fn.discover_fn (collections.abc.Callable) – Callable like
fabrictools.io.discovery.list_lakehouse_tables_for_pipeline().include_schemas (list[str] | None) – Forwarded to
discover_fn.exclude_tables (list[str] | None) – Forwarded to
discover_fn.mode (str) – Write mode for every generated job.
partition_by (list[str] | None) – Optional partition columns for every job.
cleaned_table_prefix (bool) – When
True, target leaf usesCleaned_prefix logic.
- Returns:
One job per discovered relative path.
- Return type:
Shared pipeline contracts and helpers.
Re-exports fabrictools.pipelines.config.TableJobConfig and job builders from
fabrictools.pipelines.config.
- class fabrictools.pipelines.TableJobConfig[source]
Bases:
TypedDictCanonical keys for one bulk pipeline table job.
- fabrictools.pipelines.build_table_jobs_from_config(*, tables_config: list[dict[str, Any]], default_mode: str, default_partition_by: list[str] | None = None, supported_modes: set[str], source_keys: tuple[str, ...] = ('source_relative_path', 'source_path', 'source_table', 'bronze_path'), target_keys: tuple[str, ...] = ('target_relative_path', 'target_path', 'target_table', 'prepared_table', 'silver_table'), require_target: bool = False, require_mode: bool = False, allow_merge_condition: bool = False, cleaned_table_prefix: bool = False) list[TableJobConfig][source]
Normalize heterogeneous
tables_configdicts intoTableJobConfigrows.- Parameters:
default_mode (str) – Mode used when an entry omits
mode(unlessrequire_mode).default_partition_by (list[str] | None) – Default
partition_bywhen omitted per entry.supported_modes (set[str]) – Allowed mode strings (e.g.
overwrite,append,merge).source_keys (tuple[str, ...]) – Keys tried in order to read the source path.
target_keys (tuple[str, ...]) – Keys tried in order to read the target path.
require_target (bool) – If
True, each entry must specify a target key.require_mode (bool) – If
True, each entry must includemode.allow_merge_condition (bool) – If
True, parsemerge_conditionand require it formerge.cleaned_table_prefix (bool) – Passed to target path derivation when target is inferred.
- Returns:
List of normalized job dicts.
- Return type:
- Raises:
ValueError – On invalid entries, unsupported modes, or missing merge condition.
- fabrictools.pipelines.build_table_jobs_from_discovery(*, source_lakehouse_name: str, discover_fn: Callable[[...], list[str]], include_schemas: list[str] | None, exclude_tables: list[str] | None, mode: str, partition_by: list[str] | None = None, cleaned_table_prefix: bool = False) list[TableJobConfig][source]
Build
TableJobConfigrows by listing tables then deriving target paths.- Parameters:
source_lakehouse_name (str) – Lakehouse passed to
discover_fn.discover_fn (collections.abc.Callable) – Callable like
fabrictools.io.discovery.list_lakehouse_tables_for_pipeline().include_schemas (list[str] | None) – Forwarded to
discover_fn.exclude_tables (list[str] | None) – Forwarded to
discover_fn.mode (str) – Write mode for every generated job.
partition_by (list[str] | None) – Optional partition columns for every job.
cleaned_table_prefix (bool) – When
True, target leaf usesCleaned_prefix logic.
- Returns:
One job per discovered relative path.
- Return type: