Question 67
DP-750 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.
Details →A data engineering team must build an incremental ETL workload that ingests CDC events from a Kafka topic into bronze streaming tables, applies SCD Type 2 logic in silver, and refreshes a set of gold materialized views. The team wants the platform to automatically resolve the execution order of the datasets, retry transient failures starting at the most granular unit, and reduce the amount of hand-written Spark and Structured Streaming orchestration code. They are deciding between two implementation approaches within Lakeflow: - Author the logic across several Databricks notebooks and orchestrate them as **notebook tasks** in a Lakeflow Job, wiring up the task dependencies and retry logic manually. - Author the logic as a **Lakeflow Spark Declarative Pipeline (SDP)** using streaming tables and materialized views. Which approach best meets the requirements with the least custom orchestration code?
- AUse notebook tasks in a Lakeflow Job and manually define the task DAG and per-task retry policies.
- BUse a Lakeflow Spark Declarative Pipeline (SDP) so dataset dependencies and execution order are resolved automatically and transient failures retry at the task → flow → pipeline level.
- CUse a single notebook task that runs all logic sequentially with a try/except block around each stage.
- DUse a Python script task that calls the Jobs REST API to chain notebooks together.