FEFreeExamDumps.in

Implementing Data Engineering Solutions Using Azure Databricks

Topic 1

Question 61

DP-750 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.

Details →

You are a data engineer building a Lakeflow Spark Declarative Pipeline that lands a `silver.transactions` streaming table in Unity Catalog. The business defines three row-level data quality rules and wants the pipeline to **drop** any record that violates **any** rule, while still emitting per-rule pass/fail metrics to the pipeline event log: - **Nullability:** `transaction_id` must never be null. - **Cardinality / domain:** `status` must be exactly one of `'PENDING'`, `'SETTLED'`, or `'REVERSED'`. - **Range:** `amount` must be greater than 0 and at most 50000. You author the dataset with grouped expectations so a single decorator applies all three checks with one collective action: ```python from pyspark import pipelines as dp rules = { "valid_id": "transaction_id IS NOT NULL", "valid_status": "status IN ('PENDING','SETTLED','REVERSED')", "valid_amount": "amount > 0 AND amount <= 50000" } @dp.table(name="silver_transactions") @dp.expect_all_or_drop(rules) def silver_transactions(): return spark.readStream.table("bronze.transactions_raw") ``` In the **Data quality** tab, for each rule you must select the SINGLE check category that the rule's SQL condition primarily implements. ```mermaid flowchart TD R1["Rule valid_id:<br/>transaction_id IS NOT NULL"] --> D1{{"Check category?"}} R2["Rule valid_status:<br/>status IN ('PENDING','SETTLED','REVERSED')"] --> D2{{"Check category?"}} R3["Rule valid_amount:<br/>amount > 0 AND amount <= 50000"] --> D3{{"Check category?"}} D1 -.options.-> O1["Nullability / Cardinality / Range"] D2 -.options.-> O2["Nullability / Cardinality / Range"] D3 -.options.-> O3["Nullability / Cardinality / Range"] ```