Implementing Data Engineering Solutions Using Azure Databricks
Question 1
DP-750 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.
Details →You are the lead data engineer for Contoso. Your Unity Catalog-enabled Azure Databricks workspace must serve four distinct workloads, and you want to follow Databricks' compute selection recommendations to minimize cost and management overhead while meeting each workload's needs. The four workloads are: - **Workload 1** — A nightly, scheduled notebook task in a Lakeflow Job that transforms Delta tables and has no special cluster requirements. - **Workload 2** — A Power BI semantic model and ad-hoc SQL analysts who issue bursty, highly concurrent SQL queries against Unity Catalog tables and need sub-10-second start times. - **Workload 3** — Interactive, collaborative notebook development where several data scientists need to share one always-running cluster to iterate on RDD-based code and R. - **Workload 4** — A production Spark Submit (JAR) task that requires custom cluster settings not available in serverless. For each workload, select the recommended compute type. ```mermaid flowchart TD subgraph Hotspot["Select recommended compute per workload"] W1["Workload 1: nightly scheduled notebook task"] --> D1{{"Serverless jobs compute / Serverless SQL warehouse / Classic all-purpose compute"}} W2["Workload 2: bursty concurrent BI + ad-hoc SQL"] --> D2{{"Serverless jobs compute / Serverless SQL warehouse / Classic jobs compute"}} W3["Workload 3: shared interactive RDD/R dev"] --> D3{{"Serverless SQL warehouse / Classic all-purpose compute / Classic jobs compute"}} W4["Workload 4: Spark Submit (JAR) with custom settings"] --> D4{{"Serverless jobs compute / Classic jobs compute / Serverless SQL warehouse"}} end ```