DP-750 Certification Practice Question #1

Question

You are the lead data engineer for Contoso. Your Unity Catalog-enabled Azure Databricks workspace must serve four distinct workloads, and you want to follow Databricks' compute selection recommendations to minimize cost and management overhead while meeting each workload's needs.

The four workloads are:

- **Workload 1** — A nightly, scheduled notebook task in a Lakeflow Job that transforms Delta tables and has no special cluster requirements.
- **Workload 2** — A Power BI semantic model and ad-hoc SQL analysts who issue bursty, highly concurrent SQL queries against Unity Catalog tables and need sub-10-second start times.
- **Workload 3** — Interactive, collaborative notebook development where several data scientists need to share one always-running cluster to iterate on RDD-based code and R.
- **Workload 4** — A production Spark Submit (JAR) task that requires custom cluster settings not available in serverless.

For each workload, select the recommended compute type.

```mermaid
flowchart TD
    subgraph Hotspot["Select recommended compute per workload"]
        W1["Workload 1: nightly scheduled notebook task"] --> D1{{"Serverless jobs compute / Serverless SQL warehouse / Classic all-purpose compute"}}
        W2["Workload 2: bursty concurrent BI + ad-hoc SQL"] --> D2{{"Serverless jobs compute / Serverless SQL warehouse / Classic jobs compute"}}
        W3["Workload 3: shared interactive RDD/R dev"] --> D3{{"Serverless SQL warehouse / Classic all-purpose compute / Classic jobs compute"}}
        W4["Workload 4: Spark Submit (JAR) with custom settings"] --> D4{{"Serverless jobs compute / Classic jobs compute / Serverless SQL warehouse"}}
    end
```

Accepted Answer

Databricks' compute selection guidance recommends serverless jobs compute for most automated (scheduled) workloads because it offers faster start-up, automatic scaling, and lower cost (Workload 1). BI and ad-hoc SQL with bursty concurrency and sub-10-second starts is the canonical use case for a serverless SQL warehouse, which starts in 2–6 seconds and uses Intelligent Workload Management to autoscale (Workload 2). Classic all-purpose compute is the only listed option that supports RDD APIs and R for a shared, always-running interactive dev cluster (Workload 3). Spark Submit (JAR) is explicitly NOT supported on serverless and the recommended/only supported compute is classic jobs compute, which also allows the custom cluster settings the task requires (Workload 4).

More DP-750 practice questions