DP-750 Practice Questions — Page 3

Question 21

A `transactions` table in Unity Catalog has a `region` column. A regional analyst group `emea_analysts` must see only rows where `region = 'EMEA'`, while members of the `global_admins` group must see all rows. You must enforce this with a **table-level row filter** so the restriction applies automatically to every query against the table, including queries from a SQL warehouse.

Which approach correctly implements row-level security?

```sql

-- Candidate row filter

CREATE FUNCTION region_filter(region STRING)

RETURN is_account_group_member('global_admins')

OR region = 'EMEA';

ALTER TABLE transactions SET ROW FILTER region_filter ON (region);

```

A.Create a view `transactions_emea` with `WHERE region = 'EMEA'` and revoke `SELECT` on the base table; row filters cannot reference group membership.
B.Create a SQL UDF that returns a `BOOLEAN` (rows where it returns `FALSE` are excluded), using `is_account_group_member('global_admins') OR region = 'EMEA'`, then apply it with `ALTER TABLE transactions SET ROW FILTER region_filter ON (region)`.
C.Create a column mask on `region` that returns `NULL` for non-EMEA rows; setting a value to `NULL` removes the row from results.
D.Add a `WHERE` clause to the cluster's Spark configuration so every query against `transactions` is rewritten to filter `region = 'EMEA'`.

Question 22

Open question ↗

Your team stores a JDBC password for an external SQL Server in an existing Azure Key Vault. A notebook running on an Azure Databricks cluster must read this password at run time without ever hard-coding it. Corporate policy requires that the credential continue to be managed and rotated in Azure Key Vault, not duplicated inside Databricks.

You decide to create an Azure Key Vault-backed secret scope and reference it from the notebook. Which sequence correctly retrieves the password?

```python

# Candidate notebook code

password = dbutils.secrets.get(scope="kv-sqlserver", key="jdbc-password")

```

A.Create an **Azure Key Vault-backed** secret scope (supplying the vault's DNS name and resource ID) at `https://<workspace-url>/#secrets/createScope`, then read the value with `dbutils.secrets.get(scope="kv-sqlserver", key="jdbc-password")`.
B.Create a Databricks-backed scope and run `databricks secrets put-secret kv-sqlserver jdbc-password`, copying the password out of Key Vault into Databricks so it is stored in both places.
C.Mount the Key Vault as a DBFS path and read the secret with `spark.read.text("/mnt/keyvault/jdbc-password")`.
D.Set the password as a cluster environment variable in plain text and read it with `os.environ["JDBC_PASSWORD"]`.

Question 23

Open question ↗

An automated CI/CD pipeline must connect to Azure Databricks without any interactive user sign-in and run jobs as a non-human identity. You decide to use **OAuth machine-to-machine (M2M)** authentication (the OAuth 2.0 client credentials flow) with an Azure Databricks service principal, then access data the principal is authorized for in Unity Catalog.

Which TWO actions are required to configure OAuth M2M authentication for the service principal? (Choose TWO.)

A.Create an Azure Databricks OAuth secret for the service principal and configure the client with `client_id` = the service principal's application (client) ID and `client_secret` = the OAuth secret value.
B.Grant the service principal the `SELECT` privilege on the required Unity Catalog data objects and `CAN USE` on the SQL warehouse it connects to.
C.Generate a personal access token (PAT) for a human user and embed it in the pipeline so the service principal can reuse it.
D.Enable browser-based OAuth U2M consent and have an administrator approve each pipeline run interactively.
E.Store the service principal's Microsoft Entra ID client secret as the OAuth secret, because Entra ID secrets are interchangeable with Databricks OAuth secrets for M2M.

Question 24

Open question ↗

You are configuring Unity Catalog to read and write data in an Azure Data Lake Storage Gen2 account that is protected by a storage firewall. Corporate security forbids storing or rotating any long-lived secrets. You must let Unity Catalog access the storage on behalf of users by using an **Access Connector for Azure Databricks** with a **managed identity**, then surface it as a Unity Catalog storage credential.

Which TWO actions are required to grant the managed identity access to the storage account? (Choose TWO.)

A.Create an Access Connector for Azure Databricks (a first-party Azure resource) configured with a system-assigned or user-assigned managed identity, and note its resource ID for use when creating the storage credential.
B.Assign the `Storage Blob Data Contributor` role on the storage account to the access connector's managed identity using Azure RBAC.
C.Generate a client secret for the managed identity in Microsoft Entra ID and store it in an Azure Key Vault-backed secret scope for rotation.
D.Create a Databricks personal access token and embed it in the storage account's connection string.
E.Add the storage account's account key to a Databricks-backed secret scope and reference it from the storage credential.

Question 25

Open question ↗

A Unity Catalog catalog `analytics` contains a schema `analytics.gold`. You must assign least-privilege grants to three principals so each can perform exactly its job and no more:

- `data_engineers` (group): must create and load new tables in `analytics.gold`.

- `report_readers` (group): must read all current and future tables in `analytics.gold` but never write or create.

- `All account users`: must be able to discover that objects exist and view their metadata in Catalog Explorer, without access to the underlying data.

Assume each principal already has `USE CATALOG` on `analytics` and (where relevant) `USE SCHEMA` on `analytics.gold`. For each principal, select the single grant that best matches the requirement.

```mermaid

flowchart LR

P1["data_engineers"] --> S1["analytics.gold dropdown -> ?"]

P2["report_readers"] --> S2["analytics.gold dropdown -> ?"]

P3["All account users"] --> S3["analytics (catalog) dropdown -> ?"]

subgraph Options

G1["CREATE TABLE"]

G2["SELECT"]

G3["BROWSE"]

G4["ALL PRIVILEGES"]

G5["MODIFY"]

end

```

Question 26

Open question ↗

A compliance requirement states that the `national_id` column in the table `hr.people.employees` must be masked so that only members of the `hr_privileged` group can see the raw value; everyone else must see `***-**-****`. A junior engineer proposes the following solution:

> "Grant `SELECT` on the schema `hr.people` to the `analysts` group. Because the grant is scoped to the schema, analysts can read the tables but the sensitive `national_id` column will automatically be masked for them by Unity Catalog's privilege inheritance."

The team then runs:

```sql

GRANT USE CATALOG ON CATALOG hr TO `analysts`;

GRANT USE SCHEMA ON SCHEMA hr.people TO `analysts`;

GRANT SELECT ON SCHEMA hr.people TO `analysts`;

```

**Does this solution meet the goal of masking the `national_id` column for non-privileged users?**

A.Yes
B.No

Question 27

Open question ↗

You manage a Unity Catalog catalog named `sales_prod` that backs a self-service analytics environment. Business analysts complain that they cannot understand what the `gold.customer_churn` table and its `risk_score` column represent when they browse the lakehouse. You are asked to enrich the table so that descriptions surface in Catalog Explorer search results and in the workspace search bar, while ensuring that analysts who only have the `BROWSE` privilege (and not `USE CATALOG` or `USE SCHEMA`) can still read the descriptions.

You run the following statements on a SQL warehouse:

```sql

COMMENT ON TABLE sales_prod.gold.customer_churn

IS 'Daily-refreshed churn predictions per customer, sourced from the ML pipeline.';

ALTER TABLE sales_prod.gold.customer_churn

ALTER COLUMN risk_score

COMMENT 'Model churn probability between 0.0 and 1.0; higher means greater churn risk.';

```

Which statement correctly describes the outcome and the privilege requirements?

A.The table comment is applied, but column comments must be set with `COMMENT ON COLUMN`; `ALTER TABLE ... ALTER COLUMN ... COMMENT` is invalid syntax, so the second statement fails.
B.Both comments are applied; any user with the `BROWSE` privilege on the catalog can view them in Catalog Explorer and search, even without `USE CATALOG` or `USE SCHEMA`.
C.Both comments are applied, but they are only visible to users who additionally hold `USE CATALOG` and `USE SCHEMA`; `BROWSE` alone does not expose comments.
D.The comments are stored only in the Delta transaction log and are not indexed by workspace search; analysts must query `DESCRIBE EXTENDED` to read them.

Question 28

Open question ↗

Your organization is standardizing data protection across hundreds of tables in a catalog named `finance`. Today, every table that contains a Social Security Number has a separate, manually applied column mask, and new tables are frequently created without one. You want a single, centrally managed mechanism that automatically masks any column classified as an SSN — including columns in tables created in the future — without editing each table.

A governance admin defines the following:

```sql

-- A governed tag taxonomy is defined at the account level: pii = { ssn, email, phone_number }

-- A data steward tags SSN columns with the governed tag value pii = ssn

CREATE OR REPLACE FUNCTION finance.security.redact_ssn(ssn STRING)

RETURNS STRING

RETURN '***-**-****';

CREATE POLICY redact_ssn_policy

ON SCHEMA finance.customers

COLUMN MASK finance.security.redact_ssn

TO `account users`

FOR TABLES

MATCH COLUMNS has_tag_value('pii', 'ssn') AS ssn_col

ON COLUMN ssn_col;

```

Which statement best describes how this attribute-based access control (ABAC) approach behaves?

A.The policy applies only to tables that exist at the moment it is created; tables added later must be re-tagged and the policy must be recreated to cover them.
B.The policy is evaluated dynamically: any current or future table in the `finance.customers` schema whose column carries the governed tag value `pii = ssn` is automatically masked by `redact_ssn`, with no per-table configuration.
C.ABAC policies require ungoverned (workspace-scoped) tags; the account-level governed tag will be ignored and the policy will mask nothing.
D.Because the policy is attached at the schema level, it overrides and disables any table-level column masks, but it cannot itself mask columns based on tags.

Question 29

Open question ↗

A data engineer must secure a Unity Catalog table `hr.people.employees` using table-level row filters and column masks (UDF-based). The table has columns including `region`, `email`, `ssn`, and `salary`. For each business requirement, you must choose whether to implement it as a **Row filter** function or a **Column mask** function applied to the table.

The two enforcement mechanisms behave as follows:

```mermaid

flowchart LR

subgraph T["hr.people.employees"]

direction TB

R1["row: region=EU ..."]

R2["row: region=US ..."]

R3["row: region=APAC ..."]

end

RF["Row filter UDF\n(predicate per row → keep/hide row)"] --> T

CM["Column mask UDF\n(transform a column's value in returned rows)"] --> T

```

For each requirement, select **Row filter** or **Column mask**:

```mermaid

flowchart TB

Q1["Req 1: A US-based analyst group must NOT see any rows where region = 'EU'."]

Q1 --> O1{"Row filter / Column mask"}

Q2["Req 2: Non-privileged users must see ssn as '***-**-****' but the rows must still be returned."]

Q2 --> O2{"Row filter / Column mask"}

Q3["Req 3: Only members of 'payroll' may see the real salary value; everyone else sees NULL, but all employee rows remain visible."]

Q3 --> O3{"Row filter / Column mask"}

Q4["Req 4: A regional manager may only retrieve the subset of employee records whose region matches their own assigned region."]

Q4 --> O4{"Row filter / Column mask"}

```

Question 30

Open question ↗

A compliance team requires that deleted customer records be physically removed from object storage within 30 days, but data engineers also rely on Delta Lake time travel to recover from accidental writes during long-running multi-day ingestion jobs. An engineer proposes running the following on the managed Delta table `crm.gold.customers` every night to aggressively reclaim storage:

```sql

VACUUM crm.gold.customers RETAIN 1 HOURS;

```

Which statement correctly describes the behavior and risk of this command?

A.The command runs as written; `VACUUM` removes files older than 1 hour, which safely reclaims storage with no impact on running jobs or time travel.
B.By default Delta Lake's safety check blocks a retention interval below 7 days; the command raises an error unless `spark.databricks.delta.retentionDurationCheck.enabled` is set to `false`, and even then a 1-hour threshold risks deleting files written by uncommitted long-running jobs and destroys time travel beyond 1 hour.
C.`VACUUM` only deletes Delta transaction log (`_delta_log`) files, so the `RETAIN` value affects log retention but never the data files, making the command harmless.
D.`RETAIN 1 HOURS` is ignored on Unity Catalog managed tables; managed tables always enforce a fixed 30-day retention that cannot be changed.