Question 39
DP-750 voucher + Udemy course (lifetime access) = ₹3,500 for Indian ID card holders.
Details →A data engineer is creating a new Delta table named `events` in Unity Catalog. The table is expected to hold about **300 GB** of data. The previous Hive-based design partitioned the data by `event_date` **and** by high-cardinality `user_id`, which produced hundreds of thousands of tiny files and slow queries. Most analytical queries filter on `event_date` and `country`. What should you do to optimize the data layout and avoid over-partitioning for this new table? ```sql -- Proposed table definition (choose the correct layout strategy) CREATE TABLE analytics.events ( event_id BIGINT, user_id BIGINT, country STRING, event_date DATE, payload STRING ) <LAYOUT_STRATEGY>; ```
- AKeep Hive-style partitioning but partition by `event_date` and `user_id` to maximize partition pruning.
- BPartition by `user_id` only, because it has the highest cardinality and therefore creates the most partitions.
- CDo not partition the table; instead enable liquid clustering with `CLUSTER BY (event_date, country)` so the layout can evolve and avoids small-file/over-partitioning problems.
- DPartition by `event_date` and additionally Z-order by `country` and `user_id` on the same column set used for partitioning.