DP-750 Certification Practice Question #59

Question

You maintain a `silver.sales.monthly_revenue` table with one row per `(product, month, revenue)`. Analysts need a **wide** report with one row per `product` and a separate column per month (`Jan`, `Feb`, `Mar`) holding the summed revenue. Later, a downstream ML pipeline needs the wide table reshaped **back to long** format with one row per `(product, month, revenue)`.

```sql
-- Wide report (pivot)
SELECT * FROM silver.sales.monthly_revenue
PIVOT (
  SUM(revenue) FOR month IN ('Jan', 'Feb', 'Mar')
);
```

Which TWO statements about these reshaping transformations are correct? (Choose TWO.)

Accepted Answer

`PIVOT` rotates the distinct values of the pivot column (`month`) into new columns and requires an aggregate (here `SUM(revenue)`) to populate the cells — statement A. `UNPIVOT` (and the PySpark `unpivot`/`melt` aliases) performs the reverse direction, folding the month columns back into a name column and a value column — statement B. However, unpivot only reverses the *shape*, not the aggregation, so C is false; and because `UNPIVOT` defaults to `EXCLUDE NULLS`, D is false. Denormalizing a fact table by joining in dimension attributes (E) is a legitimate widening technique that trades storage for fewer query-time joins.

More DP-750 practice questions