Aggregate & Window Functions

Functions available in the Group By (Aggregate) and Window Functions transforms.

Aggregate Functions

Used in the Group By transform to summarize groups of rows.

Group by Region, aggregate Sales with sum:

Region  | sum_Sales
--------|---------
North   | 45000
South   | 32000

Used in the Window Functions transform. These add computed columns based on row position without reducing rows.

Function	Description	Example Output
`row_number`	Sequential numbers	1, 2, 3, 4, 5
`rank`	Rank with gaps for ties	1, 2, 2, 4, 5
`dense_rank`	Rank without gaps	1, 2, 2, 3, 4
`percent_rank`	Percentage rank (0 to 1)	0, 0.25, 0.5, 0.75, 1
`ntile`	Distribute into N equal buckets	1, 1, 2, 2, 3

Function	Description
`lag`	Value from N rows before the current row
`lead`	Value from N rows after the current row
`first_value`	First value in the partition
`last_value`	Last value in the partition

Function	Description
`fill_down`	Replace nulls with the most recent non-null value above
`fill_up`	Replace nulls with the next non-null value below

Order by Date, add lag(Sales, 1):

Date       | Sales | lag_Sales
-----------|-------|----------
2024-01-01 | 100   | null
2024-01-02 | 150   | 100
2024-01-03 | 120   | 150

Order By is required for meaningful results — it determines the row sequence
Partition By restarts the window for each group (like GROUP BY but without collapsing rows)
Lag/Lead accept an offset (default 1) and an optional default value for edge rows
N-Tile distributes rows into the specified number of equal-sized buckets