Latest Record Per Category in Power Query (No Helper Columns

Q: How do I get the most recent record per group in Power Query without helper columns?

Use Table.Group with an "All Rows"-style aggregation, then take the first row of each sorted sub-table. The pattern is Table.Group(Source, {"KeyCol"}, {{"Latest", each Table.First(Table.Sort(_, {{"DateCol", Order.Descending}}))}}), followed by Table.ExpandRecordColumn to flatten the result. No helper columns, no MAXIFS, no sort-then-remove-duplicates.

Q: What is the difference between Table.ExpandRecordColumn and Table.ExpandTableColumn?

Table.ExpandRecordColumn flattens a column of records into normal columns, producing one output column per record field and leaving the row count unchanged. Table.ExpandTableColumn flattens a column of tables, producing one output row per row in the nested table. If your per-group aggregation returns a single row via Table.First, use record expansion. If it returns multiple rows via Table.FirstN or similar, use table expansion.

Q: How do I handle ties when two rows have the same latest date?

Add a secondary sort column to the Table.Sort call inside the aggregation. For example, Table.Sort(_, {{"OrderDate", Order.Descending}, {"OrderID", Order.Descending}}) gives deterministic results when multiple rows share the max date, picking the highest OrderID as a tiebreaker. Without a tiebreak, Table.First returns whichever row happens to sort first, which is not guaranteed to be stable across refreshes.

What you're building

A single Power Query function that takes any table, a grouping column, and a date column, and returns the latest row per group with every column intact. You call it like this:

fnLatestPerGroup(Sales, {"CustomerID"}, "OrderDate")

And you get back the most recent sale per customer, full row, no helper columns, no sort-and-dedupe hack. Drop the function into any workbook, point it at any table, done.

You need a sample table to work with. I'll use a Sales table with OrderID, CustomerID, OrderDate, Amount, and Product. The pattern works on anything shaped the same way - a transaction list, a status log, a timestamped audit trail.

Why not MAXIFS or sort-then-remove-duplicates

The two things people reach for first are both wrong.

MAXIFS gives you the latest date per customer, which is not the thing you asked for. You wanted the full row. To get the full row back with MAXIFS you need INDEX/MATCH layered on top, and at that point you are rebuilding the query engine by hand inside a spreadsheet. Any structural change to the data and the formulas shift with it.

Sort descending, then Remove Duplicates on CustomerID feels clean and usually works. The problem is it relies on Remove Duplicates preserving the sort order for the row it keeps. That is broadly true in Power Query, but not something the documentation guarantees, and definitely not true if someone later reorders the steps. It also breaks quietly - no error, you silently get yesterday's row instead of today's.

Table.Group with an "All Rows" aggregation does this properly. Each group is handled explicitly, the sort is scoped per group, and the pick is deliberate. Wrapped as a function, it becomes a one-line call wherever you need it.

Step 1: Build the core query

Start in the workbook with the Sales table already loaded into Power Query. New blank query, open the Advanced Editor, paste this in:

let Source = Sales, Grouped = Table.Group( Source, {"CustomerID"}, {{"Latest", each Table.First(Table.Sort(_, {{"OrderDate", Order.Descending}}))}} ), FieldsToExpand = List.Difference(Table.ColumnNames(Source), {"CustomerID"}), Expanded = Table.ExpandRecordColumn(Grouped, "Latest", FieldsToExpand) in Expanded

Three things are doing the work here:

each Table.First(Table.Sort(_, ...)) - for each CustomerID group, the sub-table is available as _. Sort that sub-table by OrderDate descending, take the first row. What comes back is a record containing every field of the most recent order.
List.Difference(Table.ColumnNames(Source), {"CustomerID"}) - figures out which columns to expand back out. The grouping key is already in the output as a standalone column, so you do not want to duplicate it.
Table.ExpandRecordColumn - flattens the nested record column back into normal table columns. Not Table.ExpandTableColumn. This trips people up; more on that below.

Run the query. One row per customer, full width, latest order.

Step 2: Convert it to a reusable function

Right-click the query, duplicate it. Rename the copy fnLatestPerGroup. Open the Advanced Editor and wrap the logic in a function signature:

(Source as table, KeyColumns as list, DateColumn as text) as table => let Grouped = Table.Group( Source, KeyColumns, {{"Latest", each Table.First(Table.Sort(_, {{DateColumn, Order.Descending}}))}} ), FieldsToExpand = List.Difference(Table.ColumnNames(Source), KeyColumns), Expanded = Table.ExpandRecordColumn(Grouped, "Latest", FieldsToExpand) in Expanded

Three parameters:

Source - the table you are operating on
KeyColumns - a list, because you might want to group by more than one column (e.g. {"Region", "CustomerID"})
DateColumn - a single text value, because it is going into a dynamic sort specification

Once saved, the function shows up in the Queries pane as a callable query, with a little form in the preview pane that takes each parameter. Useful if anyone else on the team needs to test it without reading the M.

Step 3: Call it from any query

In any query where you want the latest row per group:

let Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content], Latest = fnLatestPerGroup(Source, {"CustomerID"}, "OrderDate") in Latest

That is the whole thing. The original Sales table stays untouched; the new query is the deduplicated latest-per-customer view.

To reuse across workbooks, copy the function query from the Queries pane of one workbook into another (right-click > Copy, paste into the target), or keep the M source in a shared .pq file and paste it into a blank query when needed. Power Query has no first-class function library, which is a real product gap, but copy-paste versions cleanly in Git and is what most teams settle on.

Where people trip up

`Table.ExpandRecordColumn` vs `Table.ExpandTableColumn`

The aggregation returns a record (because Table.First returns one record). If you swap Table.First for something that returns a table - say Table.FirstN(_, 3) to keep the top three per group - you have to switch to Table.ExpandTableColumn. Getting this wrong produces "we cannot convert a value of type Record to type Table," which is cryptic the first time you hit it.

Ties on the date column

If two rows share the same max date, Table.First returns whichever one sorted first, which is not deterministic. If tie-breaking matters, add a secondary sort: Table.Sort(_, {{"OrderDate", Order.Descending}, {"OrderID", Order.Descending}}). Now the highest OrderID wins ties.

Performance on very large tables

Table.Group defaults to GroupKind.Global, which scans the full table to identify groups. If the source is already sorted by the key column and the table is in the millions of rows, pass GroupKind.Local as a fourth argument. It is dramatically faster, but silently wrong if the sort assumption does not hold. I would not bake that into the reusable function; keep it for specific tuning cases.

Query folding

This pattern does not fold to SQL. If you are pulling from a warehouse that supports ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ... DESC), do the latest-per-group shape in SQL where it belongs and bring the reshaped result into Power Query. The function is for cases where the data is already in Excel, or where folding was never on the table.

The payoff

You stop rewriting this query every time you need it, and start calling it. Because the function declares its parameter types, the next person who opens the workbook can read the signature and know exactly what it does without tracing through M. That is the shift worth making - from ad-hoc transformations that drift into "just one more workbook" territory, to a small library of queries you actually trust.

Frequently Asked Questions

How do I get the most recent record per group in Power Query without helper columns?

Use Table.Group with an "All Rows"-style aggregation, then take the first row of each sorted sub-table. The pattern is Table.Group(Source, {"KeyCol"}, {{"Latest", each Table.First(Table.Sort(_, {{"DateCol", Order.Descending}}))}}), followed by Table.ExpandRecordColumn to flatten the result. No helper columns, no MAXIFS, no sort-then-remove-duplicates.

What is the difference between Table.ExpandRecordColumn and Table.ExpandTableColumn?

Table.ExpandRecordColumn flattens a column of records into normal columns, producing one output column per record field and leaving the row count unchanged. Table.ExpandTableColumn flattens a column of tables, producing one output row per row in the nested table. If your per-group aggregation returns a single row via Table.First, use record expansion. If it returns multiple rows via Table.FirstN or similar, use table expansion.

Can I reuse a Power Query function across multiple workbooks?

Yes, but not through any built-in library. Copy the function query from one workbook's Queries pane into the next, or keep the M source in a .pq file in source control and paste it into a new blank query when needed. Power Query has no first-class function registry, which is a real product gap; copy-paste with version control is the pragmatic answer.

How do I handle ties when two rows have the same latest date?

Add a secondary sort column to the Table.Sort call inside the aggregation. For example, Table.Sort(_, {{"OrderDate", Order.Descending}, {"OrderID", Order.Descending}}) gives deterministic results when multiple rows share the max date, picking the highest OrderID as a tiebreaker. Without a tiebreak, Table.First returns whichever row happens to sort first, which is not guaranteed to be stable across refreshes.

Why doesn't the Group By dialog let me keep the full row?

The UI only surfaces collapsing aggregations - Sum, Count, Max, and so on - plus a hidden "All Rows" option under Advanced that returns each group's sub-table as a nested column. To actually keep the latest row, you pick "All Rows" and then edit the formula bar to sort the sub-table and take its first row. The dialog does not expose that second step, which is why most people never find this pattern through the UI alone.

Will this work on a large dataset?

For tables in the hundreds of thousands of rows, yes, with no tuning. For millions of rows, the default GroupKind.Global can get slow because it scans the full table to identify groups. Passing GroupKind.Local as a fourth argument to Table.Group makes it dramatically faster, but only if the data is already sorted by the grouping key. If the source is a database that supports window functions, do the latest-per-group logic in SQL and bring the pre-shaped result into Power Query instead.

Tags // Power Query Power BI Efficiency Excel