This is quick post on data bucketing in Power BI and how we can use functions to group data into pre-defined buckets or bands. 

Use-cases might include aged debtors/creditors reports or any dataset that contains numerous data-points that need grouping quickly and efficiently

Sample Dataset:

This sample dataset shows an example of some typical data where we have a supplier name, sales, date and age (days). In this example we are defining the age as the difference in days between the “From” column and the date 26/12/2018 (which just happens to be the day I started writing this post)

The objective is to group the Age column into the following buckets:

  • “< 30 Days”
  • “31 – 60 Days”
  • “61 – 90 Days”
  • “91 -120 Days”
  • “121 – 180 Days”
  • “>180 Days”

It’s tempting, especially when you’re from an excel background, to accomplish this by using a nested VLOOKUP formula within a calculated column such as:

Bucket = IF('Sample Data'[Age] <= 30, "< 30 Days", 
IF('Sample Data'[Age]>30 &amp;&amp; 'Sample Data'[Age] <=60, "31-60 Days",
IF('Sample Data'[Age]>60 &amp;&amp; 'Sample Data'[Age] <=90, "61-90 Days",
IF('Sample Data'[Age]>90 &amp;&amp; 'Sample Data'[Age] <=120, "91-120 Days",
IF('Sample Data'[Age]>120 &amp;&amp; 'Sample Data'[Age] <=180, "121-180 Days",
IF('Sample Data'[Age]>180, "> 180 Days"
))))))

…and although this does work, it’s not the most efficient or time effective. 

The Function

I find the best way to bucket data is by creating a function that does all the work for you.  To create a new function in Power Query you can either right-click the blank area under your queries and go to “New Query” > “Blank Query” or if the navigation menu you can go to Home > New Source > Blank Query

This will create a new blank query called “Query1” . You can rename it to something like AgeBucket.  From here right-click on the query and go to Advanced Editor.  Enter the following code into the editor:

   #Parameter. This function takes a parameter called Age
(Age) =>
   
   #Define Function
let AgeBand =

   #Bucket to use
{
{(x) =>x<31, "<30 Days"},
{(x) =>x<61, "30-60 Days"},
{(x) =>x<91, "61-90 Days"},
{(x) =>x<121, "91-120 Days"},
{(x) =>x<181, "121-180 Days"},
{(x) =>true, ">180 Days"}
},

   #Define the output
Result = List.First(List.Select(AgeBand, each _{0}(Age))){1}
    
in
    Result

Once done you can now use the function in your dataset.

Go to the query containing the data and apply the function by selecting Add Column > Invoke Custom Function from the menu.

In the pop-up window it will ask you to name the new column, select which function you want to use and provide the name of the column to be used as the input parameter (in this example we’re using the Age column).

Once you ok this it will run the function against your data and you will see the new column containing your new bucket grouping

We can the dataset into the report and start using the newly created column to group our data. 

In the screenshot above I have used a simple matrix visual to show how we might use the new column.   One thing you may notice though is the ordering of the buckets.   As the new column is Text format, Power BI will naturally sort the column in that way, which won’t necessarily be ideal.  

Custom Bucket Sort

I find the easiest way to add a manually sorting to your buckets is to enhance the function by adding in an index.

Eg

(Age) =>

let AgeBand =

{

{(x) =>x<31, {"<30 Days",1}},
{(x) =>x<61, {"30-60 Days",2}},
{(x) =>x<91, {"61-90 Days",3}},
{(x) =>x<121, {"91-120 Days",4}},
{(x) =>x<181, {"121-180 Days",5}},
{(x) =>true, {">180 Days",6}}


},

Result = List.First(List.Select(AgeBand, each _{0}(Age))){1}
    
in
    Result

Notice here that we’ve added an index to each line so rather than the first condition returning “<30 Days” it now returns “<30 Days”  and 1.   By also adding in curly braces {}, the output now becomes a LIST.   

{(x) =>x<31, “<30 Days”} becomes {(x) =>x<31, {“<30 Days”,1}}

If we save that query and then go back to the dataset you’ll see that the column changes to a list

From here we just click on the arrow icon in the column header and in the drop-down use Extract Values

Then we can select a delimiter (I prefer to use Comma in most cases)

This will create the following output:


To extract the values we can then simply split the column by right-clicking on the column-header and going to “Split Columns > By Delimiter”

This will create two columns.  One for the bucket name and the other for the bucket order.  

If we load the dataset into the report we can now use the newly created Bucket Order column to order our Buckets.

Select the Bucket column from the fields list and then in the top ribbon go to Modelling > Sort By Column and select Bucket Order.

This will now sort the Bucket column by Bucket Order

Hopefully that provides you a good example and quick walk-through on how to add bucketing to your reports.   As mentioned at the start this can come in very useful when you are wanting to create aged reports such as Aged Debtors or Aged Creditors