DAX 1-Column Fusion

One approach to optimise DAX queries is to focus on the number of Storage Engine (SE) calls generated by a query and look for ways to provide the same result using fewer SE calls. If a query takes 30 seconds to run and makes hundreds of SE calls, reducing the number of SE calls almost always bring the query speed down dramatically. This improvement is especially true when it comes to working with Direct Query data sources.

I wrote an article about an optimisation called DAX Fusion that attempts to fuse similar SE calls when it can. This article highlights an elegant DAX-based trick that works for a specific scenario by reducing the number of SE calls that doesn’t rely on DAX fusion. The difference between 1-Column fusion and the other is:

  • DAX fusion in the engine works across multiple columns that have the same effective WHERE clause
  • 1-Column fusion works by fusing multiple measures that all reference a single column, but with different WHERE clauses

This article is designed to help the following scenario:

Imagine you have a large Fact table with a particular column you need to generate multiple measures per distinct value. One example of this might be a table containing events belonging to a marketing campaign. The ‘Emails’ table has a column called “Response Type” with values indicating what happened to the email, such as:

  • Delivered
  • Bounced
  • Clicked
  • Dropped
  • Blocked
  • Deferred

The visuals you want to use in your report require a measure per response type created, to show a count of rows from the email table for a particular measure such as:

Bounced = 
    CALCULATE(
        COUNTROWS(Emails), 
        Emails[Response Code] = "Bounced"
        )
Clicked = 
    CALCULATE(
        COUNTROWS(Emails), 
        Emails[Response Code] = "Clicked"
        )

Imagine you have six (or more) of these measures and want to use them in a single visual such as a multi-card visual. The traditional approach to solving this would be to create six measures using the DAX pattern shown above – however, this guarantees at least six SE calls will get made during the query. If the data-source is large (and DQ), this could be quite slow.

The trick

The 1-Column fusion approach is to start by creating a measure that contains the base measure. Lets call it :

Base Measure = COUNTROWS(Emails)

Creating the [Base Measure] isn’t critical for 1-Column fusion to work, it just makes it easier to update the core measure in a single place, rather than update multiple sub-measures.

Then, create multiple measures as needed using the following pattern:

Bounced = 
    MAXX(
        ALL('Emails'[Response Type]) ,
        [Base Measure] * ([Response Type] = "Bounced")
        )
Clicked = 
    MAXX(
        ALL('Emails'[Response Type]) ,
        [Base Measure] * ([Response Type] = "Clicked")
        )

And that’s it! Now, when you drag these measures into a Power BI Visual, 1-Column fusion should result in fewer SE calls to the datasource.

Example using Wide World Importers

If you want to try this out for yourself and have a [localhost] copy of the WideWorldImporters sample DW, you can download a copy of my PBIX file here.

Or, you can download a 100% import version of a PBIX file that uses this technique.

Start by adding the following five measures to a two-table model:

Base Measure = 
	COUNTROWS('fact Sale')

Package - Each = 
	MAXX(
        ALLSELECTED('fact Sale'[Package]), 
        [Base Measure] * ('fact Sale'[Package]="Each")
    )

Package - Packet = 
	MAXX(
        ALLSELECTED('fact Sale'[Package]), 
        [Base Measure] * ('fact Sale'[Package]="Packet")
    )

Package - Pair = 
	MAXX(
        ALLSELECTED('fact Sale'[Package]), 
        [Base Measure] * ('fact Sale'[Package]="Pair")
    )

Package - Bag = 
	MAXX(
        ALLSELECTED('fact Sale'[Package]), 
        [Base Measure] * ('fact Sale'[Package]="Bag")
    )

Once the measures get added, they can be used in a single Power BI Visual as shown below:

Now lets look at the server timings in DAX Studio for the DAX query that uses

The server timings show just a single SQL scan was issued to the underlying data source to obtain everything that was required to retrieve the data. For those interested, the SQL for that scan was as follows:

SELECT 
TOP (1000001) [t0].[Package],[t1].[Calendar Year],
COUNT_BIG(*)
 AS [a0]
FROM 
((select [$Table].[Sale Key] as [Sale Key],
    [$Table].[City Key] as [City Key],
    [$Table].[Customer Key] as [Customer Key],
    [$Table].[Bill To Customer Key] as [Bill To Customer Key],
    [$Table].[Stock Item Key] as [Stock Item Key],
    [$Table].[Invoice Date Key] as [Invoice Date Key],
    [$Table].[Delivery Date Key] as [Delivery Date Key],
    [$Table].[Salesperson Key] as [Salesperson Key],
    [$Table].[WWI Invoice ID] as [WWI Invoice ID],
    [$Table].[Description] as [Description],
    [$Table].[Package] as [Package],
    [$Table].[Quantity] as [Quantity],
    [$Table].[Unit Price] as [Unit Price],
    [$Table].[Tax Rate] as [Tax Rate],
    [$Table].[Total Excluding Tax] as [Total Excluding Tax],
    [$Table].[Tax Amount] as [Tax Amount],
    [$Table].[Profit] as [Profit],
    [$Table].[Total Including Tax] as [Total Including Tax],
    [$Table].[Total Dry Items] as [Total Dry Items],
    [$Table].[Total Chiller Items] as [Total Chiller Items],
    [$Table].[Lineage Key] as [Lineage Key]
from [Fact].[Sale] as [$Table]) AS [t0]

 inner join 

(select [$Table].[Date] as [Date],
    [$Table].[Day Number] as [Day Number],
    [$Table].[Day] as [Day],
    [$Table].[Month] as [Month],
    [$Table].[Short Month] as [Short Month],
    [$Table].[Calendar Month Number] as [Calendar Month Number],
    [$Table].[Calendar Month Label] as [Calendar Month Label],
    [$Table].[Calendar Year] as [Calendar Year],
    [$Table].[Calendar Year Label] as [Calendar Year Label],
    [$Table].[Fiscal Month Number] as [Fiscal Month Number],
    [$Table].[Fiscal Month Label] as [Fiscal Month Label],
    [$Table].[Fiscal Year] as [Fiscal Year],
    [$Table].[Fiscal Year Label] as [Fiscal Year Label],
    [$Table].[ISO Week Number] as [ISO Week Number],
    [$Table].[MonthID] as [MonthID]
from [Dimension].[Date] as [$Table]) AS [t1] on 
(
[t0].[Invoice Date Key] = [t1].[Date]
)
)

GROUP BY [t0].[Package],[t1].[Calendar Year] 

Alternatively, for comparison, a more traditional approach to building multiple measures like this produces the following server timings.

The more traditional approach returned the same result in 188 ms compared with just 18ms on the same machine (10x improvement!). Fusing the scans certainly helps the overall query time taken.

The same number of SE scans get generated when working on a 100% import model; however, the difference in overall query time gets reduced because all the data resides in memory.

Fusion of SE scans can only happen for measures using this approach in the same DAX visual. If each measure gets allocated to individual visuals, SE scans cannot be fused.

You could also produce a similar result without multiple measures, by dragging the ‘fact sale'[Package] columns to a visual as a GROUP BY column along with the base measure Still, sometimes this isn’t an option, and you need to split the work out to measures. When this is the case, this approach provides an excellent, elegant way of optimising the underlying SE requirements to satisfy the query.

Summary

As always, perform benchmarks and always test to see if this approach is suited to your needs.

Eagle-eyed DAX aficionados may notice the syntax in these measures looks a little unusual. I promise a deep dive explanation in my next blog why this syntax works.

References and assets.

Here is an article on how to monitor/troubleshoot SE activity for a given DAX query.

Download a 100% import version of a PBIX file using this technique.

Download a Direct Query version of a PBIX file using this technique. This version requires a local-host copy of the Wide World Importers DW.

Download a copy of the WideWorldImporters sample DW.

4.3 6 votes
Article Rating

Leave a Reply

9 Comments
Inline Feedbacks
View all comments
trackback

[…] Phil Seamark has a performance tuning tip for DAX: […]

kre8vdragyn
kre8vdragyn
3 years ago

This 1 column fusion approach I will need to try. Thank you

daniel macias
daniel macias
3 years ago

This is some trick DAX.. I love it! Can’t wait to try it out thanks!!!

Shoaib Ahmed
Shoaib Ahmed
3 years ago

Does the column fusion works if the same column is filtered multiple times in a measure thru variables. Does it happens automatically or we have to define it explicitly as mentioned in the blog.
Secondly, do we have to do SUMX of the base measure is sum(Amount) or is it ALWAYS MAXX?
Thanks

Souvik Bagchi
Souvik Bagchi
2 years ago

Great Analogy, can you also please show the same for SUM in the place of COUNTROWS along with multiple filter conditions…Thanks.

Przemyslaw Remin
Przemyslaw Remin
1 year ago

The funny thing about DAX is that you have to obfuscate your code to make performance acceptable. In an ideal world, you should only care about the brevity of your code and leave the optimization to programming language creators. And the optimization algorithm should read your needs better as the code is short and clear, especially in the simple scenario you mention here.

Zdenek Hanzal
11 months ago

Works perfectly! Great way to optimize performance for some scenarios.

We have another scenario of calculate with multiple filters on different columns, would you see any way with DAX to force the system to do the fusion in such scenario as well?

Like:

measure1
CALCULATE (
sum(‘sales'[EUR_COMP]),
‘sales'[PL_CODE] = “1”,
‘sales'[CUSTOMER] in { “1” , “2” , “3” }
)

measure2
CALCULATE (
sum(‘sales'[EUR_COMP]),
‘sales'[PL_CODE] = “2”,
‘sales'[CUSTOMER] in { “4” , “5” , “6” }
)

Thanks

trackback

[…] Phil Seamark – 2 Part: DAX 1-Column Fusion & using Debug Measures! […]

trackback

[…] Phil Seamark – 2 Part: DAX 1-Column Fusion & using Debug Measures! […]