DAX Fusion

Introduction:

Around 2014, an enhancement was made to the DAX engine designed to help improve query speed. The enhancement, called DAX Fusion, looks to reduce the number of storage engine scans needed to satisfy queries, by combining similar scans where possible.

This optimisation feature exists in Power BI, so can be used to help speed up Power BI models and make reports load faster.

When the underlying database receives a query, a DAX query plan is generated to determine the approach required. This plan is effectively a breakdown of all the sub-tasks required to be fulfilled to produce the result set for the query.

Some of these tasks are known as Storage Engine events, while others are Formula Engine events. Storage Engine events typically involve reading data from the underlying database, while Formula Engine events often perform complicated calculation logic that cannot be efficiently completed by the Storage Engine events.

The PBIX file used for this blog can be downloaded here.

The Storage Engine:

Sometimes the query plan generated by the database specifies that multiple Storage Engine scans are required to obtain the necessary numbers to produce the correct result. Once all the scans are complete, the database engine can sew everything back together into a single result.

I’ve seen plenty of examples where an un-optimised query can generate hundreds of S.E. scans, which is often an indicator for poor performance. A common approach when optimising DAX is to study the query plan for a slow query and see how many S.E. events are taking place. If there is a high number, then this can often be an opportunity to speed up the query by re-writing the DAX query to reduce the overall number of scans required to get executed.

Fusion:

DAX Fusion kicks in while the query plan gets generated. The optimisation looks to see if Storage Engine events can combine to a single event. If it detects enough similarities between any of the Storage Engine events, it fuses these to a single event to reduce the number of trips required.

If your DAX query includes multiple calculations, such as a simple SUM, COUNT, MIN/MAX or AVERAGE etc., over a given column, and also has other SUM, COUNT, MIN/MAX or AVERAGE etc. over a different column in the same table, then these are good candidates to be fused together.

Consider the following DAX calculation that performs several calculations over different columns in a single table. The query has a filter on the Calendar year column.

EVALUATE
  SUMMARIZECOLUMNS(
     //Filter year to 2015
    TREATAS({2015}, 'Dimension Date'[Calendar Year]) ,
    
    "SUM of Quantity", SUM('Fact Sale'[Quantity]) ,
    "MAX of Quantity", MAX('Fact Sale'[Quantity]) ,
    "MIN of Quantity", MIN('Fact Sale'[Quantity]) ,
    "Count of Sales" , COUNTROWS('Fact Sale') ,
    "AVG of Sale"    , SUM('Fact Sale'[Total Including Tax])
  )

Before Fusion, this example would probably have generated at least 5 Storage Engine events. One for each line between line 6 and 10. However, because each of these calculations operate over different columns in the same table AND share the same effective WHERE clause, they can be fused to a single event.

The effect of Fusion in this query is seen in the Server Timing window of DAX studio in several places.

Firstly, the number of scans produced and listed at step (2) is only a single scan – and not five.

Secondly, the pseudo SQL generated by the Storage Engine scan event can be seen in the window at (3) once the scan is selected at step (2). The statement for this scan shows multiple activities taking place in the same SELECT statement. Lines 2 through 6 indicate that DAX fusion has taken place.

SELECT
	SUM ( 'Fact Sale'[Total Including Tax] ), 
	SUM ( 'Fact Sale'[Quantity] ), 
	MAX ( 'Fact Sale'[Quantity] ), 
	MIN ( 'Fact Sale'[Quantity] ), 
	COUNT (  )
FROM 'Fact Sale'
	LEFT OUTER JOIN 'Dimension Date' 
		ON 'Fact Sale'[Delivery Date Key]='Dimension Date'[Date]
WHERE
	'Dimension Date'[Calendar Year] = 2015;

Finally, the last place that can be used to see if Fusion has taken place is to study the logical and physical query plans for additional information.

Breaking Fusion:

To further demonstrate DAX Fusion, the same query can be modified to break Fusion by forcing one of the calculations involved to have a WHERE clause that cannot get combined with other calculations in the same query.

In this case, the MIN calculation at line 6, now has additional filtering that means the S.E. event required to satisfy this part of the query has a WHERE clause that no longer matches other S.E. scan requirements. This updated code is still likely to yield the same numeric result.

EVALUATE
  SUMMARIZECOLUMNS(
    //Filter year to 2015
    TREATAS({2015}, 'Dimension Date'[Calendar Year]) ,
    
    "SUM of Quantity", SUM('Fact Sale'[Quantity]) ,
    "MAX of Quantity", MAX('Fact Sale'[Quantity]) ,
    "MIN of Quantity", CALCULATE(MIN('Fact Sale'[Quantity]),'Fact Sale'[Quantity]<10),
    "Count of Sales" , COUNTROWS('Fact Sale') ,
    "AVG of Sale"	 , SUM('Fact Sale'[Total Including Tax])
  )

The updated query now produces two Storage Engine events as shown in the below image.

The first scan only has 4 items in the SELECT statement, where previously this was 5 items. There is no longer a MIN function in this scan event.

SELECT
	SUM ( 'Fact Sale'[Total Including Tax] ), 
	SUM ( 'Fact Sale'[Quantity] ), 
	MAX ( 'Fact Sale'[Quantity] ), 
	COUNT (  )
FROM 'Fact Sale'
	LEFT OUTER JOIN 'Dimension Date' 
		ON 'Fact Sale'[Delivery Date Key]='Dimension Date'[Date]
WHERE
	'Dimension Date'[Calendar Year] = 2015;

The second scan event only has a single item in the SELECT statement, plus an additional line in the WHERE clause to apply the new logic that gets added to the DAX query.

SELECT
	MIN ( 'Fact Sale'[Quantity] )
FROM 'Fact Sale'
	LEFT OUTER JOIN 'Dimension Date' 
		ON 'Fact Sale'[Delivery Date Key]='Dimension Date'[Date]
WHERE
	'Dimension Date'[Calendar Year] = 2015 
	VAND 'Fact Sale'[Quantity]  <  10  ;

Benefits of Fusion:

In general, DAX fusion should make your queries run faster when it can be applied. Reducing the number of Storage Engine scan events is a good thing, especially in Direct Query mode. DAX Fusion applies to DQ events in the same way it applies to in-memory (VertiPaq) data, so would be particularly effective in DQ model where performance is often more of a concern.

Fusion is also highly effective at speeding up models that use row-based time-intelligence tables. I plan to provide more detail on this in an upcoming blog.

This internal optimisation has improved and evolved over time, so depending on the version of Analysis Services you are using, you may observe slightly different thresholds for when it kicks in.

Summary:

DAX Fusion is not a property, setting or hint that you have direct control over. Either it kicks in, or doesn’t, depending on the way you have written your query. You can only indirectly control Fusion by structuring your query in a way that improves the chances it can fire.

If DAX Fusion does kick in, it has the potential to make a big difference to your query speeds. Faster queries = faster reports = happier end users. 🙂

In Power BI, every visual gets treated as a distinctly different query, so Fusion cannot fuse Storage Engine scan events across visuals. Different optimisations can help in this case, but if you want to take advantage of this feature, you need to use visuals that accept multiple measures (Matrix, Table, Multi-card, etc.).

A common pattern I have seen in Power BI, is the use of multiple single measure card visuals, set in a line with each card-visual showing a slightly different metric. Fusion will not kick in with this approach, but combining the same measures into a table, or matrix visual – and then adjusting the style of the matrix to look like a set of cards can open the door for DAX fusion to be applied if it detects similar S.E. events.

5 14 votes
Article Rating

Leave a Reply

9 Comments
Inline Feedbacks
View all comments
Matt Allington
4 years ago

Very Interesting Phil. Can you explain why you used treatas instead of (say) filter in line 4 of your original code?is this more performant?

Please share more stuff like this.

Vikrenth
Vikrenth
4 years ago
Reply to  Matt Allington

Think TREATAS was generated by a Visual in Power BI and Phil has used it to demonstrate the concept.

willrod36
willrod36
4 years ago

This is a great article and helpful when ‘designing DAX code’ – thanks!

trackback

[…] DAX Fusion (@Phil_Seamark) […]

Hymie Ho
Hymie Ho
4 years ago

Regarding the last paragraph in the article: for an example of how to use a table/matrix visualization instead of multiple cards see Miguel Angel’s 3-part tutorial:
1) https://www.youtube.com/watch?v=Eorqmy_XWTE
2) https://www.youtube.com/watch?v=6t-drD9CWMUa
3) https://www.youtube.com/watch?v=05TP3TvzYRU

trackback

[…] called “DAX Fusion”, which my colleague Phil Seamark blogged about in great detail here, and this can make a signficant difference to performance (Marco Russo’s post on why Analyze […]

Miel de Jong
Miel de Jong
3 years ago

Hi Phil thanks for this excellent article. Do you know if and when IF/SWITCH statements break DAX fusion? I have a fact table with “Actual” and “Budget” columns, i.e. perfect candidate for DAX fusion. My Actual and Budget measures branch out via a SWITCH statement due to UoM conversion (qty, kg, value, etc.). The UoM is selected by the user in a disconnected slicer. Due to this, DAX fusion is not happening (i.e. I get separate SE queries for Actual and Budget)… I cannot figure out a way to make it work other than calculating all the different UoM versions… Read more »

trackback

[…] of the report – we’ve just dramatically improved the underlying processes, by leveraging “DAX fusion” enhancement of the engine! In simple words, the engine was able to generate one single DAX […]