Visualise your Power BI Refresh

Have you ever wondered why a Power BI dataset refresh was taking so long? And more specifically, how much time did the refresh spend on various sub-tasks that aren’t that visible to you via the web-portal?

This article shares a technique you can use to capture events fired during a Power BI refresh and use the results in a Power BI report visualise the results. Have to love the idea of using Power BI to optimise and improve Power BI. 🙂

Imported Power BI models get hosted in an Analysis Services database which emits telemetry for all sorts of interesting activities that take place. There are events related to end-user queries, security events, and of course, events related to dataset refreshes.

Once you have telemetry events, you can easily visualise the results in Power BI to help you understand how you might optimise future refreshes.

The high-level plan for this is:

  1. Connect to Analysis Services and run a trace using SSMS Profiler
  2. Kick off a dataset refresh
  3. Save the trace file as a Trace XML file
  4. Import the trace file into a PBIX from step 3 to see the results (download the PBIX here)

This technique works against a local Power BI Desktop file, against datasets hosted in a Power BI Premium workspace, AAS and local SSAS instances. The method also works with datasets hosted in a Premium Per User workspace, but will not work against a dataset hosted in a Power BI Pro workspace.

Below is a video walk-through of the exercise.

Step 1: Run a Trace

If you don’t already have SSMS (SQL Server Management Studio) installed on your machine, download and install it here: https://aka.ms/ssmsfullsetup

SSMS is a free tool built and maintained by Microsoft to help manage traditional SQL Server and Analysis Services database engines. As part of SSMS, you get a tool called SQL Server Profiler. This tool can be configured to run traces against a database engine to capture and extract interesting telemetry. Traces are manually started/stopped by you and can configure to filter/capture only specific events.

Once SSMS is installed, you start SQL Server Profiler via SSMS and select SQL Server Profiler from the Tools menu (shown below).

Or, you can start SQL Server Profiler from your Windows menu system. I start typing “Profiler” from the Windows 10 start menu until the popup menu shows the app.

Once SQL Server Profiler starts, start a new trace from the File menu.

The next step is to connect to the instance of Analysis Services engine you would like to trace. In the Connect to Server dialog, make sure the Server Type gets set to Analysis Services, rather than the default of Database Engine.

The Server name option is the location of your AS dataset. For connecting to Power BI Desktop, this will be something like “localhost:1234” – where you replace 1234 with the current port number used by Power BI Desktop (this changes each time you close/open Power BI).

Once you have SSMS on your machine, you can follow these helpful steps described by Soheil Bakhshi to register SQL Server Profiler as an external tool in the Power BI Ribbon.

Suppose you want to connect to a dataset in a Power BI Premium workspace. In that case, you will need to make sure the Server name option matches the Workspace Connection property from the workspace settings in the Power BI web-service.

Before you click the connect button, you need to specify which dataset you would like to trace in the workspace. To do this, click the Options button from the Connect to Server dialog.

In the Connection Properties tab, select the <Browse server..> item from the Connect to database drop-down. This action will probably initiate authentication popups, and when successful, you should see a dialog listing all the datasets in the specified workspace. Click the Connect button at the bottom of this dialog when you are ready.

The next screen is where we define which events we want to capture. For this blog, we are only interested in dataset refresh related events. First, make sure the Use the template option is set to Blank then click the Events Selection tab at the top.

In the Events Selection tab, expand and click the Job Graph event along with the Progress Report End event. If the Job Graph event doesn’t appear, make sure you are using the latest version of Power BI Desktop (if you are connecting to Power BI Desktop) and make sure you are on the latest version of SSMS.

Clicking the left-most checkbox on each row automatically checks all the checkboxes in that row.

Finally, click the Run button in the lower right-hand corner to start the trace. If you are connecting to Power BI Premium, this currently can take 20 to 30 seconds to start working.

Kick off a dataset refresh

If you are tracing a Power BI Premium dataset, you can refresh your database the usual way through the web-portal. You can also run a refresh using SSMS (will describe in my next blog) so that you can flexibly pick and choose which tables/partitions get included in the refresh.

Otherwise, if you connect to a Power BI Desktop dataset, just hit the refresh button in Power BI Desktop to initiate the refresh.

If successful, SQL Server Profiler will start to show lines of activity after a short pause. This activity is what we want to import into the Power BI Report we will use to analyse/visualise the refresh.

Save the Trace File

Once you know the refresh is complete, you can pause/stop the trace file and save the output into a local file.

Hopefully, you see a series of Job Graph events in the EventClass category/column.

Be sure to save the trace as a Trace XML file, rather than a Trace File. The Power BI Report used to analyse this data is configured to work with the XML version of the trace data. Please note the location you save the file as this will be needed to tell our Power BI File what to import.

Analyse the Trace File

Once you have the trace file, open the Analyse my Refresh.pbix file and import the data. You will need to open the Power Query Editor and set the location of your saved Trace XML file into the parameter highlighted. Note, you need to include the full path and filename extension into the parameter for this to work.

Once you update this parameter, click Close & Apply to read in the results of your trace and start studying your refresh.

Understanding the results

The Analyse my Refresh PBIX file does not prescribe direct answers to any refresh challenges you have. Instead, it is useful in helping you understand how long each item takes to refresh. Hopefully, the report highlights tables/partitions which are fast to complete, along with those that take the most time.

The chart on

Remember, if a table/partition takes a long time to refresh, this could reflect it has a large number of rows/columns and not necessarily a problem.

The top chart shows each table/object on the Y-axis. The bars on the Gantt chart represent how much time was spent on each step for that object. The blue Execute SQL step shows when the query was issued until the first row gets returned from the data source. The yellow bar shows the time spend by Analysis Services receiving the stream of data and performing tasks such as encoding/compressing into columns.

A long blue bar could highlight some opportunities to performance tune the data source, such as increasing compute or a better index strategy. This timing could also highlight Power BI’s query calling a series of Views on Views on Views. You’d be surprised to see how often I come across this. If you suspect this to be an issue in your case, try temporarily creating physical tables in the data source and see what effect this has on your refresh.

A long yellow bar could mean the table has many rows/columns and I have some tips on how you can address this below. This could also mean the data source is struggling to deliver data. If you are using a gateway, perhaps the gateway machine is under CPU/Memory pressure. Or, you are potentially doing a lot of complex Power Query transformations that might be slowing down the system.

Consider the number of rows read per object by how long it takes and if you still want to improve refresh speed per object.

  • Move the data to a different (faster) data source. e.g. use a Dataflow, or SQL DB instead of an Excel file.
  • Optimise the number of transformations taking place in Power Query
  • Review your Gateway (if relevant)
  • Reduce the number of rows read per table
  • Reduce the number of columns per table
  • Convert tables with large numbers of rows into partitions
    • Must be configured in Power BI Premium using XMLA-RW aware tools such as SSMS, Tabular Editor, SSDT etc. These can get automated with TOM.
    • Cannot be configured using Power BI Desktop
    • Partitions are excellent for large fact tables and allow fine-grain partition loading and DIY incremental refresh
    • Partitions can refresh in parallel – which is a great way to reduce overall time, but ensure your data source can keep up
  • If you double the number of columns in a table, you often double the time each table takes to refresh

The matrix beneath the Gantt chart shows the number or rows processed for each table/partition along with a useful ratio that derives the number of rows per second. The ratio column can be useful to sort by to help figure out which objects under/over perform.

In the example from the image, an Excel table took 37 seconds to process only four rows.

The second report page called Using Slot Data, is exactly the same data as the Refresh Times page. However, in the Using Slot Data report, the Y-axis on Gantt chart is organised into Slots. This is especially interesting for datasets with lots of objects hosted in Azure AS and Power BI Premium. Both systems will limit the number of concurrent slots (or containers) available for the refresh. Once a table/partition has finished processing, the next unprocessed object will begin. In Azure AS you can control the number of slots using the maxParallelism property. Increasing this property does not always lead to faster overall refresh times.

Summary and Notes

If you run a trace against a model hosted by Power BI Desktop, you will notice the AS engine refreshes all table objects/partitions at once. Datasets hosted in Power BI Premium will stagger the objects depending on several factors, but you may only see a handful running simultaneously. Several factors determine the number of slots/containers available to a Power BI Refresh. Some of which are under your control.

In essence, this tool is not designed to solve your Power BI Refresh challenges – but hopefully, it helps you get to an “Aha!” moment when you discover the one or two items causing your refresh to take as long as it does.

Some other easy things to try on a slow dataset are creating a clone without the slowest table(s) and then checking to see how fast a new refresh can complete.

Hopefully, this is helpful. Please feel free to let me know how you get on. I’m always interested to hear if these articles are of use.

Feel free to also tweak and improve the report as you need. Let me know if you come up with improvements worthy of sharing with the wider community.

Thanks

A big thanks to Chris Hamill over at Alluring Analytics | A Power BI Creator Blog (alluringbi.com) for the excellent design for the PBIX file.

Alex Barbeau for helping test drive a version of this with some great improvements.

Adam and Patrick at GuyInACube for inviting me on their live stream to talk a little about this over the weekend.

5 5 votes
Article Rating

Leave a Reply

18 Comments
Inline Feedbacks
View all comments
kre8vdragyn
kre8vdragyn
14 days ago

I don’t have permission to view this content. That’s unfortunate

kre8vdragyn
kre8vdragyn
13 days ago
Reply to  Phil Seamark

Amazing article and I’m so glad I got to read it 😄

David Wilson
14 days ago

Awesome. This has helped so much to narrow down what’s been taking the time during refreshes. Thank you. Phil.

Frank Preusker
Frank Preusker
14 days ago

Thanks Phil for this post – this could give some great insides into refresh performance. However, I’m running into an issue: The “Job Graph Events” section is not shown for me (but the “Progress Reports is there). I’m connected to a PBI Premium workspace (v1) and using SSMS 18.8. Does this operation require additional authorizations, which I potentially not have (I’m the owner of the workspace, but don’t have any PBI admin right whatsoever)?

Frank Preusker
Frank Preusker
13 days ago
Reply to  Phil Seamark

Same thing when connected to PBI Desktop – Job Graph Events section is not available. I’ve installed SSMS on my personal laptop and connected to PBI Desktop and here is available. I need to speak to our IT to re-install SSMS on my company PC and check again afterwards.

Dan Szepesi
Dan Szepesi
14 days ago

Great stuff, thanks for sharing all this!

Chris GL
Chris GL
13 days ago

Hey Phil, really nice so far. I got another error than Frank. 1. For me it says “calculation error in measure ‘Progress Report End'[Rows Read]: DAX comparison operations do not support comparing values of type Text with values of type Integer. Consider using the VALUE or FORMAT function to convert one of the values.” 2. Do you know if the SQL Server Profiler is runnable against a gen2 capacity? I got sth. like a timeout (even with setting the filename under options). For a PBI Desktop file it works fine. 3. Maybe it’s connected with my error under 1) but… Read more »

Lance G
Lance G
13 days ago

Hi Phil,
Awesome stuff.

I appear to be having some issues with TextData from JobGraph results being truncated. I have SSMS SQL results in grid set to unlimited for XML. Are there additional settings that can be configured to prevent this?

Last edited 13 days ago by Lance G
Chris Wagner
13 days ago

This is a real game changer. Thank you Phil!!!!

Lance G
Lance G
13 days ago

@Chris GL

I was able to resolve the error below by ensuring that EventSubclass columns were both Whole Numbers instead of text. I hope this helps you as well.

‘Progress Report End'[Rows Read]: DAX comparison operations do not support comparing values of type Text with values of type Integer.

FrankT
FrankT
13 days ago

Unfortunately, the “Job Graph Events” section is not shown for me (SQL Profiler 18.8, PPU, PBI Admin).

Frank Preusker
Frank Preusker
11 days ago
Reply to  FrankT

After re-installing SSMS it finally worked and I’ve been able to select the Job Graph option. Although, when I’ve started a 2nd trace the Job Graph option disappeared again 😮 but luckily a restart of the SQL Profiler solved it again. *Don’t understand what’s going on with that.* The loading of the XML file into the pbis model worked without issues. The timeline visual is using the Attribute:Slot as Enity, but the values are mostly empty for my trace (only some 00..05 entries). I’ve replaced it with ObjectName, which I believe is also what you were showing in the screenshots… Read more »

Daryl Lynch
Daryl Lynch
9 days ago

Thanks Phil. I tried is the on Dataset that pulled excel data from SharePoint Online, and on Dataset that pulled that data via Dataflow. Both SQL Profiler XML failed to include the Job Graph Events, so the visuals did not distinguish between the individual steps. Any ideas way the Job Graph Events are not be created. I was using SSMS 18.8. Is this potentially another background service that I need running on the Machine. I noticed that the Power BI Desktop was taking a long time to perform the evaluation before starting the extract of data. On the Guy-in-a-Cube webcast,… Read more »

Chandra Reddy
Chandra Reddy
7 days ago

Phil, Thanks for the great tip. I was going through this and refreshed my incremental dataset from the App side instead of SSMS and the trace shows all the data as usual. However, the report says it took 1 minute 35 seconds for duration, where as in the app it shows as 8+ minutes. Also the graphs on the top starts at 05:22 for me, does that mean the refresh was sitting idle for 5+ minutes? and finally when I run the refresh from SSMS, I get below error Error -1052311437: We have moved the session with ID ‘2B43C920-D121-44B8-9594-57357025003D’ to… Read more »

Chandra Reddy
Chandra Reddy
7 days ago

Can you also please help me understand how did you get your partition name as “FactInternetSales-2013”, I am only getting an year in the place of “tablename-year”. In SSAS I used to the name of choice but in Power BI Desktop incremental refresh won’t let me. If I change it via SSMS through XMLA, will it not break anything else?