Visualise your Power BI Refresh

Have you ever wondered why a Power BI dataset refresh was taking so long? And more specifically, how much time did the refresh spend on various sub-tasks that aren’t that visible to you via the web-portal?

This article shares a technique you can use to capture events fired during a Power BI refresh and use the results in a Power BI report visualise the results. Have to love the idea of using Power BI to optimise and improve Power BI. 🙂

Imported Power BI models get hosted in an Analysis Services database which emits telemetry for all sorts of interesting activities that take place. There are events related to end-user queries, security events, and of course, events related to dataset refreshes.

Once you have telemetry events, you can easily visualise the results in Power BI to help you understand how you might optimise future refreshes.

The high-level plan for this is:

  1. Connect to Analysis Services and run a trace using SSMS Profiler
  2. Kick off a dataset refresh
  3. Save the trace file as a Trace XML file
  4. Import the trace file into a PBIX from step 3 to see the results (download the PBIX here)

This technique works against a local Power BI Desktop file, against datasets hosted in a Power BI Premium workspace, AAS and local SSAS instances. The method also works with datasets hosted in a Premium Per User workspace, but will not work against a dataset hosted in a Power BI Pro workspace.

Below is a video walk-through of the exercise.

Step 1: Run a Trace

If you don’t already have SSMS (SQL Server Management Studio) installed on your machine, download and install it here: https://aka.ms/ssmsfullsetup

SSMS is a free tool built and maintained by Microsoft to help manage traditional SQL Server and Analysis Services database engines. As part of SSMS, you get a tool called SQL Server Profiler. This tool can be configured to run traces against a database engine to capture and extract interesting telemetry. Traces are manually started/stopped by you and can configure to filter/capture only specific events.

Once SSMS is installed, you start SQL Server Profiler via SSMS and select SQL Server Profiler from the Tools menu (shown below).

Or, you can start SQL Server Profiler from your Windows menu system. I start typing “Profiler” from the Windows 10 start menu until the popup menu shows the app.

Once SQL Server Profiler starts, start a new trace from the File menu.

The next step is to connect to the instance of Analysis Services engine you would like to trace. In the Connect to Server dialog, make sure the Server Type gets set to Analysis Services, rather than the default of Database Engine.

The Server name option is the location of your AS dataset. For connecting to Power BI Desktop, this will be something like “localhost:1234” – where you replace 1234 with the current port number used by Power BI Desktop (this changes each time you close/open Power BI).

Once you have SSMS on your machine, you can follow these helpful steps described by Soheil Bakhshi to register SQL Server Profiler as an external tool in the Power BI Ribbon.

Suppose you want to connect to a dataset in a Power BI Premium workspace. In that case, you will need to make sure the Server name option matches the Workspace Connection property from the workspace settings in the Power BI web-service.

Before you click the connect button, you need to specify which dataset you would like to trace in the workspace. To do this, click the Options button from the Connect to Server dialog.

In the Connection Properties tab, select the <Browse server..> item from the Connect to database drop-down. This action will probably initiate authentication popups, and when successful, you should see a dialog listing all the datasets in the specified workspace. Click the Connect button at the bottom of this dialog when you are ready.

The next screen is where we define which events we want to capture. For this blog, we are only interested in dataset refresh related events. First, make sure the Use the template option is set to Blank then click the Events Selection tab at the top.

In the Events Selection tab, expand and click the Job Graph event along with the Progress Report End event. If the Job Graph event doesn’t appear, make sure you are using the latest version of Power BI Desktop (if you are connecting to Power BI Desktop) and make sure you are on the latest version of SSMS.

Clicking the left-most checkbox on each row automatically checks all the checkboxes in that row.

Finally, click the Run button in the lower right-hand corner to start the trace. If you are connecting to Power BI Premium, this currently can take 20 to 30 seconds to start working.

Kick off a dataset refresh

If you are tracing a Power BI Premium dataset, you can refresh your database the usual way through the web-portal. You can also run a refresh using SSMS (will describe in my next blog) so that you can flexibly pick and choose which tables/partitions get included in the refresh.

Otherwise, if you connect to a Power BI Desktop dataset, just hit the refresh button in Power BI Desktop to initiate the refresh.

If successful, SQL Server Profiler will start to show lines of activity after a short pause. This activity is what we want to import into the Power BI Report we will use to analyse/visualise the refresh.

Save the Trace File

Once you know the refresh is complete, you can pause/stop the trace file and save the output into a local file.

Hopefully, you see a series of Job Graph events in the EventClass category/column.

Be sure to save the trace as a Trace XML file, rather than a Trace File. The Power BI Report used to analyse this data is configured to work with the XML version of the trace data. Please note the location you save the file as this will be needed to tell our Power BI File what to import.

Analyse the Trace File

Once you have the trace file, open the Analyse my Refresh.pbix file and import the data. You will need to open the Power Query Editor and set the location of your saved Trace XML file into the parameter highlighted. Note, you need to include the full path and filename extension into the parameter for this to work.

Once you update this parameter, click Close & Apply to read in the results of your trace and start studying your refresh.

Understanding the results

The Analyse my Refresh PBIX file does not prescribe direct answers to any refresh challenges you have. Instead, it is useful in helping you understand how long each item takes to refresh. Hopefully, the report highlights tables/partitions which are fast to complete, along with those that take the most time.

The chart on

Remember, if a table/partition takes a long time to refresh, this could reflect it has a large number of rows/columns and not necessarily a problem.

The top chart shows each table/object on the Y-axis. The bars on the Gantt chart represent how much time was spent on each step for that object. The blue Execute SQL step shows when the query was issued until the first row gets returned from the data source. The yellow bar shows the time spend by Analysis Services receiving the stream of data and performing tasks such as encoding/compressing into columns.

A long blue bar could highlight some opportunities to performance tune the data source, such as increasing compute or a better index strategy. This timing could also highlight Power BI’s query calling a series of Views on Views on Views. You’d be surprised to see how often I come across this. If you suspect this to be an issue in your case, try temporarily creating physical tables in the data source and see what effect this has on your refresh.

A long yellow bar could mean the table has many rows/columns and I have some tips on how you can address this below. This could also mean the data source is struggling to deliver data. If you are using a gateway, perhaps the gateway machine is under CPU/Memory pressure. Or, you are potentially doing a lot of complex Power Query transformations that might be slowing down the system.

Consider the number of rows read per object by how long it takes and if you still want to improve refresh speed per object.

  • Move the data to a different (faster) data source. e.g. use a Dataflow, or SQL DB instead of an Excel file.
  • Optimise the number of transformations taking place in Power Query
  • Review your Gateway (if relevant)
  • Reduce the number of rows read per table
  • Reduce the number of columns per table
  • Convert tables with large numbers of rows into partitions
    • Must be configured in Power BI Premium using XMLA-RW aware tools such as SSMS, Tabular Editor, SSDT etc. These can get automated with TOM.
    • Cannot be configured using Power BI Desktop
    • Partitions are excellent for large fact tables and allow fine-grain partition loading and DIY incremental refresh
    • Partitions can refresh in parallel – which is a great way to reduce overall time, but ensure your data source can keep up
  • If you double the number of columns in a table, you often double the time each table takes to refresh

The matrix beneath the Gantt chart shows the number or rows processed for each table/partition along with a useful ratio that derives the number of rows per second. The ratio column can be useful to sort by to help figure out which objects under/over perform.

In the example from the image, an Excel table took 37 seconds to process only four rows.

The second report page called Using Slot Data, is exactly the same data as the Refresh Times page. However, in the Using Slot Data report, the Y-axis on Gantt chart is organised into Slots. This is especially interesting for datasets with lots of objects hosted in Azure AS and Power BI Premium. Both systems will limit the number of concurrent slots (or containers) available for the refresh. Once a table/partition has finished processing, the next unprocessed object will begin. In Azure AS you can control the number of slots using the maxParallelism property. Increasing this property does not always lead to faster overall refresh times.

Summary and Notes

If you run a trace against a model hosted by Power BI Desktop, you will notice the AS engine refreshes all table objects/partitions at once. Datasets hosted in Power BI Premium will stagger the objects depending on several factors, but you may only see a handful running simultaneously. Several factors determine the number of slots/containers available to a Power BI Refresh. Some of which are under your control.

In essence, this tool is not designed to solve your Power BI Refresh challenges – but hopefully, it helps you get to an “Aha!” moment when you discover the one or two items causing your refresh to take as long as it does.

Some other easy things to try on a slow dataset are creating a clone without the slowest table(s) and then checking to see how fast a new refresh can complete.

Hopefully, this is helpful. Please feel free to let me know how you get on. I’m always interested to hear if these articles are of use.

Feel free to also tweak and improve the report as you need. Let me know if you come up with improvements worthy of sharing with the wider community.

Thanks

A big thanks to Chris Hamill over at Alluring Analytics | A Power BI Creator Blog (alluringbi.com) for the excellent design for the PBIX file.

Alex Barbeau for helping test drive a version of this with some great improvements.

Adam and Patrick at GuyInACube for inviting me on their live stream to talk a little about this over the weekend.

5 18 votes
Article Rating

Leave a Reply

49 Comments
Inline Feedbacks
View all comments
kre8vdragyn
kre8vdragyn
2 years ago

I don’t have permission to view this content. That’s unfortunate

kre8vdragyn
kre8vdragyn
2 years ago
Reply to  Phil Seamark

Amazing article and I’m so glad I got to read it 😄

David Wilson
2 years ago

Awesome. This has helped so much to narrow down what’s been taking the time during refreshes. Thank you. Phil.

Frank Preusker
Frank Preusker
2 years ago

Thanks Phil for this post – this could give some great insides into refresh performance. However, I’m running into an issue: The “Job Graph Events” section is not shown for me (but the “Progress Reports is there). I’m connected to a PBI Premium workspace (v1) and using SSMS 18.8. Does this operation require additional authorizations, which I potentially not have (I’m the owner of the workspace, but don’t have any PBI admin right whatsoever)?

Frank Preusker
Frank Preusker
2 years ago
Reply to  Phil Seamark

Same thing when connected to PBI Desktop – Job Graph Events section is not available. I’ve installed SSMS on my personal laptop and connected to PBI Desktop and here is available. I need to speak to our IT to re-install SSMS on my company PC and check again afterwards.

Dan Szepesi
Dan Szepesi
2 years ago

Great stuff, thanks for sharing all this!

Chris GL
Chris GL
2 years ago

Hey Phil, really nice so far. I got another error than Frank. 1. For me it says “calculation error in measure ‘Progress Report End'[Rows Read]: DAX comparison operations do not support comparing values of type Text with values of type Integer. Consider using the VALUE or FORMAT function to convert one of the values.” 2. Do you know if the SQL Server Profiler is runnable against a gen2 capacity? I got sth. like a timeout (even with setting the filename under options). For a PBI Desktop file it works fine. 3. Maybe it’s connected with my error under 1) but… Read more »

Lance G
Lance G
2 years ago

Hi Phil,
Awesome stuff.

I appear to be having some issues with TextData from JobGraph results being truncated. I have SSMS SQL results in grid set to unlimited for XML. Are there additional settings that can be configured to prevent this?

Last edited 2 years ago by Lance G
Chris Wagner
2 years ago

This is a real game changer. Thank you Phil!!!!

Lance G
Lance G
2 years ago

@Chris GL

I was able to resolve the error below by ensuring that EventSubclass columns were both Whole Numbers instead of text. I hope this helps you as well.

‘Progress Report End'[Rows Read]: DAX comparison operations do not support comparing values of type Text with values of type Integer.

FrankT
FrankT
2 years ago

Unfortunately, the “Job Graph Events” section is not shown for me (SQL Profiler 18.8, PPU, PBI Admin).

Frank Preusker
Frank Preusker
2 years ago
Reply to  FrankT

After re-installing SSMS it finally worked and I’ve been able to select the Job Graph option. Although, when I’ve started a 2nd trace the Job Graph option disappeared again 😮 but luckily a restart of the SQL Profiler solved it again. *Don’t understand what’s going on with that.* The loading of the XML file into the pbis model worked without issues. The timeline visual is using the Attribute:Slot as Enity, but the values are mostly empty for my trace (only some 00..05 entries). I’ve replaced it with ObjectName, which I believe is also what you were showing in the screenshots… Read more »

Daryl Lynch
Daryl Lynch
2 years ago

Thanks Phil. I tried is the on Dataset that pulled excel data from SharePoint Online, and on Dataset that pulled that data via Dataflow. Both SQL Profiler XML failed to include the Job Graph Events, so the visuals did not distinguish between the individual steps. Any ideas way the Job Graph Events are not be created. I was using SSMS 18.8. Is this potentially another background service that I need running on the Machine. I noticed that the Power BI Desktop was taking a long time to perform the evaluation before starting the extract of data. On the Guy-in-a-Cube webcast,… Read more »

Chandra Reddy
Chandra Reddy
2 years ago

Phil, Thanks for the great tip. I was going through this and refreshed my incremental dataset from the App side instead of SSMS and the trace shows all the data as usual. However, the report says it took 1 minute 35 seconds for duration, where as in the app it shows as 8+ minutes. Also the graphs on the top starts at 05:22 for me, does that mean the refresh was sitting idle for 5+ minutes? and finally when I run the refresh from SSMS, I get below error Error -1052311437: We have moved the session with ID ‘2B43C920-D121-44B8-9594-57357025003D’ to… Read more »

Chandra Reddy
Chandra Reddy
2 years ago

Can you also please help me understand how did you get your partition name as “FactInternetSales-2013”, I am only getting an year in the place of “tablename-year”. In SSAS I used to the name of choice but in Power BI Desktop incremental refresh won’t let me. If I change it via SSMS through XMLA, will it not break anything else?

Dan Szepesi
Dan Szepesi
2 years ago

Hi Phil – this works great for the refresh but the trace doesn’t pick up anything when you are in the power query editor and refreshing the preview. I have a big ugly fact table that takes forever to load the preview and sometimes completely hangs. i would love to see what it’s doing when its hanging. Is there a way to attach a trace to catch the previews loading or is that done in a different way?

Lance
Lance
2 years ago

Hello All, I seem to be having trouble using the template file to parse out the trace xml file. There seems to be an error within TextData at the Parsed XML step on JobGraph. When looking at the step before I get the following: Applied Step: Extracted Text After Delimiter1 = = Table.TransformColumns(#”Grouped Rows”, {{“TextData”, each Text.AfterDelimiter(_, “>”), type text}}) TextData= <Node Id….. If I understand intent of the steps, the TextData should be starting at instead of the metadata at this step. Has anybody else had this issue or know what could be the cause? Perhaps related to this,… Read more »

trackback

[…] recently wrote an article showing how you can visualise a dataset refresh using Power BI. It was a pretty cool way to show […]

Salty_Bronto
Salty_Bronto
2 years ago

Was wondering if anyone knows a replacement for the “AS Timeline” custom visual used in the template. as it no longer seems to be on Appsource. I tried other gantt charts, but not sure how to configure them to give me anything useful.

trackback

[…] connected to a CSV file stored in ADLSgen2 storage. Using the technique described by Phil Seamark here I was able to visualise the amount of parallelism when the dataset is refreshed in a Premium Per […]

trackback

[…] Seamark has recently written a post on how to Visualise your Power BI Refresh and of course we can apply the same technique of his excellent post to a Tabular […]

Dom
Dom
2 years ago

Great article Phil. Thanks a lot. I have an additional question regarding parallism. I have a trace where at “Using Slot Data” I only see one bar. Does it mean that in this case there is no parallism happening?

Nicolas
Nicolas
2 years ago

Hi, everything sounds great but the Job Graph events do not appear in Events. I’m using the latest SSMS but SSAS is version 14. The tabular model is on premise.

trackback

[…] Visualise your Power BI Refresh – Phil Seamark on DAX […]

Ravi
Ravi
2 years ago

You are just awesome! Thank you for sharing.

phil
phil
2 years ago

Hi Phil,

I try to connect to a ppu workspace.
Is it compatible with this kind of workspace?

Connection is ok, but I got the following error message when i launch the trace:
not exists or not access to the trace object “MicrosoftProfiler Tracexxxx”
on server object “autopremiumhostfrancecentral000-003”

It works fine with dax studio.
SSM version: 18.9.1

Anonymous
Anonymous
2 years ago

I found the solution on this site, we need to choose the dataset when connecting. https://www.fourmoo.com/2021/09/15/how-to-connect-to-a-power-bi-premium-per-user-using-sql-profiler/

MK
MK
1 year ago

Hi Phil,
I would like to download the pbix but after clicking on the link it say “Something went wrong”. Could you verify that the link is correct?

trackback

[…] Visualise your Power BI Refresh […]

Rohit Babber
Rohit Babber
1 year ago

Hi Phil,

When I tried to link my xml file with Provided PBIX file here getting below error:
multiple root elements line 1 position xxxxxx…..
Please note above dataset refresh took around 90 minutes.

When I tried with such dataset which took 5 minutes to refresh then PBIX file is able to parse xml file.

trackback

[…] is the link to the original post on Phil Seamark’s blog, where you’ll find a step-by-step tutorial and PBIX file that you can use as a template for […]

trackback

[…] workspace in the Power BI Service and used the technique described in Phil Seamark’s “Visualise your Power BI refresh” blog post to capture what happened during refresh. The refresh took 44 seconds as as you can […]

Konstantin Volke
Konstantin Volke
1 year ago

That’s a cool report. I was wondering whether it’s possible to also connect directly to a Log Analytics Workspace in Azure to leverage the LogAnalyticsIntegration.

Best regards,
Konstantin

trackback

[…] is the link to the original post on Phil Seamark’s blog, where you’ll find a step-by-step tutorial and PBIX file that you can use as a template for your […]

trackback

[…] here is the hyperlink to the unique publish on Phil Seamark’s weblog, the place you’ll discover a step-by-step tutorial and PBIX file that you need to use as a […]

Chris
Chris
11 months ago

Thanks Phil! This is a phenomenal tool. Does wonders to speed up my troubleshooting process for optimizing refreshes. Occasionally, I notice that the Power BI file hits an error when refreshing some Trace XML files. The error always corresponds to issues with multiple XML root elements. Here’s an example error message: “Xml processing failed. Either the input is invalid or it isn’t supported. (Internal error: There are multiple root elements. Line 1, position 26405.).” I’ve traced this error back to the Parsed XML step of the JobGraph query. The TextData column will contain rows with one or more errors that… Read more »

trackback

[…] Visualise your Power BI Refresh […]

Msom
Msom
9 months ago

Hi Unfortunately the PBIX file source is not responding. Is possbile to get it back online please ?

Marcin
Marcin
9 months ago

link to the pbix is not working, is it possible to reupload?

Darryl
Darryl
8 months ago

I can’t seem to find the Analyse My Refresh.pbix file anymore. Has it been moved from it’s location?

paul
paul
7 months ago

I’m getting an error when I try to download the .pbix file, is there a known issue with this file?

esblo
esblo
6 months ago

I get an error on the “Parsed XML” step in power query for JobGraph. Is the pbix file outdated or have I done something wrong?

Anonymous
Anonymous
1 month ago

2 years on this template worked great right away! I just which I hadn’t had to try so hard to find this functionality via google. Great job!

Anonymous
Anonymous
25 days ago

Hi!

Our datasource is a Orcle database and I am trying to figure out why the first partition never is run with parallelism when refreshing a dataset.

If I run 7 partitions in a table the the first partition runs on its own and then the other 6 in parallelism

If I run 5 partitions in a table the the first partition runs on its own and then the other 4 in parallelism.

Does any one know why not all 5 partitions runs in parallelism in my last run?

Mike Honey
Mike Honey
14 days ago

This looks very useful – thanks Phil for sharing.

My scenario is a Power BI Desktop refresh with a complex set of queries that tends to saturate the CPU. The results from the first refresh on default settings were very confusing – the longest-running queries were often seemingly trivial. Then I re-tried with the Data Load / Parallel loading option set to 1 (disable parallel loading). This gave results that made more sense, and are hopefully more useful to evaluate tuning efforts on individual queries.