Share

TOP 3 Tips for Faster Power BI Reports

Christopher Wagner • December 28, 2019

Want your Power BI Dashboards to run faster? Follow these Tabular Best Practices.

The best practices for Fact/ Dim relations within Tabular models are based on the Kimball DW design. This design has been integrated into the VertiPaq compute engine in a manner that maximizes performance over large datasets and many dimensions and is usable by business users as well as advanced technical resources.


While the flexibility of the VertiPaq engine enables usage of almost any data model, there are specific techniques within Kimball that can ensure the highest performance and ease of implementation within this compute engine.


TIP #1 INTEGER BASED COLUMN STORE

The VertiPaq engine is built around and highly optimized for integers. This has led to the following best practices within the largest and highest performing models:

  • Fact Tables only contain Integer Values
  • All joins are Integer Based (recent updates may have tuned to enable GUID based joins to perform as effectively- I have not seen this in action)
  • Date joins as Integers in the format of yyyymmdd have been highly optimized and tuned for very complex and sophisticated time-series analytics (more on this later)
  • Minimize the size of string values (even in dimensions) as much as possible- 255 is good, 127 is better, 31-63 is better still
  • Note- there are new features and new techniques that would enable people to ignore these string limitations. These are very new and proper guidance is not widely known and usage is not obvious to people. We will be able to easily deal with this in the future.


TIP #2 DIMENSIONAL RELATIONS

The way dimensions are broken out in-memory and the calculations are stored in Cache based upon the joins and best performance can be derived from optimizing this cache. The cache uses concepts of HOT and COLD for where and how the data is stored, with HOT cache providing the fastest responses and COLD taking a little longer. To maximize what can remain in HOT cache, we follow the following best practices:

  • Integer based Joins
  • Minimize Cardinality of Dimensions
  • Highest performance is obtained by keeping Cardinality of a DIM sub 127k (above this performance will start to degrade)
  • Move Type 2 attributes to Fact Tables and separate into a different dimension


TIP #3 DATE DIMENSION

The Date Dimension within the VertiPaq engine has been specifically and highly tuned for time-based analytics to the point where there is no other engine that offers the same level of dynamic time-based analysis to business units and data scientists alike. If you are unfamiliar with what the Tabular model can do with Time based analytics, I highly recommend checking out the many articles Marco Russo has published on time intelligence (https://www.sqlbi.com/articles/time-intelligence-in-power-bi-desktop/)

  • Date Dimension based off of Datekey integer join of yyyymmdd
  • Dynamic calculations and breakouts by Date Dim and features and functions within the dimension
  • A standard DateDim makes enabling these features much easier
  • This DateDim can also be built to minimize data refreshes, partition cubes and reduce our overall cloud spend while improving performance.



CHRIS WAGNER, MBA MVP

Analytics Architect, Mentor, Leader, and Visionary

Chris has been working in the Data and Analytics space for nearly 20 years. Chris has dedicated his professional career to making data and information accessible to the masses. A significant component in making data available is continually learning new things and teaching others from these experiences. To help people keep up with this ever-changing landscape, Chris frequently posts on LinkedIn and to this blog.
By Christopher Wagner 03 Sep, 2024
Your guide to becoming a Data Engineer.
By Christopher Wagner 19 Aug, 2024
Compare Microsoft Fabric and Databricks, two leading data platforms. Highlights their features, strengths, and unique offerings across various domains like data engineering, data analytics, data science, DevOps, security, integration with other tools, cost management, and governance. Microsoft Fabric is noted for its low-code/no-code solutions and seamless integration with Microsoft tools, making it accessible for users with varying technical skills. Databricks is praised for its high-performance capabilities in big data processing and collaborative analytics, offering flexibility and control for experienced data teams.
By Christopher Wagner 15 Nov, 2023
In a dynamic data engineering scenario, Sam, a skilled professional, adeptly navigates urgent requests using Microsoft Fabric. Collaborating with Data Steward Lisa and leveraging OneLake, Sam streamlines data processes, creating a powerful collaboration between engineering and stewardship. With precision in Azure Data Factory and collaboration with a Data Scientist, Sam crafts a robust schema, leading to a visually appealing Power BI report.
By Christopher Wagner 28 Apr, 2023
NOTE: This is the first draft of this document that was assembled yesterday as a solo effort. If you would like to contribute or have any suggestions, check out my first public GIT repository - KratosDataGod/LakehouseToPowerBI: Architectural design for incorporating a Data Lakehouse architecture with an Enterprise Power BI Deployment (github.com) This article is NOT published, reviewed, or approved by ANYONE at Microsoft. This content is my own and is what I recommend for architecture and build patterns.
Show More
Share by: