If you’ve landed on this article from a Google search, we hope you’ve already read Part 1 and Part 2 of this Power BI tutorial series.
Nabler is a digital analytics consulting company providing data infrastructure, Business Intelligence, Conversion rate optimization and data science services. You can trust us for your Power BI implementation/migration. Talk to our BI consultants about it today- CLICK HERE
You can also get in touch with us about-
When your Big Data changes, you can refresh the extract by selecting the data source from the data menu in Business Intelligence. By default, the extracts are fully refreshed. That means, all the rows are replaced with the new data. This full refresh which we spoke about in part-1 of this series is an expensive refresh as the extract can be big and time consuming.
There is an alternative in the BI tool.
It’s called the incremental data refresh. Here your data source is updated with the daily sales transaction instead of refreshing the entire data. Therefore, you can add rows of the new transactions that occur daily and conduct a full refresh maybe once a week. The advantages are that you have a quicker refresh, they are more reliable as you don’t need to maintain long-running connections to source systems and the resource consumption is reduced since there is less data to refresh.
Incremental refresh polices are defined in Power BI Desktop and they can be applied once they are published to the Power BI service.
The first refresh in the Power BI service may take a little longer since it must import the 5-year data, but subsequent refreshes may be quicker.
The below mentioned example defines a refresh policy to store 5 years of data in total, and incrementally refresh 10 days of data. If the dataset is refreshed on day to day basis, the following steps will be carried out for each refresh operation.
Definition of these ranges might be all what you need, and you can go to the publishing step below. The additional dropdowns are for advanced features.
Incremental refresh of 10 days is more efficient than a full refresh of 5 years in the Power BI service. However, this is achieved better if you select the ‘Detect Data Changes’ checkbox, and then a date/time column to identify and refresh only the days where the data has changed. This assuming such a column exists in the source system, which is typically for auditing purposes. But be mindful that this should not be the same column used to partition the data with the RangeStart/RangeEnd parameters. The maximum value of this column is evaluated for each of the periods in the incremental range and if it hasn’t changed much since the previous refresh, you don’t need to refresh the period. In the example below, you can see how this can further reduce the days incrementally refreshed from 10 to 2.
The current design requires that the column to detect Big Data changes is retained and cached into memory. You may want to consider one of the following techniques to reduce cardinality and memory consumption.