Create Unique Field Names Across the Entire Dataset
|
Data pane
|
Although the Power BI software permits columns
to exist which are named the same across tables, that is a poor practice to
allow in a data model. Let’s say that there is a Sales Rep table and a Sales
Manager table, and both have a Sales Region column. Both Sales Regions should
be renamed to be Sales Rep Region, and Sales Manager Region for clarity (or,
even better, renamed in an underlying database view where Power BI retrieves
the data). The general rule to ensure report labels are easy to understand:
the name assigned to each column needs to be self-explanatory on its own
rather than relying on the context of its table.
|
Expose a Field Only Once in the Entire
Dataset
|
Data pane
|
Sometimes a column exists in the data model
more than once. The ideal choice is for the redundant column to not be
imported at all to the dataset. If that isn’t a viable choice for whatever
reason, then make sure the “extra” copy of the column is hidden from report
view. For instance, if a Sales Header ID exists in both the Header and the
Lines tables – make sure it’s hidden in the Lines table and only shown once
in the data model. This way, users won’t accidentally use both columns. Using
both could result in possibly needing to use two separate filters which would
provide a very poor user experience.
|
Hide Fields Not Utilized for Reporting
|
Data pane
|
IDs and surrogate keys are needed for
relationships, but are not useful for reporting. Hiding them simplifies the
data model because there’s less fields shown in the field list. The
consumers of your data model will appreciate the lack of what they’d view as
clutter. (Reminder: if a column is not needed for something – a relationship,
basis for a calculation, sorting, something – then don’t import it at all.)
|
Use Friendly Names for Fields
|
Data pane
|
A field such as “Student Name” is nicer to
look at for reporting than something like “Stdt_nm” which may be how it’s
stored in the source. Since field names impact the naming of column titles,
filters, and slicers, assigning friendly names with business-relevance are
well worth a bit of time investment. Just be consistent with respect to
spaces, casing, and abbreviations (such as Desc or Nbr). Try to only use
acronyms if you are certain all users know the meaning.
|
Set Formatting for All Numeric and Date
Fields
|
Modeling > Formatting
|
It’s no fun to add a field onto a report
that needs to be reformatted every single time. Defining units as whole
numbers, or amounts as currency, is a helpful timesaver on the reporting end.
Don’t forget to also set the comma to assist with readability.
|
Specify Sorting for Columns
|
Modeling > Sort
|
Creating a default sort order is a common
need for certain types of columns. For example, you may need to sort the
Month Name column by the Month Number column. Oftentimes the column serving
as the sort value can be hidden from report view.
|
Set Default Summarization for All Numeric
Columns
|
Modeling > Properties > Default
Summarization
|
The summarization default is sum, but some numeric
columns are more suitable as count, min, or max. For example, high
temperature per day would never be summed for a meaningful value; average is
likely a better choice. Setting this properly allows subtotals and totals to
be presented properly. Summarization for column that aren’t really numeric, such
as Customer Number or Year Number, should be set to “don’t summarize.” More
details: Why the Default Summarization Property in Power BI is So
Important.
|
Create Useful Calculated Columns
|
Modeling > Calculations > New Column
|
Creation of calculated columns (aka derived fields) is vital for enrichment of the data
model. A very simple example of this is names – perhaps the underlying data
source keeps First Name and Last Name in separate columns; you may wish to
derive a Full Name field for reporting purposes which concatenates them
together. Note: there are times when it's preferable to derive columns in the
Query Editor instead of DAX: (a) to take advantage of query folding, (b) to
remove intermediary columns from being loaded to the model, (c) to achieve
faster processing time since DAX columns compute sequentially, or (d) to
potentially achieve a better compression rate. Marco Russo discussed the last
two items in this post. For complex calculations in a larger model, test the
difference in performance.
|
Create Useful Calculated Measures
|
Modeling > Calculations > New Measure
|
Calculated measures (aka explicit measures) are extremely useful to augment
reporting and analysis. Year-to-date and year-over-year are common examples
of calculated measures which rely upon current context for calculation at
runtime. Typically a model starts with common calculations, and continues to
evolve over time as business users identify additional relevant calculations.
|
Decide on Location for Calculated Measures
|
Modeling > Properties > Home Table
|
Some Power BI practitioners prefer to locate
all calculated measures in a specific table just for measures. However, this
results in calculated columns and calculated measures residing in different tables
which might seem confusing to users. If you do choose to use a “home table”
for all measures, or certain critical measures, focus on organization to help
users find them easily.
|
Consolidate Intermediary Calculated Columns
|
N/A
|
Frequently it is easier to understand and
test a complicated calculated column in a series of steps that build upon
each other. The tradeoff for this understandability is additional size and
memory usage. Therefore, you may consider consolidating calculated columns
once testing is complete rather than leaving the separate steps permanently in
the model. If they remain, make certain that any intermediary calculated
columns are hidden from report view. Alternatively, you could handle adding
new columns in the Query Editor (instead of DAX), and delete the intermediary
columns before the last step in the query.
|
Decide on Use of Implicit Measures
|
N/A
|
An implicit measure utilizes the default
summarization setting such as “Sum of Sales Amount.” Some Power BI
practitioners prefer to use explicit calculated measures only, in which case
base numeric columns are hidden. Like many other things, consistency here is
what matters.
|
Set the Data Category
|
Modeling > Properties > Data Category
|
Address-related columns, such as
city/state/zip and so forth, need to be specified in the data model to ensure
geocoding can occur properly. Typically, Power BI is very good about detecting
data categories from the data, but the data category for addresses, URLs and barcodes needs to be verified.
|
Create Columns to Support Query String
Parameters
|
Modeling > Calculations > New Column
|
There are some rules for using URL query string parameters, such as no spaces allowed in the name.
Therefore, there may be circumstances when you need to create some specific
columns for use with query string parameters. In this case, you'll likely
want to name them something similar to 'pStateName' and ensure they are
hidden from report view.
|