New Whitepaper on Planning a Power BI Enterprise Deployment

I'm excited to share that a new technical whitepaper I co-authored with Chris Webb is published. It's called Planning a Power BI Enterprise Deployment. It was really a fun experience to write something a bit more formal than blog posts. My interest in Power BI lies in how to successfully deploy it, manage it, and what the end-to-end story is especially from the perspective of integration with other data assets in an organization. Power BI has grown to be a huge, wide set of features so we got a little verbose at just over 100 pages.

A huge thank you to Chris Webb for inviting me to be his co-author. Chris is not only whip-smart, but a total pleasure to work with. 

Another big thank you to Meagan Longoria for being our tech editor. I like to think of myself as detail-oriented, but I've got nothin' compared to her eagle eye.

We worked primarily with Adam Wilson at Microsoft in terms of getting information, so he deserves a thank you as well for dealing with the questions that Chris and I peppered him with week after week.

I hope you find the whitepaper to to be useful. 

Displaying Columns of Metadata in the Azure Portal

This is just a quick tip about displaying columns in the Azure portal. If you've not been using this little feature, they can be helpful.

Here's an example of what part of my portal looks like for virtual machines:

AzurePortal_ColumnsDisplayed.jpg

For VMs, there is some useful metadata pertaining to IP addresses and disks. If you've started using tags those can be displayed too -- tags are really helpful for things like billing categories, environment names, project or system names, team or group names, who owns or supports a resource...you get the idea. If you have multiple subscriptions in your directory, that can be shown as well.

Each Azure resource differs in which columns are available.

You Might Also Like...

Setting Up Azure Disk Encryption for a Virtual Machine with PowerShell

Why Some Azure VM Sizes are Unavailable When Resizing in the Portal

 

Reusing Datasets Imported to the Power BI Service

I'm a big fan of reusing Power BI datasets whenever possible to minimize redundant datasets floating around the organization. It's less maintenance, less chance of calculation differences, and less data refreshes to schedule. And, this technique is a way to separate who handles report creation vs. creating the dataset, calculations, and relationships (which is typically far fewer people).

Info verified as of: May 20, 2017

In this post I'm referring to datasets which are imported into Power BI (thus requiring a data refresh to update the imported data). We're already "reusing" a dataset which is in DirectQuery or SSAS Live Connection mode, so those are useful techniques too -- just not applicable to this particular post.

To reuse an imported dataset, there are three options I'm aware of:

  1. Report in the Power BI Service. This refers to using the web interface for creating a new report separate from the original PBIX file.
  2. Analyze In Excel. This refers to creating an Excel report and can currently be used by anyone with read or edit access to the dataset. Hence, very useful for self-service BI purposes.
  3. Power BI Service Live Connection. This refers to creating a Power BI Desktop report. This option can currently only be used by people with edit permissions on the dataset (so not appropriate for broad self-service BI reporting at this time).

Report in the Power BI Service

In the Power BI Service, if you have edit permissions, the option to create a report is on the Actions menu. It will open up a blank report connected to the dataset:

When you save the new report, it will appear as another report although it shares the same exact underlying dataset. This is a great way to divide up reporting needs for different people, yet use the same exact dataset.

That was the simple one. Next we have...


Analyze In Excel

In the Power BI Service, if you have edit permissions, the option to use Analyze In Excel is on the Actions menu:

PowerBI_AnalyzeInExcel.jpg

If you're a read only user, it's not as prominently displayed. However, it works the same. You can find it through the "Related Content" pane:

The first time you'll be prompted to download and install an Analysis Services OLEDB driver which handles connectivity back to the Power BI dataset in the Service:

 

The next thing to know is that the connection will be stored in a separate .ODC file (short for Office Data Connection). You'll want to keep all of your .ODC files in a single location, and only have one .ODC file per connection (this makes it easy to change the data connection info later if you need to).

From there, you can create pivot tables, charts, etc like normal in Excel. The data connection properties inside of Excel will look like this:

If things don't work, you might want to check that this option in the Power BI Admin portal hasn't been turned off for Analyze In Excel (though this Admin portal setting is applicable only to datasets where the underlying data is SSAS in Live Connection mode):

 

More info about Analyze In Excel is here: https://powerbi.microsoft.com/en-us/documentation/powerbi-service-analyze-in-excel/ - the post has more details about requirements for Excel version, the need for a measure in the dataset, etc. 

Note that you can't publish an Analyze In Excel workbook back to the Power BI Service (because workbooks in Power BI are only supported if it has imported data in the workbook). Maybe we'll get this feature in the future, because having one place to publish would be very nice.

One more option which is similar but not quite...


Power BI Service Live Connection

For this one, we start inside of Power BI Desktop. As of the time I'm writing this, it's still a preview feature which needs to be enabled first in the Options menu:

PowerBIServiceLiveConnection_Option.jpg
 

To get started, visit the Get Data menu. You can locate the "Power BI Service" option under Online Services:

 

Here's where things differ a LOT from Analyze In Excel. 

If you are a read-only user, you'll see your list of workspaces. However, you won't see any datasets to choose from. That is because currently you are required to have edit privileges on the dataset in order to use this feature.

However, if you do have edit permissions, you can select the dataset. Then it will open a blank report with a connection back to the dataset:

 

Note that you cannot edit queries or datasets or relationships - just the report. Perfect. That means it looks like it's behaving exactly like the earlier two options discussed.

However, since edit permissions are needed currently in order to use this feature effectively, it's not able to be widely used by self-service BI users. I hope it ends up being the equivalent of Analyze In Excel, for Power BI Desktop instead.

More info about Power BI Service Live Connection is here: https://powerbi.microsoft.com/en-us/blog/connecting-to-datasets-in-the-power-bi-service-from-desktop/.

That's it. You keep thinking about ways to reduce the number of duplicate / similar datasets and I'll be a happy girl.

You Might Also Like...

Why a Semantic Layer Like Azure Analysis Services Is Relevant

Deploying Solutions from the Cortana Intelligence Gallery

The Cortana Intelligence Gallery is a site where you can search for solutions, tutorials, experiments, and training for learning the data & analytics tools within Azure. The Gallery can be located at: https://gallery.cortanaintelligence.com/.

The Gallery is a community site. Many of the contributions are from Microsoft directly. Individual community members can make contributions to the Gallery as well.

The "Solutions" are particularly interesting. Let's say you've searched and found Data Warehousing and Modern BI on Azure:

 

Deploying a Solution from the Gallery

What makes these solutions pretty appealing is the "Deploy" button. They're packaged up to deploy all (or most) of the components into your Azure environment. I admit I'd like to see some fine-tuning of this deployment process as it progresses through the public preview. Here's a quick rundown what to expect.

1|Create new deployment:

CISGallery_Deployment_1.jpg

The most important thing in step 1 above is that your deployment name ends up being your resource group. The resource group is created as soon as you click the Create button (so if you change your mind on naming, you'll have to go manually delete the RG). Also note that you're only allowed 9 characters, which makes it hard to implement a good naming convention. (Have I ever mentioned how fond I am of naming conventions?!?)

Resource groups are an incredibly important concept in Azure. They are a way to logically organize related resources which (usually) have the same lifecycle and are managed together. All items within a single resource group are included in an ARM template. Resource groups can serve as a boundary for security/permissions at the RG level, and can be used to track the cost of a solution. So, it's extremely important to plan out resource group structure in your real environment. In our situation here, having all of these related resources for testing/learning purposes is perfect.

2|Provide configuration parameters:

CISGallery_Deployment_2.jpg

In step 2 above, the only thing we need to specify is a user and password. This will be the server admin for both Azure SQL Database and Azure SQL Data Warehouse which are provisioned. It will use SQL authentication.

As soon as you hit the Next button, the solution is provisioning.

3|Resource provisioning (automated):

In step 3 above we see the progress. Depending on the type of resource, it may take a little while.

4|Done:

CISGallery_Deployment_4.jpg

When provisioning is complete, as shown in step 4 above (partial screenshot), you get a list of what was created and instructions for follow-up steps. For instance, in this solution our next steps are to go and create an Azure Service Principal and then create the Azure Analysis Services model (via PowerShell script saved in an Azure runbook provided by the solution).

They also send an e-mail to confirm the deployment:

 

If we pop over to the Azure portal and review what was provisioned so far, we see the following:

We had no options along the way for selecting names for resources, so we have a lot of auto-generated suffixes for our resource names. This is ok for purely learning scenarios, but not my preference if we're starting a true project with a pre-configured solution. Following an existing naming convention is impossible with solutions (at this point anyway). A wish list item I have is for the solution deployment UI to display the proposed names for each resource and let us alter if desired before the provisioning begins.

The deployment also doesn't prompt for which subscription to deploy to (if you have multiple subscriptions like I do). The deployment did go to the subscription I wanted, however, it would be really nice to have that as a selection to make sure it's not just luck.

We aren't prompted to select scale levels during deployment. From what I can tell, it chooses the lowest possible scale (I noted that the SQL Data Warehouse was provisioned with 100 DWUs, and the SQLDB had 5 DTUs).

To minimize cost, don't forget to pause what you can (such as the SQL Data Warehouse) when you're not using it. The HDInsight piece of this will be the most expensive, and it cannot be paused, so you might want to learn & experiment with that first then de-provision HDInsight in order to save on cost. If you're done with the whole solution, you can just delete the resource group (in which case all resources within it will be deleted permanently).

Referring to Documentation for Deployed Solutions

You can find each of your deployed solutions here: https://start.cortanaintelligence.com/Deployments

From this view, you can refer back to the documentation for a solution deployment (which is the same info presented in Step 4 when it was finished provisioning).

You can also 'Clean up Deployments' which is a nice feature. The clean up operation first deletes each individual resource, then it deletes the resource group: