How to Reference Azure Storage Files from Cloud Shell

Recently I used Azure Cloud Shell for the first time. This is a quick post to show how I referenced the file share in Azure Storage to communicate with Cloud Shell.

What is Cloud Shell?

Cloud Shell is a lightweight way to run scripts using either Bash or PowerShell. You can run scripts in a browser using the Azure portal or shell.azure.com, with the Azure mobile app, or using the VS Code Azure Account extension. If you have seen the "Try it now" links in Azure documentation pages, that will direct you to use Cloud Shell.

CloudShell_AzurePortal.jpg

The rest of this post focuses on using PowerShell with Cloud Shell.

Finding the File Service Info to Use with Cloud Shell

When you create a Cloud Shell account, you are prompted to also create an Azure Storage account. This gives you a mounted file share in the Azure Files service which is available for all of your Cloud Shell sessions.

You can use Get-CloudDrive to find the information related to the drive for your Cloud Shell account:

Get-CloudDrive.jpg
 

Example of Exporting a File to Azure Files Using Cloud Shell

Using the mount point information we got from Get-CloudDrive shown above, we now have what we need for how to reference the Azure file share that is associated to my Cloud Shell account. For an example, I’m going to export a PBIX file from the Power BI Service (discussed in my previous blog post).

First we need to log in to the Power BI Service. In Cloud Shell, authentication to another service like Power BI is a little different than how we see it within a local client tool - it utilizes a login page with a code provided by Cloud Shell:

Connect-PowerBIServiceAccount.jpg

Now that we are authenticated, we can execute our Export-PowerBIReport cmdlet:

Export-PowerBIReport_CloudShell.jpg

Note in the above PowerShell cmdlet, I referenced the mount point for the Azure file share associated with my Cloud Shell account as:

'/home/melissa/clouddrive/PowerBIExportFiles/ExportTest_V2workspace.pbix'
AzureFiles.jpg

Notice in the above example I used a folder called ‘PowerBIExportFiles’ — note that any folder(s) need to already exist before you can export a file to it. It won’t auto-create for you; if the folder doesn’t exist you’ll get an error that says “Could not find a part of the path.” Using a folder is optional though.

You Might Also Like…

Controlling Data Access in Azure for Administrators and Owners

Getting Started with Azure

PowerShell for Assigning and Querying Tags in Azure

How Permissions Work for a Power BI Service Administrator

A Power BI administrator is a role for managing various aspects of the Power BI Service. This role can be assigned in Office 365. Anyone with Office 365 global admin privileges is also a Power BI administrator by default.

Based on the tests I've been doing, I've observed that users with membership to the Power BI administrator role have two sets of permissions apply:

  • Activities which are scoped across the entire organization

  • Activities for which normal user permissions apply

Within the above 2 categories, I’m thinking there are 4 main types of activities:

  1. Manage tenant settings (always scoped to the organization)

  2. Compile inventory and metadata (can be scoped to the organization)

  3. Manage workspace users (can be scoped to the organization)

  4. Export content from a workspace (relies on user permissions)

PowerBIAdministrator_CategoriesOfResponsibility.jpg

My motivation for laying this out is because initially I expected that all of the PowerShell scripts (including the Export-PowerBIReport cmdlet discussed below) would apply organization-wide. This made me think that the Power BI administrator role had become a highly elevated role which could access all content in the Power BI tenant. However, my initial expectation was wrong in that respect — it turns out that a Power BI administrator can access all metadata but not all of the actual data.

Next let’s briefly review each of the 4 types of activities.

Manage Power BI Tenant Settings

The ability to manage tenant settings in the Power BI Admin Portal, has been in place for some time now. It includes managing things such as:

PowerBIAdminPortalTenantSettings.jpg
  • Tenant settings (for the most part this includes enabling/disabling certain features to influence the user experience and/or govern the system)

  • Capacity

  • Embed codes

  • Organizational custom visuals

The Power BI administrator role cannot be delegated to individual subsets of the organization — it applies to the entire tenant.

The role also cannot be granted in a read-only way. This can make it challenging in a very large organization. For example, let's say you're a large worldwide organization with five main divisions. One of the key Power BI people from division A requests access to the Power BI Admin Portal because they want to be able to view what the settings are. An example I've seen of this is someone thinking that the 'push apps to end users' doesn't work, when really the issue is that it's disabled by default in the tenant settings.

Compile Power BI Inventory and Metadata

With the introduction of the Power BI Management Module, we can more easily run scripts to perform certain activities such as accessing metadata. There are several cmdlets available, for instance: Get-PowerBIDashboard, Get-PowerBIReport, and Get-PowerBIWorkspace.

Here’s an example of a script which looks across the entire tenant (i.e., the organization scope) to find all instances of a report named Product Sales Analysis:

Get-PowerBIReport.jpg

The key point above is that the Power BI administrator can retrieve all metadata like this, including My Workspace for other users. This is actually great because it means a Power BI administrator can put together an inventory of the content in the tenant. If you compare this to usage data from the Office 365 audit log, you could determine what isn’t being used.

Manage Workspace Users

There are PowerShell cmdlets such as Add-PowerBIWorkspaceUser and Remove-PowerBIWorkspaceUser to manage the new type of workspaces (i.e., the V2 ‘new workspace experience’ that is in preview as of this writing in late 2018).

Here is an example of my Power BI administrator account providing member permissions to a colleague:

Add-PowerBIWorkspaceUser.jpg

The interesting part of the above example is that my Power BI administrator account does *not* have any direct permissions to the workspace. However, the organization scope allows it to be done.

Export Power BI Content from Workspaces

There is a PowerShell cmdlet called Export-PowerBIReport which, as the name implies, exports a PBIX from the Power BI Service. This includes the report and the underlying data.

Initially I thought this would also be able to be done with an organization-wide scope, but that’s not true. (Though if that were true, that would open up some interesting scenarios like exporting files to make backups, and/or exporting files to minimize risk that critical data is being delivered from a user's individual workspace…but that’s not possible right now. And if it were, I would want that highly privileged role to be separate somehow from the existing Power BI administrator role.)

Here is an example of exporting one file:

Export-PowerBIReport.jpg

Unlike the previous two examples, Export-PowerBIReport does require the Power BI administrator to have rights to the workspace in order to access the content. This is what I was observing, so I was happy to have it confirmed by Chaz Beck (CodeCyclone) from the Power BI product team when he replied to this GitHub issue. This is a legitimate unauthorized (401) message when a Power BI administrator tries to export a PBIX that resides somewhere that the administrator doesn’t have access to - this includes My Workspace for all other users.

Also, this cmdlet appears to only work with the new workspace experience that is currently in public preview. I get an unauthorized (401) message, even if I’m an admin on the workspace itself, if I try this on a V1 workspace. This is confirmed as a bug on this GitHub issue.

Summary

Hopefully this post saves you some time in determining how permissions apply to the different types of activities that a Power BI administrator can do.

 
PowerBIAdministrator.jpg
 

The scripts shown above are super simplified, not really ready for actual production use, but I kept them simple since the focus of this post is on permissions.

Also, keep in mind that you do *not* have to be a designated Power BI administrator to use the Power BI PowerShell cmdlets — any user can run them related to their own content. However, you *do* need to be a Power BI administrator in order to set the scope to organization (for those cmdlets which support it).

To find additional information:

You Might Also Like

Terminology Check - What is a Power BI App?

Lesson Learned - Keep PowerShell Modules Consistent and Up To Date

Checklist for Finalizing a Model in Power BI Desktop

When Should We Load Relational Data to a Data Lake?

This is a question I get fairly regularly these days: Should we extract relational data and load it to a data lake?

Architecture diagrams, such as the one displayed here from Microsoft, frequently depict all types of data sources going thru the lake:

For certain types of data, writing it to the data lake really is frequently the best choice. This is often true for low latency IoT data, semi-structured data like logs, and varying structures such as social media data. However, the handling of structured data which originates from a relational database is much less clear.

Most data lake technologies store data as files (like csv, json, or parquet). This means that when we extract relational data into a file stored in a data lake, we lose valuable metadata from the database such as data types, constraints, foreign keys, etc. I tend to say that we "de-relationalize" data when we write it to a file in the data lake. If we're going to turn right around and load that data to a relational database destination, is it the right call to write it out to a file in the data lake as an intermediary step?

My current thinking is that this is justified primarily by these situations:

(1) Do you need to keep the history? If you are retaining snapshots as of periodic points in time, having those snapshots accessible in a data lake can be useful.

(2) Will multiple downstream teams, systems, or processes access the data? If there are multiple consumers of the data (and presumably we don't want them accessing the original source directly), then it's possible that providing data access from a data lake might be beneficial.

(3) Do you have a strategy of storing *all* of the organizational data in your data lake? If you've standardized on this type of approach, and (1) or (2) above applies, then it makes sense.

(4) Do you need to provide a subset of data for a specialized use case? If you can't provide access to the source database, but a full replica or another relational DB is overkill, then delivery of data via the data lake can sometimes make sense.

DataLakeQuote.jpg
 

The above list isn’t exhaustive - it always depends. My rule of thumb: Avoid writing relational data to a data lake when it doesn't have value being there. This is an opinion and not everyone agrees with this. Even if we are using our data lake as a staging area for a data warehouse, my opinion is that all relational data doesn't necessarily have to make a pit stop in the data lake except when it's justified to do so.

You Might Also Like…

Zones in a Data Lake

Data Lake Use Cases and Planning Considerations

Controlling Data Access in Azure for Administrators and Owners

ResourceGroup.jpg

Recently a customer expressed concern that an owner of an Azure resource group automatically gains access to the data within the services contained in the resource group. In this case, the customer was specifically referring to data in Azure Data Lake Storage Gen 1 but this concept applies to Azure Storage and some of the other data-oriented services in Azure as well. The customer’s comment prompted me to look into available alternatives. This is by no means a detailed security post…rather, I’m trying to share a few nuggets of what I learned.

Key points to take away from this post:

  • The built-in Owner role is, by design, unrestricted for both management operations as well as data operations. *See option 3 below where it looks like this behavior is changing.*

  • You cannot 'break' the inheritance model of resources in Azure.

  • If resource group owners are not allowed to see data, these are the following choices I’m aware of currently:

  1. Remove owner permissions from the resource group level, or

  2. Isolate the resource into its own resource group so it can be managed separately, or

  3. Investigate using the new "DataActions" and "NotDataActions" properties within RBAC role structure to separate management operations from data operations (currently in preview at the time of this writing)

Default Behavior of an Owner in a Resource Group Allows Access to Data

Let's first cover what happens by default. Say I have a resource group with a way cool name of "SecurityTestRG" and I've granted "Wayne Writer" ownership permissions to the resource group. And within the resource group I have a storage account called "strgtest96." As we'd expect, the owner permissions are inherited by the storage account:

Security_ResourceInherited.jpg

And with that, Wayne Writer is able to view, edit, and delete any of the data that has been uploaded to that storage account by virtue of his owner permissions at the resource group level:

Security_OwnerOfResourceGroup.jpg

Allowing owners unrestricted access to both the management plane and the data plane is by design. However, if you have a group of people who should be able to administer (own) everything contained in the resource group, but you don’t want them to automatically see all the data, what are the options?

Cannot Break the RBAC Inheritance Model in Azure

The first thing you might be inclined to try is to just remove the owner permission for the one resource within the resource group that has data, so that you can specify security for it differently. However, that doesn’t work:

Security_RoleAssignmentsError.jpg

When you try to delete a role assignment that’s been inherited, as shown above, you get an error message: “Inherited role assignments cannot be removed. Please open up the associated resource and remove the role assignments from there.

In Azure’s RBAC model, we can add additional permissions at lower levels (i.e., like for a resource itself within the resource group), but we cannot remove an assignment that’s been inherited.

An additional note about subscription level owners & administrators: Any owner you specify at the subscription level will be inherited by all resource groups and resources across the subscription—it’s a very high privilege role. The same inheritance behavior also applies to the Azure service administrator (but not the other co-administrators—they would have to be specified as subscription owners).

Options for Handling Security Differently

Option 1. You could remove the owner permission from the resource group (that is the suggestion presented within the error message shown above). However, let’s say there are 10 resources in this resource group and the other 9 of them can inherit the owner permission without issue. In that case, do we really want to have to manage security for every single resource separately? That might introduce risk and inconsistency. Though if there aren’t very many resources within a resource group, that might work ok.

Option 2. You could isolate the resource into its own ‘sensitive’ resource group so it can be managed separately. This allows the other 9 resources to stay in the main resource group with normal RBAC, and just segregates this one resource. Depending on how you handle automation and deployments, having separate resource groups could add complications (because the general rule for resource groups is to group resources together that are related & have the same lifecycle). Also, it still precludes you from being able to own the resource yet not see data—this segregation would work if contributor or a custom role meets your needs, and if you don’t have subscription-level owners specified which also inherit no matter what.

Option 3. There is new functionality in preview that looks promising which will allow us to handle security for management operations and data operations separately, as shown here:

 

The above scenario indicates that Alice, a high level owner, doesn’t actually see the data. This is still in preview and I haven’t gotten it to work correctly yet, but I’m keeping an eye on this. You can find more info here: https://docs.microsoft.com/en-us/azure/role-based-access-control/role-definitions. I’m guessing this security enhancement is inspired by GDPR and/or customer needs for more granularity of permissions, so I’m excited to see where this is headed.

Finding More Information

Azure Storage Security Guide <—This is a good read re: management plane and data plane

What is Role-Based Access Control (RBAC)?

Understand Role Definitions