Deploying Solutions from the Cortana Intelligence Gallery

The Cortana Intelligence Gallery is a site where you can search for solutions, tutorials, experiments, and training for learning the data & analytics tools within Azure. The Gallery can be located at: https://gallery.cortanaintelligence.com/.

The Gallery is a community site. Many of the contributions are from Microsoft directly. Individual community members can make contributions to the Gallery as well.

The "Solutions" are particularly interesting. Let's say you've searched and found Data Warehousing and Modern BI on Azure:

 

Deploying a Solution from the Gallery

What makes these solutions pretty appealing is the "Deploy" button. They're packaged up to deploy all (or most) of the components into your Azure environment. I admit I'd like to see some fine-tuning of this deployment process as it progresses through the public preview. Here's a quick rundown what to expect.

1|Create new deployment:

CISGallery_Deployment_1.jpg

The most important thing in step 1 above is that your deployment name ends up being your resource group. The resource group is created as soon as you click the Create button (so if you change your mind on naming, you'll have to go manually delete the RG). Also note that you're only allowed 9 characters, which makes it hard to implement a good naming convention. (Have I ever mentioned how fond I am of naming conventions?!?)

Resource groups are an incredibly important concept in Azure. They are a way to logically organize related resources which (usually) have the same lifecycle and are managed together. All items within a single resource group are included in an ARM template. Resource groups can serve as a boundary for security/permissions at the RG level, and can be used to track the cost of a solution. So, it's extremely important to plan out resource group structure in your real environment. In our situation here, having all of these related resources for testing/learning purposes is perfect.

2|Provide configuration parameters:

CISGallery_Deployment_2.jpg

In step 2 above, the only thing we need to specify is a user and password. This will be the server admin for both Azure SQL Database and Azure SQL Data Warehouse which are provisioned. It will use SQL authentication.

As soon as you hit the Next button, the solution is provisioning.

3|Resource provisioning (automated):

In step 3 above we see the progress. Depending on the type of resource, it may take a little while.

4|Done:

CISGallery_Deployment_4.jpg

When provisioning is complete, as shown in step 4 above (partial screenshot), you get a list of what was created and instructions for follow-up steps. For instance, in this solution our next steps are to go and create an Azure Service Principal and then create the Azure Analysis Services model (via PowerShell script saved in an Azure runbook provided by the solution).

They also send an e-mail to confirm the deployment:

 

If we pop over to the Azure portal and review what was provisioned so far, we see the following:

We had no options along the way for selecting names for resources, so we have a lot of auto-generated suffixes for our resource names. This is ok for purely learning scenarios, but not my preference if we're starting a true project with a pre-configured solution. Following an existing naming convention is impossible with solutions (at this point anyway). A wish list item I have is for the solution deployment UI to display the proposed names for each resource and let us alter if desired before the provisioning begins.

The deployment also doesn't prompt for which subscription to deploy to (if you have multiple subscriptions like I do). The deployment did go to the subscription I wanted, however, it would be really nice to have that as a selection to make sure it's not just luck.

We aren't prompted to select scale levels during deployment. From what I can tell, it chooses the lowest possible scale (I noted that the SQL Data Warehouse was provisioned with 100 DWUs, and the SQLDB had 5 DTUs).

To minimize cost, don't forget to pause what you can (such as the SQL Data Warehouse) when you're not using it. The HDInsight piece of this will be the most expensive, and it cannot be paused, so you might want to learn & experiment with that first then de-provision HDInsight in order to save on cost. If you're done with the whole solution, you can just delete the resource group (in which case all resources within it will be deleted permanently).

Referring to Documentation for Deployed Solutions

You can find each of your deployed solutions here: https://start.cortanaintelligence.com/Deployments

From this view, you can refer back to the documentation for a solution deployment (which is the same info presented in Step 4 when it was finished provisioning).

You can also 'Clean up Deployments' which is a nice feature. The clean up operation first deletes each individual resource, then it deletes the resource group:

Why Some Azure VM Sizes are Unavailable When Resizing in the Portal

Just a quick tip about why you might notice some sizes are not available when you are attempting to change the size/scale level of an Azure virtual machine in the portal.

I wanted to change one of my Development VMs to a DS12_v2, but that choice wasn't available:

It didn't immediately dawn on me why it wasn't available, so I thought I'd try PowerShell:

PowerShell returned an error (shown above) that told me the specific problem. I had 9 disks attached to the existing VM, which exceeded the 8 able to be attached to the VM that I wanted. I haven't converted to using managed disks yet. So, once I reconfigured my connected storage accounts a bit, the VM size I wanted became available:


Other common reasons you might see a size unavailable:
-That machine isn't available in your selected region
-Not supported by your hardware or cluster (more info: https://azure.microsoft.com/en-us/blog/resize-virtual-machines/)

As a sidenote, if the new VM configuration adds cores which exceeds the number you are allocated, then you will have to request more cores from support. From what I've seen, support responds to these requests very quickly.

You Might Also Like...

Setting Up Azure Disk Encryption for a Virtual Machine with PowerShell

Deciding on Encryption at Rest for an Azure Virtual Machine

Querying Documents With Different Structures in Azure DocumentDB

This is a quick post to share how we can use the coalesce operator in Azure DocumentDB (which is a schema-free, NoSQL database) to handle situations when the data structure varies from file to file. Varying data structure is a common issue in big data and analytics projects. A schema-free database like DocumentDB allows us to ingest and store the data with varying structures without a lot of upfront effort. However, accommodating these varying data structures is challenging later when we want to analyze the data. When querying the data (think Schema on Read here), I do need to impose a consistent structure on the data to perform analytics.

Following is a highly simplified example which shows how I have a PrevVal in one document, but not the other:

 

Here are the results if I do a simple select statement in the Azure DocumentDB Query Explorer:

See how with the simple select above I get the PrevVal (aliased to PreviousDataValue) property returned (in results on the right) for Document1 but not Document2? No big surprise there of course. However, what I really want is a "standardized" structure that is consistent across all documents so that I can perform analytics on the data, and potentially store the valuable data in my data warehouse.

Enter coalesce. Coalesce checks for the existence of a property inside of a document. This makes it easier to deal with properties that don't *always* exist in all documents.

Here's what happens when I add the ?? coalesce operator to our select query:

In the above example, I've told the coalesce operator to return a 0 if the PrevValue doesn't exist. And...shazam! We now have a standardized structure output on the right. The trick to making this work of course is including every possible property that could possibly come through in the set of JSON documents.

If missing data values like PrevValue is an issue, you can find specific documents which are missing a specific property by using NOT IS_DEFINED like this:

I've found it very helpful to be able to 'standardize' data from file to file by using the coalesce operator in DocDB. I also like the ability to jump immediately to nested levels of the data without having to do extra work of parsing out levels. 

Finding More Information

SQL Query and SQL Syntax in DocumentDB <--Note only a subset of SQL is supported in DocDB

You Might Also Like...

Data Lake Use Cases and Planning Considerations

Setting Up Azure Disk Encryption for a Virtual Machine with PowerShell

Where Azure Analysis Services Fits Into BI & Analytics Architecture

 

Setting up Azure Disk Encryption for a Virtual Machine with PowerShell

As I discussed in my previous blog post, I opted to use Azure Disk Encryption for my virtual machines in Azure, rather than Storage Service Encryption. Azure Disk Encryption utilizes Bitlocker inside of the VM. Enabling Azure Disk Encryption involves these Azure services:

  • Azure Active Directory for a service principal
  • Azure Key Vault for a KEK (key encryption key) which wraps around the BEK (bitlocker encryption key)
  • Azure Virtual Machine (IaaS)

Following are 4 scripts which configures encryption for an existing VM. I initially had it all as one single script, but I purposely separated them. Now that they are modular, if you already have a Service Principal and/or a Key Vault, you can skip those steps. I have my 'real' version of these scripts stored in an ARM Visual Studio project (same logic, just with actual names for the Azure services). These PowerShell templates go along with other ARM templates to serve as source control for our Azure infrastructure.

As any expert will immediately know by looking at my scripts below, I'm pretty much a PowerShell novice. So, be kind dear reader. My purpose is to document the steps, the flow,add some commentary, and to pull together a couple pieces I found on different documentation pages. 


Step 1: Set up Service Principal in AAD

<#
.SYNOPSIS
Creates Service Principal in Azure Active Directory
.DESCRIPTION
This script creates a service principal in Azure Active Directory. 
A service principal is required to enable disk encryption for VM.
.NOTES
File Name: CreateAADSvcPrinForDiskEncryption.ps1
Author : Melissa Coates
Notes: Be sure the variables in the input area are completed, following all standard naming conventions.
The $aadSvcPrinAppPassword needs to be removed before saving this script in source control.
.LINK
Supporting information: 
https://blogs.msdn.microsoft.com/azuresecurity/2015/11/16/explore-azure-disk-encryption-with-azure-powershell/
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption 
#>

#-----------------------------------------

#Input Area
$subscriptionName = 'MyAzureSubscriptionDev'
$aadSvcPrinAppDisplayName = 'VMEncryptionSvcPrinDev'
$aadSvcPrinAppHomePage = 'http://FakeURLBecauseItsNotReallyNeededForThisPurpose'
$aadSvcPrinAppIdentifierUri = 'https://DomainName.com/VMEncryptionSvcPrinDev'
$aadSvcPrinAppPassword = 'SuperStrongPassword'

#-----------------------------------------

#Manual login into Azure
Login-AzureRmAccount -SubscriptionName $subscriptionName

#-----------------------------------------

#Create Service Principal App to Use For Encryption of VMs
$aadSvcPrinApplication = New-AzureRmADApplication -DisplayName $aadSvcPrinAppDisplayName -HomePage $aadSvcPrinAppHomePage -IdentifierUris $aadSvcPrinAppIdentifierUri -Password $aadSvcPrinAppPassword
New-AzureRmADServicePrincipal -ApplicationId $aadSvcPrinApplication.ApplicationId

Step 2: Create Azure Key Vault

<#
.SYNOPSIS
Creates Azure Key Vault.
.DESCRIPTION
This script does the following:
1 - Creates a key vault in Azure.
2 - Allows the Azure Backup Service permission to the key vault.
This is required if Recovery Vault will be used for backups.
A key vault is required to enable disk encryption for VM.
.NOTES
File Name: ProvisionAzureKeyVault.ps1
Author : Melissa Coates
Notes: Be sure the variables in the input area are completed, following all standard naming conventions.
The key vault must reside in the same region as the VM which will be encrypted.
A Premium key vault is being provisioned so that an HSM key can be created for the KEK.
The 262044b1-e2ce-469f-a196-69ab7ada62d3 ID refers to the Azure Key Vault (which is why it is not a variable).
.LINK
Supporting information: 
https://blogs.msdn.microsoft.com/azuresecurity/2015/11/16/explore-azure-disk-encryption-with-azure-powershell/
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption 

#>

#-----------------------------------------

#Input Area
$subscriptionName = 'MyAzureSubscriptionDev'
$resourceGroupName = 'MyDevRG'
$keyVaultName = 'KeyVault-Dev'
$keyVaultLocation = 'East US 2'

#-----------------------------------------

#Manual login into Azure
#Login-AzureRmAccount -SubscriptionName $subscriptionName

#-----------------------------------------

#Create Azure Key Vault
New-AzureRmKeyVault -VaultName $keyVaultName -ResourceGroupName $resourceGroupName -Location $keyVaultLocation -Sku 'Premium'

#-----------------------------------------

#Permit the Azure Backup service to access the key vault
Set-AzureRmKeyVaultAccessPolicy -VaultName $keyVaultName -ResourceGroupName $resourceGroupName -PermissionsToKeys backup,get,list -PermissionsToSecrets get,list -ServicePrincipalName 262044b1-e2ce-469f-a196-69ab7ada62d3

Step 3: Connect Service Principal with Key Vault

<#
.SYNOPSIS
Enables the service principal for VM disk encryption to communicate with Key Vault.
.DESCRIPTION
This script does the following:
A - Allows service principal the selective permissions to the key vault so
that disk encryption functionality works.
B - Creates a KEK (Key Encryption Key). For Disk Encryption, a KEK is required 
in addition to the BEK (BitLocker Encryption Key).
Prerequisite 1: Service Principal name (see CreateAADSvcPrinForVMEncryption.ps1)
Prerequisite 2: Azure Key Vault (see ProvisionAzureKeyVault.ps1)
.NOTES
File Name: EnableSvcPrinWithKeyVaultForDiskEncryption.ps1
Author : Melissa Coates
Notes: Be sure the variables in the input area are completed, following all standard naming conventions.
The key vault must reside in the same region as the VM being encrypted.
The key type can be either HSM or Software (HSM offers additional security but does require a Premium key vault). 
.LINK
Supporting information: 
https://blogs.msdn.microsoft.com/azuresecurity/2015/11/16/explore-azure-disk-encryption-with-azure-powershell/
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption 
#>

#Input Area
$subscriptionName = 'MyAzureSubscriptionDev'
$resourceGroupName = 'MyDevRG'
$aadSvcPrinAppDisplayName = 'VMEncryptionSvcPrinDev'
$keyVaultName = 'KeyVault-Dev'
$keyName = 'VMEncryption-KEK'
$keyType = 'HSM'

#-----------------------------------------

#Manual login into Azure
#Login-AzureRmAccount -SubscriptionName $subscriptionName

#-----------------------------------------

#Allow the Service Principal Permissions to the Key Vault
$aadSvcPrinApplication = Get-AzureRmADApplication -DisplayName $aadSvcPrinAppDisplayName
Set-AzureRmKeyVaultAccessPolicy -VaultName $keyVaultName -ServicePrincipalName $aadSvcPrinApplication.ApplicationId -PermissionsToKeys 'WrapKey' -PermissionsToSecrets 'Set' -ResourceGroupName $resourceGroupName

#-----------------------------------------

#Create KEK in the Key Vault
Add-AzureKeyVaultKey -VaultName $keyVaultName -Name $keyName -Destination $keyType

#-----------------------------------------

#Allow Azure platform access to the KEK
Set-AzureRmKeyVaultAccessPolicy -VaultName $keyVaultName -ResourceGroupName $resourceGroupName -EnabledForDiskEncryption

Step 4: Enable Disk Encryption

<#
.SYNOPSIS
Enables disk encryption for a VM.
.DESCRIPTION
This script enables disk encryption for an Azure virtual machine.
Prerequisite 1: Service Principal name (see CreateAADSvcPrinForDiskEncryption.ps1)
Prerequisite 2: Azure Key Vault (see ProvisionAzureKeyVault.ps1)
Prerequisite 3: Permissions to Key Vault for Service Principal (see EnableSvcPrinWithKeyVaultForDiskEncryption.ps1)
.NOTES
File Name: EnableAzureDiskEncryption.ps1
Author : Melissa Coates
Notes: Be sure the variables in the input area are completed, following all standard naming conventions.
Azure Disk Encryption (ADE) 
The Azure VMs must already exist and be running.
To verify when completed: Get-AzureRmVmDiskEncryptionStatus -ResourceGroupName $resourceGroupName -VMName $vmName
.LINK
Supporting information: 
https://blogs.msdn.microsoft.com/azuresecurity/2015/11/16/explore-azure-disk-encryption-with-azure-powershell/
https://docs.microsoft.com/en-us/azure/security/azure-security-disk-encryption 
#>

#-----------------------------------------

#Input Area
$subscriptionName = 'MyAzureSubscriptionDev'
$resourceGroupName = 'MyDevRG'
$keyVaultName = 'KeyVault-Dev'
$keyName = 'VMEncryption-KEK'
$aadSvcPrinAppDisplayName = 'VMEncryptionSvcPrinDev'
$aadSvcPrinAppPassword = 'SuperStrongPassword'
$vmName = 'VMName-Dev'

#-----------------------------------------

#Manual login into Azure
#Login-AzureRmAccount -SubscriptionName $subscriptionName

#-----------------------------------------

#Enable Encryption on Virtual Machine
$keyVault = Get-AzureRmKeyVault -VaultName $keyVaultName -ResourceGroupName $resourceGroupName
$diskEncryptionKeyVaultUrl = $KeyVault.VaultUri
$keyVaultResourceId = $KeyVault.ResourceId
$keyEncryptionKeyUri = Get-AzureKeyVaultKey -VaultName $keyVaultName -KeyName $keyName 
$aadSvcPrinApplication = Get-AzureRmADApplication -DisplayName $aadSvcPrinAppDisplayName 
Set-AzureRmVMDiskEncryptionExtension -ResourceGroupName $resourceGroupName -VMName $vmName -AadClientID $aadSvcPrinApplication.ApplicationId -AadClientSecret $aadSvcPrinAppPassword -DiskEncryptionKeyVaultUrl $diskEncryptionKeyVaultUrl -DiskEncryptionKeyVaultId $KeyVaultResourceId -KeyEncryptionKeyUrl $keyEncryptionKeyUri.Id -KeyEncryptionKeyVaultId $keyVaultResourceId

Step 4 takes around 10 minutes to run; it will prompt you with the following dialog box before it executes:

You'll see this message when step 4 has completed:

And in the portal, the disks associated with the VM will also indicate that encryption is now enabled:


Troubleshooting

One error I had issues with was "Azure Backup Service does not have sufficient permissions to Key Vault for Backup of Encrypted Virtual Machines."  The last cmdlet in Step 2 above resolves this issue. So, watch out for that if you are using a key vault that already exists.

You Might Also Like...

Deciding on Encryption at Rest for an Azure Virtual Machine

Setting Up a PC for Cortana Intelligence Suite Development