March 2013 Update: The post below was written prior to the release of the Data Explorer Preview which is an Excel Add-In, so some information below will differ somewhat.
Overview: When investigating the Discover > Enrich > Publish capabilities of Data Explorer, my immediate question was: “Hmmm…Sounds awfully similar to PowerPivot. How is Data Explorer different?” This entry shares my initial impressions of Data Explorer.
What is Data Explorer?
Data Explorer is a self-service tool to Discover, Enrich (Transform), and Publish datasets. Data Explorer is currently in a SQL Azure Labs “Preview” (i.e., experimental) mode. You can view the datasheet here.
For a business user, the primary purpose of Data Explorer is to improve or eliminate manual processes to consolidate, standardize & cleanse data before it’s consumed by an end-user tool (such as PowerPivot).
For an IT user, the primary benefit of Data Explorer is likely to be the capability to publish the results as an OData feed for consumption by another application.
A Few Differences Between Data Explorer and PowerPivot
Data Explorer has functionality to “suggest” additional data sources for you (from Azure Marketplace or Bing) based upon the data you already have (i.e., if it detects a phone number or an address in your dataset). While both tools are similar in that they facilitate bringing together disparate data sources, PowerPivot does not attempt to make suggestions. Data Explorer places more emphasis on finding & bringing in additional related data from the web.
Data Explorer can extract data from some interesting data sources not available to PowerPivot, such as Web Content (lists & tables), Office 365 including SharePoint Online, or JSON documents.
For relational data, Data Explorer can currently connect only to SQL Server or SQL Azure. PowerPivot can natively connect to many more relational data sources.
Data Explorer cannot connect to Analysis Services from what I can tell (whereas PowerPivot can).
Data Explorer does have some nice capabilities to allow a non-technical user to perform transforms on the data (nowhere near the capabilities of SSIS though – don’t forget Data Explorer is intended as a self-service tool). Some transformations might be easier to do in the Data Explorer interface than in PowerPivot. It also provides a way to centralize the transformations, if the dataset will be consumed by multiple downstream applications.
Publishing data is where Data Explorer really distinguishes itself – really, the main purpose of Data Explorer is to publish data so it can be consumed downstream by another tool. Conversely, PowerPivot’s main purpose is to support data visualization rather than publish it outside of the Excel (.xlsx) file it resides in.
The ability to publish OData feeds via Data Explorer is, perhaps, its biggest benefit. Publishing data as a feed allows other applications to consume it (including PowerPivot).
Currently only the cloud version has publishing capabilities (the Data Explorer team is working on this capability for the desktop add-in).
Data Explorer has its own Formula Language called “M” (um, yeah, we were hoping for another language to learn…I just had to say it).
Data Explorer has a cloud version, and a desktop version which works as an add-in to Excel (both 32-bit and 64-bit downloads of the Excel add-in are available). It may be easier to access on-premises company data behind a firewall using the desktop client version of Data Explorer. (PowerPivot is currently only available as an Excel add-in; no word on when it might be available as part of Office 365.)
The cloud version of Data Explorer uses Silverlight.
Finding More Information
Chris Webb’s Blog – Self-Service ETL with Data Explorer
MSDN Magazine Article by Julie Lerman – Let Power Users Create Their Own OData Feeds
Jamie Thomson’s Blog – Thoughts on Data Explorer
Jamie Thomson’s Blog – Data Explorer Feedback Part 1
Oakleaf Systems Blog – Mashup Big Data with Microsoft Codename “Data Explorer” – an Illustrated Tutorial