I'm very excited to be presenting an all-day session on Architecting a Data Lake on Friday, October 13, 2017. It is a pre-conference session as part of the annual SQLSaturday event in Charlotte, NC -- which means the price is a bargain. The early bird price is $125 up to Sept 20th; after that the regular price is $165. You can register here: http://bit.ly/ArchitectingADataLake.
The full title of the session is "Architecting a Data Lake to Modernize Your Data Warehouse." The reason the title has a 'qualifier' about the DW is because that's the viewpoint we'll be taking -- i.e., the data lake is augmenting an existing DW/BI/Analytics type of system. Having said that, the vast majority of the information I'll be sharing will apply to any type of data lake implementation (supporting Data Science, IoT, Big Data, and so forth).
This full-day session will focus on principles for designing and implementing a data lake. There will be a mix of concepts, lessons learned, and technical implementation details. During the session we will build a data lake from the ground up, populate it, organize it, secure it, integrate it with a data warehouse via logical constructs, and query the data. You will leave this session with an understanding of the benefits and challenges of a multi-platform analytics/DW/BI environment, as well as recommendations for how to get started. You will learn:
- Scenarios and use cases for expanding an analytics/DW/BI environment into a multi-platform environment which includes a data lake
- Strengths and limitations of a logical data architecture which follows a polyglot persistence strategy
- Planning considerations for a data lake which supports streaming data as well as batch data processing
- Methods for organizing a data lake which focuses on optimal data retrieval and data security
- Techniques for speeding up development and refining user requirements via data virtualization and federated query approaches
- Benefits and challenges of schema-on-read vs. schema-on-write approaches for data integration and on-demand querying needs
- Deciding between Azure Blob Storage vs. Azure Data Lake Store vs. a relational platform
Specific technologies discussed and/or demonstrated in this session include Azure Data Lake Store, Azure Data Lake Analytics, Azure SQL Data Warehouse, Azure Blob Storage, SQL Server, PolyBase, and U-SQL:
If you have an Azure account, you will be able to follow along during the demonstrations if you'd like. Demo scripts will be provided with the course materials.
SQLSaturday is a free (with optional $10 lunch) training day run by members of the local community. There's usually 250-350 people at the annual Charlotte SQLSaturday, so it's a lot of fun as well as an excellent chance to get your learn on. The full schedule for Saturday can be found here: http://www.sqlsaturday.com/683/Sessions/Schedule.aspx.
I hope you can join me for the Friday pre-conference session on Architecting a Data Lake. That same day, Leila Etaati is also giving a pre-conference session on Advanced Analytics with R, Microsoft SQL Server, Power BI, and Azure ML - it will also be a great session if that suits your needs a little better.
If you have any questions, you can contact me via the form on my About page (scroll down to find the form).