I'm very excited to be presenting an all-day workshop on Architecting a Data Lake at two events:
This full-day session will focus on principles for designing and implementing a data lake. There will be a mix of concepts, lessons learned, and technical implementation details. This session is approximately 70% demonstrations: there are 19 demos throughout the day. We will create a data lake, populate it, organize it, query it, and integrate it with a relational database via logical constructs. You will leave this session with an understanding of the benefits and challenges of a multi-platform analytics/DW/BI environment, as well as recommendations for how to get started. You will learn:
- Scenarios and use cases for expanding an analytics/DW/BI environment into a multi-platform environment which includes a data lake
- Strengths and limitations of a logical data architecture which follows a polyglot persistence strategy
- Planning considerations for a data lake which supports streaming data as well as batch data processing
- Methods for organizing a data lake which focuses on optimal data retrieval and data security
- Techniques for speeding up development and refining user requirements via data virtualization and federated query approaches
- Benefits and challenges of schema-on-read vs. schema-on-write approaches for data integration and on-demand querying needs
- Deciding between Azure Blob Storage vs. Azure Data Lake Store vs. a relational platform
Specific technologies discussed and/or demonstrated in this session include Azure Data Lake Store, Azure Data Lake Analytics, Azure SQL Data Warehouse, Azure Blob Storage, SQL Server, PolyBase, and U-SQL:
If you have an Azure account and your own laptop, you will be able to follow along during the demonstrations if you'd like. Demo scripts will be provided with the workshop materials.
If you have any questions, you can contact me via the form on my About page (scroll down to find the form).