- Добавил: literator
- Дата: 6-01-2024, 14:47
- Комментариев: 0
Название: Databricks Lakehouse Platform Cookbook: 100+ recipes for building a scalable and secure Databricks Lakehouse
Автор: Alan L. Dennis
Издательство: BPB Publications
Год: 2024
Страниц: 581
Язык: английский
Формат: epub (true)
Размер: 52.2 MB
Analyze, Architect, and Innovate with Databricks Lakehouse. The Databricks Lakehouse is groundbreaking technology that simplifies data storage, processing, and analysis. This cookbook offers a clear and practical guide to building and optimizing your Lakehouse to make data-driven decisions and drive impactful results. This definitive guide walks you through the entire Lakehouse journey, from setting up your environment, and connecting to storage, to creating Delta tables, building data models, and ingesting and transforming data. We start off by discussing how to ingest data to Bronze, then refine it to produce Silver. Next, we discuss how to create Gold tables and various data modeling techniques often performed in the Gold layer. You will learn how to leverage Spark SQL and PySpark for efficient data manipulation, apply Delta Live Tables for real-time data processing, and implement Machine Learning and Data Science workflows with MLflow, Feature Store, and AutoML. The book also delves into advanced topics like graph analysis, data governance, and visualization, equipping you with the necessary knowledge to solve complex data challenges. By the end of this cookbook, you will be a confident Lakehouse expert, capable of designing, building, and managing robust data-driven solutions. A good understanding of SQL, Python, Spark, and cloud computing would benefit the reader but is not required.
Автор: Alan L. Dennis
Издательство: BPB Publications
Год: 2024
Страниц: 581
Язык: английский
Формат: epub (true)
Размер: 52.2 MB
Analyze, Architect, and Innovate with Databricks Lakehouse. The Databricks Lakehouse is groundbreaking technology that simplifies data storage, processing, and analysis. This cookbook offers a clear and practical guide to building and optimizing your Lakehouse to make data-driven decisions and drive impactful results. This definitive guide walks you through the entire Lakehouse journey, from setting up your environment, and connecting to storage, to creating Delta tables, building data models, and ingesting and transforming data. We start off by discussing how to ingest data to Bronze, then refine it to produce Silver. Next, we discuss how to create Gold tables and various data modeling techniques often performed in the Gold layer. You will learn how to leverage Spark SQL and PySpark for efficient data manipulation, apply Delta Live Tables for real-time data processing, and implement Machine Learning and Data Science workflows with MLflow, Feature Store, and AutoML. The book also delves into advanced topics like graph analysis, data governance, and visualization, equipping you with the necessary knowledge to solve complex data challenges. By the end of this cookbook, you will be a confident Lakehouse expert, capable of designing, building, and managing robust data-driven solutions. A good understanding of SQL, Python, Spark, and cloud computing would benefit the reader but is not required.