Название: Effective XGBoost: Optimizing, Tuning, Understanding, and Deploying Classification Models Автор: Matt Harrison Издательство: MetaSnake Серия: Treading on Python Год: 2023 Страниц: 221 Язык: английский Формат: pdf (true) Размер: 28.9 MB
"Effective XGBoost" is the ultimate guide to mastering the art of classification. Whether you're a seasoned data scientist or just starting out, this comprehensive book will take you from the basics of XGBoost to advanced techniques for optimizing, tuning, understanding, and deploying your models.
XGBoost is one of the most popular Machine Learning algorithms used in Data Science today. With its ability to handle large datasets, handle missing values, and deal with non-linear relationships, it has become an essential tool for many data scientists. In this book, you'll learn everything you need to know to become an expert in XGBoost.
XGBoost is both a library and a particular gradient boosted trees (GBT) algorithm. (Although the XGBoost library also supports other - linear - base learners.) GBTs are a class of algorithms that utilize the so-called ensembling - building a very strong ML algorithm by combining many weaker algorithms. GBTs use decision trees as “base learners”, utilize “boosting”as an ensembling method, and optimize the ensembling by utilizing gradient descent, something that they have in common with other Machine Learning methods, such as neural networks.
Starting with the basics, you'll learn how to use XGBoost for classification tasks, including how to prepare your data, select the right features, and train your model. From there, you'll explore advanced techniques for optimizing your models, including hyperparameter tuning, early stopping, and ensemble methods.
But "Effective XGBoost" doesn't stop there. You'll also learn how to interpret your XGBoost models, understand feature importance, and deploy your models in production. With real-world examples and practical advice, this book will give you the skills you need to take your XGBoost models to the next level.
Whether you're working on a Kaggle competition, building a recommendation system, or just want to improve your Data Science skills, "Effective XGBoost" is the book for you. With its clear explanations, step-by-step instructions, and expert advice, it's the ultimate guide to mastering XGBoost and becoming a top-notch data scientist.
XGBoost is the only GBT library with a comprehensive Nvidia CUDA GPU support. It works on a single machine, or on a large cluster. And since version 1.7, it also supports federated learning. It includes C, Java, Python, and R front ends, as well as many other ones. If you are a working Data Scientist, and need an efficient way to train and deploy a Machine Learning model for a wide variety of problems, chances are that XGBoost is indeed all you need. To be clear, in terms of predictive performance, I have always considered XGBoost to be on par with other well known GBT libraries - LightGBM, CatBoost, HistGradientBoosting, etc. Each one of them has its strengths and weaknesses, but in the first approximation all of them are fairly interchangeable with each other for the majority of problems that a working Data Scientist comes across.
Machine Learning for tabular data is still a very hands-on artisanal process. A big part of what makes a great tabular data ML model has to do with proper data preparation and feature engineering. This is where Matt’s background with Pandas really comes in handy - many Pandas examples throughout the book are exceptionally valuable in their own right. Chapters end with a great selection of useful exercises. All of the code in the book is also available from the accompanying repo, and most of the datasets can be found on Kaggle.
Скачать Effective XGBoost: Optimizing, Tuning, Understanding, and Deploying Classification Models (Treading on Python)
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.