Statistics with R for Machine Learning: Vol. 1-2
- Добавил: literator
- Дата: Вчера, 19:34
- Комментариев: 0

Автор: Mohsen Nady
Издательство: Arcler Press
Год: 2025
Страниц: 298+188
Язык: английский
Формат: pdf (true)
Размер: 17.0 MB
Data preparation is the foundation of any successful Machine Learning project. This volume provides a comprehensive guide to cleaning, transforming, and splitting data for Machine Learning using R, including handling missing values, feature scaling, and stratified sampling. Practical examples and R code demonstrate how to optimize datasets for predictive modeling. The volume is essential for data scientists and Machine Learning practitioners seeking to build robust models.
The book is organized into three chapters that outline the initial steps of creating Machine Learning models using R. Chapter 1 introduces key definitions and the main types of Machine Learning models. It uses small, cleaned datasets to explain the two primary types of models: regression and classification. Linear regression is presented as an example of a regression model for numerical outcomes, while decision trees are used as an example of a classification model for categorical outcomes. Data are not always clean and ready for Machine Learning. Therefore, Chapter 2 is fully dedicated to the data cleaning steps necessary to prepare datasets for modeling. It features two survey datasets that are cleaned extensively and later used in the next Chapter 3. In Chapters 3, various techniques for splitting data into training and testing subsets are discussed. Chapter 3 explores various approaches of data splitting and highlight their importance in evaluating model performance on unseen, future data.
Resampling techniques are key to improving model performance and reliability in Machine Learning. This volume explores advanced resampling methods, including cross-validation, bootstrapping, and hyperparameter tuning, using R. Readers will learn how to apply these techniques to optimize model accuracy and prevent overfitting. Practical examples and case studies illustrate their real-world applications. This voulme is an essential resource for data scientists and Machine Learning enthusiasts aiming to master resampling strategies.
The book is organized into two chapters that outline the advanced resampling techniques. It begins with the cleaned datasets created in Chapter 2 of Volume 1. The cleaned datasets are used to explain the two primary types of models: regression and classification. Linear regression is presented as an example of a regression model for numerical outcomes, while decision trees are used as an example of a classification model for categorical outcomes. Random data splitting discussed in Chapter 3 of Volume 1 is not always useful and may give biased model performance metrics. Therefore, in Chapter 1 and 2, we discussed the more useful stratified and time-based resampling and compare their results with the random splitting. By doing that, we highlight the importance of these advanced resampling methods in evaluating model performance on unseen future data.
Books inside:
Statistics with R for Machine Learning: Volume 1 Data Preparation and Splitting with R for Machine Learning
Statistics with R for Machine Learning: Volume 2 Advanced Resampling Techniques with R for Machine Learning
Скачать Statistics with R for Machine Learning: Vol. 1-2

[related-news] [/related-news]
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.