Machine Learning for Econometrics
- Добавил: literator
- Дата: 30-09-2025, 18:54
- Комментариев: 0
Автор: Christophe Gaillac, Jérémy LʼHour
Издательство: Oxford University Press
Год: 2025
Страниц: 353
Язык: английский
Формат: pdf (true)
Размер: 14.2 MB
Machine Learning for Econometrics is a book for economists seeking to grasp modern Machine Learning techniques - from their predictive performance to the revolutionary handling of unstructured data - in order to establish causal relationships from data.
The volume covers automatic variable selection in various high-dimensional contexts, estimation of treatment effect heterogeneity, natural language processing (NLP) techniques, as well as synthetic control and macroeconomic forecasting. The foundations of Machine Learning methods are introduced to provide both a thorough theoretical treatment of how they can be used in econometrics and numerous economic applications, and each chapter contains a series of empirical examples, programs, and exercises to facilitate the reader's adoption and implementation of the techniques.
Econometrics and Machine Learning (ML) share many statistical tools, as we will see in Chapter 2. However, the philosophies and goals of these two approaches often differ in subtle ways. On the other hand, the goal of ML is to build a model that allows one to obtain the best possible predictive performance for a given problem, often by respecting a computational constraint when calling the model, also called runtime or inference time performance. Thus, the model must generate predictions within a defined timeframe. ML researchers often talk about algorithms rather than models, to stress that this process is based on a series of instructions that lead to a prediction, regardless of their nature, rather than on a single statistical model. ML is therefore used to respond to different problems than those of econometrics, such as constructing song or movie recommendation systems, matching job-seekers to firms, translating documents, predicting the next data point in a time series, categorizing products, recognizing patterns in images, retrieving documents based on their content, etc.
The term Artificial Intelligence (AI) is often used as a synonym for Machine Learning. This term underlines that a machine replaces the human in performing a cognitive task and that its implementation can be carried out on a very large scale at a very small marginal cost – the main fixed costs consisting of training the algorithm and then making it available. As an aside, these costs are far from negligible, so much so that training large language models (LLMs) like the one that powers ChatGPT from scratch can run to well over a few million dollars.
Machine Learning is an area in which Computer Science is ubiquitous, and comparing ML algorithms with traditional algorithms can help to understand the paradigm differences. Traditional algorithms consist of fixed rules, established a priori by a human, that the machine simply executes; whereas training an ML algorithm consists of using datasets that correctly associate inputs with outputs to teach the computer the implicit rules underlying these associations, regardless of the exact nature of these rules, as long as they produce relevant responses for the (human) end-user. In the case of analyzing text data, this is the difference between using a regular expression (Chapter 12) and a modern language model (Chapter 14). It is noteworthy that Machine Learning, and Deep Learning in particular, have achieved their most impressive successes in well-defined tasks characterized by a favorable signal-to-noise ratio.
Natural Language Processing (NLP) encompasses the set of concepts and algorithms that allow for the automatic processing of human language. “Natural” refers to the opposition between computer languages, such as C++, Python, etc., which are devoid of any semantic ambiguity, and human languages, for which words are often polysemous and whose meaning varies greatly depending on the context. This book focuses solely on the written form of language, excluding its oral counter part (referred to as automatic speech recognition, ASR), which is more complex and currently less used in the economic literature. It is important to note at the outset that NLP extends beyond the realms of statistics and econometrics. Recent breakthroughs, exemplified by the enthusiastic response and concerns arising from ChatGPT, should be integrated into the toolkit of empirical economists.
A GitHub repository is available at the address github.com/jeremylhour/ml4econometrics. It contains scripts in R and Python, which reproduce some of the applications of this work, as well as elements to answer the questions and exercises presented in Part VI. The associated Python notebooks can be found here: github.com/nlp-with-transformers/notebooks.
Скачать Machine Learning for Econometrics
[related-news] [/related-news]
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.
