LitMy.ru - литература в один клик

Generative AI on Kubernetes (Early Release)

  • Добавил: literator
  • Дата: Вчера, 20:54
  • Комментариев: 0
Название: Generative AI on Kubernetes: Operationalizing Large Language Models (Early Release)
Автор: Roland Huß, Daniele Zonca
Издательство: O’Reilly Media, Inc.
Год: 2025-03-12
Страниц: 108
Язык: английский
Формат: epub
Размер: 10.1 MB

Generative AI is revolutionizing industries, and Kubernetes has fast become the backbone for deploying and managing these resource-intensive workloads. This book serves as a practical, hands-on guide for MLOps engineers, software developers, Kubernetes administrators, and AI professionals ready to unlock AI innovation with the power of cloud native infrastructure. Authors Roland Huß and Daniele Zonca provide a clear road map for training, fine-tuning, deploying, and scaling GenAI models on Kubernetes, addressing challenges like resource optimization, automation, and security along the way.

With actionable insights with real-world examples, readers will learn to tackle the opportunities and complexities of managing GenAI applications in production environments. Whether you're experimenting with large-scale language models or facing the nuances of AI deployment at scale, you'll uncover expertise you need to operationalize this exciting technology effectively.

The Ray project, compared to KServe, is a newer project with a broader scope. It is an open-source framework designed to build and scale ML applications easily. It is very Pythonic, making it user-friendly for those with Python experience, and allows you to configure all activities directly within your Python codebase. Ray is not specific for model serving but instead it defines a set of core concepts quite generic: Task, Actor, Object, Placement Group and Environment Dependency. These core concepts in addition to the Ray Cluster define the execution model that is used to build and scale all the other features. Ray has an API that is very friendly to a Data Scientist or in general a Python developer, but when it comes to deploy a Ray Cluster on Kubernetes you still need some help to wire all the components together with Kubernetes concepts like Deployment and Ingress.

Learn to run GenAI models on Kubernetes for efficient scalability
Get techniques to train and fine-tune LLMs within Kubernetes environments
See how to deploy production-ready AI systems with automation and resource optimization
Discover how to monitor and scale GenAI applications to handle real-world demand
Uncover the best tools to operationalize your GenAI workloads
Learn how to run agent-based and AI-driven applications

Скачать Generative AI on Kubernetes (Early Release)












[related-news] [/related-news]
Внимание
Уважаемый посетитель, Вы зашли на сайт как незарегистрированный пользователь.
Мы рекомендуем Вам зарегистрироваться либо войти на сайт под своим именем.