LLMOps refers to the set of practices and tools used to manage, streamline, and operationalize large language models.
LLMOps is a portmanteau of ‘LLM’ and ‘MLOps’.
LLMOps applies MLOps principles and the MLOps infrastructure to LLMs. For this reason, it is considered a subset of MLOps.
Large language models, like GPT-3.5, are highly complex and resource-intensive. They require specialized techniques and infrastructure for their development, deployment, and maintenance.
Here are some of the challenges of operationalizing LLMs:
LLMOps aims to address the unique challenges associated with managing LLMs and ensure their efficient and effective operation in production environments. LLMOps helps deploy applications with LLM models securely, efficiently, and at scale.
Some of the key aspects of LLMOps are:
If you’re familiar with MLOps, you can see that these are key aspects of MLOps as well. In LLMOps, they are extended and adjusted to meet the requirements of LLMs.
The LLMOps landscape is constantly evolving, as new tools and platforms are developed to meet the needs of organizations that are using LLMs. Some of the key players in the LLMOps landscape include:
MLOps platforms, like MLRun and Iguazio, can be used for LLMOps. To do so, some of the steps need to be adapted. For example, the embeddings, tokenization, and data cleansing steps need to be adjusted, to name a few. Validation and testing also require a different approach. However, these platforms enable the kep aspects of LLMOps: automating the flow, processing at scale, rolling upgrades, rapid pipeline development and deployment, models monitoring, and more.
Looking to practically apply LLMs? Check out this demo showing MLOps orchestration best practices for Generative AI applications.