NEW RELEASE

MLRun 1.7 is here! Unlock the power of enhanced LLM monitoring, flexible Docker image deployment, and more.

How can LLMs be customized with company data?

Organizations that do not have an external dataset for training their models or want to have a model that is trained on their own data, can use prompt engineering or fine tuning.

Prompt engineering means feeding a model with engineered requests (prompts), which include specific content, details, clarifying instructions, and examples. These prompts guide the model into the expected and most accurate answer. For example, ChatGPT is currently being trained on prompt engineering.

Prompt engineering significantly improves model performance and tailoring of AI outputs, ensuring they align with desired outcomes and ethical guidelines. It also simplifies user interactions by providing clearer instructions and racks up less computational costs over fine-tuning, since optimized prompts result in more accurate results with fewer resources. While it’s certainly possible to do prompt engineering on a 3rd party model, it’s critical to understand the challenges of doing it that way:

  • Inference performance 
  • Prompt size is usually limited 
  • Versioning sometimes changes the responses 
  • Company data is publicly exposed

Another viable option is fine-tuning. Fine-tuning involves taking a pre-trained model, which has already learned from vast amounts of data, and further refining it on a specific task or domain. The model is exposed to labeled data related to the target task, allowing it to adapt and specialize its knowledge to better perform on that particular task.

This process is highly beneficial as it enables the transfer of general knowledge acquired during pre-training to more specific applications. Fine-tuning also helps enhance the model's performance, boost accuracy, and ensure its suitability for specific use cases, while keeping the data safe. However, fine tuning requires more advanced technological skills and knowledge, and is expensive to develop and serve.

Both options come with their benefits and tradeoffs, so it's worth investigating which one makes sense for the specific use case.

Need help?

Contact our team of experts or ask a question in the community.

Have a question?

Submit your questions on machine learning and data science to get answers from out team of data scientists, ML engineers and IT leaders.