Hugging Face has become a leading model repository, offering user-friendly tools for building, training and deploying ML models and LLMs models. In combination with MLRun, an open-source platform that automates data prep, tuning, validating and optimizing ML models and LLMs over elastic resources, Hugging Face empowers data scientists and engineers to bring their models to production more quickly and efficiently.
This blog post introduces Hugging Face and MLRun, demonstrating the benefits of using them together. It is based on the webinar “How to Easily Deploy Your Hugging Face Models to Production”, which includes a live demo of deploying a Hugging Face model with MLRun. The demo covers data preparation, a real application pipeline, post-processing and model retraining.
You can also watch the webinar, featuring Julien Simon, Chief Evangelist at Hugging Face, Noah Gift, MLOps expert and author, and Yaron Haviv, co-founder and CTO of Iguazio (acquired by McKinsey).
Hugging Face has gained recognition for its open-source library, Transformers, which provides easy access to pre-trained models. These include LLMs like BERT, GPT-2, GPT-3, T5 and others. These models can be used for various MLP tasks such as text generation, classification, translation, summarization and more.
By providing a repository of pre-trained models that users can fine-tune for specific applications, Hugging Face significantly reduces the time and resources required to develop powerful NLP systems. This enables a broader range of organizations to leverage advanced language technologies, thus democratizing access to LLMs.
The impact of Hugging Face’s LLMs spans various industries, including healthcare, finance, education and entertainment. For instance, in healthcare, LLMs can assist in analyzing medical records, extracting relevant information and supporting clinical decision-making. In finance, these models can enhance customer service through chatbots and automate the analysis of financial documents.
Now let’s see how Hugging Face LLMs can be operationalized.
MLRun is an open-source MLOps orchestration framework that enables managing continuous ML and gen AI applications across their lifecycle, quickly and at scale. Capabilities include:
Deploying Hugging Face models to production is streamlined with MLRun. Below, we’ll outline the steps to build a serving pipeline with your model and then retrain or calibrate it with a training flow that processes data, optimizes the model and redeploys it.
Hugging Face models are integrated into MLRun, so you only need to specify the models you want to use.
Integrating Hugging Face with MLRuns significantly shortens the model development, training, testing, deployment,and monitoring processes. This helps operationalize gen AI, effectively and efficiently.
Learn more about MLRun and Hugging Face for your gen AI workflows.