Skip to main content

OpenVINO Local Pipelines

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. The OpenVINO™ Runtime supports various hardware devices including x86 and ARM CPUs, and Intel GPUs. It can help to boost deep learning performance in Computer Vision, Automatic Speech Recognition, Natural Language Processing and other common tasks.

Hugging Face embedding model can be supported by OpenVINO through OpenVINOEmbeddings class. If you have an Intel GPU, you can specify model_kwargs={"device": "GPU"} to run inference on it.

%pip install --upgrade-strategy eager "optimum[openvino,nncf]" --quiet
Note: you may need to restart the kernel to use updated packages.
from langchain_community.embeddings import OpenVINOEmbeddings
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"mean_pooling": True, "normalize_embeddings": True}

ov_embeddings = OpenVINOEmbeddings(
model_name_or_path=model_name,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs,
)
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/transformers/utils/import_utils.py:519: FutureWarning: `is_torch_tpu_available` is deprecated and will be removed in 4.41.0. Please use the `is_torch_xla_available` instead.
warnings.warn(
Framework not specified. Using pt to export the model.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Using framework PyTorch: 2.2.1+cu121
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
warnings.warn(
Compiling the model to CPU ...
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, onnx, openvino
text = "This is a test document."
query_result = ov_embeddings.embed_query(text)
query_result[:3]
[-0.048951778560876846, -0.03986183926463127, -0.02156277745962143]
doc_result = ov_embeddings.embed_documents([text])

BGE with OpenVINO

We can also access BGE embedding models via the OpenVINOBgeEmbeddings class with OpenVINO.

from langchain_community.embeddings import OpenVINOBgeEmbeddings

model_name = "BAAI/bge-small-en"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"normalize_embeddings": True}
ov_embeddings = OpenVINOBgeEmbeddings(
model_name_or_path=model_name,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs,
)
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdm
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/transformers/utils/import_utils.py:519: FutureWarning: `is_torch_tpu_available` is deprecated and will be removed in 4.41.0. Please use the `is_torch_xla_available` instead.
warnings.warn(
Framework not specified. Using pt to export the model.
Using the export variant default. Available variants are:
- default: The default ONNX variant.
Using framework PyTorch: 2.2.1+cu121
Overriding 1 configuration item(s)
- use_cache -> False
/home/ethan/intel/langchain_test/lib/python3.10/site-packages/transformers/modeling_utils.py:4225: FutureWarning: `_is_quantized_training_enabled` is going to be deprecated in transformers 4.39.0. Please use `model.hf_quantizer.is_trainable` instead
warnings.warn(
Compiling the model to CPU ...
INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, onnx, openvino
embedding = ov_embeddings.embed_query("hi this is harrison")
len(embedding)
384

For more information refer to:


Help us out by providing feedback on this documentation page: