Merlin Systems provides tools for combining recommendation models with other elements of production recommender systems like feature stores, nearest neighbor search, and exploration strategies into end-to-end recommendation pipelines that can be served with Triton Inference Server.
Quickstart
Merlin Systems uses the Merlin Operator DAG API, the same API used in NVTabular for feature engineering, to create serving ensembles. To combine a feature engineering workflow and a Tensorflow model into an inference pipeline:
import tensorflow as tf
from merlin.systems.dag import Ensemble
from merlin.systems.dag.ops import PredictTensorflow, TransformWorkflow
from nvtabular.workflow import Workflow
workflow = Workflow.load(nvtabular_workflow_path)
model = tf.keras.models.load_model(tf_model_path)
workflow = workflow.remove_inputs([<target_columns>])
pipeline = (
workflow.input_schema.column_names >>
TransformWorkflow(workflow) >>
PredictTensorflow(model)
)
ensemble = Ensemble(pipeline, workflow.input_schema)
ensemble.export(export_path)
After you export your ensemble, you reference the directory to run an instance of Triton Inference Server to host your ensemble.
tritonserver --model-repository=/export_path/
Refer to the Merlin Example Notebooks for exploring notebooks that demonstrate
how to train and evaluate a ranking model with Merlin Models and then how to serve it as an ensemble on Triton Inference Server.
For training models with XGBoost and Implicit, and then serving with Systems, you can visit these examples.
Building a Four-Stage Recommender Pipeline
Merlin Systems can also build more complex serving pipelines that integrate multiple models and external tools (like feature stores and nearest neighbor search):
retrieval_model = tf.keras.models.load_model(retrieval_model_path)
ranking_model = tf.keras.models.load_model(ranking_model_path)
feature_store = feast.FeatureStore(feast_repo_path)
request_schema = Schema([
ColumnSchema("user_id", dtype=np.int32),
])
user_features = request_schema.column_names >> QueryFeast.from_feature_view(
store=feature_store, view="user_features", column="user_id"
)
retrieval = (
user_features
>> PredictTensorflow(retrieval_model_path)
>> QueryFaiss(faiss_index_path, topk=100)
)
filtering = retrieval["candidate_ids"] >> FilterCandidates(
filter_out=user_features["movie_ids"]
)
item_features = filtering >> QueryFeast.from_feature_view(
store=feature_store, view="movie_features", column="filtered_ids",
)
combined_features = item_features >> UnrollFeatures(
"movie_id", user_features, unrolled_prefix="user"
)
ranking = combined_features >> PredictTensorflow(ranking_model_path)
ordering = combined_features["movie_id"] >> SoftmaxSampling(
relevance_col=ranking["output"], topk=10, temperature=20.0
)
ensemble = Ensemble(ordering, request_schema)
ensemble.export("./ensemble")
Refer to the Example Notebooks for exploring
building-and-deploying-multi-stage-RecSys
notebooks with Merlin Models and Systems.
Installation
Merlin Systems requires Triton Inference Server and Tensorflow. The simplest setup is to use the Merlin Tensorflow Inference Docker container, which has both pre-installed.
Installing Merlin Systems Using Pip
You can install Merlin Systems with pip
:
pip install merlin-systems
Installing Merlin Systems from Source
Merlin Systems can be installed from source by cloning the GitHub repository and running setup.py
git clone https://github.com/NVIDIA-Merlin/systems.git
cd systems && python setup.py develop
Running Merlin Systems from Docker
Merlin Systems is installed on multiple Docker containers that are available from the NVIDIA GPU Cloud (NGC) catalog.
The following table lists the containers that include Triton Inference Server for use with Merlin.
If you want to add support for GPU-accelerated workflows, you will first need to install the NVIDIA Container Toolkit to provide GPU support for Docker. You can use the NGC links referenced in the table above to obtain more information about how to launch and run these containers.
Feedback and Support
To report bugs or get help, please open an issue.