Model download

To enable use cases where models are deployed off platform, you can download your trained models, either in .h5 or SavedModel formats. We recommend the SavedModel format since it gives more flexibility in loading the model again and can includes operations that are performed by the platform.

This guide describes how to download a model and provides utility functions to make it easier to work with the models outside the Peltarion Platform. You can see how to deploy a SavedModel in a few selected frameworks and platforms:

Prerequisites

Create a project, run an experiment, and reach the deployment page. If you’re new to the Peltarion Platform, the tutorials will get you started.

Downloading the model

To download a model, go to the Deployment View and click on Export model.

Export model button

You will be able to choose the Experiment that you want to export, that is, which specific model from your project.
You may also change the Checkpoint, that is, how many training epochs the downloaded model will have. The Checkpoint which has the best model performance is indicated with (Best) and is selected by default.

You can also choose the file format that you want to download the model.
We recommend to use the SavedModel format since it is more compatible with platform blocks and it includes pre- and post-processing steps that the platform would perform if you created a deployment.

SavedModel format

The TensorFlow SavedModel format includes all the pre- and post-processing done by the platform, and is compatible with TensorFlow 2.5.x.

h5 format

Currently, Keras v2.1.6-tf compatible .h5 file is provided for running a forward pass. Make sure to set compile=False when loading the model in Keras. If import Keras doesn’t work, try from Tensorflow import keras instead.

An example of how you can load a .h5-model is explained in the Keras documentation.

h5 limitations

This solution does currently not take pre/post processing into account which means that any normalization or categorical preprocessing will not be part of this model. Currently, these operations and the metadata used to apply them is not exposed.

Therefore we recommend for users that rely on deploying .h5-files from their platform in their own systems to do all pre-processing such as normalization, one-hot preprocessing, and reordering color channels of images outside the platform.

Note: If your model uses the Scaling block, make sure to use the additional code provided here.

Utility functions

All models that are trained on the Peltarion Platform do pre-processing and post-processing of the input and output data which is not visible for the users in the web interface, but provide benefits when deploying the models. When exporting a model from the platform, these pre-processing and post-processing steps will be included in the SavedModel file and require both serialization of the input data and parsing of the output data to correspond to what can be seen in the Peltarion Platform.

To assist you, a set of Python utility functions are provided here, which can be used in a Python client that is calling a deployed model outside the platform.

To add the utilisty functions to your local deployment, create a folder called utils inside your working directory, and create a new file inside this folder called parse_input.py.
Copy the content of parse_input.py inside this file. You will then be able to use the utility functions in your local deployment code by using:

from utils.parse_input import *

You may also import only the utility functions required according to the following examples.

Input data serialization

Downloaded models expect to receive input data as a tf.train.Example, which is a common data format for inference. You can find a complete tutorial for how to work with that format on TFRecord and tf.train.Example.

The Peltarion Platform also maps the human-readable feature names in the dataset to unique feature IDs in the model, which the model expects to find in the input data when running inference. Therefore, those feature IDs have to be encoded into the tf.train.Example. To enable this mapping, the file signatures.json is bundled together with the model files, and it contains all the information required to perform the mapping between feature names and feature IDs, as well as the feature data types which is requires to encode the data correctly.

When deploying a model on a server and sending data to it over HTTP, it will usually be required to encode the data in base64 to avoid any data corruption during transit. You’ll find more details on the specific framework pages.

The provided input data serializer can instantiated like this:

# Modify the import path as needed
from utils.parse_input import TrainExampleSerializer

serializer = TrainExampleSerializer(
    feature_mapping_file_path="<relative path to feature_mapping.json>"
)

This will create an instance of the class TrainExampleSerializer which parses the feature_mapping.json file. In order to build the tf.train.Example object, one can use the serialize_data() method, which expects a dictionary where the keys are the feature names and the value are the feature data.

# Use the utility function to serialize the image into a tf.train.Example
example_tensor = serializer.serialize_data(
    {
        "Image": mnist_bytes
    }
)

Output data parsing

The output from a downloaded model is a dictionary, where the dictionary keys corresponds to the names of the output block. Once again, these block names will use the platform’s feature IDs and need to be mapped to the human-readable feature names to match what can be seen in the Peltarion Platform.

Categorical target variables, used in classification models that give a probability score for each class, adds additional complexity. To make the prediction interpretable, we need to map the probability scores to the class names. For that purpose, there in one special signature, metadata(), included in the SavedModel which can be called to access that mapping.

Content of parse_input.py

from abc import abstractmethod
import base64
from typing import Any, Dict, Union, Optional
import json
import os
from pathlib import Path

import numpy as np
import tensorflow as tf


class Serializer:
    """
    Base class for Serializers, defining the interface.
    """

    @abstractmethod
    def serialize_data(self, features: Dict) -> Union[tf.train.Example, Dict]:
        raise NotImplementedError


class TrainExampleSerializer(Serializer):
    """Serializes data into the format required by the predict endpoint.
    Implements the utility functions to serialize input data which follows
    the Peltarion conventions into a tf.train.Example.
    """

    def __init__(self, feature_mapping_file_path: str) -> None:
        """
        Defines the mapping from the encodings we can find in the `feature_mapping.json`
        file to the serialization methods.
        :param feature_mapping_file_path: Relative path to the signature file.
        """
        self._encoders = {
            "image": self._image_encoder,
            "numeric": self._numeric_encoder,
            "categorical": self._categorical_encoder,
            "binary": self._binary_encoder,
            "text": self._text_encoder,
        }

        self.features = Features(feature_mapping_file_path=feature_mapping_file_path)

    def serialize_data(self, features: Dict) -> Union[tf.train.Example, Dict]:
        """
        Serializes a dictionary of features into a tf.train.Example with the correct
        feature labels.
        :param features:
            Dict of the form:
            {
                "feature_label": feature_data
            }
        :param method:
        :return:
        """

        tf_train_example = self._serialize_features(
            features, self.features.features_by_name
        )
        return tf_train_example

    def _image_encoder(self, value):
        return self._bytes_feature(value)

    def _numeric_encoder(self, value):
        if isinstance(value, int):
            return self._int64_feature(value)
        else:
            return self._float_feature(value)

    def _categorical_encoder(self, value):
        if isinstance(value, int):
            return self._int64_feature(value)
        else:
            return self._bytes_feature(value)

    def _binary_encoder(self, value):
        return self._bytes_feature(value)

    def _text_encoder(self, value):
        return self._bytes_feature(value)

    @staticmethod
    def _bytes_feature(value) -> tf.train.Feature:
        """
        Returns a bytes_list from a string / byte, either in Python objects or in an
        TensorFlow EagerTensor.
        :param value:
        :return:
        """
        # BytesList won't unpack a string from an EagerTensor,
        # so transform it to a numpy value first
        if tf.is_tensor(value):
            value = value.numpy()
        # Strings are utf-8 encoded into bytes
        if isinstance(value, str):
            value = value.encode("utf-8")
        return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

    @staticmethod
    def _float_feature(value) -> tf.train.Feature:
        """
        Returns a float_list Feature from a float / double. It can either be a single
        value or a list thereof.
        :param value:
        :return:
        """
        if isinstance(value, (list, np.ndarray)):
            return tf.train.Feature(float_list=tf.train.FloatList(value=value))
        else:
            return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

    @staticmethod
    def _int64_feature(value) -> tf.train.Feature:
        """
        Returns an int64_list from a bool / enum / int / uint. It can either be a single
        value or a list thereof.
        :param value:
        :return:
        """
        if isinstance(value, (list, np.ndarray)):
            return tf.train.Feature(int64_list=tf.train.Int64List(value=value))
        else:
            return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

    def _serialize_feature(self, data: Any, encoding: str) -> tf.train.Feature:
        """
        Takes the data for a feature and transforms it into a tf.train.Feature.
        :param data: The data to be turned into a feature
        :param encoding: The data type, which determines which feature method to use.
        :return:
        """
        # Get the correct method for the data type and use that to transform the feature.
        serializer = self._encoders[encoding]
        serialized_data = serializer(data)
        return serialized_data

    def _serialize_features(
            self, features: Dict[str, Any], feature_mapping: Dict[str, Dict]
    ) -> tf.Tensor:
        """
        Creates a tf.train.Example message ready to be written to a file.
        :param features:
            Dict of the form {feature_label: data}
        :param feature_mapping:
            The feature_mapping works as a lookup table to get the internal ID
            of the feature and the data type
        :return: An tf.train.Example vector
        """
        # Loop over all features, which has the human-readable labels
        # and the data for the features.
        feature = {
            feature_mapping[feature_label]["feature_id"]: self._serialize_feature(
                feature_data, feature_mapping[feature_label]["encoding"]
            )
            for feature_label, feature_data in features.items()
        }

        # Create a tf.train.Example and serialize it to a string
        example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
        example_str = example_proto.SerializeToString()
        return tf.constant(example_str, dtype=tf.string)


class RegressClassifySerializer(Serializer):
    """Serializes data into the format required by the classify and regress endpoints.
    Implements the utility functions to serialize input data which follows
    the Peltarion conventions into a a list of features with the correct feature
    names and base64 encoding for text/bytes.
    """

    def __init__(self, feature_mapping_file_path: str) -> None:
        """
        Defines the mapping from the encodings we can find in the `feature_mapping.json`
        file to the serialization methods.
        :param feature_mapping_file_path: Relative path to the signature file.
        """

        self.features = Features(feature_mapping_file_path=feature_mapping_file_path)

    def serialize_data(self, features: Dict) -> Union[
        tf.train.Example, Dict]:
        """
        Serializes a dictionary of features into a tf.train.Example with the correct
        feature labels.
        :param features:
            Dict of the form:
            {
                "feature_label": feature_data
            }
        :param method:
        :return:
        """

        processed_features = self._serialize_features(
            features, self.features.features_by_name
        )
        return processed_features


    @staticmethod
    def _serialize_feature(data: Any, encoding: str) -> tf.train.Feature:
        """Preprocesses the feature if needed.
        :param data: The data to be turned into a feature
        :param encoding: The data type, which determines which feature method to use.
        :return:
        """
        # text/bytes features have to be base64 encoded
        if encoding in ["text", "image"]:
            processed_data = {"b64": base64.b64encode(data).decode('utf-8')}
        else:
            processed_data = data

        return processed_data

    def _serialize_features(
            self, features: Dict[str, Any], feature_mapping: Dict[str, Dict]
    ) -> Dict:
        # Loop over all features
        processed_features = {
            feature_mapping[feature_label]["feature_id"]: self._serialize_feature(
                feature_data, feature_mapping[feature_label]["encoding"]
            )
            for feature_label, feature_data in features.items()
        }
        return processed_features


class Features:
    """
    Implements the logic to read and parse the `feature_mapping.json` file and use that
    information to access the feature mapping.
    """

    def __init__(self, feature_mapping_file_path: str) -> None:
        """
        Reads the `feature_mapping.json` file from the feature_mapping_file_path.
        :param feature_mapping_file_path: Relative path to the signature file.
        """

        file_path = (Path(os.getcwd()) / feature_mapping_file_path).as_posix()

        with open(file_path, "r") as f:
            self._raw_feature_mapping = json.load(f)

        self.features_by_name: Dict[str, Dict] = {
            feature["label"]: feature
            for feature in self._raw_feature_mapping["features"]
        }

        self.features_by_id: Dict[str, Dict] = {
            feature["feature_id"]: feature
            for feature in self._raw_feature_mapping["features"]
        }

    def get_feature_by_id(self, feature_id: str) -> Dict:
        return self.features_by_id.get(feature_id)

    def get_feature_label(self, feature_id: str) -> Optional[str]:
        feature = self.get_feature_by_id(feature_id)
        if not feature:
            return None
        return feature["label"]

    def get_feature_encoding(self, feature_id: str) -> Optional[str]:
        feature = self.get_feature_by_id(feature_id)
        if not feature:
            return None
        return feature["encoding"]
Was this page helpful?
YesNo