Skip to main content

Import model predictions

Incorporate model predictions into Encord Active

By incorporating machine learning model predictions, Encord Active expands its capabilities to provide visualizations, model evaluation, identification of failure modes, error detection in labeling, prioritization of high-value data for re-labeling, and other valuable insights, enhancing the overall performance of the system.

note

If you possess predictions in the COCO Results format, you can conveniently navigate to the Import COCO predictions subsection.

To import your model predictions into Encord Active, there are a couple of steps you need to follow:

If you are already familiar with data_hashes and featureNodeHashes from the Encord platform, you can safely skip to 2. Prepare a .pkl File to be Imported. Just note that when specifying the class_id for a prediction, Encord Active expects the associated featureNodeHash from the Encord ontology as id.

tip

In the SDK section, you will also find ways to import predictions KITTI files and directories containing mask files.

Covering the basics

Before diving into the details, there are a couple of things that need to be covered.

info

All commands used from this point onward assume that the current working directory is the project folder. If it is not, please either navigate to it or utilize the --target option available in each command.

Uniquely identifying data units

At Encord, every data unit has a data_hash which uniquely defines it. To view the mapping between the data_hash values in your Encord project and the corresponding filenames that were uploaded, execute the following CLI command:

encord-active print data-mapping

Once you have selected the project for which you want to generate the mapping, it will display a JSON object resembling the following structure, consisting of key-value pairs (data_hash, data_file_name):

{
"c115344f-6869-4608-a4b8-644241fea10c": "image_1.jpg",
"5973f5b6-d284-4a71-9e7e-6576aa3e56cb": "image_2.jpg",
"9f4dae86-cad4-42f8-bb47-b179eb2e4886": "video_1.mp4"
...
}

tip

To store the data mapping as data_mapping.json in the current working directory, run

encord-active print --json data-mapping

Please note that in the case of image groups, each individual image within the group has its own unique data_hash, whereas videos have a single data_hash representing the entire video. As a consequence, predictions for videos will also need a frame to uniquely define where the prediction belongs.

caution

When you are preparing predictions for import, you need to have the data_hash and potentially the frame available.

Uniquely identifying predicted classes

The second thing you will need during the preparation of predictions for import, is the class_id for each prediction. The class_id tells Encord Active which class the prediction is associated with.

The class_id values in an Encord project are determined by the featureNodeHash attribute associated with labels in the Encord ontology. You can conveniently print the class names and corresponding class_id values of your project ontology via the CLI:

encord-active print ontology

Once you have selected the project for which you want to generate the mapping, it will display a JSON object resembling the following structure, consisting of key-value pairs (label_name, class_id):

{
"objects": {
"cat": "OTK8MrM3",
"dog": "Nr52O8Ex",
"horse": "MjkXn2Mx"
},
"classifications": {...}
}

As classifications with nested and/or checklist attributes (e.g. has a dog? yes/no -> explain why?) are represented once for each attribute answer, it's necessary to uniquely identify each classification and corresponding answer. This requires utilizing the respective classification, attribute and option hashes from the ontology.

{
"objects": {...},
"classifications": {
"horses": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "376b9761"
},
"cats": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "d8e85460"
},
"dogs": {
"feature_hash": "55eab8b3",
"attribute_hash": "d446851e",
"option_hash": "e5264a59"
}
}
}
tip

To store the ontology as ontology_output.json in the current working directory, run

encord-active print --json ontology

Prepare a .pkl File to be Imported

Now, you can prepare a pickle file (.pkl) to be imported by Encord Active. You can do this by building a list of Prediction objects. A prediction object holds a unique identifier of the data unit (the data_hash and potentially frame), the class_id, a model confidence score, the actual prediction data, and the format of that data.

Creating a Prediction label

Below are examples illustrating how to create a label for each of the four supported types

prediction = Prediction(
data_hash="<data_hash>",
frame = 3, # optional frame for videos
confidence = 0.8,
classification=FrameClassification(
feature_hash="<class_id>",
attribute_hash="<attribute_hash>",
option_hash="<option_hash>",
),
)
tip

To find the three hashes, we can inspect the ontology by running

encord-active print ontology

Creating the pickle file

Now you're ready to create the pickle file. You can select the appropriate snippet based on your prediction format from above and paste it in the code below.

Pay attention to the highlighted line, as it specifies the location where the .pkl file will be stored.

import pickle
from encord_active.lib.db.predictions import Prediction, Format

predictions_to_store = []

for prediction in my_predictions: # Iterate over your predictions
predictions_to_store.append(
# PASTE appropriate prediction snippet from above
)

with open("/path/to/predictions.pkl", "wb") as f:
pickle.dump(predictions_to_store, f)

In the above code snippet, you will have to fetch data_hash, class_id, etc., from the for loop in line 5.

Import the predictions via the CLI

To import the predictions into Encord Active, execute the following command in the CLI:

encord-active import predictions /path/to/predictions.pkl

This will import your predictions into Encord Active and run all the metrics on your predictions.

Easy imports

Encord Active streamlines the import of well-known model prediction formats allowing for easy integration of diverse model types into the system.

The following subsections outline simplified methods to import popular formats, bypassing the previous 3-step process.

COCO predictions

info

Make sure you have installed Encord Active with the coco extras.

note

This command assumes that you have imported your project using the COCO importer and that the current working directory is the project folder.

Importing COCO predictions is currently the easiest way to import predictions into Encord Active.

You need to have a results JSON file following the COCO results format and run the following command on it:

info

Make sure that the annotation coordinates in the COCO result file are not normalized (not scaled into [0-1]).

encord-active import predictions --coco results.json

After the execution is done, you are ready to evaluate your model performance.