BABY AI HUB

Home HomePerson Search With Natural…

Scroll Down To Discover

Model Person Search With Natural Language Description

Person Search With Natural Language Description : Person Description Model

August 25, 2023

0

Person Description Model

-Person Search With Natural Language Description-

ort_inputs = { "images": np.zeros([1, 3, 384, 128], dtype=np.float32), "txt": np.expand_dims(np.array(tokens["input_ids"], dtype=np.int64), axis=0), "attention_mask": np.ones([1,64], dtype=np.int64) }

ort_outs = ort_session.run(None, ort_inputs)

text_emb = ort_outs[1] # 1,2048

The example code above shows how to use this model. The ort_session is an ONNX inference session.

During inference, the input image is not needed, because the video to be searched must be pre-indexed. It can be simply an array of all zeros. The txt input field is the tokenized search query. The tokenizer is the same as the WangchanBERTa model. Since the model must be able to see the entire query text, attention_mask is all ones.

The output text_emb is the embedding (numpy array of size 1x2048) of the description text. It can be compared with the pre-calculated image embedding of each person in the video using approximate nearest neighbour search.

Leave a Comment:
Cancel reply