Model Fine-tuning For Automated Augmented Reality Descriptions

/Model Fine-tuning For Automated Augmented Reality Descriptions

Abstract

A second input image is generated by applying a target augmented reality (AR) effect to a first input image. The first input image and the second input image are provided to a first visual-semantic machine learning model to obtain output describing at least one feature of the target AR effect. The first visual-semantic machine learning model is fine-tuned from a second visual-semantic machine learning model by using training samples. Each training sample comprises a first training image, a second training image, and a training description of a given AR effect. The second training image is generated by applying the given AR effect to the first training image. A description of the target AR effect is selected based on the output of the visual-semantic machine learning model. The description of the target AR effect is stored in association with an identifier of the target AR effect.

Full Text

What is claimed is:

Timeline

Filed

02/23/2026

Published

06/25/2026

Granted

Not Available

IPC Codes(7)

G06V 20/70:Labelling scene content, e.g. deriving syntactic or semantic representations

G06F 40/40:Processing or translation of natural language (natural language analysis G06F 40/20; semantic analysis G06F 40/30)

G06N 3/0455:Auto-encoder networks; Encoder-decoder networks