Find Semantic Similairity Between Models?

In the video of Explore Natural Language multilingual models https://developer.apple.com/videos/play/wwdc2023/10042/, it's said at 6:24 that there are three models.

I wonder if it is possible to find semantic similairity between models? For example English and Japanese belong to different models(Latin and CJK), can we compare the vector produced from the different models to find out if two sentences have similar meanings?

Replies

I read through all the Documentation we have on NLContextualEmbedding and still don't have the idea how to find similarities for 2 sentences.

static func train(_ sentence: String, text: String) async throws {

    var vectors: [[Double]] = []

    let embedding = NLContextualEmbedding(language: .russian)

    let result = try embedding?.embeddingResult(for: search, language: .russian)
    
    print(result?.sequenceLength ?? 0)

    result?.enumerateTokenVectors(in: search.startIndex ..< search.endIndex) { (tokenVector, _) in
        vectors.append(tokenVector, _ )
        return true
    }
}

For each embedded sentence: String. Each time I receive different length of vector Array [Double].

How to compare tokenVectors arrays ?

Please advise is it possible to find similarities ?