NLP Framework

Hi Everyone,


i am going to ask four questions so please bear with me, but they all are related i believe.


Question 1:. Identifying a place

I tried to identify the location tag using NLTagger using the following code.

let input = "Hello, i am looking for Harry Potter in london"
let scheme: NLTagScheme = .nameType
    if #available(iOS 12, *) {
        let tagger = NLTagger(tagSchemes: [scheme])
        tagger.string = input
        
        let recognizer = NLLanguageRecognizer()
        recognizer.processString(input)
        // can we do any better than this
        // What if there are two languages
        if let dominantLang = recognizer.dominantLanguage {
            tagger.setLanguage(
                dominantLang,
                range: input.startIndex..<input.endIndex)
        }
        
        let tags = tagger.tags(
            in: input.startIndex..<input.endIndex,
            unit: .word,
            scheme: scheme,
            options: [.omitWhitespace, .omitPunctuation])
        
        // print it out
        tags.forEach { print("Tag: ", $0.0 as Any, " Value:", input[$0.1]) }
     }


this resulted in "london" classified as "OtherWord", however if i provide the input as "Hello, i am looking for Harry Potter in London", then "London" is classified as "PlaceName".


What am i doing wrong in the above code snippet (i tried them in playground)?

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Question 2. Why does coreML model imported needs an extra 'c' at the end of the 'withExtension:' parameter in the following snippet?

import Foundation
import NaturalLanguage
import CoreML
if #available(iOS 12, *) {
    let mURL = Bundle.main.url(
                         forResource: "timeTokenTagger", 
                         withExtension: "mlmodelc") //Extra C at the end of 'mlmodel' why?
    if let modelURL =  mURL {
        let model = try NLModel(contentsOf: modelURL)
        let label = model.predictedLabel(for: "yesterday, i went to see a movie")
        print(label)
    }
    else {
        print("No model")
    }
}

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Question 3. Can the NLP models (baked in the SDK) identify the time tokens in the text, or do i have to train & bring my own model?

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Question 4. If i have to train my own model to identify the Time token, am i approching correctly in setting up the training data?

[
 {
 "tokens": ["I", "went", "home", "yesterday"],
 "labels": ["NONE", "NONE", "NONE", "TIME"]
 },
 {
 "tokens": ["I", "went", "fishing", "yesterday"],
 "labels": ["NONE", "NONE", "NONE", "TIME"]
 },
 {
 "tokens": ["last", "week", "there", "was", "a", "huge", "strom"],
 "labels": ["NONE", "TIME", "NONE", "NONE", "NONE", "NONE", "NONE"]
 }
]

---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Question 5. Any pointer to get pretrained model in tensorFlow or any other place woould be much appreciated please...

---------------------------------------------------------------------------------------------------------------------------------------------------------------------


Thanks for reading the questions.


Cheers

Arun

NLP Framework
 
 
Q