Sentence completion example of Cor… | Apple Developer Forums

Sentence completion example of CoreML raises error

I was following the sentence completion example from the link appleTechTalk so I could learn CoreML for using M1's capabilities without giving up on Pytorch.

The example NLP code written by the Apple Engineer starts around the minute 15:00.

I've typed the exact same code, and it worked fine until I hit the cell where he was performing the coreML conversion. It gave an error that he did not receive during his execution. I checked the code I typed many times and I cannot see any difference between the two. Unfortunately, they did not put the example code anywhere online (at least I failed to find it) so I cannot copy-paste to try out.

What should the error be related to? Is it because of coreML? How can this be fixed so coreML will work?

The code I copied by looking at the video is below:

import torch
import numpy as np
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import coremltools as ct

### Model
class FinishMySentence(torch.nn.Module):
    def __init__(self, model=None, eos=198):
        super(FinishMySentence, self).__init__()
        
        self.eos = torch.tensor([eos])
        self.next_token_predictor = model
        self.default_token = torch.tensor([0]) # denotes beginning of a sentence
        
    def forward(self, x):
        sentence = x
        token = self.default_token
        
        while token != self.eos: # loop/predict until end of sentence token is generated
            predictions, _ = self.next_token_predictor(sentence) # takes a list of tokens to predict the next one
            token = torch.argmax(predictions[-1, :], dim=0, keepdim=True)
            sentence = torch.cat((sentence, token), 0)
            
        return sentence

### Initialize the Token Predictor
token_predictor = GPT2LMHeadModel.from_pretrained("gpt2", torchscript=True).eval()

### Trace the token predictor
random_tokens = torch.randint(10000, (5,))

traced_token_predictor = torch.jit.trace(token_predictor, random_tokens)

### Script the Outer Loop
model = FinishMySentence(model=traced_token_predictor)
scripted_model = torch.jit.script(model)

### Convert to Core ML
# in inputs, give the range for the sequence dimension to be between [1, 64]
mlmodel = ct.convert(scripted_model, 
                     inputs=[ct.TensorType(name="context", shape=(ct.RangeDim(1, 64),), dtype=np.int32)],
                    )

Output and the error of the last line