I was following the sentence completion example from the link appleTechTalk so I could learn CoreML for using M1's capabilities without giving up on Pytorch.
The example NLP code written by the Apple Engineer starts around the minute 15:00.
I've typed the exact same code, and it worked fine until I hit the cell where he was performing the coreML conversion. It gave an error that he did not receive during his execution. I checked the code I typed many times and I cannot see any difference between the two. Unfortunately, they did not put the example code anywhere online (at least I failed to find it) so I cannot copy-paste to try out.
What should the error be related to? Is it because of coreML? How can this be fixed so coreML will work?
The code I copied by looking at the video is below:
import torch
import numpy as np
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import coremltools as ct
### Model
class FinishMySentence(torch.nn.Module):
def __init__(self, model=None, eos=198):
super(FinishMySentence, self).__init__()
self.eos = torch.tensor([eos])
self.next_token_predictor = model
self.default_token = torch.tensor([0]) # denotes beginning of a sentence
def forward(self, x):
sentence = x
token = self.default_token
while token != self.eos: # loop/predict until end of sentence token is generated
predictions, _ = self.next_token_predictor(sentence) # takes a list of tokens to predict the next one
token = torch.argmax(predictions[-1, :], dim=0, keepdim=True)
sentence = torch.cat((sentence, token), 0)
return sentence
### Initialize the Token Predictor
token_predictor = GPT2LMHeadModel.from_pretrained("gpt2", torchscript=True).eval()
### Trace the token predictor
random_tokens = torch.randint(10000, (5,))
traced_token_predictor = torch.jit.trace(token_predictor, random_tokens)
### Script the Outer Loop
model = FinishMySentence(model=traced_token_predictor)
scripted_model = torch.jit.script(model)
### Convert to Core ML
# in inputs, give the range for the sequence dimension to be between [1, 64]
mlmodel = ct.convert(scripted_model,
inputs=[ct.TensorType(name="context", shape=(ct.RangeDim(1, 64),), dtype=np.int32)],
)