CreateML API train soft lock on 90%

Question

richardsnipers OP

Created Nov ’23

Replies 1

Boosts 0

Participants 1

Hello, I'm trying to train a MLImageClassifier dataset using Swift using the function MLImageClassifier.train.

It doesn't change the dataset size (I have the same problem with a smaller one), but when the train reaches the 9 completedUnitCount of 10, even if the CPU usage is still high, seems to happen a soft lock that doesn't never brings the model to its completion (or error).

The dataset is made of jpg images, using the CreateML app doesn't appear any problem during the training.

There is any known issue with CreateML training APIs about part 9 of the process? There is any information about this part of the training job?

Thank you

Boost

Answer 1

richardsnipers OP

Nov ’23

Use:

var mlImgClass = try? MLImageClassifier(trainingData: datasource, parameters: parameters)

Instead of:

 var trainJob = try MLImageClassifier.train(
   trainingData: datasource,
   parameters: parameters,
   sessionParameters:sessionParameters
)
 
(... handle the job progress ...)

I'm sorry to say that I don't understand why creating a job creates so many complications, but using the MLImageClassifier directly is even more rapid and efficient.

Is possible that the train function has a similar behavior of makeTrainingSession where you can handle a session with the MLTrainingSession. In this case you have an array of checkpoints (MLCheckpoint). Every checkpoint has an url: you should load the model from this location to make a prediction. The original documentation is not enough intuitive.

0

	var trainJob = try MLImageClassifier.train(
	trainingData: datasource,
	parameters: parameters,
	sessionParameters:sessionParameters
	)

	(... handle the job progress ...)