CreateML Assertion Failure when training Hand Pose model with 5k+ static images

Hey all, we are currently training a Hand Pose model with the current release of CreateML, and during the feature extraction phase, we get the following error:

Assertion failed: (/AppleInternal/Library/BuildRoots/d9889869-120b-11ee-b796-7a03568b17ac/Library/Caches/com.apple.xbs/Sources/TuriCore/turicreate_oss/src/core/storage/DataTable_data/DataColumn_v2_block_manager.cpp:105): seg->blocks.size()>column_id  [0 > 0]

We have tried to search this online and mitigate the issue, but we are getting nowhere - has anyone else experienced this issue?

Post not yet marked as solved Up vote post of dougkilby Down vote post of dougkilby
671 views

Replies

Thanks for providing this, we'll look into it. If you could provide the dataset it would be a big help!

I am encountering a similar error. I am using the https://www.kaggle.com/datasets/innominate817/hagrid-classification-512p/ dataset, and subsampling from it.

I also get an error in the pretraining, feature extraction phase: Human Hand Pose Detector was given zero-dimensioned image (0 x 0).

I have tried sampling the first x of each of the 18 classes of handpose images in the datatset. I find that if x = 200 (3600 images) or less, everything runs fine, but if x = 300 (5400 images), I get this error, around the late 4000s of file extraction/prep.

I have checked a couple things:

  1. Every Image in the dataset has been checked to ensure it is not actually a zero dimensional image via a python script with PIL.
  2. I get the same error if I use images 1-200 plus 201-300 as when I use images 1-200 plus 301-400. So i do not believe there is an issue with a particular image.

It seems as though there is a cap on the number of images allowed in a createML workspace, though I am not sure why.

Thank you!

Wanted to resurface this thread. I am also experiencing the same issues with Create ML with a 26-class dataset where each class has 3000 images. Similar @amlawson98, Create ML seems to error out in the early 4000s for images. I also followed the same steps for zero-dimensional images. For a dataset with so many classes, the ability to only have 4000 or so images isn't the most ideal as that is a pretty sparse amount of data of each class for the model to train on.

I am using a dataset on Kaggle, but I can't link it here due to forum rules.

Other specs: 16" MacBook Pro with M1 Pro, Create ML Version 5.0 (121.3), Xcode Version 15.1 (15C65).