Transform Object Detection dataset from COCO to CreateML format

Question

Created Jul ’22

Replies 3

Boosts 1

Views 1.8k

Participants 3

COCO has set a standard in Object Detection task dataset format. Is there a tool for translating this dataset format to CreateML format so that data can be used in CreateML for training and evaluation? I've found Roboflow but I would rather use a python script rather than this platform as it seems too complex for my needs.

Boost

Answer 1

yeldarb OP

Jul ’22

You can download the COCO dataset in CreateML JSON format from here: https://universe.roboflow.com/jacob-solawetz/microsoft-coco/dataset/2

There are also over 90,000 other user-shared datasets there you can download for free in the CreateML format: https://roboflow.com/universe

Or use the pre-trained models users have shared directly with the Swift SDK (the model it uses behind the scenes is newer and more accurate than the one used by CreateML): https://blog.roboflow.com/roboflow-ios-sdk

0

Answer 2

derrkater OP

Jul ’22

I stand by my question as I already have a dataset that was annotated in coco format :)

0

Answer 3

OP

Apple

Aug ’22

Here's a brief Python script that would convert the COCO annotation format to CreateML.

import json

input_annotations = json.load(open('annotations.coco.json'))
output_annotations = []

images = input_annotations['images']
image_dict = {}
for image in images:
  image_dict[image['id']] = image['file_name']

for annotation in input_annotations['annotations']:
  coords_dict = {}
  coords_dict['x'] = annotation['bbox'][0] + annotation['bbox'][2]/2
  coords_dict['y'] = annotation['bbox'][1] + annotation['bbox'][3]/2
  coords_dict['width'] = annotation['bbox'][2]
  coords_dict['height'] = annotation['bbox'][3]
  cur_dict = {}
  cur_dict['annotation'] = [{'coordinates': coords_dict, 'label': annotation['category_id']}]
  cur_dict['imagefilename'] = image_dict[annotation['image_id']]
  output_annotations.append(cur_dict)

with open('annotations.createml.json', 'w') as f:
  json.dump(output_annotations, f)

1