Transform Object Detection dataset from COCO to CreateML format

COCO has set a standard in Object Detection task dataset format. Is there a tool for translating this dataset format to CreateML format so that data can be used in CreateML for training and evaluation? I've found Roboflow but I would rather use a python script rather than this platform as it seems too complex for my needs.

Post not yet marked as solved Up vote post of derrkater Down vote post of derrkater
1.5k views

Replies

You can download the COCO dataset in CreateML JSON format from here: https://universe.roboflow.com/jacob-solawetz/microsoft-coco/dataset/2

There are also over 90,000 other user-shared datasets there you can download for free in the CreateML format: https://roboflow.com/universe

Or use the pre-trained models users have shared directly with the Swift SDK (the model it uses behind the scenes is newer and more accurate than the one used by CreateML): https://blog.roboflow.com/roboflow-ios-sdk

I stand by my question as I already have a dataset that was annotated in coco format :)

Here's a brief Python script that would convert the COCO annotation format to CreateML.

import json

input_annotations = json.load(open('annotations.coco.json'))
output_annotations = []

images = input_annotations['images']
image_dict = {}
for image in images:
  image_dict[image['id']] = image['file_name']

for annotation in input_annotations['annotations']:
  coords_dict = {}
  coords_dict['x'] = annotation['bbox'][0] + annotation['bbox'][2]/2
  coords_dict['y'] = annotation['bbox'][1] + annotation['bbox'][3]/2
  coords_dict['width'] = annotation['bbox'][2]
  coords_dict['height'] = annotation['bbox'][3]
  cur_dict = {}
  cur_dict['annotation'] = [{'coordinates': coords_dict, 'label': annotation['category_id']}]
  cur_dict['imagefilename'] = image_dict[annotation['image_id']]
  output_annotations.append(cur_dict)

with open('annotations.createml.json', 'w') as f:
  json.dump(output_annotations, f)