Crop Image by coordinates (Vision + Turi Create)

Question

Created Jun ’20

Replies 3

Boosts 0

Views 2.1k

Participants 3

I need to crop UIImage from recognized VNRecognizedObjectObservation and Turi Create coordinates.
I've tried a lot of ways cropping images but nothing works well.
Stack Overflow question with images

Code Block func handleClassifications(request: VNRequest, error: Error?) {
				DispatchQueue.main.async { [weak self] in
						guard let self = self, let predictions = request.results as? [VNRecognizedObjectObservation] else { return }
						let objectBounds = VNImageRectForNormalizedRect(predictions.first!.boundingBox, Int(self.bufferSize.width), Int(self.bufferSize.height))
						let img = self.chosenImage!.cropped(rect: objectBounds)
						self.recognitionImageView.image = img 
				}
		}

Boost

Answer 1

Apple Staff OP

Apple

Jun ’20

Hi there,

Thank you for trying out Turi Create with Vision. Looking at Deployment using Core ML + Vision for Turi Create Object Detectors, could you try setting request.imageCropAndScaleOption = .scaleFill. Does anything change?

Happy to help further!

Thanks!

0

Answer 2

DanielVer OP

Jun ’20

Hi!
Unfortunately, I set this in my request

Code Block lazy var classificationRequest: VNCoreMLRequest = {
        do {
            let model = try VNCoreMLModel(for: Building().model)
            let request = VNCoreMLRequest(model: model, completionHandler: { [weak self] request, error in
                self?.handleClassifications(request: request, error: error)
            })
            request.imageCropAndScaleOption = .scaleFill
            return request
        } catch {
            fatalError("Failed to load Vision ML model: \(error)")
        }
    }()

Thank you!

0

Answer 3

Frameworks Engineer OP

Apple

Jun ’20

Hi there,

Have you inspected the bounding box coordinates to see if and how they correspond to your input image?

It's possible that you're getting back bounding box coordinates in the normalized Vision space (i.e.: with coordinates ranging from 0.0 to 1.0, and the origin in the bottom left) but using those to crop an image based on pixel dimensions.

I recommend using an image that you know has a single object located at some very obvious coordinates (for example: the top left quadrant), and looking at actual bounding box coordinates you get out of it to make sure they're what you expect.
For example, an object occupying the the entire top left quadrant should roughly give you back a bounding box with (x: 0.0, y: 0.5, width: 0.5, height: 0.5). And, if you are dealing with a 416x416 pixel image, you may want to crop that top left quadrant in your image with a rectangle of (x: 0, y: 0, width: 208, height, 208).

Notice that the coordinate systems may also be flipped between the space in which you're performing your cropping and the Vision coordinate space. A lot of the image manipulation libraries use a coordinate space where the origin is in the top-left corner because that's typically what is used for laying out pixels in a user interface.

Hope this helps!

Let us know if you have any more questions about this.

Thanks!

1