Thank you for interesting conversations and details for possible solutions.
There seem to be two topics: box detection and "measure the area of planes or surfaces, like the length and width of objects, just like the Apple Measure app does". Box detection is more specific and complex so we’ll skip that.
ARKit world tracking can automatically detect planes and provide their dimensions. That’s all the API provides and it’s insufficient for your needs because it lacks scene understanding.
Re: scene understanding, a plane is a plane is a plane… is it the top of a box? Is it the “top” of some rectangular surface that we need to measure? We have no idea. All we know is that ARKit thought some compelling detected features comprised the surface of a plane.
If you add ARKit’s sceneReconstruction things get more interesting (although not required):
“When you enable scene reconstruction, ARKit provides a polygonal mesh that estimates the shape of the physical environment. …
If you enable plane detection, ARKit applies that information to the mesh. Where the LiDAR scanner may produce a slightly uneven mesh on a real-world surface, ARKit smooths out the mesh where it detects a plane on that surface.”
Better yet, you can classify (identify) common types of detected planes with ARPlaneAnchor’s classification property.
Assuming you’re not measuring walls, floors ceilings, tables, seats, doors or windows anything with ARPlaneAnchor.Classification.none(_:) is a plane of interest.
So that’s what’s possible with ARKit API.
If need something more rigorous or specific you’ll want to consider techniques from the CoreML and Vision frameworks, which may require training a machine learning model for bespoke object detection and tracking.