RecognizeDocumentsRequest for receipts

Hi,

I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs.

Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this?

Code setup:

let request = RecognizeDocumentsRequest()
let observations = try await request.perform(on: image)

guard let document = observations.first?.document else {
    return
}

for paragraph in document.paragraphs {
    print(paragraph.transcript)

    for data in paragraph.detectedData {
        switch data.match.details {
        case .phoneNumber(let data):
            print("Phone: \(data)")
        case .postalAddress(let data):
            print("Postal: \(data)")
        case .calendarEvent(let data):
            print("Calendar: \(data)")
        case .moneyAmount(let data):
            print("Money: \(data)")
        case .measurement(let data):
            print("Measurement: \(data)")
        default:
            continue
        }
    }
}

See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address.

And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this:

*
Pomp 1
Volume
Prijs
€
TOTAAL
*
BTW
Netto
21.00 %
95 Ongelood
41,90 l
1.949/ 1
81.66
€
14.17
67.49

Hi, It seems like it expected behavior. From the demo I can assume that it's not for those type of "document". Imagine that you have a normal document, where you have a 2 columns of text. You wouldn't want to read it line by line like you want to do with receipts

Unfortunately I try to do the same thing, so I have to stay with simple previous iOS OCR solution and trying to connect those in lines by my own (not very good) algorithm.

Maybe putting the raw ocr result in apple intelligence ml model with wort of try?

RecognizeDocumentsRequest for receipts
 
 
Q