RecognizeDocumentsRequest for receipts

Question

Created Jun ’25

Replies 3

Boosts 1

Participants 2

Hi,

I'm trying to use the new RecognizeDocumentsRequest from the Vision Framework to read a receipt. It looks very promising by being able to read paragraphs, lines and detect data. So far it unfortunately seems to read every line on the receipt as a paragraph and when there is more space on one line it creates two paragraphs.

Is there perhaps an Apple Engineer who knows if this is expected behaviour or if I should file a Feedback for this?

Code setup:

let request = RecognizeDocumentsRequest()
let observations = try await request.perform(on: image)

guard let document = observations.first?.document else {
    return
}

for paragraph in document.paragraphs {
    print(paragraph.transcript)

    for data in paragraph.detectedData {
        switch data.match.details {
        case .phoneNumber(let data):
            print("Phone: \(data)")
        case .postalAddress(let data):
            print("Postal: \(data)")
        case .calendarEvent(let data):
            print("Calendar: \(data)")
        case .moneyAmount(let data):
            print("Money: \(data)")
        case .measurement(let data):
            print("Measurement: \(data)")
        default:
            continue
        }
    }
}

See attached image as an example of a receipt I'd like to parse. The top 3 lines are the name, street, and postal code + city. These are all separate paragraphs. Checking on detectedData does see the street (2nd line) as PostalAddress, but not the complete address. Might that be a location thing since it's a Dutch address.

And lower on the receipt it sees the block with "Pomp 1 95 Ongelood" and the things below also as separate paragraphs. First picking up the left side and after that the right side. So it's something like this:

*
Pomp 1
Volume
Prijs
€
TOTAAL
*
BTW
Netto
21.00 %
95 Ongelood
41,90 l
1.949/ 1
81.66
€
14.17
67.49

Boost

Answer 1

MariuszBaranski OP

Sep ’25

Hi, It seems like it expected behavior. From the demo I can assume that it's not for those type of "document". Imagine that you have a normal document, where you have a 2 columns of text. You wouldn't want to read it line by line like you want to do with receipts

Unfortunately I try to do the same thing, so I have to stay with simple previous iOS OCR solution and trying to connect those in lines by my own (not very good) algorithm.

1

Answer 2

MariuszBaranski OP

Sep ’25

Maybe putting the raw ocr result in apple intelligence ml model with wort of try?

0

Answer 3

Thuri88 OP

Nov ’25

I’m not sure if Apple Intelligence can help out here without it being able to take image itself in. Will give it a try though, let’s see what it might come up with.

Thanks for the replies and suggestion.

0