Why does referencing a class inside loop cause memory crash

I'm looping through all pages in a PDFDocument (300+ pages) but app crashes with



> Message from debugger: Terminated due to memory issue



The pdf is approx 4mb in size yet each iteration of the loop jumps the memory up approx 30mb. That memory is never reclaimed because the loop ends i get the crash.



@objc func scrapePDF(){
      
        let documentURL = self.documentDisplayWebView!.url!
        let document = PDFDocument(url: documentURL)
        let numberOfPages = document!.pageCount
         
        DispatchQueue.global().async {
          
            for pageNumber in 1...numberOfPages {
              
               print(document?.page(at: pageNumber)!.string!)
              
            }
        }
    }





I found that rather than passing a reference to the `PDFDocument` inside the loop, if instead I create a new instance for each loop this strangely solves the memory issue. I don't quite understand why though. `PDFDocument` is a Class not a Struct so is passed by reference. Meaning it is only created once and then referenced to inside my loop. So why would it cause a memory issue?




@objc func scrapePDF(){
  
        let documentURL = self.documentDisplayWebView!.url!
        let document = PDFDocument(url: documentURL)
        let numberOfPages = document!.pageCount
  
        DispatchQueue.global().async {
  
            for pageNumber in 1...numberOfPages {
               let doc = PDFDocument(url: documentURL)
               print(doc?.page(at: pageNumber)!.string!)
  
            }
        }
    }



Though the above code clears the memory issue the problem with it is that its too slow. Each loop takes 0.5 seconds and with 300+ pages I can't accept that. Any tips on speeding it up? Or why it doesn't give the memory back if referencing the `PDFDocument` from outside the loop



have tried autoreleasepool{} no effect

A sample project (in a public folder on github, dropbox, google docs, etc) would be more helpful than a video, if you'd like to see if someone else can reproduce the behavior. Note that you can include external URLs in a post here if you just drop the initial "https", so that it's not recognized as a link.

I have met the same problem. It is due to the pdf file. For some file, it works well. But for some other specific files, it will be out of memory. At the same time, if you use the "search" function, it will also have the same issue and the search speed will be very slow. I think it's a bug for PDFKit.

I had a very similar issue searching PDFs, and my current assumption is that by searching a PDF it creates an index, which is kept in memory. This is backed by the fact that the next searches are blazingly fast (as long as the document is not big enough to produce a crash).


Could you create a small Github project reproducing the issue?

Why does referencing a class inside loop cause memory crash
 
 
Q