Reduced CPU usage w/ Time Profiler?

Short Version:


My main question is this, what is different when building to profile using instruments then a regular build that would result in reduced CPU load of my app by over 200%?


When building to run, it uses well over 200% CPU as reported by activity monitor, but with everything else the same, when building for profiling, using the Time Profiler, it reduces the CPU load down to <5%, which is a dramatic (orders of magnitude) difference.


TL;DR Version:


As an exercise to learn Cocoa, Swift and DSP (yes all three at once), I am working on writing a simple radio scanner OS X application using the cheap rtl-sdr dongles.


I have written a simple Swift wrapper around librtlsdr, a simple UI to be able to set the frequency, and a couple of simple DSP routines. My wrapper around librtlsdr uses an NSOperationQueue and my DSP routines use GCD queues in order to move the IO and CPU intense routines off the main thread / queue.


Currently, everything is working to the extent that I can successfully demodulate an AM transmission.


I have implemented a simple low-pass FIR filter and while working on the algorithm, I was surprised when I realized that I couldn’t use much more than about 30 coefficients before my filter routine started taking too long and the audio became choppy. As well, Activity Monitor shows up to 300% CPU usage for my app, which seems crazy high considering my filter contains nothing but a nested loop to do some multiply and accumulate operations. Anything higher than about 40 coefficients and the UI becomes unresponsive.


For the DSP minded, it’s a decimating filter where I am using the entire sample set for filtering (960000 sps) , but only filtering the samples that I need for the rate reduction (48000), using a rectangular windowed sinc function for the coefficients, pre computed. Not the most efficient algorithm, but on my quad core i7 Macbook Pro and iMac, it should still scream.


To get some insight on where my program was using up all the CPU cycles, I decided to give Instruments a go. Product->Profile, choosing the Time Profiler and running my app gave my some interesting information.


1) My filter routine was NOT using the most CPU cycles.
2) Activity monitor showed that my app wasn’t even at 5% CPU usage


So I decided to find out how far I can stress things before I see any stress on the CPU and I was up to a 50,000 tap filter before it started to be noticeably choppy and the CPU usage went close to 300%. So… to recap, normal build and run, I max out at about 35-40 filter taps; profile build and run, I max out at about 50,000 filter taps.


Also worthy of note, while profiling with 50,000 filter taps, the UI still responds instantly and I can change frequency, start / stop the radio and it has choppy audio. During a normal run, the UI starts to freeze just as soon as I start the radio with no audio, and that happens after I get to only about 50 taps.


Again, why the dramatic difference in CPU usage between between running while profiling, and running just a standard build; what’s different aside from the elevated privileges for Instruments and what do I need to do to make it the normal behavior for my app?


Thanks


JE

Accepted Reply

By default, the profile action uses a release configuration, while the run action uses debug.


If you change your scheme so the run action uses a release configuration (from the Product menu, select Scheme, then Edit Scheme), does it match the performance you see while profiling?


Note that using the release configuration can make debugging more difficult, so you normally want it on debug while you're working on your app.

Replies

By default, the profile action uses a release configuration, while the run action uses debug.


If you change your scheme so the run action uses a release configuration (from the Product menu, select Scheme, then Edit Scheme), does it match the performance you see while profiling?


Note that using the release configuration can make debugging more difficult, so you normally want it on debug while you're working on your app.

I must have overlooked the fact that the release configuration is used during profiling, but that made the difference.


I understand the difference between a debug build vs a release build, but it begs the question, with a debug build configuration, what would cause a simple nested loop doing nothing but multiply and accumulate operations to have that much of a difference in CPU usage.


Off to go research what's happening.


Thanks