I'm Chad Woolf.
KRIS MARKEL: I'm Kris Markel.
We are performance tools engineers for Apple.
This is session 412.
We'll talking about profiling in depth.
This will talk about the time profiler in Instruments and how
to use it to optimize your applications.
Time profiler is place to go when you're trying to figure
out where your application is spending the bulk of its time.
It is useful when trying
to discover what the application is doing at runtime.
You want to see who is calling what.
Our session breaks down a bit like this.
We'll talk a bit about the motivation on why we wanted
to create a session devoted to the time profiler only.
But the session will revolve around demonstrations
and showing you details on how it works
and how your applications work below the source level.
Then we'll give you final tips on how
to use the time profiler on your own.
Quickly about our motivation, it comes from the creation
of Instruments 7 itself.
With Instrument 7 we wanted to go with a new look and feel
which meant new artwork but it also meant
that for the new feel, we wanted it to be more responsive
and we wanted to feel smoother.
We'll do performance optimizations
for our UI in general.
We also wanted to try new graphing styles.
These graphing styles were things that we wanted to do
in the past but didn't have the performance
to in our existing graphic code.
We knew that we would have to focus on the rendering
which is a pretty difficult piece of application.
Instruments have to deal with hundreds of thousands,
sometimes millions of data points and has to try to crunch
that down into a presentation that's both simple
and easy to understand.
There is definitely algorithm complexities in there.
What that meant for us, we had to rewrite a significant portion
of our application which is the track view which lives
up at the top of the app.
Chris and I over winter took our design for the new track view
and we started building it up out of a series of prototypes.
We didn't do the prototyping in Instruments itself,
we broke it out in a separate application to keep it simple.
This is what one of the last prototypes looked like.
Now while we were prototyping, a thing we did,
we set a performance budget.
As we layered on feature
after feature we were constantly evaluating our performance
of our code against the budget.
When we found we were exceeding the budget we would use the time
profiler to figure out where
in our application things were going wrong.
Sometimes it was an easy fix but sometimes it wasn't.
Since we were prototyping,
even some of the bigger structural changes we had
to make to get the performance back
on track was still fairly easy.
When we integrated it back
into Instruments we use the time profiler again
to find the hot spots in our integration points,
and after a few iterations, we had a version of Instruments 7
which was meeting our performance goals.
We were thrilled with how the time profiler got us
through this winter, that when WWDC 2015 came,
we wanted to create a session that talks about time profiler
and the problems that it's really good at solving.
We wanted to share our experience
when writing the track view.
What we did this year,
we created a demonstration application on iOS
that looks similar to the first track view prototypes
and we set performance targets for ourselves,
100,000 data points is what we wanted to graph,
we wanted a perfect 60 frame per second for panning and zooming.
We wanted to make it work on a iPad mini 1st generation.
We picked the iPad mini 1st gen --
you know what I'm talking about --
we knew it would work well on all
of the other platforms especially the later platforms.
Chris will show you the application,
he'll time profile it, and he's going to show you the things
that we found when putting this together for you.
KRIS MARKEL: Thank you, Chad.
Here I have the prototype application of an Xcode,
and I want to call out a few things.
Our initial implementation we discovered can't handle even
close to 100,000 data points.
Initially we're working with 10,000 data points to get going.
Another important point, you should do your time profiling
on release builds, you want to take advantage
of the compiler optimizations while you're profiling.
I'll go ahead, start profiling the application and to do
that I'll, from the product menu, choose profile.
This will build the application, install it on the iPad and bring
up the Instrument template user.
Here it is, time profiler is selected for me.
I'll click choose.
Here you will see Instruments in our new track view
and we'll get back to that in a minute.
Right now I'll go ahead and start recording,
click the record button, we'll show you the app.
I want to emphasize what you see here, this is not the simulator,
this is QuickTime mirroring that which is already on the app.
Here's our graph, I'll go ahead and scroll, scrolling isn't bad,
it is not bad, I'll zoom out by pinching, it is okay at fist
and then starts to stutter, stutter,
stutter, that's pretty bad.
Now, finally, I'm going to just scroll back and forth
with my finger, I'm actually moving my finger
but the display is not updating, it is very, very laggy.
That's very poor performance.
Let's look at what's going on.
Let's go back to Xcode -- Instruments.
We'll stop the profiler
and quickly show you the new track view.
Here we have the CPU usage.
What this is, this is the average CPU usage
for specific unit of time.
That time depends on your current zoom level.
You see the different parts as I use my app
where it was spending time.
This is scrolling, the zooming out, this is the scrolling back
and forth while zoomed out.
A nice thing about the new track view is that I can go ahead
and use this pinch gesture to zoom in on a piece of data
that I'm interested in.
If you're not using a Trackpad, you can hold down the option key
and scroll up and down to zoom in and out.
I want to focus on this particular piece of data,
the activity right here.
I was scrolling here.
To do that, I'll apply a filter, that's just a click and drag
and it selecting just those specific samples so I can focus
on that particular data.
I'll create more space down here.
Down here this is our detailed view.
What this is showing us is how many --
the percentage of time profiler samples we have inside
of a particular function or method
and then we have the symbol name.
Here is our percentage and here is our symbols.
Now the first thing you usually do
when you time profile is you start to expand these out
and looking for kind of a -- comparing the numbers over here
with specific methods over here, function.
Looking for things that kind of stand out as, you know,
jump out at you as something that's worthy of investigating.
Another option, if we go over to the inspector pane and click
on the extended detail, we see the heaviest stack trace.
This is for the main thread.
This is where focusing here is where I'll get the most bang
for my buck when doing a time profiling trying
to make performance improvements.
Let's see what's going on.
The main is calling application main, run loop,
there is core animation work happening, some --
there is nothing standing out as something
that seems out of the ordinary.
In fact, this is a really common occurrence when profiling.
You look at what the application is doing the most,
well it doesn't look like it is doing anything special.
There is nothing -- no call
to compute the 40th Fibonacci number here
in here or something.
However, you know, looking at this call stack,
this stack trace, there is something I know what my
It is a simple prototype app, what it does,
it builds a path and it draws a path.
I can actually see in here
that there is a call, a CG context path.
It is not called by my code according to the stack trace.
It is here, it is taking up a fair amount of time.
I'll go ahead, click on that, take a look at that.
If we look back at our call tree I do see something
What we have, we have the draw path according
to this call tree is called
by this draw layer method on UI view.
That's also calling our drawRect for the graph view.
That's taking a fair amount of time.
That's one of the things that the app does.
If I look at the time over here, context draw path is taking up,
you know, 55% of the samples but the drawRect, it in very few.
This is interesting, if I double-click
on the drawRect method it will take me to the source code.
I see, if you look to the bottom,
I actually do call draw path from the drawRect method
but it is not showing up in any sample.
Everything in here is an add path.
This is unusual especially
since my expectation is my own drawRect method would take a
while to run.
It is basically half of what my app does.
Looking at this, I actually notice
that while drawRect returns a void
and context draw path is the last method called
and this is probably a case
of what's called Tail Call Elimination.
To tell us more about Tail Call Elimination and how
to verify that, back to Chad.
CHAD WOOLF: Okay.
So to explain the situation that Kris is seeing we have
to understand how the time profiler sees what's calling
what inside of your application.
This is going to get technical,
I'll walk through it step by step.
On the left, we have the code for drawRect and on the right,
we have what you would imagine the stack to look like for
that thread, right before the UIKit calls into the drawRect.
When that call is made
to the drawRect it will do what most functions and methods do,
to establish their own call frame.
That starts off by first pushing the return address
from the link register and the previous value
of the frame Pointer on the stack.
Now drawRect knows how to return to its caller
and restore the frame Pointer.
Now the next thing that happens,
we take the frame Pointer set up to the new base.
Then drawRect will make room for its local variables
and the compiler scratch space and that's --
now we have a frame for drawRect.
Now the code starts to run and we come down to draw path
and draw path does the same thing.
It pushes the own frame on to the stack.
The way time profiler works, it uses a service in the kernel
that will sample what the CPUs are doing
at about 1000x per second.
In this case, if we take a sample,
we see that we're running inside of context draw path.
Then the kernal looks at the frame Pointer register to see
where the base of that function's frame is
and find the return address of who called it.
Now we can see that drawRect called into draw path.
If we want to see who called
into drawRect we can use the frame Pointers that were pushed
on the stack to find the base of drawRect
and then continuously go back through the stack
until we hit the bottom.
This is a backtrace.
If we take enough of these and put them
in the call tree view you can get a pretty good picture
of what's going on inside of your application.
I want to point out, the frame Pointers
on the stack are absolutely required.
If you're compiling code with fomit-frame-pointer turn
that off to try to do the time type of profiling
that we're doing here.
Let's look at the optimize cases.
This is where drawRect was compiled
with compiler optimizations enabled.
Same situation, we have a drawRect frame.
We're about to call into draw path.
You notice when draw path returns drawRect is finished.
Doesn't need to do anything.
It is going to return.
The way it returns, it is it pops the stack frame,
restores the previous value of the frame Pointer
and jumps back to the caller.
The compiler looks at this and says well,
why does draw path need anything from drawRect's stack frame.
Furthermore, why come in to drawRect --
back to drawRect at all when it is going
to return to its caller?
What it does, it rearranges the code like this.
It is going to pop the stack frame, restore the frame Pointer
and make a direct call back into draw path meaning
that we don't need to jump back to the caller.
That's harder to explain than it is to see.
Let's imagine what it would look like when running this code.
We'll pop the stack frame off to get rid of the local variables.
We'll restore the frame Pointer to the original value
and the value of the link register.
Then we'll jump to the beginning of the code for draw path.
Draw path is going to push its own frame back
on to the stack using the value that it found
in the link register and the frame Pointer.
From draw path's perspective it was called directly
from draw layer in context from UIKit.
If we take a time sample at this point,
we'll see the exact same story.
Even though that's not the actual call sequence
that happened, this is what the time profiler saw.
That's what we ended up seeing in our call trees.
This is called Tail Call Elimination, it is common
in highly optimized code and has some benefits.
It saves stack memory.
In the process of saving stack memory, it keeps the caches hot,
that reuses the memory, the caches and data.
It has a profound effect on recursive code,
especially tail call recursive code, where a function
or method calls itself as the last thing and then returns.
Without pushing those frames a Tail Call Elimination inside
of a recursive function can make the performance as good
as its iterative version, so you don't get the stack growth
and you get the high performance as well.
This optimization leave on with highly recursive code.
If you want to turn it off for the sake of profiling
to show a cleaner stack trace you can turn it off by going
in the build settings of the project
and setting the compiler flag,
the CFLAGS to FNO-optimize-sibling-calls,
turning off the optimization,
and unfortunately the performance gains with it,
but you'll get a better result in the time profiler.
If you choose to live with it, and you want to identify
if a tail call is happening, then what you can do,
you can look at the disassembly and call sight of the last call.
If it is a normal call, it is going to use a branch
and link family of instructions, that's the first example there.
What that does, it jumps to the new function
and saves the return value in the link register.
If it is a tail call case and we're going to jump directly
into the new function, it will be a direct branch instruction.
Without the BL.
That could be a call instruction for a branch and link
and the branch would be a jump instruction,
if you see that, it looks similar.
Now it is up to Kris if he wants to disable the optimization
and recompile, or he can carry on, it is your choice.
KRIS MARKEL: I'll look at the disassembly.
In Instruments, upper right-hand corner of the detailed view,
there's a button, view disassembly, if I click that,
I see the disassembly for the method
and we can confirm the call to context add path is a branch
and link, the call to context draw path,
it is just a simple branch.
I'm confident that this is a case of Tail Call Elimination
that 55% that I saw on the call tree that was not attributed
to my drawRect should be attributed to the drawRect.
That is good news.
I know now my drawRect is on my heaviest deck frame,
my heaviest stack trace
and it is consuming 55 to 60% of my time.
This is great.
I know where to optimize.
I optimize drawRect, I'm good to go.
Let's look at this drawRect.
Looking that the drawRect, if I had a table I would flip it.
There is not much to optimize here.
It is hard to think of a simpler drawRect
that actually is functional.
We have 4 function calls, context, you know, CG calls,
this drawRect really does not do much.
It turns out that this is actually a really common
occurrence when doing profiling.
You will take a look at your hot spots and code,
there is not much you can do in your code directly
to improve your performance.
You know, this junction, what do you do?
You know, other than flipping tables, crying into your pillow
at night, what we did was we went through, started to look
at the core graphics documentation and other drawing,
you know, Cocoa drawing documentation.
We came across this particular property here.
Lo and behold, there was a make my code faster button
that was created by an Apple engineer.
This is excellent.
Above that, you see I copied
out of the documentation, pasted it in there.
It says a couple of things that are interesting.
It says, first of all, it may improve performance, it may not.
You should always measure.
You know, okay, dad.
Let's see if this improves in performance.
This time to start the Instruments, I'm just going
to do command-I for Instruments.
It will do the same thing.
It will build the application and install on the device,
bring up the template chooser.
It takes a moment.
Here we go.
Another shortcut I like to use, if you take a look
at the choose button down here.
If I hold down the option button it changes to profile.
What it means, when I click this,
the application is going to start recording.
It saves me a step or two.
I'll do that now.
Now the time profiler will come up.
This is measuring the app.
I'll do some quick scrolling back
and forth, capturing some data.
I think that's enough.
Let's go ahead, stop the recording.
I'm going to filter to specifically the scrolling data.
If we go ahead down here looking at the detailed view.
This is promising.
I'm actually -- you can see, there is multiple threads here.
The threads are doing work.
That's really good.
If we go ahead, if I hold down option,
click the disclosure triangle I can see what the thread is
calling, there is some dispatch calls,
some CG calls, that's good.
That's the drawing code.
We'll go ahead, check the other one to see.
Hold down the option key.
Dispatch, CG calls.
This is good.
This is looking promising.
I'm multithreaded, in theory my app is faster.
However, multithreading does not necessarily mean faster.
We should confirm this is actually doing anything for us.
One way to do that, I happen to know this device is two CPUs,
if the CPUs operate in parallel
at max capacity I should see a 200% CPU usage
in my graph up here.
I'm not seeing anything over 100%, that's a bit
of a warning sign, it doesn't necessarily mean
that they're not both doing work at the same time.
It means that I need to check further.
How do we check further?
Instruments has what we call strategies, it is different ways
of partitioning the data to look at them.
There is three of them.
The first one is the Instrument strategies, the default,
we're looking at it here.
The second one is the CPU strategy,
it shows the data per CPU or CPU relevant data
and the final one is the thread strategy.
It shows you details on what each thread is doing.
Let's look at the CPU strategy.
We can see we have each of the CPUs,
we can see how much work it is doing.
At the bottom, we see the combined usage.
A nice thing to do here, when I zoom in far enough, the details,
what the graph shows me, it will actually change.
Rather than show making the average usage, it will show
if the CPU is active or not,
it will go from average usage, to either on or off.
Now each CPU shows an on state or off state,
whether or not it is working.
What you notice here, the CPUs are never working
at the same time, there is no parallelism here.
You know, this is not good.
If we want to feel worse, we look at the thread strategy.
What this is showing us, each icon,
it represents a sample the time profiler took,
you can click on it.
See the call stack.
Here this is on a background thread
and you see the core graphics calls, here is the main thread,
you see the main -- the work we're doing on the main thread.
As you see, if I zoom in to the right level,
it is probably here, I scroll around,
you see there is not really anywhere
where two threads are working at the same time.
It is jumping from one thread to another.
So the drawsAsynchronously, this has not really done anything
for us, in theory, it may have slowed us down.
Now not only we doing the same drawing work but also managing,
you know, core graphics system managed the threads it is
working on, that sort of thing.
That didn't really help.
I'll turn it off.
I'll flip another table I guess.
It is not clear, the magic button didn't help.
What do we do now?
This is a really common occurrence again
in time profiling.
You try lots of things, most don't work.
We step back.
What does the app do?
It builds a path and draws a path.
We have seen the draw path code.
Let's think about the build path code.
That's right in here.
What we wanted to do, we wanted
to investigate the actual path we're building.
What this code does, it loops the data elements,
creating a path and adds a line to that path
for each data element.
We want to know how many lines we're adding to the path.
This is something that the time profiler can't tell us.
It can't tell you how long,
or how many times a particular method
or function has been called.
It doesn't know the difference
between a slow function that's called a few times
or a fast function that's called a lot.
In this case we resorted to NSLog and we just have a thing,
every time we add a path we increment our counter
and then we log it when we're done with the loop.
Something important to point out, NSLog,
it is not a very fast function.
You don't want it in high performance code.
You probably don't want to use it for anything other
than gathering diagnostic information or debugging.
When you're done, delete it from the code.
In this case we just comment it out so you can see it.
What we have found, we're adding 10,000 lines to the point
in cases where we did not need to do that.
In fact, there is no way
to display 10,000 lines on this device.
Especially when you're zoomed out far enough
that all the data fits within 100 screen points.
There is no reason to draw 10,000 lines in there.
We need to draw 100 lines.
It is a lot less work.
We went ahead, we created an implementation that did that.
If multiple data elements, data points are
within a single screen point it just finds the max
and draws a single line.
If we're using 100 screen points we're creating 100 screen lines.
We'll go ahead and switch to that implementation.
. . We're feeling good about that.
We'll change the element count up to the goal
of 100,000 rather than 10,000.
We'll see if that helped us at all.
I'll use command-I to start up the Instruments.
Since Instruments is already open, it is just going
to bring it to the foreground and start recording immediately.
Here we go.
A new recording.
We'll go ahead and scroll, the scrolling seems fine.
I'll zoom out.
Zooming performance is much, much better.
It takes longer because I have more data to zoom out.
It is actually looking pretty good.
I'm going to do the swiping back and forth.
It is tracking my finger really well now.
It is actually keeping up with it, doing a fantastic job.
Hooray! All done!
Except not quite.
If we look at the actual amount
of CPU we're using while scrolling back and forth,
we can see, you know, sometimes it is down to 60%,
it is usually in the 70s or 80s.
Technically we're meeting our performance goals.
What are we doing -- what's the next thing to do
with this app or prototype?
We'll add additional features.
We know that we need more headroom than what we have here.
How do we make it faster, how do we make the app better
and meet performance goals?
We'll focus on this.
We'll filter to that data.
I'll give myself a little room.
In this case, I'm going to hold down the option key
and click main and expand this out.
I can actually go down here and I can see this method here.
You know, now that the drawing of the paths is fast enough,
it is the building of the path that becomes the bottleneck.
I want to focus on this method.
I will click the focus button.
That just moves aside everything outside of the method
and normalizes our percentages to within this method.
This method is spending 55% of time in the get next element
and 10, 11% of time in objc msgSend.
Something that I know, objc msgSend,
it is a super fast method.
It is super optimized.
But, if I can get that 10% back, I want it.
If we look inside of our code here we can see it is actually
Most of our time is spent on getting the next element.
This percentage here, it is a bit higher than in the tree view
because it includes the objc msgSend time.
If I get rid of that and make this iterator faster,
I hopefully can get the performance boost that I want.
To give us ideas on how much to do that, it is back to Chad.
CHAD WOOLF: Let's talk about objc msgSend a bit.
It implicitly gets inserted
by the compiler whenever you use the square bracket notation
or whenever you use the dot notation
to access a property on an object.
Its purpose is to look up the method implementation
for the selector and invoke that method.
That's a long way of saying that's how we do dynamic
dispatch in Objective-C.
Objc msgSend is extremely fast and does not push a stack frame.
When you look at your time profiles,
you typically won't see its effect.
The times that you do see it would be a perfect example
like we have where we see it in our iterator.
What we're doing, we're iterating over 100,000 points
and calling it the get next method with a small method body.
Just increments a couple values and returns a structure.
What's happening, all of that overhead time
from the Objective-C message send is accumulating
into something that's measurable.
Is there a way to get around that overhead?
Objective-C by design is a dynamic language,
you have to make the objc msgSend call
when accessing methods for objects and classes.
It does this every time because you can switch
out method implementations at runtime.
There is no compile time in Objective-C saying I want
to call in this particular method body.
The only exception here, if you do what's called method caching,
where you look up the method implementation yourself
and you call it through its function Pointer.
In general I don't recommend that you do that.
It is fragile as you can imagine.
In general, in my experience it hasn't given me the kind
of performance wins I'm expecting because you got
to think about why we're here in the first place.
The reason we're here, the get next element method has
that small method body.
Even if you invoke it through the function Pointer you still
have to have to marshal the arguments and push the frames
on to the stack and pop them when you're done.
That's exactly what you saw in the previous set of slides,
that can be a lot of overhead
with an increment and then return.
I want to make sure I point it out that method caching is not
as fast as inlining, what we really want in this case is
that small method body to be inlined.
How do you that in Objective-C?
Well, you have alternatives.
The first one, you could have used C
and you could have used structures instead
of an iterator, you could pass a C ray
into the method for example.
If you want that OO flavor you can use C++.
The way you use C++ in Objective-C,
is you rename the file from a .m to a .mm
and then you can use C++ syntax.
Because Arc is usually on by default,
then you take Objective-C objects and put them
in STL containers, and put them in instance variables
on your classes and structures.
This is handy and you get the performance benefits of C++
and I have used it extensively in Instruments to get
as much speed as I could out of the track view
and other critical areas of Instruments.
I know from firsthand experience also there is a major downside
That is that you have to know ahead of time which parts
of your code are going to benefit from C++
and which codes will benefit from Objective-C?
Sometimes, until you're doing profiling,
you can make a mistake there as we did in the demo case.
We wrote our iterator
in Objective-C not realizing it would show
up in our time profiles.
Is there a better alternative than what I just mentioned here?
There is. You knew it was coming.
Swift is ideal, because unlike Objective-C it is only dynamic
when it notes to be.
If you make sure that the performance critical classes are
internal and you use whole module optimization,
the compiler or the whole tool chain can determine
when there's only one method implementation
and inlines it right into the call site giving you some
significant wins especially on the iterator case.
Because we're prototyping, rewriting the iterator
in the View Controller in Swift is easy.
Kris has done that.
KRIS MARKEL: I have a Swift implementation ready to go,
this is an easy port of the Objective-C implementation
with a couple of the suggestions they made
in this morning's session about improving Swift code,
specifically turning on whole module optimization.
Let's profile this.
It will build.
Install to the device.
It should just start profiling.
Okay. I'm going to bring the application forward
so you can see.
Here is the scrolling.
There you go.
Staying nice and fast.
A lot of data to zoom out of.
Now, if I go ahead, move back and forth, it moves really fast.
KRIS MARKEL: It is very nice.
Thanks. Actually we can go ahead in here and look
at what the CPU usage is.
You know, we've got, actually further improvement
than what we expected, we're expecting 5, 6% improvement
from removing the objc msgSend, this is lower and I can actually
if I turn down this disclosure triangle you see the two runs
next to each other and you see the previous run in the lower --
the current run, it is clearly much lower.
Actually if I go ahead, I go in here,
I look for my build path method I have to search for it now, oh,
that's not how you do a search.
If I hit command-F, it will bring up this dialogue here.
I can type in build path and it shows me my method here.
If we look at this, you see here is my Swift code.
My get next call which is right here,
it isn't showing up in any samples.
You know, no samples landed on this.
Why? It is because Swift was able to inline it,
whip means that there's no function overhead,
no method call overhead whatsoever.
Because the code for the iterator is inline with the rest
of the code it makes further improvements explaining the
better performance than what we hoped
for by skipping the dynamic dispatch.
Chad, anything else to tell these nice people?
CHAD WOOLF: We have 5 minutes!
Of course I do.
Some tips for you to explore Instruments on your own.
The first one to point out, under the recording settings,
it's called record waiting threads.
I mentioned that the service
and kernel we use samples the active CPUs
but if you have threads sitting around, blocked on a lock
or waiting for I/O, you check this checkbox,
and the service will sample the idle threads as well.
If you have code that's contending
over a lock you see the hot pods show
up when you enable record waiting threads.
Another thing that I find interesting,
in the display settings, in the call tree section,
it is invert call tree.
Figuratively what it does, it flips the call tree upside down.
Instead of seeing the leafs at the bottom nodes of the tree,
that's functions that don't call into anything,
they appear at the top.
If a utility function is being called from 5, 6 places,
you invert that call tree to see who is actually calling
into that particular function.
It gives you a different perspective
on the data in the call tree.
When you right click a node in the call tree,
you get a context menu and there is interesting stuff
under there as well.
One thing that I use from time to time is the charge to caller,
so what you can do, you can charge a function
on method to the caller.
You can charge an entire Library of Framework to the caller.
There's an option in there as well to prune a subtree.
You don't want to work on a specific problem at the time,
you can prune that out of the data and focus
on what you want to focus on.
What did we learn?
Through all of this, the first things I want
to remind you about,
is to incorporate performance targets early.
If you're doing a big performance rewrite
like we were, start with a budget and keep monitoring it
because once you start layering a lot of code on top of that,
it is harder to change.
Secondly, always measure.
of our demonstrations we were taking time profiles
with the time profiler, we used that data to find the hot spots
and retuning it, by the end we had a
If you go in blind, depends, you may be lucky, hit it.
I would start with a measurement and go after that first.
In the third one, this is most important to me,
I want to encourage you to keep digging.
Some of these performance problems you look
at at first may have seemed impassable,
you say that's happening in someone else's code
or a side effect of the runtime.
The reason we gave you the details on the time profiler,
the runtimer, what the disassembly view looks like,
is we wanted to show you there is a whole world
out there of more details.
By using that, you can solve the performance problems
that we were solving today.
I encourage you to keep digging, and with creativity,
going out there into sessions like this,
I know you guys will be able to fix
and hit the performance targets you want.
That just about does it for today.
Stefan Lesser is our dev tools evangelist,
contact him if you have any questions.
We related sessions, Energy debugging issues, turns out that
if you have efficient code on the CPU it uses less energy.
There was that session on Wednesday.
Also tomorrow, there is performance on iOS and Watch OS,
good news, the time profiler can target apps on the watch.
That's a big boon.
Tomorrow we'll be in labs from 9:00 to 2:00.
Enjoy the rest of your conference.
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.