I just watched Sam Saffron's "Measuring Ruby" talk at Golden Gate Ruby Conference 2013. I wanted to take a few notes, since I don't know a lot about performance tuning, and thought I'd publish them rather than keep them to myself. These are pretty unpolished; obviously, watch the talk if you want more.
-
The quote about premature optimization being evil is taken out of context. Knuth was just saying that we should measure, not guess. In fact, he wanted all compilers to give us constant feedback on what parts of the code were slow. We still don't have that.
-
Measure your front-end first; that's generally the slowest. Webpagetest.org, Yslow, etc are good tools for that.
-
For Ruby, check out rbtrace.
-
Apart from any tools, here's a quick and dirty measurement of a method:
def print_duration(name)
start = Time.now
result = yield
total = (Time.now - start) * 1000
puts “#{name}: #{total.to_i)ms”
result
end
alias_method :some_method_old, :some_method
def some_method(*args, &blk)
print_duration(“some description”) do
some_method(*args, &blk)
end
end
-
rack-mini-profiler is another good tool. Saffron has it hooked into Discourse so that every time he loads a page, he sees, embedded into the page, the amount of time spent building the page on the server, with breakdowns for the action, the rendering, etc. Also shows client-side times (page painting, dom loading, etc) and has a share link to share with the team.
-
Also shows the SQL statements and the time gaps between them. Look for queries that shouldn't run at all or take too long. Look for too much data selected and for N+1 problems.
-
Hidden features:
pp
is expert mode. Flamegraphs gather the entire stacktrace and lay it side-by-side, colorcoded. Moving vertically shows you, for example, what's being done above (before) it runs SQL. -
You can exclude certain requests; things that happen a lot and are short should probably be excluded.
-
Try memory_profiler gem with ruby-head (currently). See retained objects by gem to see what's eating memory. Fewer objects also means GC has less work to do and can run faster. More objects in memory means a slower app and higher hardware requirements for hosting and higher hosting costs.
-
Regarding queries, ask: Do I need this at all? Can I cache it? Can I join or include to reduce the number of queries?
-
Our intuitions are often wrong. He measured exceptions at 0.3ms each. That's cheap unless you have a lot of them, like if you're using them for control flow in a loop.
You can find them with:
exceptions = []
TracePoint.new(:raise) do |tp
exceptions << tp.raised_exception
end
trace.enable
-
Apache Benchmark is another tool for an outside view of speed.
ab -n 100 http://mysite.com/page
breaks down request speed by percentile. -
Sometimes just seeing that a gem eats a lot of memory, putting 'require: false' in your Gemfile, and requiring it in the rarely-run bit of code that needs it is a big help.
Wrapup
-
Audit memory usage using
memory_profiler
-
Run
rack_mini_profiler
to integrate performance metrics into dev and production -
Use
flamegraphs
to find your slowest code