Tracking initial memory usage by file in Ruby

I’ve found that as a project progresses, the initial memory usage of a Rails application seems to grow more and more.

The more time I spend trying to track down memory leaks (or just pieces of code that use more memory than they should) the more I realize that it’s a fairly imprecise science. I’ve had the best luck using tools to give me a good idea of where to start poking around. From there it’s just a matter of looking at the code and finding what silly things people are doing.

If I wanted to see what was contributing to the large memory footprint of an application on startup, tracking how much memory was allocated during each require would give me a good place to start.

The code to do this is amazingly strait forward:

module RequireTracking
  def require(*args)
    start_size = GC.allocated_size
    super
  ensure
    $require_stats[args.first] += (GC.allocated_size - start_size)
  end
end

Object.send(:include, RequireTracking)
Kernel.send(:include, RequireTracking)

The entire implementation is available as a gist on GitHub.

The GC.allocated_size method is included in the RailsBench GC patch which is part of the Ruby Enterprise Edition interpreter.

One thing to understand is this is only tracking how much memory was allocated but not how much was freed. This will cause these statistics to include memory that was temporarily allocated and then no longer referenced. This can be useful because even temporarily using lots of memory can negatively impact startup time.

Another aspect to understand is the numbers we are tracking are what are normally called “self + children” in profilers. This means that all memory allocated by a file as well as anything allocated by files that are required from it are included in the statistics. This results in the same memory being counted multiple times, but is useful in understanding the total memory implications of requiring a file.

Running this on one of the projects I was working on found this little gem:

class Webster
  DICTIONARY = File.open(File.join(File.dirname(__FILE__), 'words')) do |file|
    file.readlines.collect {|each| each.chomp}
  end

  def random_word
    DICTIONARY[rand(DICTIONARY.size)]
  end
end

You can find the source on GitHub here.

This would be a prime candidate for refactoring if you are worried about your memory usage.

Posted Wednesday, December 30th, at 11:21 AM (∞).

Add NewRelic instrumentation for ThinkingSphinx

NewRelic provides a really great mechanism in their plugin to instrument just about anything.

One of the things I found when analyzing actions in NewRelic was that all of the time that was being spent in the ThinkingSphinx methods were being attributed to the template instead of the model. As we all know, mis-attribution of time spent can make tracking down trouble spots in your code much more difficult.

It ends up that all that is required to start tracking the time you are searching in ThinkingSphinx is a couple calls to add_method_tracer:

  add_method_tracer :search, 'ActiveRecord/#{self.name}/search'
  add_method_tracer :search, 'ActiveRecord/search', :push_scope => false
  add_method_tracer :search, 'ActiveRecord/all', :push_scope => false

You can see the full code here.

Once you’ve required the code, you’ll start to see the #search and #search_count methods show up in your Performance Breakdowns:

Performance Breakdown

Isn’t that sweet?

Update: NewRelic has some great documentation for Custom Metric Collection if you want to do more.

Posted Tuesday, December 29th, at 5:11 PM (∞).

themed by Adam Lloyd.

by Eric Lindvall

I also appear on the internet on GitHub and Twitter and work hard to make Papertrail awesome.