Tracking initial memory usage by file in Ruby

I’ve found that as a project progresses, the initial memory usage of a Rails application seems to grow more and more.

The more time I spend trying to track down memory leaks (or just pieces of code that use more memory than they should) the more I realize that it’s a fairly imprecise science. I’ve had the best luck using tools to give me a good idea of where to start poking around. From there it’s just a matter of looking at the code and finding what silly things people are doing.

If I wanted to see what was contributing to the large memory footprint of an application on startup, tracking how much memory was allocated during each require would give me a good place to start.

The code to do this is amazingly strait forward:

module RequireTracking
  def require(*args)
    start_size = GC.allocated_size
    super
  ensure
    $require_stats[args.first] += (GC.allocated_size - start_size)
  end
end
 
Object.send(:include, RequireTracking)
Kernel.send(:include, RequireTracking)

The entire implementation is available as a gist on GitHub.

The GC.allocated_size method is included in the RailsBench GC patch which is part of the Ruby Enterprise Edition interpreter.

One thing to understand is this is only tracking how much memory was allocated but not how much was freed. This will cause these statistics to include memory that was temporarily allocated and then no longer referenced. This can be useful because even temporarily using lots of memory can negatively impact startup time.

Another aspect to understand is the numbers we are tracking are what are normally called “self + children” in profilers. This means that all memory allocated by a file as well as anything allocated by files that are required from it are included in the statistics. This results in the same memory being counted multiple times, but is useful in understanding the total memory implications of requiring a file.

Running this on one of the projects I was working on found this little gem:

class Webster
  DICTIONARY = File.open(File.join(File.dirname(__FILE__), 'words')) do |file|
    file.readlines.collect {|each| each.chomp}
  end
  
  def random_word
    DICTIONARY[rand(DICTIONARY.size)]
  end
end

You can find the source on GitHub here.

This would be a prime candidate for refactoring if you are worried about your memory usage.

Posted Wednesday, December 30 2009 (∞).

written by Eric Lindvall

I also appear on the internet on GitHub and Twitter as @lindvall and work hard to make Papertrail awesome.

themed by Adam Lloyd.