Tracking initial memory usage by file in Ruby
I’ve found that as a project progresses, the initial memory usage of a Rails application seems to grow more and more.
The more time I spend trying to track down memory leaks (or just pieces of code that use more memory than they should) the more I realize that it’s a fairly imprecise science. I’ve had the best luck using tools to give me a good idea of where to start poking around. From there it’s just a matter of looking at the code and finding what silly things people are doing.
If I wanted to see what was contributing to the large memory footprint of an application on startup, tracking how much memory was allocated during each require
would give me a good place to start.
The code to do this is amazingly strait forward:
module RequireTracking
def require(*args)
start_size = GC.allocated_size
super
ensure
$require_stats[args.first] += (GC.allocated_size - start_size)
end
end
Object.send(:include, RequireTracking)
Kernel.send(:include, RequireTracking)
The entire implementation is available as a gist on GitHub.
The GC.allocated_size
method is included in the RailsBench GC patch which is part of the Ruby Enterprise Edition interpreter.
One thing to understand is this is only tracking how much memory was allocated but not how much was freed. This will cause these statistics to include memory that was temporarily allocated and then no longer referenced. This can be useful because even temporarily using lots of memory can negatively impact startup time.
Another aspect to understand is the numbers we are tracking are what are normally called “self + children” in profilers. This means that all memory allocated by a file as well as anything allocated by files that are required from it are included in the statistics. This results in the same memory being counted multiple times, but is useful in understanding the total memory implications of requiring a file.
Running this on one of the projects I was working on found this little gem:
class Webster
DICTIONARY = File.open(File.join(File.dirname(__FILE__), 'words')) do |file|
file.readlines.collect {|each| each.chomp}
end
def random_word
DICTIONARY[rand(DICTIONARY.size)]
end
end
You can find the source on GitHub here.
This would be a prime candidate for refactoring if you are worried about your memory usage.