开发者

Does a line profiler for code require a parse tree and is that sufficient?

开发者 https://www.devze.com 2023-04-02 07:42 出处:网络
I am tryin开发者_Python百科g to determine what is necessary to write a line profiler for a language, like those available for Python and Matlab.

I am tryin开发者_Python百科g to determine what is necessary to write a line profiler for a language, like those available for Python and Matlab.

A naive way to interpret "line profiler" is to assume that one can insert time logging around every line, but the definition of a line is dependent on how a parser handles whitespace, which is only the first problem. It seems that one needs to use the parse tree and insert timings around individual nodes.

Is this conclusion correct? Does a line profiler require the parse tree, and is that all that is needed (beyond time logging)?


Update 1: Offering a bounty on this because the question is still unresolved.

Update 2: Here is a link for a well known Python line profiler in case it is helpful for answering this question. I've not yet been able to make heads or tails of it's behavior relative to parsing. I'm afraid that the code for the Matlab profiler is not accessible.

Also note that one could say that manually decorating the input code would eliminate a need for a parse tree, but that's not an automatic profiler.

Update 3: Although this question is language agnostic, this arose because I am thinking of creating such a tool for R (unless it exists and I haven't found it).

Update 4: Regarding use of a line profiler versus a call stack profiler - this post relating to using a call stack profiler (Rprof() in this case) exemplifies why it can be painful to work with the call stack rather than directly analyze things via a line profiler.


I'd say that yes, you require a parse tree (and the source) - how else would you know what constitutes a "line" and a valid statement?

A practical simplification though might be an "statement profiler" instead of a "line profiler". In R, the parse tree is readily available: body(theFunction), so it should be fairly easy to insert measuring code around each statement. With some more work you can insert it around a group of statements that belong to the same line.

In R, the body of a function loaded from a file typically also has an attribute srcref that lists the source for each "line" (actually each statement) :

Here's a sample function (put in "example.R"):

f <- function(x, y=3)
{
    a <- 0; a <- 1  # Two statements on one line
    a <- (x + 1) *  # One statement on two lines
        (y + 2)

    a <- "foo       
        bar"        # One string on two lines
}

Then in R:

source("example.R")
dput(attr(body(theFunction), "srcref"))

Which prints this line/column information:

list(structure(c(2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L), srcfile = <environment>, class = "srcref"), 
    structure(c(3L, 2L, 3L, 7L, 9L, 14L, 3L, 3L), srcfile = <environment>, class = "srcref"), 
    structure(c(3L, 10L, 3L, 15L, 17L, 22L, 3L, 3L), srcfile = <environment>, class = "srcref"), 
    structure(c(4L, 2L, 5L, 15L, 9L, 15L, 4L, 5L), srcfile = <environment>, class = "srcref"), 
    structure(c(7L, 2L, 8L, 6L, 9L, 20L, 7L, 8L), srcfile = <environment>, class = "srcref"))

As you can "see" (the last two numbers in each structure are begin/end line), the expressions a <- 0 and a <- 1 map to the same line...

Good luck!


It sounds like what you mean by line profiler is something that measures time spent (i.e. instrumenting) within each line. I hope what you mean by time is wall-clock time, because in real good-size software if you only look at CPU time you're going to be missing a lot.

Another way to do it is stack-sampling on wall-clock time, as in the Zoom and LTProf profilers. Since every line of a stack sample can be localized to a line of code using only a map or pdb file, in the same way as debuggers do, there is no need to parse or modify the source.

The percent of time taken by a line of code is simply the percent of stack samples containing it. Since you are working at the line level, there is no need to distinguish between exclusive (self) time and inclusive time. This is because the line's percent of time active is what matters, whether or not it is a call to another function, a call to a blind system function, or just a call to microcode.

The advantage of looking at percents, instead of absolute times, is you don't need to worry about the app being slowed down, either by the sampling itself, or by competition with other processes, because those things don't affect the percents very much.

Also you don't have to worry about recursion. If a line of code is in a recursive function and appears more than once on a sample, that's OK. It still counts as only one sample containing the line. The reason that's OK is, if that line of code could somehow be made to take no time (such as by removing it) that sample would not have occurred. Therefore the samples containing that line would be removed from the sample set, and the program's total time would decrease by the same amount as the fraction of samples removed. That's irrespective of recursion.

You also don't need to count how many times a line of code is executed, because the number that matters for locating code you should optimize is the percent of time it's active.

Here's more explanation of these issues.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号