开发者

What is the practical use of the identity function in R?

开发者 https://www.devze.com 2023-03-28 20:07 出处:网络
Base R defines an identity function, a trivial identity function returning its argument (quoting from ?identity).

Base R defines an identity function, a trivial identity function returning its argument (quoting from ?identity).

It is defined as :

identity <- function (x){x}

Why would s开发者_JAVA技巧uch a trivial function ever be useful? Why would it be included in base R?


Don't know about R, but in a functional language one often passes functions as arguments to other functions. In such cases, the constant function (which returns the same value for any argument) and the identity function play a similar role as 0 and 1 in multiplication, so to speak.


I use it from time to time with the apply function of commands.

For instance, you could write t() as:

dat <- data.frame(x=runif(10),y=runif(10))
apply(dat,1,identity)

       [,1]      [,2]      [,3]      [,4]      [,5]      [,6]       [,7]
x 0.1048485 0.7213284 0.9033974 0.4699182 0.4416660 0.1052732 0.06000952
y 0.7225307 0.2683224 0.7292261 0.5131646 0.4514837 0.3788556 0.46668331
       [,8]      [,9]      [,10]
x 0.2457748 0.3833299 0.86113771
y 0.9643703 0.3890342 0.01700427


One use that appears on a simple code base search is as a convenience for the most basic type of error handling function in tryCatch.

tryCatch(...,error = identity)

which is identical (ha!) to

tryCatch(...,error = function(e) e)

So this handler would catch an error message and then simply return it.


For whatever it's worth, it is located in funprog.R (the functional programming stuff) in the source of the base package, and it was added as a "convenience function" in 2008: I can imagine (but can't give an immediate example!) that there would be some contexts in the functional programming approach (i.e. using Filter, Reduce, Map etc.) where it would be convenient to have an identity function ...

r45063 | hornik | 2008-04-03 12:40:59 -0400 (Thu, 03 Apr 2008) | 2 lines

Add higher-order functions Find() and Position(), and convenience
function identity().


Stepping away from functional programming, identity is also used in another context in R, namely statistics. Here, it is used to refer to the identity link function in generalized linear models. For more details about this, see ?family or ?glm. Here is an example:

> x <- rnorm(100)
> y <- rpois(100, exp(1+x))
> glm(y ~x, family=quasi(link=identity))

Call:  glm(formula = y ~ x, family = quasi(link = identity))

Coefficients:
(Intercept)            x
      4.835        5.842

Degrees of Freedom: 99 Total (i.e. Null);  98 Residual
Null Deviance:      6713
Residual Deviance: 2993         AIC: NA

However, in this case parsing it as a string instead of a function will achieve the same: glm(y ~x, family=quasi(link="identity"))

EDIT: As noted in the comments below, the function base::identity is not what is used by the link constructor, and it is just used for parsing the link name. (Rather than deleting this answer, I'll leave it to help clarify the difference between the two.)


Here is usage example:

    Map<Integer, Long> m = Stream.of(1, 1, 2, 2, 3, 3)
            .collect(Collectors.groupingBy(Function.identity(),
                    Collectors.counting()));
    System.out.println(m);
    output:
    {1=2, 2=2, 3=2}

here we are grouping ints into a int/count map. Collectors.groupingBy accepts a Function. In our case we need a function which returns the argument. Note that we could use e->e lambda instead


I just used it like this:

fit_model <- function(lots, of, parameters, error_silently = TRUE) {

  purrr::compose(ifelse(test = error_silently, yes = tryNA, no = identity),
                 fit_model_)(lots, of, parameters)
}

tryNA <- function(expr) {
  suppressWarnings(tryCatch(expr = expr,
                            error = function(e) NA,
                            finally = NA))
}


As this question has already been viewed 8k times it maybe worth updating even 9 years after it has been written.

In a blog post called "Simple tricks for Debugging Pipes (within magrittr, base R or ggplot2)" the author points out how identity() can be very usefull at the end of different kinds of pipes. The blogpost with examples can be found here: https://rstats-tips.net/2021/06/06/simple-tricks-for-debugging-pipes-within-magrittr-base-r-or-ggplot2/

If pipe chains are written in a way, that each "pipe" symbol is at the end of a line, you can exclude any line from execution by commenting it out. Except for the last line. If you add identity() as the last line, there will never be a need to comment that out. So you can temporarily exclude any line that changes the data by commenting it out.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号