开发者

Inverse of which

开发者 https://www.devze.com 2023-04-10 15:08 出处:网络
Am I missing something obvious here? It appears the inverse function of which is missing from base R (googling and even a search on SO for \"R inverse which\" returns a myriad of unrelated links)?

Am I missing something obvious here? It appears the inverse function of which is missing from base R (googling and even a search on SO for "R inverse which" returns a myriad of unrelated links)?

Well, not that I can't write one, but just to relieve my frustration with it being missing and as an R-muscle flexing challenge: how would you go about writing one?

What we need is a function like:

invwhich<-function(indices, totlength)

that returns a logical vector of length totlength where each element in indices is TRUE and开发者_运维知识库 the rest is FALSE.

There's bound to be a lot of ways of accomplishing this (some of which are really low hanging fruit), so argue why your solution is 'best'. Oneliner anyone?

If it takes into account some of the other parameters of which (arr.ind??), that's obviously even better...


One-liner solution:

invwhich <- function(indices, totlength) is.element(seq_len(totlength), indices)

invwhich(c(2,5), 10)
[1] FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE


My own solution (for now): EDIT as per @Marek's suggestion.

invwhich<-function(indices, outlength, useNames = TRUE)
{
    rv<-logical(outlength)
    #rv<-rep(FALSE, outlength) #see Marek's comment
    if(length(indices) > 0)
    {
        rv[indices]<-TRUE
        if(useNames) names(rv)[indices]<-names(indices)
    }
    return(rv)
}

It performs very well (apparently better than @Andrie's oneliner) and, in as much as possible, accounts for useNames. But is it possible to make this into a oneliner?

wrt performance, I simply use:

someindices<-sample(1000000, 500000, replace=FALSE)
system.time(replicate(100, tmp<-invwhich(someindices, 1000000)))

as a very lo-fi performance measurement.


Another variant, to make a oneliner:

lWhich <- function(indices, totlength, vec = vector(length = totlength)){vec[indices] <- TRUE; return(vec)}

I'd prefer different names, for brevity:

lWhich <- function(ix, len, vec = vector(length = len)){vec[ix] <- TRUE; return(vec)}

Or, using the bit package:

lWhichBit <- function(ix, len){return(as.logical(bitwhich(len, x = ix, poslength = length(ix))))}

Surprisingly, that seems slow. It turns out that the code uses rep in some places. :(

This is a job for Rcpp or compile! :)


Overkill version working with all kinds of indices:

#' Logical which
#' 
#' Inverse of \link[base]{which}.
#' Converts an array of any indices to a logical index array.
#' 
#' Either \code{nms} or \code{len} has to be specified.
#' 
#' @param idx       Numeric or character indices.
#' @param nms       Array of names or a sequence.
#'                  Required if \code{idx} is a character array
#' @param len       Length of output array.
#'                  Alternative to \code{nms} if \code{idx} is numeric
#' @param useNames  Use the names of nms or idx
#' 
#' @examples
#' all(lWhich(2, len = 3) == c(F, T, F))
#' all(lWhich(c('a', 'c'), letters[1:3]) == c(T, F, T))
#' 
#' @export
lWhich <- function(idx, nms = seq_len(len), len = length(nms), useNames = TRUE) {
    rv <- logical(len)
    if (is.character(nms)) # we need names here so that rv[idx] works
        names(rv) <- nms

    if (useNames && !is.null(names(idx)))
        names(rv)[idx] <- names(idx)

    rv[idx] <- TRUE

    if (!useNames) # if we don’t want names, we’ll remove them again
        names(rv) <- NULL
    rv
}


setdiff(1:total_length, indices)


Just use which() but match the condition to FALSE:

# which set indices are missing in the subset 
which(FALSE == (subset %in% set))
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号