Let's say I have a data.frame like:
x <- c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10)
df <- data.frame(x=x,y=rnorm(100))
and I want to label values that are sorted (descending) in the 80th percentile for each value of x (1:10). I can get the quantiles and order the data, without issue like this:
df <- ddply(df, .(x), subset, y > quantile(y,0.8))
df <-开发者_Go百科 df[with(df, order(x,-y)),]
Now, how could I get ddply to add a column of labels (1,2,3,...n) in a new column of the data.frame for each sorted subset? I can do this now with a for loop by counting nrow(df["x"]), but that seems to lack any sense of eloquence.
Note: This question is a build up from and related to: Creating multiple subsets all in one data.frame (possibly with ddply)
df <- ddply(df, "x", transform, id = rank(y))
Or, if already sorted:
df <- ddply(df, "x", transform, id = seq_along(y))
Maybe this function produces what you want:
subno <- function(df, vars, offset=1) {
    id <- do.call("paste", df[,vars, drop=FALSE])
    nr <- seq(along.with=id)
    grpnr <- nr
    grpnr[c(FALSE, id[-1] == id[-length(id)])] <- 0
    subnr <- nr - cummax(grpnr) + offset
    return(subnr)
}
df$label <- subno(df, c('x'))
This function expects a sorted dataframe and vars contains the variable names on which to group.
 
         
                                         
                                         
                                         
                                        ![Interactive visualization of a graph in python [closed]](https://www.devze.com/res/2023/04-10/09/92d32fe8c0d22fb96bd6f6e8b7d1f457.gif) 
                                         
                                         
                                         
                                         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论