开发者

Is there an equivalent of unix "comm" command in R?

开发者 https://www.devze.com 2023-02-15 19:47 出处:网络
I have one master file with a list of unique IDs and want to align three files with subsets of IDs alongside this, ending up with:

I have one master file with a list of unique IDs and want to align three files with subsets of IDs alongside this, ending up with: Column 1 (id1, id2, id3, id4 etc) Column 2 (space, id2, space, space) Column 3 (id1, id2, space space) Column 4 (id1, 开发者_如何学编程space id3 space) etc. I have a unique list in R and the "comm" command in unix seems to do this - is there an equivalent in R?


The structure of your data is not very clear, but if you start with the following vectors :

R> master <- paste("id",1:10,sep="")
R> sub1 <- paste("id",c(2,3,5),sep="")
R> sub2 <- paste("id",c(1,4,8,9),sep="")
R> master
[1] "id1"  "id2"  "id3"  "id4"  "id5"  "id6"  "id7"  "id8"  "id9"  "id10"
R> sub1
[1] "id2" "id3" "id5"
R> sub2
[1] "id1" "id4" "id8" "id9"

You can create a data frame from your master list of ids, and use these ids as row names :

R> df <- data.frame(master=master, row.names=master)
R> df
     master
id1     id1
id2     id2
id3     id3
id4     id4
id5     id5
id6     id6
id7     id7
id8     id8
id9     id9
id10   id10

Then you can add new columns for each subset the following way :

R> df[sub1, "sub1"] <- sub1
R> df[sub2, "sub2"] <- sub2

With the following result :

R> df
     master sub1 sub2
id1     id1 <NA>  id1
id2     id2  id2 <NA>
id3     id3  id3 <NA>
id4     id4 <NA>  id4
id5     id5  id5 <NA>
id6     id6 <NA> <NA>
id7     id7 <NA> <NA>
id8     id8 <NA>  id8
id9     id9 <NA>  id9
id10   id10 <NA> <NA>
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号