开发者

Subset a dataframe based on other vector

开发者 https://www.devze.com 2023-04-03 17:06 出处:网络
Here is my question, the first data frame is output inside a function which would be applied to bigger dataframe2, to subset it.

Here is my question, the first data frame is output inside a function which would be applied to bigger dataframe2, to subset it.

# dataframe1 
loc <- c(paste('Loc', 1:9, sep = ''))
开发者_运维百科qit <- c(13, 27, 16,  14,  15,  21,  12,  11,  8)

mydf <- data.frame(loc, qit)
 loc qit
1 Loc1  13
2 Loc2  27
3 Loc3  16
4 Loc4  14
5 Loc5  15
6 Loc6  21
7 Loc7  12
8 Loc8  11
9 Loc9   8

#dataframe 2
loc <- c(paste('Loc', 1:9, sep = ''))
 vloc <- c(rep(loc, each=2))
 allele <- c(
  13, 12, 27, 20, 16, 18, 
  14, 17, 15, 22, 21, 26, 
  12, 14, 11, 18,  8, 24
  )
  afreq <- c( 0.308, 0.4, 0.041, 0.5, 0.125, 0.5,
             0.139, 0.2, 0.219, 0.2,0.176, 0.33,
             0.358, 0.4, 0.274, 0.5, 0.173, 0.15)   
 loctab <- data.frame(vloc, allele, afreq)

   vloc allele afreq
1  Loc1     13 0.308
2  Loc1     12 0.400
3  Loc2     27 0.041
4  Loc2     20 0.500
5  Loc3     16 0.125
6  Loc3     18 0.500
7  Loc4     14 0.139
8  Loc4     17 0.200
9  Loc5     15 0.219
10 Loc5     22 0.200
11 Loc6     21 0.176
12 Loc6     26 0.330
13 Loc7     12 0.358
14 Loc7     14 0.400
15 Loc8     11 0.274
16 Loc8     18 0.500
17 Loc9      8 0.173
18 Loc9     24 0.150

What I want to make new dataframe like mydf with additional afreq variable from dataframe2. I tried to subset it:

loctab[loctab$allele %in%  mydf$qit, ]

  vloc allele afreq
1  Loc1     13 0.308
2  Loc1     12 0.400
3  Loc2     27 0.041
5  Loc3     16 0.125
7  Loc4     14 0.139
9  Loc5     15 0.219
11 Loc6     21 0.176
13 Loc7     12 0.358
14 Loc7     14 0.400
15 Loc8     11 0.274
17 Loc9      8 0.173 

I did not get what I want. Here subset doesnot care about the vloc or loc variable. In this whenever it gets a match for all values in qit, will subset it. Is there anyway to subset by putting reference to loc or vloc.


Maybe the merge() function is what you're looking for:

mydf2 <- merge(mydf,loctab,by.x = "qit", by.y = "allele")

You end up with 4 columns, but can then just get rid of the extra "vloc" column.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号