开发者

Remove rows in dataframe with factor ""

开发者 https://www.devze.com 2023-03-28 11:01 出处:网络
I have a dataframe like x where the column genes is a factor. I want to remove all the rows where column genes has nothing. So in table X I want to remove row 4. Is there a way to do this for a large

I have a dataframe like x where the column genes is a factor. I want to remove all the rows where column genes has nothing. So in table X I want to remove row 4. Is there a way to do this for a large dataframe?

X 
names   values   genes
1 A  0.2876113  EEF1A1 
2开发者_如何学Python B  0.6681894   GAPDH
3 C  0.1375420 SLC35E2
4 D -1.9063386        
5 E -0.4949905   RPS28

Finally result:

X 
names   values   genes
1 A  0.2876113  EEF1A1 
2 B  0.6681894   GAPDH
3 C  0.1375420 SLC35E2
5 E -0.4949905   RPS28

Thank you all!


It's not completely obvious from your question what the empty values are, but you should be able to adopt the solution below (here I assume the 'empty' values are empty strings):

toBeRemoved<-which(X$genes=="")
X<-X[-toBeRemoved,]


@Nick Sabbe provided a great answer, but it has one caveat:

Using -which(...) is a neat trick to (sometimes) speed up the subsetting operation when there are only a few elements to remove.

...But if there are no elements to remove, it fails!

So, if X$genes does not contain any empty strings, which will return an empty integer vector. Negating that is still an empty vector. And X[integer(0)] returns an empty data.frame!

toBeRemoved <- which(X$genes=="")
if (length(toBeRemoved>0)) { # MUST check for 0-length
    X<-X[-toBeRemoved,]
}

Or, if the speed gain isn't important, simply:

X<-X[X$genes!="",]

Or, as @nullglob pointed out,

subset(X, genes != "")
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号