dataframe - r remove rows from a data frame that contain a duplicate of either combination of 2 columns -
this question has answer here:
i trying remove rows data frame contain either combination of 2 columns. example, following code:
vct <- c("a", "b", "c") <- b <- vct combo <- expand.grid(a,b) #generate posible combinations combo <- combo[!combo[,1] == combo[,2],] #removes rows matching column
generates data frame:
var1 var2 2 b 3 c 4 b 6 c b 7 c 8 b c
how can remove rows duplicates of combination of 2 columns, i.e. #4 b removed because #2 b present? resulting data frame this:
var1 var2 2 b 3 c 4 c b
we can sort
row
using apply
margin=1
, transpose (t
) output, use duplicated
logical index of duplicate rows, negate (!
) rows not duplicated, , subset dataset.
combo[!duplicated(t(apply(combo, 1, sort))),] # var1 var2 #2 b #3 c #6 c b
Comments
Post a Comment