python - Fast method for removing duplicate columns in pandas.Dataframe -


so using

df_ab = pd.concat([df_a, df_b], axis=1, join='inner') 

i dataframe looking this:

          b    b 0   5    5   10   10 1   6    6   19   19 

and want remove multiple columns:

        b 0   5    10 1   6    19 

because df_a , df_b subsets of same dataframe know rows have same values if column name same. have working solution:

df_ab = df_ab.t.drop_duplicates().t 

but have number of rows 1 slow. have faster solution? prefer solution explicit knowledge of column names isn't needed.

you may use np.unique indices of unique columns, , use .iloc:

>>> df       b   b 0  5  5  10  10 1  6  6  19  19 >>> _, = np.unique(df.columns, return_index=true) >>> df.iloc[:, i]      b 0  5  10 1  6  19 

Comments

Popular posts from this blog

php - Admin SDK -- get information about the group -

dns - How To Use Custom Nameserver On Free Cloudflare? -

Python Error - TypeError: input expected at most 1 arguments, got 3 -