Dropping columns/variables based on count of missing in Stata -
i have large dataset looks 1 below. drop variables (not observations/rows) have less 3 observations in rows. in case variable x1
needs dropped.
i apologise if asking obvious, however, @ point not have clue on how proceed this.
+-----+-----+-----+-----+-----+ | id | x1 | x2 | x3 | x4 | +-----+-----+-----+-----+-----+ | 1 | . | 1 | 1 | 2 | | 2 | . | 2 | 2 | 3 | | 3 | . | 3 | 1 | . | | 4 | 1 | . | 3 | 1 | | 5 | . | 2 | 4 | 3 | | 6 | 2 | 3 | . | . | |total| 2 | 5 | 5 | 4 | +-----+-----+-----+-----+-----+
my interpretation want drop variables have @ least 3 missing values.
you can use nmissing
, ssc (ssc install nmissing
):
clear set more off input /// x y z . . 5 . 6 8 4 . 9 . . 1 5 . . end list nmissing, min(3) drop `r(varlist)'
if interpretation incorrect, check help
nmissing
, npresent
. syntax flexible enough.
edit
a re-interpretation. want drop variables don't have @ least 3 non-missing values:
clear set more off input /// id x1 x2 x3 x4 1 . 1 1 2 2 . 2 2 3 3 . 3 1 . 4 1 . 3 1 5 . 2 4 3 6 2 3 . . end list, sep(0) npresent, min(3) keep `r(varlist)' describe
Comments
Post a Comment