How to replace a column in R by a modified column, dependent on filtered values? (removing outliers in panel data)

0

Issue

I have a panel dataset that goes like this

year id treatment_year time_to_treatment outcome
2000 1 2011 -11 2
2002 1 2011 -10 3
2004 2 2015 -9 22

and so on and so forth. I am trying to deal with the outliers by ‘Winsorize’. The end goal is to make a scatterplot with time_to_treatment on the X axis and outcome on the Y.

I would like to replace the outcomes for each time_to_treatment by its winsorized outcomes, i.e. replace all extreme values with the 5% and 95% quantile values.
So far what I have tried to do is this but it doesn’t work.

for(i in range(dataset$time_to_treatment)){
    dplyr::filter(dataset, time_to_treatment == i)$outcome <-  DescTools::Winsorize(dplyr::filter(dataset,time_to_treatment==i)$outcome)
}

I get the error – Error in filter(dataset, time_to_treatment == i) <- *vtmp* :
could not find function "filter<-"

Would anyone able to give a better way?
Thanks.


my actual data
where: conflicts = outcome, commission = year of treatment, CD_mun = id.

The concerned time period indicator is time_to_t

Groups: year, CD_MUN, type [6]

type CD_MUN year time_to_t conflicts commission
chr dbl dbl dbl int dbl
manif 1100023 2000 -11 1 2011
manif 1100189 2000 -3 2 2003
manif 1100205 2000 -9 5 2009
manif 1500602 2000 -4 1 2004
manif 3111002 2000 -11 2 2011
manif 3147006 2000 -10 1 2010

Solution

For a start you may use this:

# The data
set.seed(123)
df <- data.frame(
  time_to_treatment = seq(-15, 0, 1),
  outcome = sample(1:30, 16, replace=T)
)

# A solution without Winsorize based solely on dplyr
library(dplyr)
df %>% 
  mutate(outcome05 = quantile(outcome, probs = 0.05), # 5% quantile
         outcome95 = quantile(outcome, probs = 0.95), # 95% quantile
         outcome = ifelse(outcome <= outcome05, outcome05, outcome), # replace
         outcome = ifelse(outcome >= outcome95, outcome95, outcome)) %>% 
  select(-c(outcome05, outcome95))

You may adapt this to your exact problem.

Answered By – timm

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More