How to copy a character string in a dataframe row to all subsequent rows sharing the same ID?

0

Issue

Suppose we start with the data frame df shown below:

  ID Flag
1  1 NULL
2  1 NULL
3  1  FRY
4  1  CRY
5  1 NULL
6  5  CRY
7  5 NULL
8  5 NULL

ID <- c(1, 1, 1, 1,1, 5, 5, 5)
  Flag <- c("NULL","NULL","FRY","CRY","NULL","CRY","NULL","NULL")
  df <- data.frame(ID, Flag)
  df

I would like to change the "Flag" column so that the first time a Flag row is not "NULL" for a given ID, then that non-NULL item is copied down to all remaining rows for that same ID. So we would end up with the following data frame:

  ID Flag  [Explain]
1  1 NULL   
2  1 NULL
3  1  FRY   First row for ID 1 where Flag <> NULL, so apply row 3 FRY to all subsequent rows for ID 1
4  1  FRY   Override original row 4 CRY since FRY came first
5  1  FRY   FRY rules for all remaining ID = 1 rows
6  5  CRY   First row for ID 5 where Flag <> NULL, so apply row 1 CRY to all subsequent rows for ID 5
7  5  CRY   CRY rules for all remaining ID = 5 rows
8  5  CRY

How would this be accomplished using dplyr? I’ve been fiddling with group(), fill(), coalesce(), but am stumbling.

Solution

Using tidyr::fill and some additional data wrangling you could do:

library(dplyr)
library(tidyr)

df %>% 
  group_by(ID) %>% 
  mutate(Flag = ifelse(Flag != "NULL", first(Flag[Flag != "NULL"]), NA_character_)) %>% 
  fill(Flag) %>% 
  replace_na(list(Flag = "NULL")) %>% 
  ungroup()
#> # A tibble: 8 × 2
#>      ID Flag 
#>   <dbl> <chr>
#> 1     1 NULL 
#> 2     1 NULL 
#> 3     1 FRY  
#> 4     1 FRY  
#> 5     1 FRY  
#> 6     5 CRY  
#> 7     5 CRY  
#> 8     5 CRY

Answered By – stefan

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave A Reply

Your email address will not be published.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More