Identifying rows with uniform sequence while ignoring missing data in R

up vote
1
down vote

favorite

I'm working with panel data where the same variable is recorded multiple times to create a sequence of states. I only want to use observations that do not have uniform sequences but I am struggling to create a flag that would identify these while also not considering NAs as a different state.

I've created an example dataset to make things simple:

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5)

df



   ID         S1         S2         S3         S4         S5

1   1  Education  Education  Education  Education  Education

2   2 Employment Employment Employment Employment Employment

3   3  Education  Education         NA  Education  Education

4   4  Education Unemployed Unemployed Unemployed Unemployed

5   5  Education  Education  Education  Education  Education

6   6  Education  Education Employment Employment Employment

7   7  Education Employment Employment Employment Employment

8   8  Education  Education         NA         NA         NA

9   9  Education  Education  Education  Education  Education

10 10  Education  Education  Education  Education  Education

I'd ideally be able to flag or keep only observations ID=c("4", "6", "7").

I tried couple of approaches:

I tried counting the consecutive states but that doesn't account for the separate IDs

library(data.table)



setDT(df_long)

df_long[, employed := (S=="Employment")

   ][, e.length := with(rle(employed), rep(lengths,lengths))

     ][employed == 0, e.length := 0]



df_long[, education := (S=="Education")

        ][, edu.length := with(rle(education), rep(lengths,lengths))

          ][education == 0, edu.length := 0]

df_long

I've also tried manually creating a flag variable but that doesn't account for NAs and with the number of repeated observations in my dataset it is too manual/time-consuming

df$employed[df$S1=="Education" & df$S2=="Education" & df$S3=="Education" & df$S4=="Education" & df$S5=="Education"] <- 1

df$employed

Any help would be greatly appreciated.

asked Nov 8 at 10:59

Maria

Could also vectorize as follows which(rowSums((df[, 2] == df[, -(1:2)]) + (df[, -(1:2)] == "NA")) < 4) (but only if you create your data while specifying , stringsAsFactors = FALSE)
– David Arenburg
Nov 8 at 11:38

add a comment |

up vote
1
down vote

favorite

I've created an example dataset to make things simple:

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5)

df



   ID         S1         S2         S3         S4         S5

1   1  Education  Education  Education  Education  Education

2   2 Employment Employment Employment Employment Employment

3   3  Education  Education         NA  Education  Education

4   4  Education Unemployed Unemployed Unemployed Unemployed

5   5  Education  Education  Education  Education  Education

6   6  Education  Education Employment Employment Employment

7   7  Education Employment Employment Employment Employment

8   8  Education  Education         NA         NA         NA

9   9  Education  Education  Education  Education  Education

10 10  Education  Education  Education  Education  Education

I'd ideally be able to flag or keep only observations ID=c("4", "6", "7").

I tried couple of approaches:

I tried counting the consecutive states but that doesn't account for the separate IDs

library(data.table)



setDT(df_long)

df_long[, employed := (S=="Employment")

   ][, e.length := with(rle(employed), rep(lengths,lengths))

     ][employed == 0, e.length := 0]



df_long[, education := (S=="Education")

        ][, edu.length := with(rle(education), rep(lengths,lengths))

          ][education == 0, edu.length := 0]

df_long

I've also tried manually creating a flag variable but that doesn't account for NAs and with the number of repeated observations in my dataset it is too manual/time-consuming

df$employed[df$S1=="Education" & df$S2=="Education" & df$S3=="Education" & df$S4=="Education" & df$S5=="Education"] <- 1

df$employed

Any help would be greatly appreciated.

asked Nov 8 at 10:59

Maria

Could also vectorize as follows which(rowSums((df[, 2] == df[, -(1:2)]) + (df[, -(1:2)] == "NA")) < 4) (but only if you create your data while specifying , stringsAsFactors = FALSE)
– David Arenburg
Nov 8 at 11:38

add a comment |

up vote
1
down vote

favorite

I've created an example dataset to make things simple:

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5)

df



   ID         S1         S2         S3         S4         S5

1   1  Education  Education  Education  Education  Education

2   2 Employment Employment Employment Employment Employment

3   3  Education  Education         NA  Education  Education

4   4  Education Unemployed Unemployed Unemployed Unemployed

5   5  Education  Education  Education  Education  Education

6   6  Education  Education Employment Employment Employment

7   7  Education Employment Employment Employment Employment

8   8  Education  Education         NA         NA         NA

9   9  Education  Education  Education  Education  Education

10 10  Education  Education  Education  Education  Education

I'd ideally be able to flag or keep only observations ID=c("4", "6", "7").

I tried couple of approaches:

I tried counting the consecutive states but that doesn't account for the separate IDs

library(data.table)



setDT(df_long)

df_long[, employed := (S=="Employment")

   ][, e.length := with(rle(employed), rep(lengths,lengths))

     ][employed == 0, e.length := 0]



df_long[, education := (S=="Education")

        ][, edu.length := with(rle(education), rep(lengths,lengths))

          ][education == 0, edu.length := 0]

df_long

I've also tried manually creating a flag variable but that doesn't account for NAs and with the number of repeated observations in my dataset it is too manual/time-consuming

df$employed[df$S1=="Education" & df$S2=="Education" & df$S3=="Education" & df$S4=="Education" & df$S5=="Education"] <- 1

df$employed

Any help would be greatly appreciated.

asked Nov 8 at 10:59

Maria

I've created an example dataset to make things simple:

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5)

df



   ID         S1         S2         S3         S4         S5

1   1  Education  Education  Education  Education  Education

2   2 Employment Employment Employment Employment Employment

3   3  Education  Education         NA  Education  Education

4   4  Education Unemployed Unemployed Unemployed Unemployed

5   5  Education  Education  Education  Education  Education

6   6  Education  Education Employment Employment Employment

7   7  Education Employment Employment Employment Employment

8   8  Education  Education         NA         NA         NA

9   9  Education  Education  Education  Education  Education

10 10  Education  Education  Education  Education  Education

I'd ideally be able to flag or keep only observations ID=c("4", "6", "7").

I tried couple of approaches:

I tried counting the consecutive states but that doesn't account for the separate IDs

library(data.table)



setDT(df_long)

df_long[, employed := (S=="Employment")

   ][, e.length := with(rle(employed), rep(lengths,lengths))

     ][employed == 0, e.length := 0]



df_long[, education := (S=="Education")

        ][, edu.length := with(rle(education), rep(lengths,lengths))

          ][education == 0, edu.length := 0]

df_long

I've also tried manually creating a flag variable but that doesn't account for NAs and with the number of repeated observations in my dataset it is too manual/time-consuming

df$employed[df$S1=="Education" & df$S2=="Education" & df$S3=="Education" & df$S4=="Education" & df$S5=="Education"] <- 1

df$employed

Any help would be greatly appreciated.

r count sequence

asked Nov 8 at 10:59

Maria

asked Nov 8 at 10:59

Maria

asked Nov 8 at 10:59

Maria

asked Nov 8 at 10:59

Maria

asked Nov 8 at 10:59

Maria

Could also vectorize as follows which(rowSums((df[, 2] == df[, -(1:2)]) + (df[, -(1:2)] == "NA")) < 4) (but only if you create your data while specifying , stringsAsFactors = FALSE)
– David Arenburg
Nov 8 at 11:38

add a comment |

Could also vectorize as follows which(rowSums((df[, 2] == df[, -(1:2)]) + (df[, -(1:2)] == "NA")) < 4) (but only if you create your data while specifying , stringsAsFactors = FALSE)
– David Arenburg
Nov 8 at 11:38

Could also vectorize as follows which(rowSums((df[, 2] == df[, -(1:2)]) + (df[, -(1:2)] == "NA")) < 4) (but only if you create your data while specifying , stringsAsFactors = FALSE)
– David Arenburg
Nov 8 at 11:38

add a comment |

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

Its super easy:

df[df == "NA"] <- NA



df$keep <- lengths(apply(df[,-1],1, table)) > 1

#> which(df$keep)

#[1] 4 6 7

answered Nov 8 at 11:07

Andre Elrico

4,6971827

1

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

add a comment |

up vote
0
down vote

I had a similar solution, but without table:

df[df == "NA"] <- NA

df$to.keep <- apply(df[, -1], 1, function(x) {

  !any(is.na(x)) & length(unique(x)) > 1

})



> which(df$to.keep)

[1] 4 6 7

answered Nov 8 at 11:17

Gramposity

23614

please add S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education") to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

add a comment |

up vote
0
down vote

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5,S6)

Added S6 also from your comments where Andre answer not able to label it correctly

library(dplyr)

df[df == "NA"] <- NA



df$Flag_NA = ifelse(apply(df %>% select(-ID),1,function(x) any(is.na(x))),'No','Yes')

df$Flag_Uniform = ifelse(apply(df %>% select(-ID,-Flag_NA), 1, function(x)length(unique(x))) == 1,'No','Yes')

df = df %>% mutate(Flag_keep = ifelse(Flag_NA == Flag_Uniform,"Yes","No"))



df

   ID         S1         S2         S3         S4         S5         S6 Flag_NA Flag_Uniform Flag_keep

1   1  Education  Education  Education  Education  Education  Education     Yes           No        No

2   2 Employment Employment Employment Employment Employment Employment     Yes           No        No

3   3  Education  Education       <NA>  Education  Education  Education      No          Yes        No

4   4  Education Unemployed Unemployed Unemployed Unemployed Unemployed     Yes          Yes       Yes

5   5  Education  Education  Education  Education  Education  Education     Yes           No        No

6   6  Education  Education Employment Employment Employment Employment     Yes          Yes       Yes

7   7  Education Employment Employment Employment Employment Employment     Yes          Yes       Yes

8   8  Education  Education       <NA>       <NA>       <NA>        EMP      No          Yes        No

9   9  Education  Education  Education  Education  Education  Education     Yes           No        No

10 10  Education  Education  Education  Education  Education  Education     Yes           No        No

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53206365%2fidentifying-rows-with-uniform-sequence-while-ignoring-missing-data-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
0
down vote

accepted

Its super easy:

df[df == "NA"] <- NA



df$keep <- lengths(apply(df[,-1],1, table)) > 1

#> which(df$keep)

#[1] 4 6 7

answered Nov 8 at 11:07

Andre Elrico

4,6971827

1

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

add a comment |

up vote
0
down vote

accepted

Its super easy:

df[df == "NA"] <- NA



df$keep <- lengths(apply(df[,-1],1, table)) > 1

#> which(df$keep)

#[1] 4 6 7

answered Nov 8 at 11:07

Andre Elrico

4,6971827

1

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

add a comment |

up vote
0
down vote

accepted

Its super easy:

df[df == "NA"] <- NA



df$keep <- lengths(apply(df[,-1],1, table)) > 1

#> which(df$keep)

#[1] 4 6 7

answered Nov 8 at 11:07

Andre Elrico

4,6971827

Its super easy:

df[df == "NA"] <- NA



df$keep <- lengths(apply(df[,-1],1, table)) > 1

#> which(df$keep)

#[1] 4 6 7

answered Nov 8 at 11:07

Andre Elrico

4,6971827

answered Nov 8 at 11:07

Andre Elrico

4,6971827

answered Nov 8 at 11:07

Andre Elrico

4,6971827

answered Nov 8 at 11:07

Andre Elrico

4,6971827

1

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

add a comment |

1

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

That's amazing, thank you Andre
– Maria
Nov 8 at 11:12

add a comment |

up vote
0
down vote

I had a similar solution, but without table:

df[df == "NA"] <- NA

df$to.keep <- apply(df[, -1], 1, function(x) {

  !any(is.na(x)) & length(unique(x)) > 1

})



> which(df$to.keep)

[1] 4 6 7

answered Nov 8 at 11:17

Gramposity

23614

please add S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education") to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

add a comment |

up vote
0
down vote

I had a similar solution, but without table:

df[df == "NA"] <- NA

df$to.keep <- apply(df[, -1], 1, function(x) {

  !any(is.na(x)) & length(unique(x)) > 1

})



> which(df$to.keep)

[1] 4 6 7

answered Nov 8 at 11:17

Gramposity

23614

please add S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education") to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

add a comment |

up vote
0
down vote

I had a similar solution, but without table:

df[df == "NA"] <- NA

df$to.keep <- apply(df[, -1], 1, function(x) {

  !any(is.na(x)) & length(unique(x)) > 1

})



> which(df$to.keep)

[1] 4 6 7

answered Nov 8 at 11:17

Gramposity

23614

I had a similar solution, but without table:

df[df == "NA"] <- NA

df$to.keep <- apply(df[, -1], 1, function(x) {

  !any(is.na(x)) & length(unique(x)) > 1

})



> which(df$to.keep)

[1] 4 6 7

answered Nov 8 at 11:17

Gramposity

23614

answered Nov 8 at 11:17

Gramposity

23614

answered Nov 8 at 11:17

Gramposity

23614

answered Nov 8 at 11:17

Gramposity

23614

please add S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education") to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

add a comment |

please add S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education") to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

please add

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

please add

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

to the data.frame. You will see "your solution" will not work.
– Andre Elrico
Nov 8 at 11:27

add a comment |

up vote
0
down vote

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5,S6)

Added S6 also from your comments where Andre answer not able to label it correctly

library(dplyr)

df[df == "NA"] <- NA



df$Flag_NA = ifelse(apply(df %>% select(-ID),1,function(x) any(is.na(x))),'No','Yes')

df$Flag_Uniform = ifelse(apply(df %>% select(-ID,-Flag_NA), 1, function(x)length(unique(x))) == 1,'No','Yes')

df = df %>% mutate(Flag_keep = ifelse(Flag_NA == Flag_Uniform,"Yes","No"))



df

   ID         S1         S2         S3         S4         S5         S6 Flag_NA Flag_Uniform Flag_keep

1   1  Education  Education  Education  Education  Education  Education     Yes           No        No

2   2 Employment Employment Employment Employment Employment Employment     Yes           No        No

3   3  Education  Education       <NA>  Education  Education  Education      No          Yes        No

4   4  Education Unemployed Unemployed Unemployed Unemployed Unemployed     Yes          Yes       Yes

5   5  Education  Education  Education  Education  Education  Education     Yes           No        No

6   6  Education  Education Employment Employment Employment Employment     Yes          Yes       Yes

7   7  Education Employment Employment Employment Employment Employment     Yes          Yes       Yes

8   8  Education  Education       <NA>       <NA>       <NA>        EMP      No          Yes        No

9   9  Education  Education  Education  Education  Education  Education     Yes           No        No

10 10  Education  Education  Education  Education  Education  Education     Yes           No        No

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

add a comment |

up vote
0
down vote

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5,S6)

Added S6 also from your comments where Andre answer not able to label it correctly

library(dplyr)

df[df == "NA"] <- NA



df$Flag_NA = ifelse(apply(df %>% select(-ID),1,function(x) any(is.na(x))),'No','Yes')

df$Flag_Uniform = ifelse(apply(df %>% select(-ID,-Flag_NA), 1, function(x)length(unique(x))) == 1,'No','Yes')

df = df %>% mutate(Flag_keep = ifelse(Flag_NA == Flag_Uniform,"Yes","No"))



df

   ID         S1         S2         S3         S4         S5         S6 Flag_NA Flag_Uniform Flag_keep

1   1  Education  Education  Education  Education  Education  Education     Yes           No        No

2   2 Employment Employment Employment Employment Employment Employment     Yes           No        No

3   3  Education  Education       <NA>  Education  Education  Education      No          Yes        No

4   4  Education Unemployed Unemployed Unemployed Unemployed Unemployed     Yes          Yes       Yes

5   5  Education  Education  Education  Education  Education  Education     Yes           No        No

6   6  Education  Education Employment Employment Employment Employment     Yes          Yes       Yes

7   7  Education Employment Employment Employment Employment Employment     Yes          Yes       Yes

8   8  Education  Education       <NA>       <NA>       <NA>        EMP      No          Yes        No

9   9  Education  Education  Education  Education  Education  Education     Yes           No        No

10 10  Education  Education  Education  Education  Education  Education     Yes           No        No

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

add a comment |

up vote
0
down vote

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5,S6)

Added S6 also from your comments where Andre answer not able to label it correctly

library(dplyr)

df[df == "NA"] <- NA



df$Flag_NA = ifelse(apply(df %>% select(-ID),1,function(x) any(is.na(x))),'No','Yes')

df$Flag_Uniform = ifelse(apply(df %>% select(-ID,-Flag_NA), 1, function(x)length(unique(x))) == 1,'No','Yes')

df = df %>% mutate(Flag_keep = ifelse(Flag_NA == Flag_Uniform,"Yes","No"))



df

   ID         S1         S2         S3         S4         S5         S6 Flag_NA Flag_Uniform Flag_keep

1   1  Education  Education  Education  Education  Education  Education     Yes           No        No

2   2 Employment Employment Employment Employment Employment Employment     Yes           No        No

3   3  Education  Education       <NA>  Education  Education  Education      No          Yes        No

4   4  Education Unemployed Unemployed Unemployed Unemployed Unemployed     Yes          Yes       Yes

5   5  Education  Education  Education  Education  Education  Education     Yes           No        No

6   6  Education  Education Employment Employment Employment Employment     Yes          Yes       Yes

7   7  Education Employment Employment Employment Employment Employment     Yes          Yes       Yes

8   8  Education  Education       <NA>       <NA>       <NA>        EMP      No          Yes        No

9   9  Education  Education  Education  Education  Education  Education     Yes           No        No

10 10  Education  Education  Education  Education  Education  Education     Yes           No        No

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

ID <- c(1,2,3,4,5,6,7,8,9,10)

S1 <- c("Education", "Employment", "Education", "Education", "Education", "Education", "Education", "Education", "Education", "Education")

S2 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Education", "Employment", "Education", "Education", "Education")

S3 <- c("Education", "Employment", "NA", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S4 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S5 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "NA", "Education", "Education")

S6 <- c("Education", "Employment", "Education", "Unemployed", "Education", "Employment", "Employment", "EMP", "Education", "Education")

df <- data.frame(ID, S1, S2, S3, S4, S5,S6)

Added S6 also from your comments where Andre answer not able to label it correctly

library(dplyr)

df[df == "NA"] <- NA



df$Flag_NA = ifelse(apply(df %>% select(-ID),1,function(x) any(is.na(x))),'No','Yes')

df$Flag_Uniform = ifelse(apply(df %>% select(-ID,-Flag_NA), 1, function(x)length(unique(x))) == 1,'No','Yes')

df = df %>% mutate(Flag_keep = ifelse(Flag_NA == Flag_Uniform,"Yes","No"))



df

   ID         S1         S2         S3         S4         S5         S6 Flag_NA Flag_Uniform Flag_keep

1   1  Education  Education  Education  Education  Education  Education     Yes           No        No

2   2 Employment Employment Employment Employment Employment Employment     Yes           No        No

3   3  Education  Education       <NA>  Education  Education  Education      No          Yes        No

4   4  Education Unemployed Unemployed Unemployed Unemployed Unemployed     Yes          Yes       Yes

5   5  Education  Education  Education  Education  Education  Education     Yes           No        No

6   6  Education  Education Employment Employment Employment Employment     Yes          Yes       Yes

7   7  Education Employment Employment Employment Employment Employment     Yes          Yes       Yes

8   8  Education  Education       <NA>       <NA>       <NA>        EMP      No          Yes        No

9   9  Education  Education  Education  Education  Education  Education     Yes           No        No

10 10  Education  Education  Education  Education  Education  Education     Yes           No        No

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

answered Nov 8 at 11:59

Sai Prabhanjan Reddy

1829

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl

Identifying rows with uniform sequence while ignoring missing data in R

3 Answers
3

Added S6 also from your comments where Andre answer not able to label it correctly

Your Answer

Post as a guest

3 Answers
3

3 Answers
3

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Post as a guest

Popular posts from this blog

how to define a CAPL function taking a sysvar argument

Reims

How do I alter this code to allow abstract classes or interfaces to work over identical auto generated...

Identifying rows with uniform sequence while ignoring missing data in R

3 Answers 3

Added S6 also from your comments where Andre answer not able to label it correctly

Your Answer

Sign up or log in

Post as a guest

Post as a guest

3 Answers 3

3 Answers 3

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Added S6 also from your comments where Andre answer not able to label it correctly

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

how to define a CAPL function taking a sysvar argument

Reims

How do I alter this code to allow abstract classes or interfaces to work over identical auto generated...

3 Answers
3

3 Answers
3

3 Answers
3