Split column into intervals based on row content











up vote
2
down vote

favorite












I am trying to convert a single-column data frame into separate columns — the main descriptor in the data is the "item number" and then includes information on the price, date, color, etc. I would just split the column depending on row number, but since each item has a different amount of information, that doesn't really work.



I've been playing around with this a bit but haven't found anything at all to come close, as I can't use regex to create a separate column (using str_which, for example) since the information differs so much item to item. How can I use regex to create intervals that I can then split the column into (so I need the information between each row containing "item" in a separate column). Sample data is below.



data

item 1
$600
red
item 2
$70
item 3
$430
orange
10/11/2017


Thank you!










share|improve this question






















  • Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
    – Masoud
    Nov 8 at 19:18










  • @Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
    – I.I.
    Nov 8 at 19:21















up vote
2
down vote

favorite












I am trying to convert a single-column data frame into separate columns — the main descriptor in the data is the "item number" and then includes information on the price, date, color, etc. I would just split the column depending on row number, but since each item has a different amount of information, that doesn't really work.



I've been playing around with this a bit but haven't found anything at all to come close, as I can't use regex to create a separate column (using str_which, for example) since the information differs so much item to item. How can I use regex to create intervals that I can then split the column into (so I need the information between each row containing "item" in a separate column). Sample data is below.



data

item 1
$600
red
item 2
$70
item 3
$430
orange
10/11/2017


Thank you!










share|improve this question






















  • Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
    – Masoud
    Nov 8 at 19:18










  • @Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
    – I.I.
    Nov 8 at 19:21













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I am trying to convert a single-column data frame into separate columns — the main descriptor in the data is the "item number" and then includes information on the price, date, color, etc. I would just split the column depending on row number, but since each item has a different amount of information, that doesn't really work.



I've been playing around with this a bit but haven't found anything at all to come close, as I can't use regex to create a separate column (using str_which, for example) since the information differs so much item to item. How can I use regex to create intervals that I can then split the column into (so I need the information between each row containing "item" in a separate column). Sample data is below.



data

item 1
$600
red
item 2
$70
item 3
$430
orange
10/11/2017


Thank you!










share|improve this question













I am trying to convert a single-column data frame into separate columns — the main descriptor in the data is the "item number" and then includes information on the price, date, color, etc. I would just split the column depending on row number, but since each item has a different amount of information, that doesn't really work.



I've been playing around with this a bit but haven't found anything at all to come close, as I can't use regex to create a separate column (using str_which, for example) since the information differs so much item to item. How can I use regex to create intervals that I can then split the column into (so I need the information between each row containing "item" in a separate column). Sample data is below.



data

item 1
$600
red
item 2
$70
item 3
$430
orange
10/11/2017


Thank you!







r






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 8 at 19:15









I.I.

111




111












  • Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
    – Masoud
    Nov 8 at 19:18










  • @Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
    – I.I.
    Nov 8 at 19:21


















  • Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
    – Masoud
    Nov 8 at 19:18










  • @Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
    – I.I.
    Nov 8 at 19:21
















Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
– Masoud
Nov 8 at 19:18




Make a list of data frames (each one is an item and its associated data) and then append them together and assign NA to empty columns.
– Masoud
Nov 8 at 19:18












@Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
– I.I.
Nov 8 at 19:21




@Masoud that's kind of what I'm trying to figure out how to do. I'm trying to split this data (I received it like this) by item, but since each item has a different number of associated rows, with different content, I'm not sure how
– I.I.
Nov 8 at 19:21












1 Answer
1






active

oldest

votes

















up vote
0
down vote













Here is a function to reformat your data depending on how you want the final dataset to look like. For the function, you supply the dataframe DF, the variable var, and a vector of column names in the correct order colnames and byitem to choose the output format (default is TRUE, which outputs a dataframe with one row per item):



library(tidyverse)

df_transform = function(DF, var, colnames, byitem = TRUE){
if(byitem){
ID = sym("rowid")
}else{
ID = sym("id")
}
DF %>%
group_by(id = paste0("item", cumsum(grepl("item", var)))) %>%
mutate(rowid = replace(2:n(), 2:n(), setNames(colnames[1:(n()-1)], 2:n()))) %>%
filter(!grepl("item", var)) %>%
spread(!!ID, var)
}


Output:



> df_transform(df, var, c("price", "color", "date"))

# A tibble: 3 x 4
# Groups: id [3]
id color date price
<chr> <fct> <fct> <fct>
1 item1 red <NA> $600
2 item2 <NA> <NA> $70
3 item3 orange 10/11/2017 $430


> df_transform(df, var, c("price", "color", "date"), byitem = FALSE)

# A tibble: 3 x 4
rowid item1 item2 item3
<chr> <fct> <fct> <fct>
1 color red <NA> orange
2 date <NA> <NA> 10/11/2017
3 price $600 $70 $430


Note that this would not work if you have missing values in the middle, since the column names are assigned by position.



Data:



df <- structure(list(var = structure(c(5L, 2L, 9L, 6L, 3L, 7L, 1L, 
8L, 4L), .Label = c("$430", "$600", "$70", "10/11/2017", "item_1",
"item_2", "item_3", "orange", "red"), class = "factor")), .Names = "var", class = "data.frame", row.names = c(NA,
-9L))





share|improve this answer























  • Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
    – I.I.
    Nov 8 at 19:59










  • @I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
    – avid_useR
    Nov 8 at 20:03










  • @I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
    – avid_useR
    Nov 8 at 20:05










  • thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
    – I.I.
    Nov 8 at 22:36










  • @I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
    – avid_useR
    Nov 9 at 3:02











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53214673%2fsplit-column-into-intervals-based-on-row-content%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
0
down vote













Here is a function to reformat your data depending on how you want the final dataset to look like. For the function, you supply the dataframe DF, the variable var, and a vector of column names in the correct order colnames and byitem to choose the output format (default is TRUE, which outputs a dataframe with one row per item):



library(tidyverse)

df_transform = function(DF, var, colnames, byitem = TRUE){
if(byitem){
ID = sym("rowid")
}else{
ID = sym("id")
}
DF %>%
group_by(id = paste0("item", cumsum(grepl("item", var)))) %>%
mutate(rowid = replace(2:n(), 2:n(), setNames(colnames[1:(n()-1)], 2:n()))) %>%
filter(!grepl("item", var)) %>%
spread(!!ID, var)
}


Output:



> df_transform(df, var, c("price", "color", "date"))

# A tibble: 3 x 4
# Groups: id [3]
id color date price
<chr> <fct> <fct> <fct>
1 item1 red <NA> $600
2 item2 <NA> <NA> $70
3 item3 orange 10/11/2017 $430


> df_transform(df, var, c("price", "color", "date"), byitem = FALSE)

# A tibble: 3 x 4
rowid item1 item2 item3
<chr> <fct> <fct> <fct>
1 color red <NA> orange
2 date <NA> <NA> 10/11/2017
3 price $600 $70 $430


Note that this would not work if you have missing values in the middle, since the column names are assigned by position.



Data:



df <- structure(list(var = structure(c(5L, 2L, 9L, 6L, 3L, 7L, 1L, 
8L, 4L), .Label = c("$430", "$600", "$70", "10/11/2017", "item_1",
"item_2", "item_3", "orange", "red"), class = "factor")), .Names = "var", class = "data.frame", row.names = c(NA,
-9L))





share|improve this answer























  • Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
    – I.I.
    Nov 8 at 19:59










  • @I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
    – avid_useR
    Nov 8 at 20:03










  • @I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
    – avid_useR
    Nov 8 at 20:05










  • thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
    – I.I.
    Nov 8 at 22:36










  • @I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
    – avid_useR
    Nov 9 at 3:02















up vote
0
down vote













Here is a function to reformat your data depending on how you want the final dataset to look like. For the function, you supply the dataframe DF, the variable var, and a vector of column names in the correct order colnames and byitem to choose the output format (default is TRUE, which outputs a dataframe with one row per item):



library(tidyverse)

df_transform = function(DF, var, colnames, byitem = TRUE){
if(byitem){
ID = sym("rowid")
}else{
ID = sym("id")
}
DF %>%
group_by(id = paste0("item", cumsum(grepl("item", var)))) %>%
mutate(rowid = replace(2:n(), 2:n(), setNames(colnames[1:(n()-1)], 2:n()))) %>%
filter(!grepl("item", var)) %>%
spread(!!ID, var)
}


Output:



> df_transform(df, var, c("price", "color", "date"))

# A tibble: 3 x 4
# Groups: id [3]
id color date price
<chr> <fct> <fct> <fct>
1 item1 red <NA> $600
2 item2 <NA> <NA> $70
3 item3 orange 10/11/2017 $430


> df_transform(df, var, c("price", "color", "date"), byitem = FALSE)

# A tibble: 3 x 4
rowid item1 item2 item3
<chr> <fct> <fct> <fct>
1 color red <NA> orange
2 date <NA> <NA> 10/11/2017
3 price $600 $70 $430


Note that this would not work if you have missing values in the middle, since the column names are assigned by position.



Data:



df <- structure(list(var = structure(c(5L, 2L, 9L, 6L, 3L, 7L, 1L, 
8L, 4L), .Label = c("$430", "$600", "$70", "10/11/2017", "item_1",
"item_2", "item_3", "orange", "red"), class = "factor")), .Names = "var", class = "data.frame", row.names = c(NA,
-9L))





share|improve this answer























  • Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
    – I.I.
    Nov 8 at 19:59










  • @I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
    – avid_useR
    Nov 8 at 20:03










  • @I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
    – avid_useR
    Nov 8 at 20:05










  • thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
    – I.I.
    Nov 8 at 22:36










  • @I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
    – avid_useR
    Nov 9 at 3:02













up vote
0
down vote










up vote
0
down vote









Here is a function to reformat your data depending on how you want the final dataset to look like. For the function, you supply the dataframe DF, the variable var, and a vector of column names in the correct order colnames and byitem to choose the output format (default is TRUE, which outputs a dataframe with one row per item):



library(tidyverse)

df_transform = function(DF, var, colnames, byitem = TRUE){
if(byitem){
ID = sym("rowid")
}else{
ID = sym("id")
}
DF %>%
group_by(id = paste0("item", cumsum(grepl("item", var)))) %>%
mutate(rowid = replace(2:n(), 2:n(), setNames(colnames[1:(n()-1)], 2:n()))) %>%
filter(!grepl("item", var)) %>%
spread(!!ID, var)
}


Output:



> df_transform(df, var, c("price", "color", "date"))

# A tibble: 3 x 4
# Groups: id [3]
id color date price
<chr> <fct> <fct> <fct>
1 item1 red <NA> $600
2 item2 <NA> <NA> $70
3 item3 orange 10/11/2017 $430


> df_transform(df, var, c("price", "color", "date"), byitem = FALSE)

# A tibble: 3 x 4
rowid item1 item2 item3
<chr> <fct> <fct> <fct>
1 color red <NA> orange
2 date <NA> <NA> 10/11/2017
3 price $600 $70 $430


Note that this would not work if you have missing values in the middle, since the column names are assigned by position.



Data:



df <- structure(list(var = structure(c(5L, 2L, 9L, 6L, 3L, 7L, 1L, 
8L, 4L), .Label = c("$430", "$600", "$70", "10/11/2017", "item_1",
"item_2", "item_3", "orange", "red"), class = "factor")), .Names = "var", class = "data.frame", row.names = c(NA,
-9L))





share|improve this answer














Here is a function to reformat your data depending on how you want the final dataset to look like. For the function, you supply the dataframe DF, the variable var, and a vector of column names in the correct order colnames and byitem to choose the output format (default is TRUE, which outputs a dataframe with one row per item):



library(tidyverse)

df_transform = function(DF, var, colnames, byitem = TRUE){
if(byitem){
ID = sym("rowid")
}else{
ID = sym("id")
}
DF %>%
group_by(id = paste0("item", cumsum(grepl("item", var)))) %>%
mutate(rowid = replace(2:n(), 2:n(), setNames(colnames[1:(n()-1)], 2:n()))) %>%
filter(!grepl("item", var)) %>%
spread(!!ID, var)
}


Output:



> df_transform(df, var, c("price", "color", "date"))

# A tibble: 3 x 4
# Groups: id [3]
id color date price
<chr> <fct> <fct> <fct>
1 item1 red <NA> $600
2 item2 <NA> <NA> $70
3 item3 orange 10/11/2017 $430


> df_transform(df, var, c("price", "color", "date"), byitem = FALSE)

# A tibble: 3 x 4
rowid item1 item2 item3
<chr> <fct> <fct> <fct>
1 color red <NA> orange
2 date <NA> <NA> 10/11/2017
3 price $600 $70 $430


Note that this would not work if you have missing values in the middle, since the column names are assigned by position.



Data:



df <- structure(list(var = structure(c(5L, 2L, 9L, 6L, 3L, 7L, 1L, 
8L, 4L), .Label = c("$430", "$600", "$70", "10/11/2017", "item_1",
"item_2", "item_3", "orange", "red"), class = "factor")), .Names = "var", class = "data.frame", row.names = c(NA,
-9L))






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 8 at 19:57

























answered Nov 8 at 19:36









avid_useR

11.6k41830




11.6k41830












  • Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
    – I.I.
    Nov 8 at 19:59










  • @I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
    – avid_useR
    Nov 8 at 20:03










  • @I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
    – avid_useR
    Nov 8 at 20:05










  • thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
    – I.I.
    Nov 8 at 22:36










  • @I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
    – avid_useR
    Nov 9 at 3:02


















  • Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
    – I.I.
    Nov 8 at 19:59










  • @I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
    – avid_useR
    Nov 8 at 20:03










  • @I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
    – avid_useR
    Nov 8 at 20:05










  • thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
    – I.I.
    Nov 8 at 22:36










  • @I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
    – avid_useR
    Nov 9 at 3:02
















Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
– I.I.
Nov 8 at 19:59




Thank you so much! This is great. I was wondering, though (sorry for not making it clear in my original question, and for forgetting to provide reproducible data), if there's something that can be done if the "price" and "color" columns, for example, are not always in the same order (as in, not always in row 2). Thank you again!
– I.I.
Nov 8 at 19:59












@I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
– avid_useR
Nov 8 at 20:03




@I.I. Well, other than the position of your values, there is no way of knowing whether the literal "$100" is supposed to be a price or something else. Unless there is another column to indicate the column name for each row, or to naively assume that a value with a dollar sign is always "price", I'm afraid your data is too ambiguous to do this kind of transformation.
– avid_useR
Nov 8 at 20:03












@I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
– avid_useR
Nov 8 at 20:05




@I.I. If you are willing to impose an assumption that "$" is always price, characters are always color, and dates are always date, then I can modify my answer to accommodate that.
– avid_useR
Nov 8 at 20:05












thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
– I.I.
Nov 8 at 22:36




thank you! That makes sense... I think I'm going to go ahead and try this code but try a few different regex expressions to get the columns out...I do have just one more question, though - there are a lot of duplicated rows in the set of kind of useless information (like, table 1 will have two rows of orange, just as an example) and dropping duplicate rows removes all instances of orange (which I don't want). I was wondering if there was a way within using mutate + case_when to drop rows when it doesn't match. I hope that makes sense - thank you again!
– I.I.
Nov 8 at 22:36












@I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
– avid_useR
Nov 9 at 3:02




@I.I. Do you mean only keeping the first row for each duplicate rows? Perhaps you should update your question and provide expected output to illustrate what you mean.
– avid_useR
Nov 9 at 3:02


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53214673%2fsplit-column-into-intervals-based-on-row-content%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Schultheiß

Verwaltungsgliederung Dänemarks

Liste der Kulturdenkmale in Wilsdruff