How can I calculate the sum of comma separated in the 2nd column
up vote
2
down vote
favorite
sm_agg
Group.1 x
1 1001 8
2 1002 16, 8
3 1003 8
4 1004 16
5 1005 5.33333333333333, 8
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 8, 5.33333333333333
11 1011 8, 4
12 1012 5.33333333333333
13 1013 5.33333333333333, 8
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
I want to get like this
sm_agg
Group.1 x
1 1001 8
2 1002 24
3 1003 8
4 1004 16
5 1005 13.3
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 13.3
11 1011 12
12 1012 5.33333333333333
13 1013 13.3
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
r
add a comment |
up vote
2
down vote
favorite
sm_agg
Group.1 x
1 1001 8
2 1002 16, 8
3 1003 8
4 1004 16
5 1005 5.33333333333333, 8
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 8, 5.33333333333333
11 1011 8, 4
12 1012 5.33333333333333
13 1013 5.33333333333333, 8
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
I want to get like this
sm_agg
Group.1 x
1 1001 8
2 1002 24
3 1003 8
4 1004 16
5 1005 13.3
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 13.3
11 1011 12
12 1012 5.33333333333333
13 1013 13.3
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
r
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
sm_agg
Group.1 x
1 1001 8
2 1002 16, 8
3 1003 8
4 1004 16
5 1005 5.33333333333333, 8
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 8, 5.33333333333333
11 1011 8, 4
12 1012 5.33333333333333
13 1013 5.33333333333333, 8
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
I want to get like this
sm_agg
Group.1 x
1 1001 8
2 1002 24
3 1003 8
4 1004 16
5 1005 13.3
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 13.3
11 1011 12
12 1012 5.33333333333333
13 1013 13.3
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
r
sm_agg
Group.1 x
1 1001 8
2 1002 16, 8
3 1003 8
4 1004 16
5 1005 5.33333333333333, 8
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 8, 5.33333333333333
11 1011 8, 4
12 1012 5.33333333333333
13 1013 5.33333333333333, 8
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
I want to get like this
sm_agg
Group.1 x
1 1001 8
2 1002 24
3 1003 8
4 1004 16
5 1005 13.3
6 1006 4
7 1007 4
8 1008 4
9 1009 5.33333333333333
10 1010 13.3
11 1011 12
12 1012 5.33333333333333
13 1013 13.3
14 1014 8
15 1015 5.33333333333333
16 1016 5.33333333333333
r
r
edited Nov 10 at 5:50
Ronak Shah
30.4k103753
30.4k103753
asked Nov 10 at 5:06
Kiran Pg
228
228
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
up vote
1
down vote
accepted
Try this:
sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i)))
sm_agg
# Group.1 x
# 1 1001 8.000000
# 2 1002 24.000000
# 3 1003 8.000000
# 4 1004 16.000000
# 5 1005 13.333333
# 6 1006 4.000000
# 7 1007 4.000000
# 8 1008 4.000000
# 9 1009 5.333333
# 10 1010 13.333333
# 11 1011 12.000000
# 12 1012 5.333333
# 13 1013 13.333333
# 14 1014 8.000000
# 15 1015 5.333333
# 16 1016 5.333333
Explanation:
For a single entry, we split it by one or more commas/spaces:
strsplit(sm_agg$x[2], "[, ]+")
# [[1]]
# [1] "16" "8"
With that, we want to convert to numbers and add, so
as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]])
# [1] 16 8
sum(as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]))
# [1] 24
We want to do that for every element, so we instead feed the
strsplit
output into ansapply
anon-function.
If your frame has factor
s instead of strings, then instead use
sapply(strsplit(as.character(sm_agg$x), "[ ,]+"), function(i) sum(as.numeric(i)))
Last Edit
I think your data is actually an embedded list
. When data contains a list-column, it presents like that (which I find a little frustrating, but still ...).
I'll generate some fake data to demonstrate what I think you actually have:
sm2 <- data.frame(Group.1 = c("1001", "1002", "1003", "1005"))
sm2$x <- list(c(8L), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x
# 1 1001 8
# 2 1002 16, 8
# 3 1003 8
# 4 1005 5.333333, 8.000000
Okay. When we tried strsplit
and even as.character
, things break and are obviously not number-like:
as.character(sm2$x)
# [1] "8" "c(16, 8)" "8"
# [4] "c(5.33333333333333, 8)"
When in fact, all we have to do is just sum them up, because they're already numbers.
sapply(sm2$x, sum)
# [1] 8.00000 24.00000 8.00000 13.33333
If by chance one of the nested things is actually a character
:
sm2$y <- list(c("8"), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x y
# 1 1001 8 8
# 2 1002 16, 8 16, 8
# 3 1003 8 8
# 4 1005 5.333333, 8.000000 5.333333, 8.000000
which will cause our "simple" solution to fail.
sapply(sm2$y, sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Luckily, we can be a bit over-handed and force strings to numbers, and numbers to numbers:
sapply(sm2$y, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
sapply(sm2$x, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
You havefactor
s in your data. In the future, presenting your data in an unambiguous way such asdput(head(sm_agg))
would have prevented this. See my edit for the workaround.
– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace"[ ,]+"
(notice the space, total of five chars within the quotes) with"[\s,]+"
(no space here, the\s
includes spaces and tabs).
– r2evans
Nov 10 at 6:05
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
|
show 4 more comments
up vote
1
down vote
We can use separate_rows
to separate comma-separated enteries into different rows and then sum
by group.
library(tidyverse)
df %>%
separate_rows(x, sep = ",") %>%
group_by(Group.1) %>%
summarise(x = sum(as.numeric(x)))
# Group.1 x
# <dbl> <dbl>
# 1 1001 8
# 2 1002 24
# 3 1003 8
# 4 1004 16
# 5 1005 13.3
# 6 1006 4
# 7 1007 4
# 8 1008 4
# 9 1009 5.33
#10 1010 13.3
#11 1011 12
#12 1012 5.33
#13 1013 13.3
#14 1014 8
#15 1015 5.33
#16 1016 5.33
data
df <- structure(list(Group.1 = c(1001, 1002, 1003, 1004, 1005, 1006,
1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016),
x = structure(c(5L, 7L, 5L, 6L, 10L, 2L, 1L, 1L, 9L, 11L,
4L, 8L, 10L, 3L, 8L, 8L), .Label = c(" 4",
" 4", " 8", " 8, 4",
" 8", " 16", " 16, 8", " 5.33333333333333",
" 5.33333333333333", " 5.33333333333333, 8", " 8, 5.33333333333333"
), class = "factor")), .Names = c("Group.1", "x"), class =
"data.frame", row.names = c(NA,
-16L))
add a comment |
up vote
0
down vote
using this method i get asnwer sapply(sm2$y, function(i) sum(as.numeric(i)))
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53236148%2fhow-can-i-calculate-the-sum-of-comma-separated-in-the-2nd-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
Try this:
sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i)))
sm_agg
# Group.1 x
# 1 1001 8.000000
# 2 1002 24.000000
# 3 1003 8.000000
# 4 1004 16.000000
# 5 1005 13.333333
# 6 1006 4.000000
# 7 1007 4.000000
# 8 1008 4.000000
# 9 1009 5.333333
# 10 1010 13.333333
# 11 1011 12.000000
# 12 1012 5.333333
# 13 1013 13.333333
# 14 1014 8.000000
# 15 1015 5.333333
# 16 1016 5.333333
Explanation:
For a single entry, we split it by one or more commas/spaces:
strsplit(sm_agg$x[2], "[, ]+")
# [[1]]
# [1] "16" "8"
With that, we want to convert to numbers and add, so
as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]])
# [1] 16 8
sum(as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]))
# [1] 24
We want to do that for every element, so we instead feed the
strsplit
output into ansapply
anon-function.
If your frame has factor
s instead of strings, then instead use
sapply(strsplit(as.character(sm_agg$x), "[ ,]+"), function(i) sum(as.numeric(i)))
Last Edit
I think your data is actually an embedded list
. When data contains a list-column, it presents like that (which I find a little frustrating, but still ...).
I'll generate some fake data to demonstrate what I think you actually have:
sm2 <- data.frame(Group.1 = c("1001", "1002", "1003", "1005"))
sm2$x <- list(c(8L), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x
# 1 1001 8
# 2 1002 16, 8
# 3 1003 8
# 4 1005 5.333333, 8.000000
Okay. When we tried strsplit
and even as.character
, things break and are obviously not number-like:
as.character(sm2$x)
# [1] "8" "c(16, 8)" "8"
# [4] "c(5.33333333333333, 8)"
When in fact, all we have to do is just sum them up, because they're already numbers.
sapply(sm2$x, sum)
# [1] 8.00000 24.00000 8.00000 13.33333
If by chance one of the nested things is actually a character
:
sm2$y <- list(c("8"), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x y
# 1 1001 8 8
# 2 1002 16, 8 16, 8
# 3 1003 8 8
# 4 1005 5.333333, 8.000000 5.333333, 8.000000
which will cause our "simple" solution to fail.
sapply(sm2$y, sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Luckily, we can be a bit over-handed and force strings to numbers, and numbers to numbers:
sapply(sm2$y, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
sapply(sm2$x, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
You havefactor
s in your data. In the future, presenting your data in an unambiguous way such asdput(head(sm_agg))
would have prevented this. See my edit for the workaround.
– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace"[ ,]+"
(notice the space, total of five chars within the quotes) with"[\s,]+"
(no space here, the\s
includes spaces and tabs).
– r2evans
Nov 10 at 6:05
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
|
show 4 more comments
up vote
1
down vote
accepted
Try this:
sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i)))
sm_agg
# Group.1 x
# 1 1001 8.000000
# 2 1002 24.000000
# 3 1003 8.000000
# 4 1004 16.000000
# 5 1005 13.333333
# 6 1006 4.000000
# 7 1007 4.000000
# 8 1008 4.000000
# 9 1009 5.333333
# 10 1010 13.333333
# 11 1011 12.000000
# 12 1012 5.333333
# 13 1013 13.333333
# 14 1014 8.000000
# 15 1015 5.333333
# 16 1016 5.333333
Explanation:
For a single entry, we split it by one or more commas/spaces:
strsplit(sm_agg$x[2], "[, ]+")
# [[1]]
# [1] "16" "8"
With that, we want to convert to numbers and add, so
as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]])
# [1] 16 8
sum(as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]))
# [1] 24
We want to do that for every element, so we instead feed the
strsplit
output into ansapply
anon-function.
If your frame has factor
s instead of strings, then instead use
sapply(strsplit(as.character(sm_agg$x), "[ ,]+"), function(i) sum(as.numeric(i)))
Last Edit
I think your data is actually an embedded list
. When data contains a list-column, it presents like that (which I find a little frustrating, but still ...).
I'll generate some fake data to demonstrate what I think you actually have:
sm2 <- data.frame(Group.1 = c("1001", "1002", "1003", "1005"))
sm2$x <- list(c(8L), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x
# 1 1001 8
# 2 1002 16, 8
# 3 1003 8
# 4 1005 5.333333, 8.000000
Okay. When we tried strsplit
and even as.character
, things break and are obviously not number-like:
as.character(sm2$x)
# [1] "8" "c(16, 8)" "8"
# [4] "c(5.33333333333333, 8)"
When in fact, all we have to do is just sum them up, because they're already numbers.
sapply(sm2$x, sum)
# [1] 8.00000 24.00000 8.00000 13.33333
If by chance one of the nested things is actually a character
:
sm2$y <- list(c("8"), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x y
# 1 1001 8 8
# 2 1002 16, 8 16, 8
# 3 1003 8 8
# 4 1005 5.333333, 8.000000 5.333333, 8.000000
which will cause our "simple" solution to fail.
sapply(sm2$y, sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Luckily, we can be a bit over-handed and force strings to numbers, and numbers to numbers:
sapply(sm2$y, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
sapply(sm2$x, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
You havefactor
s in your data. In the future, presenting your data in an unambiguous way such asdput(head(sm_agg))
would have prevented this. See my edit for the workaround.
– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace"[ ,]+"
(notice the space, total of five chars within the quotes) with"[\s,]+"
(no space here, the\s
includes spaces and tabs).
– r2evans
Nov 10 at 6:05
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
|
show 4 more comments
up vote
1
down vote
accepted
up vote
1
down vote
accepted
Try this:
sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i)))
sm_agg
# Group.1 x
# 1 1001 8.000000
# 2 1002 24.000000
# 3 1003 8.000000
# 4 1004 16.000000
# 5 1005 13.333333
# 6 1006 4.000000
# 7 1007 4.000000
# 8 1008 4.000000
# 9 1009 5.333333
# 10 1010 13.333333
# 11 1011 12.000000
# 12 1012 5.333333
# 13 1013 13.333333
# 14 1014 8.000000
# 15 1015 5.333333
# 16 1016 5.333333
Explanation:
For a single entry, we split it by one or more commas/spaces:
strsplit(sm_agg$x[2], "[, ]+")
# [[1]]
# [1] "16" "8"
With that, we want to convert to numbers and add, so
as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]])
# [1] 16 8
sum(as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]))
# [1] 24
We want to do that for every element, so we instead feed the
strsplit
output into ansapply
anon-function.
If your frame has factor
s instead of strings, then instead use
sapply(strsplit(as.character(sm_agg$x), "[ ,]+"), function(i) sum(as.numeric(i)))
Last Edit
I think your data is actually an embedded list
. When data contains a list-column, it presents like that (which I find a little frustrating, but still ...).
I'll generate some fake data to demonstrate what I think you actually have:
sm2 <- data.frame(Group.1 = c("1001", "1002", "1003", "1005"))
sm2$x <- list(c(8L), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x
# 1 1001 8
# 2 1002 16, 8
# 3 1003 8
# 4 1005 5.333333, 8.000000
Okay. When we tried strsplit
and even as.character
, things break and are obviously not number-like:
as.character(sm2$x)
# [1] "8" "c(16, 8)" "8"
# [4] "c(5.33333333333333, 8)"
When in fact, all we have to do is just sum them up, because they're already numbers.
sapply(sm2$x, sum)
# [1] 8.00000 24.00000 8.00000 13.33333
If by chance one of the nested things is actually a character
:
sm2$y <- list(c("8"), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x y
# 1 1001 8 8
# 2 1002 16, 8 16, 8
# 3 1003 8 8
# 4 1005 5.333333, 8.000000 5.333333, 8.000000
which will cause our "simple" solution to fail.
sapply(sm2$y, sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Luckily, we can be a bit over-handed and force strings to numbers, and numbers to numbers:
sapply(sm2$y, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
sapply(sm2$x, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
Try this:
sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i)))
sm_agg
# Group.1 x
# 1 1001 8.000000
# 2 1002 24.000000
# 3 1003 8.000000
# 4 1004 16.000000
# 5 1005 13.333333
# 6 1006 4.000000
# 7 1007 4.000000
# 8 1008 4.000000
# 9 1009 5.333333
# 10 1010 13.333333
# 11 1011 12.000000
# 12 1012 5.333333
# 13 1013 13.333333
# 14 1014 8.000000
# 15 1015 5.333333
# 16 1016 5.333333
Explanation:
For a single entry, we split it by one or more commas/spaces:
strsplit(sm_agg$x[2], "[, ]+")
# [[1]]
# [1] "16" "8"
With that, we want to convert to numbers and add, so
as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]])
# [1] 16 8
sum(as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]))
# [1] 24
We want to do that for every element, so we instead feed the
strsplit
output into ansapply
anon-function.
If your frame has factor
s instead of strings, then instead use
sapply(strsplit(as.character(sm_agg$x), "[ ,]+"), function(i) sum(as.numeric(i)))
Last Edit
I think your data is actually an embedded list
. When data contains a list-column, it presents like that (which I find a little frustrating, but still ...).
I'll generate some fake data to demonstrate what I think you actually have:
sm2 <- data.frame(Group.1 = c("1001", "1002", "1003", "1005"))
sm2$x <- list(c(8L), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x
# 1 1001 8
# 2 1002 16, 8
# 3 1003 8
# 4 1005 5.333333, 8.000000
Okay. When we tried strsplit
and even as.character
, things break and are obviously not number-like:
as.character(sm2$x)
# [1] "8" "c(16, 8)" "8"
# [4] "c(5.33333333333333, 8)"
When in fact, all we have to do is just sum them up, because they're already numbers.
sapply(sm2$x, sum)
# [1] 8.00000 24.00000 8.00000 13.33333
If by chance one of the nested things is actually a character
:
sm2$y <- list(c("8"), c(16L,8L), c(8L), c(16/3, 8))
sm2
# Group.1 x y
# 1 1001 8 8
# 2 1002 16, 8 16, 8
# 3 1003 8 8
# 4 1005 5.333333, 8.000000 5.333333, 8.000000
which will cause our "simple" solution to fail.
sapply(sm2$y, sum)
# Error in FUN(X[[i]], ...) : invalid 'type' (character) of argument
Luckily, we can be a bit over-handed and force strings to numbers, and numbers to numbers:
sapply(sm2$y, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
sapply(sm2$x, function(i) sum(as.numeric(i)))
# [1] 8.00000 24.00000 8.00000 13.33333
edited Nov 10 at 6:36
answered Nov 10 at 5:13
r2evans
25.3k32856
25.3k32856
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
You havefactor
s in your data. In the future, presenting your data in an unambiguous way such asdput(head(sm_agg))
would have prevented this. See my edit for the workaround.
– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace"[ ,]+"
(notice the space, total of five chars within the quotes) with"[\s,]+"
(no space here, the\s
includes spaces and tabs).
– r2evans
Nov 10 at 6:05
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
|
show 4 more comments
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
You havefactor
s in your data. In the future, presenting your data in an unambiguous way such asdput(head(sm_agg))
would have prevented this. See my edit for the workaround.
– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace"[ ,]+"
(notice the space, total of five chars within the quotes) with"[\s,]+"
(no space here, the\s
includes spaces and tabs).
– r2evans
Nov 10 at 6:05
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
when i apply this i got error > as.numeric(strsplit(sm_agg$x[2], "[, ]+")[[1]]) Error in strsplit(sm_agg$x[2], "[, ]+") : non-character argument > sm_agg$x <- sapply(strsplit(sm_agg$x, "[ ,]+"), function(i) sum(as.numeric(i))) Error in strsplit(sm_agg$x, "[ ,]+") : non-character argument >
– Kiran Pg
Nov 10 at 5:33
1
1
You have
factor
s in your data. In the future, presenting your data in an unambiguous way such as dput(head(sm_agg))
would have prevented this. See my edit for the workaround.– r2evans
Nov 10 at 5:38
You have
factor
s in your data. In the future, presenting your data in an unambiguous way such as dput(head(sm_agg))
would have prevented this. See my edit for the workaround.– r2evans
Nov 10 at 5:38
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
>sapply(strsplit(as.character(sm_agg$x), "[,]+"), function(i) sum(as.numeric(i))) [1] 8.000000 NA 8.000000 16.000000 NA 4.000000 4.000000 4.000000 5.333333 NA [11] NA 5.333333 NA 8.000000 5.333333 5.333333
– Kiran Pg
Nov 10 at 5:57
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace
"[ ,]+"
(notice the space, total of five chars within the quotes) with "[\s,]+"
(no space here, the \s
includes spaces and tabs).– r2evans
Nov 10 at 6:05
You forgot the space in the pattern. That space is not an accident or a "code style preference". If you want to be even clearer, replace
"[ ,]+"
(notice the space, total of five chars within the quotes) with "[\s,]+"
(no space here, the \s
includes spaces and tabs).– r2evans
Nov 10 at 6:05
1
1
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
You can accept it by clicking on the check mark next to vote button.
– Ronak Shah
Nov 10 at 11:50
|
show 4 more comments
up vote
1
down vote
We can use separate_rows
to separate comma-separated enteries into different rows and then sum
by group.
library(tidyverse)
df %>%
separate_rows(x, sep = ",") %>%
group_by(Group.1) %>%
summarise(x = sum(as.numeric(x)))
# Group.1 x
# <dbl> <dbl>
# 1 1001 8
# 2 1002 24
# 3 1003 8
# 4 1004 16
# 5 1005 13.3
# 6 1006 4
# 7 1007 4
# 8 1008 4
# 9 1009 5.33
#10 1010 13.3
#11 1011 12
#12 1012 5.33
#13 1013 13.3
#14 1014 8
#15 1015 5.33
#16 1016 5.33
data
df <- structure(list(Group.1 = c(1001, 1002, 1003, 1004, 1005, 1006,
1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016),
x = structure(c(5L, 7L, 5L, 6L, 10L, 2L, 1L, 1L, 9L, 11L,
4L, 8L, 10L, 3L, 8L, 8L), .Label = c(" 4",
" 4", " 8", " 8, 4",
" 8", " 16", " 16, 8", " 5.33333333333333",
" 5.33333333333333", " 5.33333333333333, 8", " 8, 5.33333333333333"
), class = "factor")), .Names = c("Group.1", "x"), class =
"data.frame", row.names = c(NA,
-16L))
add a comment |
up vote
1
down vote
We can use separate_rows
to separate comma-separated enteries into different rows and then sum
by group.
library(tidyverse)
df %>%
separate_rows(x, sep = ",") %>%
group_by(Group.1) %>%
summarise(x = sum(as.numeric(x)))
# Group.1 x
# <dbl> <dbl>
# 1 1001 8
# 2 1002 24
# 3 1003 8
# 4 1004 16
# 5 1005 13.3
# 6 1006 4
# 7 1007 4
# 8 1008 4
# 9 1009 5.33
#10 1010 13.3
#11 1011 12
#12 1012 5.33
#13 1013 13.3
#14 1014 8
#15 1015 5.33
#16 1016 5.33
data
df <- structure(list(Group.1 = c(1001, 1002, 1003, 1004, 1005, 1006,
1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016),
x = structure(c(5L, 7L, 5L, 6L, 10L, 2L, 1L, 1L, 9L, 11L,
4L, 8L, 10L, 3L, 8L, 8L), .Label = c(" 4",
" 4", " 8", " 8, 4",
" 8", " 16", " 16, 8", " 5.33333333333333",
" 5.33333333333333", " 5.33333333333333, 8", " 8, 5.33333333333333"
), class = "factor")), .Names = c("Group.1", "x"), class =
"data.frame", row.names = c(NA,
-16L))
add a comment |
up vote
1
down vote
up vote
1
down vote
We can use separate_rows
to separate comma-separated enteries into different rows and then sum
by group.
library(tidyverse)
df %>%
separate_rows(x, sep = ",") %>%
group_by(Group.1) %>%
summarise(x = sum(as.numeric(x)))
# Group.1 x
# <dbl> <dbl>
# 1 1001 8
# 2 1002 24
# 3 1003 8
# 4 1004 16
# 5 1005 13.3
# 6 1006 4
# 7 1007 4
# 8 1008 4
# 9 1009 5.33
#10 1010 13.3
#11 1011 12
#12 1012 5.33
#13 1013 13.3
#14 1014 8
#15 1015 5.33
#16 1016 5.33
data
df <- structure(list(Group.1 = c(1001, 1002, 1003, 1004, 1005, 1006,
1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016),
x = structure(c(5L, 7L, 5L, 6L, 10L, 2L, 1L, 1L, 9L, 11L,
4L, 8L, 10L, 3L, 8L, 8L), .Label = c(" 4",
" 4", " 8", " 8, 4",
" 8", " 16", " 16, 8", " 5.33333333333333",
" 5.33333333333333", " 5.33333333333333, 8", " 8, 5.33333333333333"
), class = "factor")), .Names = c("Group.1", "x"), class =
"data.frame", row.names = c(NA,
-16L))
We can use separate_rows
to separate comma-separated enteries into different rows and then sum
by group.
library(tidyverse)
df %>%
separate_rows(x, sep = ",") %>%
group_by(Group.1) %>%
summarise(x = sum(as.numeric(x)))
# Group.1 x
# <dbl> <dbl>
# 1 1001 8
# 2 1002 24
# 3 1003 8
# 4 1004 16
# 5 1005 13.3
# 6 1006 4
# 7 1007 4
# 8 1008 4
# 9 1009 5.33
#10 1010 13.3
#11 1011 12
#12 1012 5.33
#13 1013 13.3
#14 1014 8
#15 1015 5.33
#16 1016 5.33
data
df <- structure(list(Group.1 = c(1001, 1002, 1003, 1004, 1005, 1006,
1007, 1008, 1009, 1010, 1011, 1012, 1013, 1014, 1015, 1016),
x = structure(c(5L, 7L, 5L, 6L, 10L, 2L, 1L, 1L, 9L, 11L,
4L, 8L, 10L, 3L, 8L, 8L), .Label = c(" 4",
" 4", " 8", " 8, 4",
" 8", " 16", " 16, 8", " 5.33333333333333",
" 5.33333333333333", " 5.33333333333333, 8", " 8, 5.33333333333333"
), class = "factor")), .Names = c("Group.1", "x"), class =
"data.frame", row.names = c(NA,
-16L))
answered Nov 10 at 5:46
Ronak Shah
30.4k103753
30.4k103753
add a comment |
add a comment |
up vote
0
down vote
using this method i get asnwer sapply(sm2$y, function(i) sum(as.numeric(i)))
add a comment |
up vote
0
down vote
using this method i get asnwer sapply(sm2$y, function(i) sum(as.numeric(i)))
add a comment |
up vote
0
down vote
up vote
0
down vote
using this method i get asnwer sapply(sm2$y, function(i) sum(as.numeric(i)))
using this method i get asnwer sapply(sm2$y, function(i) sum(as.numeric(i)))
edited Nov 10 at 10:32
answered Nov 10 at 10:04
Kiran Pg
228
228
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53236148%2fhow-can-i-calculate-the-sum-of-comma-separated-in-the-2nd-column%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown