Difference between two higher numbers in a column in R

up vote
1
down vote

favorite

I have a data frame like these:

  NUM_TURNO CODIGO_MUNICIPIO SIGLA_PARTIDO     SHARE

1         1            81825           PPB 38.713318

2         1            81825          PMDB 61.286682

3         1            09717          PMDB 48.025900

4         1            09717            PL  1.279217

5         1            09717           PFL 50.694883

6         1            61921          PMDB 51.793868

This is a data.frame of elections in Brazil. Grouping by NUM_TURNO and CODGIDO_MUNICIPIO I want to compare the SHARE of the FIRST and SECOND most votted politics in each city and round (1 or 2) and create a new column.

What am I having problem to do? I don't know how to calculate the difference only for the two biggest SHARES of votes.

For the first case, for example, I want to create something that gives me the difference between 61.286682 and 38.713318 = 22.573364 and so on.

Something like this:

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    mutate(Diff = HIGHER SHARE - 2º HIGHER SHARE))

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

2

Something like -diff(sort(SHARE,decreasing=TRUE)[1:2])
– nicola
Nov 8 at 16:47

add a comment |

up vote
1
down vote

favorite

I have a data frame like these:

  NUM_TURNO CODIGO_MUNICIPIO SIGLA_PARTIDO     SHARE

1         1            81825           PPB 38.713318

2         1            81825          PMDB 61.286682

3         1            09717          PMDB 48.025900

4         1            09717            PL  1.279217

5         1            09717           PFL 50.694883

6         1            61921          PMDB 51.793868

For the first case, for example, I want to create something that gives me the difference between 61.286682 and 38.713318 = 22.573364 and so on.

Something like this:

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    mutate(Diff = HIGHER SHARE - 2º HIGHER SHARE))

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

2

Something like -diff(sort(SHARE,decreasing=TRUE)[1:2])
– nicola
Nov 8 at 16:47

add a comment |

up vote
1
down vote

favorite

I have a data frame like these:

  NUM_TURNO CODIGO_MUNICIPIO SIGLA_PARTIDO     SHARE

1         1            81825           PPB 38.713318

2         1            81825          PMDB 61.286682

3         1            09717          PMDB 48.025900

4         1            09717            PL  1.279217

5         1            09717           PFL 50.694883

6         1            61921          PMDB 51.793868

For the first case, for example, I want to create something that gives me the difference between 61.286682 and 38.713318 = 22.573364 and so on.

Something like this:

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    mutate(Diff = HIGHER SHARE - 2º HIGHER SHARE))

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

I have a data frame like these:

  NUM_TURNO CODIGO_MUNICIPIO SIGLA_PARTIDO     SHARE

1         1            81825           PPB 38.713318

2         1            81825          PMDB 61.286682

3         1            09717          PMDB 48.025900

4         1            09717            PL  1.279217

5         1            09717           PFL 50.694883

6         1            61921          PMDB 51.793868

For the first case, for example, I want to create something that gives me the difference between 61.286682 and 38.713318 = 22.573364 and so on.

Something like this:

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    mutate(Diff = HIGHER SHARE - 2º HIGHER SHARE))

r dataframe calculator difference

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

edited Nov 8 at 16:48

Dave2e

6,161112228

edited Nov 8 at 16:48

Dave2e

6,161112228

edited Nov 8 at 16:48

Dave2e

6,161112228

asked Nov 8 at 16:43

Danilo Imbimbo

1558

asked Nov 8 at 16:43

Danilo Imbimbo

1558

asked Nov 8 at 16:43

Danilo Imbimbo

1558

2

Something like -diff(sort(SHARE,decreasing=TRUE)[1:2])
– nicola
Nov 8 at 16:47

add a comment |

2

Something like -diff(sort(SHARE,decreasing=TRUE)[1:2])
– nicola
Nov 8 at 16:47

Something like -diff(sort(SHARE,decreasing=TRUE)[1:2])
– nicola
Nov 8 at 16:47

add a comment |

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    arrange(desc(Share)) %>%

    slice(1:2) %>%

    summarise(Diff = -diff(Share))

answered Nov 8 at 16:57

FloSchmo

4486

add a comment |

up vote
2
down vote

You can also use top_n from dplyr with grouping and summarizing. Keep in mind that in the data you provided, you will get an error in summarize if you use diff with a single value, hence the use of ifelse.

df %>%

  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

  top_n(2, SHARE) %>% 

  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))



# A tibble: 3 x 3

# Groups:   NUM_TURNO [?]

  NUM_TURNO CODIGO_MUNICIPIO  Diff

      <dbl>            <dbl> <dbl>

1         1             9717  2.67

2         1            61921 NA   

3         1            81825 22.6

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53212322%2fdifference-between-two-higher-numbers-in-a-column-in-r%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
1
down vote

accepted

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    arrange(desc(Share)) %>%

    slice(1:2) %>%

    summarise(Diff = -diff(Share))

answered Nov 8 at 16:57

FloSchmo

4486

add a comment |

up vote
1
down vote

accepted

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    arrange(desc(Share)) %>%

    slice(1:2) %>%

    summarise(Diff = -diff(Share))

answered Nov 8 at 16:57

FloSchmo

4486

add a comment |

up vote
1
down vote

accepted

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    arrange(desc(Share)) %>%

    slice(1:2) %>%

    summarise(Diff = -diff(Share))

answered Nov 8 at 16:57

FloSchmo

4486

You could arrange your dataframe by Share and then slice the first two values. Then you could use summarise to get the diff between the values for every group:

library(dplyr)

df %>%

    group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

    arrange(desc(Share)) %>%

    slice(1:2) %>%

    summarise(Diff = -diff(Share))

answered Nov 8 at 16:57

FloSchmo

4486

answered Nov 8 at 16:57

FloSchmo

4486

answered Nov 8 at 16:57

FloSchmo

4486

answered Nov 8 at 16:57

FloSchmo

4486

add a comment |

up vote
2
down vote

df %>%

  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

  top_n(2, SHARE) %>% 

  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))



# A tibble: 3 x 3

# Groups:   NUM_TURNO [?]

  NUM_TURNO CODIGO_MUNICIPIO  Diff

      <dbl>            <dbl> <dbl>

1         1             9717  2.67

2         1            61921 NA   

3         1            81825 22.6

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

add a comment |

up vote
2
down vote

df %>%

  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

  top_n(2, SHARE) %>% 

  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))



# A tibble: 3 x 3

# Groups:   NUM_TURNO [?]

  NUM_TURNO CODIGO_MUNICIPIO  Diff

      <dbl>            <dbl> <dbl>

1         1             9717  2.67

2         1            61921 NA   

3         1            81825 22.6

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

add a comment |

up vote
2
down vote

df %>%

  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

  top_n(2, SHARE) %>% 

  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))



# A tibble: 3 x 3

# Groups:   NUM_TURNO [?]

  NUM_TURNO CODIGO_MUNICIPIO  Diff

      <dbl>            <dbl> <dbl>

1         1             9717  2.67

2         1            61921 NA   

3         1            81825 22.6

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

df %>%

  group_by(NUM_TURNO, CODIGO_MUNICIPIO) %>%

  top_n(2, SHARE) %>% 

  summarize(Diff = ifelse(n() == 1, NA, diff(SHARE)))



# A tibble: 3 x 3

# Groups:   NUM_TURNO [?]

  NUM_TURNO CODIGO_MUNICIPIO  Diff

      <dbl>            <dbl> <dbl>

1         1             9717  2.67

2         1            61921 NA   

3         1            81825 22.6

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

answered Nov 8 at 17:13

Jake Kaupp

4,83221427

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl