Why when I run two instances of Elasticsearch does my index rate nearly double?

up vote
0
down vote

favorite

When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.

Single ES on server:
~6000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          29.45    0.00    3.87    6.26    0.00   60.43



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00   733.13    0.00  800.60     0.00     6.48    16.59     1.75    2.19    0.00    2.19   0.89  71.22

Dual ES on server:
~10,0000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          52.87    0.00    5.22    5.41    0.00   36.49



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00  1076.40    0.00  989.40     0.00     9.75    20.18     2.15    2.17    0.00    2.17   0.89  88.32

Maybe useful notes:

Both ES instances are stock ES installs with the only change being to increase the JVM size.

I have TBs of logs that look like the below:

{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}

Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

I provide a mapping up front but it isn't static.

System swap is turned off.

Index refresh_interval is 90s.

number_of_replicas is set to 0.

_node stats shows total_indexing_buffer": 1062404096

index rate is according to Kibana xpack monitoring

Elasticsearch 6.4.2 and Logstash 6.4.2

Is there a limiter somewhere that I need to change?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

add a comment |

up vote
0
down vote

favorite

Single ES on server:
~6000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          29.45    0.00    3.87    6.26    0.00   60.43



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00   733.13    0.00  800.60     0.00     6.48    16.59     1.75    2.19    0.00    2.19   0.89  71.22

Dual ES on server:
~10,0000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          52.87    0.00    5.22    5.41    0.00   36.49



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00  1076.40    0.00  989.40     0.00     9.75    20.18     2.15    2.17    0.00    2.17   0.89  88.32

Maybe useful notes:

Both ES instances are stock ES installs with the only change being to increase the JVM size.

I have TBs of logs that look like the below:

{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}

Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

I provide a mapping up front but it isn't static.

System swap is turned off.

Index refresh_interval is 90s.

number_of_replicas is set to 0.

_node stats shows total_indexing_buffer": 1062404096

index rate is according to Kibana xpack monitoring

Elasticsearch 6.4.2 and Logstash 6.4.2

Is there a limiter somewhere that I need to change?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

add a comment |

up vote
0
down vote

favorite

Single ES on server:
~6000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          29.45    0.00    3.87    6.26    0.00   60.43



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00   733.13    0.00  800.60     0.00     6.48    16.59     1.75    2.19    0.00    2.19   0.89  71.22

Dual ES on server:
~10,0000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          52.87    0.00    5.22    5.41    0.00   36.49



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00  1076.40    0.00  989.40     0.00     9.75    20.18     2.15    2.17    0.00    2.17   0.89  88.32

Maybe useful notes:

Both ES instances are stock ES installs with the only change being to increase the JVM size.

I have TBs of logs that look like the below:

{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}

Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

I provide a mapping up front but it isn't static.

System swap is turned off.

Index refresh_interval is 90s.

number_of_replicas is set to 0.

_node stats shows total_indexing_buffer": 1062404096

index rate is according to Kibana xpack monitoring

Elasticsearch 6.4.2 and Logstash 6.4.2

Is there a limiter somewhere that I need to change?

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

Single ES on server:
~6000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          29.45    0.00    3.87    6.26    0.00   60.43



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00   733.13    0.00  800.60     0.00     6.48    16.59     1.75    2.19    0.00    2.19   0.89  71.22

Dual ES on server:
~10,0000 EPS

avg-cpu:  %user   %nice %system %iowait  %steal   %idle

          52.87    0.00    5.22    5.41    0.00   36.49



Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

sdb               0.00  1076.40    0.00  989.40     0.00     9.75    20.18     2.15    2.17    0.00    2.17   0.89  88.32

Maybe useful notes:

Both ES instances are stock ES installs with the only change being to increase the JVM size.

I have TBs of logs that look like the below:

{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}

Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

I provide a mapping up front but it isn't static.

System swap is turned off.

Index refresh_interval is 90s.

number_of_replicas is set to 0.

_node stats shows total_indexing_buffer": 1062404096

index rate is according to Kibana xpack monitoring

Elasticsearch 6.4.2 and Logstash 6.4.2

Is there a limiter somewhere that I need to change?

elasticsearch

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

edited Nov 9 at 15:38

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

asked Nov 9 at 15:29

helpzmepleasekthxbye

183

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.

have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?

i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.

answered Nov 9 at 20:13

ibexit

656313

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228657%2fwhy-when-i-run-two-instances-of-elasticsearch-does-my-index-rate-nearly-double%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.

i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.

answered Nov 9 at 20:13

ibexit

656313

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

add a comment |

up vote
0
down vote

first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.

i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.

answered Nov 9 at 20:13

ibexit

656313

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

add a comment |

up vote
0
down vote

first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.

i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.

answered Nov 9 at 20:13

ibexit

656313

first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.

i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.

answered Nov 9 at 20:13

ibexit

656313

answered Nov 9 at 20:13

ibexit

656313

answered Nov 9 at 20:13

ibexit

656313

answered Nov 9 at 20:13

ibexit

656313

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

add a comment |

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35

you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31

and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl