Why when I run two instances of Elasticsearch does my index rate nearly double?
up vote
0
down vote
favorite
When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.
Single ES on server:
~6000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
29.45 0.00 3.87 6.26 0.00 60.43
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22
Dual ES on server:
~10,0000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
52.87 0.00 5.22 5.41 0.00 36.49
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32
Maybe useful notes:
- Both ES instances are stock ES installs with the only change being to increase the JVM size.
- I have TBs of logs that look like the below:
{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}
- Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.
- I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.
- I provide a mapping up front but it isn't static.
- System swap is turned off.
- Index refresh_interval is 90s.
- number_of_replicas is set to 0.
- _node stats shows total_indexing_buffer": 1062404096
- index rate is according to Kibana xpack monitoring
- Elasticsearch 6.4.2 and Logstash 6.4.2
Is there a limiter somewhere that I need to change?
elasticsearch
add a comment |
up vote
0
down vote
favorite
When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.
Single ES on server:
~6000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
29.45 0.00 3.87 6.26 0.00 60.43
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22
Dual ES on server:
~10,0000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
52.87 0.00 5.22 5.41 0.00 36.49
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32
Maybe useful notes:
- Both ES instances are stock ES installs with the only change being to increase the JVM size.
- I have TBs of logs that look like the below:
{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}
- Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.
- I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.
- I provide a mapping up front but it isn't static.
- System swap is turned off.
- Index refresh_interval is 90s.
- number_of_replicas is set to 0.
- _node stats shows total_indexing_buffer": 1062404096
- index rate is according to Kibana xpack monitoring
- Elasticsearch 6.4.2 and Logstash 6.4.2
Is there a limiter somewhere that I need to change?
elasticsearch
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.
Single ES on server:
~6000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
29.45 0.00 3.87 6.26 0.00 60.43
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22
Dual ES on server:
~10,0000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
52.87 0.00 5.22 5.41 0.00 36.49
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32
Maybe useful notes:
- Both ES instances are stock ES installs with the only change being to increase the JVM size.
- I have TBs of logs that look like the below:
{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}
- Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.
- I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.
- I provide a mapping up front but it isn't static.
- System swap is turned off.
- Index refresh_interval is 90s.
- number_of_replicas is set to 0.
- _node stats shows total_indexing_buffer": 1062404096
- index rate is according to Kibana xpack monitoring
- Elasticsearch 6.4.2 and Logstash 6.4.2
Is there a limiter somewhere that I need to change?
elasticsearch
When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.
Single ES on server:
~6000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
29.45 0.00 3.87 6.26 0.00 60.43
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22
Dual ES on server:
~10,0000 EPS
avg-cpu: %user %nice %system %iowait %steal %idle
52.87 0.00 5.22 5.41 0.00 36.49
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32
Maybe useful notes:
- Both ES instances are stock ES installs with the only change being to increase the JVM size.
- I have TBs of logs that look like the below:
{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}
- Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.
- I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.
- I provide a mapping up front but it isn't static.
- System swap is turned off.
- Index refresh_interval is 90s.
- number_of_replicas is set to 0.
- _node stats shows total_indexing_buffer": 1062404096
- index rate is according to Kibana xpack monitoring
- Elasticsearch 6.4.2 and Logstash 6.4.2
Is there a limiter somewhere that I need to change?
elasticsearch
elasticsearch
edited Nov 9 at 15:38
asked Nov 9 at 15:29
helpzmepleasekthxbye
183
183
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.
have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?
i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.
have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?
i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
add a comment |
up vote
0
down vote
first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.
have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?
i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
add a comment |
up vote
0
down vote
up vote
0
down vote
first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.
have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?
i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.
first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.
have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?
i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.
answered Nov 9 at 20:13
ibexit
656313
656313
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
add a comment |
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
– helpzmepleasekthxbye
Nov 9 at 22:35
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
– ibexit
Nov 10 at 20:31
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
– ibexit
Nov 10 at 20:48
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228657%2fwhy-when-i-run-two-instances-of-elasticsearch-does-my-index-rate-nearly-double%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown