Why when I run two instances of Elasticsearch does my index rate nearly double?











up vote
0
down vote

favorite












When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.



Single ES on server:
~6000 EPS



avg-cpu:  %user   %nice %system %iowait  %steal   %idle
29.45 0.00 3.87 6.26 0.00 60.43

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22


Dual ES on server:
~10,0000 EPS



avg-cpu:  %user   %nice %system %iowait  %steal   %idle
52.87 0.00 5.22 5.41 0.00 36.49

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32


Maybe useful notes:




  • Both ES instances are stock ES installs with the only change being to increase the JVM size.

  • I have TBs of logs that look like the below:


{"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}




  • Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

  • I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

  • I provide a mapping up front but it isn't static.

  • System swap is turned off.

  • Index refresh_interval is 90s.

  • number_of_replicas is set to 0.

  • _node stats shows total_indexing_buffer": 1062404096

  • index rate is according to Kibana xpack monitoring

  • Elasticsearch 6.4.2 and Logstash 6.4.2


Is there a limiter somewhere that I need to change?










share|improve this question




























    up vote
    0
    down vote

    favorite












    When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.



    Single ES on server:
    ~6000 EPS



    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    29.45 0.00 3.87 6.26 0.00 60.43

    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22


    Dual ES on server:
    ~10,0000 EPS



    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
    52.87 0.00 5.22 5.41 0.00 36.49

    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32


    Maybe useful notes:




    • Both ES instances are stock ES installs with the only change being to increase the JVM size.

    • I have TBs of logs that look like the below:


    {"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}




    • Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

    • I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

    • I provide a mapping up front but it isn't static.

    • System swap is turned off.

    • Index refresh_interval is 90s.

    • number_of_replicas is set to 0.

    • _node stats shows total_indexing_buffer": 1062404096

    • index rate is according to Kibana xpack monitoring

    • Elasticsearch 6.4.2 and Logstash 6.4.2


    Is there a limiter somewhere that I need to change?










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.



      Single ES on server:
      ~6000 EPS



      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
      29.45 0.00 3.87 6.26 0.00 60.43

      Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
      sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22


      Dual ES on server:
      ~10,0000 EPS



      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
      52.87 0.00 5.22 5.41 0.00 36.49

      Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
      sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32


      Maybe useful notes:




      • Both ES instances are stock ES installs with the only change being to increase the JVM size.

      • I have TBs of logs that look like the below:


      {"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}




      • Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

      • I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

      • I provide a mapping up front but it isn't static.

      • System swap is turned off.

      • Index refresh_interval is 90s.

      • number_of_replicas is set to 0.

      • _node stats shows total_indexing_buffer": 1062404096

      • index rate is according to Kibana xpack monitoring

      • Elasticsearch 6.4.2 and Logstash 6.4.2


      Is there a limiter somewhere that I need to change?










      share|improve this question















      When I run one instance of Elasticsearch, I can index at ~6,000 EPS. On the same server, I start another instance of Elasticsearch, join it to the cluster, and my index speed increases to ~10,000. In other words, a single instance of Elasticsearch does NOT utilize all of the CPU or disk IO that the server has available. Even when running two, not all resources are utilized. It seems like there is some sort of throttling somewhere and I'd like to change it. The primary use of this node will be indexing.



      Single ES on server:
      ~6000 EPS



      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
      29.45 0.00 3.87 6.26 0.00 60.43

      Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
      sdb 0.00 733.13 0.00 800.60 0.00 6.48 16.59 1.75 2.19 0.00 2.19 0.89 71.22


      Dual ES on server:
      ~10,0000 EPS



      avg-cpu:  %user   %nice %system %iowait  %steal   %idle
      52.87 0.00 5.22 5.41 0.00 36.49

      Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
      sdb 0.00 1076.40 0.00 989.40 0.00 9.75 20.18 2.15 2.17 0.00 2.17 0.89 88.32


      Maybe useful notes:




      • Both ES instances are stock ES installs with the only change being to increase the JVM size.

      • I have TBs of logs that look like the below:


      {"timestamp":"1541290120","computername":"somenamehere","type":"server","owner":"somenamehere"}




      • Disks are SSDs in software raid0. A FIO 512B write test shows IOPS=46.4k, BW=22.7MiB/s and for 4k, IOPS=46.1k, BW=180MiB/s.

      • I use Logstash to process the files from a file and send to ES. The docID is created within Logstash. Stock tar.gz logstash yml config excluding config for xpack monitoring.

      • I provide a mapping up front but it isn't static.

      • System swap is turned off.

      • Index refresh_interval is 90s.

      • number_of_replicas is set to 0.

      • _node stats shows total_indexing_buffer": 1062404096

      • index rate is according to Kibana xpack monitoring

      • Elasticsearch 6.4.2 and Logstash 6.4.2


      Is there a limiter somewhere that I need to change?







      elasticsearch






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 9 at 15:38

























      asked Nov 9 at 15:29









      helpzmepleasekthxbye

      183




      183
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.



          have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?



          i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.






          share|improve this answer





















          • thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
            – helpzmepleasekthxbye
            Nov 9 at 22:35












          • you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
            – ibexit
            Nov 10 at 20:31












          • and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
            – ibexit
            Nov 10 at 20:48











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228657%2fwhy-when-i-run-two-instances-of-elasticsearch-does-my-index-rate-nearly-double%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          0
          down vote













          first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.



          have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?



          i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.






          share|improve this answer





















          • thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
            – helpzmepleasekthxbye
            Nov 9 at 22:35












          • you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
            – ibexit
            Nov 10 at 20:31












          • and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
            – ibexit
            Nov 10 at 20:48















          up vote
          0
          down vote













          first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.



          have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?



          i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.






          share|improve this answer





















          • thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
            – helpzmepleasekthxbye
            Nov 9 at 22:35












          • you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
            – ibexit
            Nov 10 at 20:31












          • and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
            – ibexit
            Nov 10 at 20:48













          up vote
          0
          down vote










          up vote
          0
          down vote









          first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.



          have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?



          i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.






          share|improve this answer












          first, please have a look on my yesterdays answer, where i explain how indexing works: ElasticSearch - How does sharding affect indexing performance?.



          have you seen this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html) and this (https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-disk-usage.html) already?



          i don't know the shard count for your index, but lowering the number of primary shards can improve your indexing speed.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 9 at 20:13









          ibexit

          656313




          656313












          • thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
            – helpzmepleasekthxbye
            Nov 9 at 22:35












          • you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
            – ibexit
            Nov 10 at 20:31












          • and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
            – ibexit
            Nov 10 at 20:48


















          • thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
            – helpzmepleasekthxbye
            Nov 9 at 22:35












          • you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
            – ibexit
            Nov 10 at 20:31












          • and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
            – ibexit
            Nov 10 at 20:48
















          thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
          – helpzmepleasekthxbye
          Nov 9 at 22:35






          thank you for the response.Could you please connect the dots for me on this one? Your response tells me to change my sharding to increase indexing speed, and based on your links and your other answers, this all sounds very reasonable. However, I'm really trying to understand why running two ES nodes on the same server doubled my indexing speed. I assume that if I change my shards from 5 to 1, indexing speed will increase and if I run two ES nodes on the same server, I wonder if it'll nearly double as presented in my original question.
          – helpzmepleasekthxbye
          Nov 9 at 22:35














          you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
          – ibexit
          Nov 10 at 20:31






          you generate the id in logstash, the distribution of the docs across the shards can be not optimal which leads to write waits. less shards, better distribution. cosider using es build in id generation - if you need to have a special id, just incroduce a legacy_id in you doc. have you tried to reduce shards count? what´s the outcome? can you share your es config? please provide your cluster settings too.
          – ibexit
          Nov 10 at 20:31














          and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
          – ibexit
          Nov 10 at 20:48




          and what is your cpu setup in this particular host? have you any custom plugins enabled in your es?
          – ibexit
          Nov 10 at 20:48


















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228657%2fwhy-when-i-run-two-instances-of-elasticsearch-does-my-index-rate-nearly-double%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Schultheiß

          Liste der Kulturdenkmale in Wilsdruff

          Android Play Services Check