mysql database composite key issues

up vote
1
down vote

favorite

We have a database with a data table in it defined as follows:

CREATE TABLE data

(

msts BIGINT,

variable_id INT,

p_id INT,

value DOUBLE,

PRIMARY KEY(msts,variable_id,p_id)

);



ALTER TABLE data

ADD    FOREIGN KEY (p_id)

REFERENCES p(id);



ALTER TABLE data

ADD    FOREIGN KEY (variable_id)

REFERENCES variables(id);

This table can contain billions of records.

When doing a simple query:

SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (

msts<1535886000000000000 AND msts>1535796060000000000  AND  

variable_id=107  AND p_id=661 );

yeilding:

+-------------+

| COUNT(msts) |

+-------------+

|       89873 |

+-------------+

1 row in set (42.51 sec)

takes 42.51 seconds to count 89873.

Why should it take so long seeing as the primary key should act as a composite index?

This is the explain:

EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 

( msts<1535886000000000000 AND msts>1535796060000000000  AND 

variable_id=107  AND plant_id=661 );

which gives:

rows = 190996998

filtered=0

ref=NULL

type=range

Any help would be much appreciated!

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47

Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12

add a comment |

up vote
1
down vote

favorite

We have a database with a data table in it defined as follows:

CREATE TABLE data

(

msts BIGINT,

variable_id INT,

p_id INT,

value DOUBLE,

PRIMARY KEY(msts,variable_id,p_id)

);



ALTER TABLE data

ADD    FOREIGN KEY (p_id)

REFERENCES p(id);



ALTER TABLE data

ADD    FOREIGN KEY (variable_id)

REFERENCES variables(id);

This table can contain billions of records.

When doing a simple query:

SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (

msts<1535886000000000000 AND msts>1535796060000000000  AND  

variable_id=107  AND p_id=661 );

yeilding:

+-------------+

| COUNT(msts) |

+-------------+

|       89873 |

+-------------+

1 row in set (42.51 sec)

takes 42.51 seconds to count 89873.

Why should it take so long seeing as the primary key should act as a composite index?

This is the explain:

EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 

( msts<1535886000000000000 AND msts>1535796060000000000  AND 

variable_id=107  AND plant_id=661 );

which gives:

rows = 190996998

filtered=0

ref=NULL

type=range

Any help would be much appreciated!

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47

Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12

add a comment |

up vote
1
down vote

favorite

We have a database with a data table in it defined as follows:

CREATE TABLE data

(

msts BIGINT,

variable_id INT,

p_id INT,

value DOUBLE,

PRIMARY KEY(msts,variable_id,p_id)

);



ALTER TABLE data

ADD    FOREIGN KEY (p_id)

REFERENCES p(id);



ALTER TABLE data

ADD    FOREIGN KEY (variable_id)

REFERENCES variables(id);

This table can contain billions of records.

When doing a simple query:

SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (

msts<1535886000000000000 AND msts>1535796060000000000  AND  

variable_id=107  AND p_id=661 );

yeilding:

+-------------+

| COUNT(msts) |

+-------------+

|       89873 |

+-------------+

1 row in set (42.51 sec)

takes 42.51 seconds to count 89873.

Why should it take so long seeing as the primary key should act as a composite index?

This is the explain:

EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 

( msts<1535886000000000000 AND msts>1535796060000000000  AND 

variable_id=107  AND plant_id=661 );

which gives:

rows = 190996998

filtered=0

ref=NULL

type=range

Any help would be much appreciated!

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

We have a database with a data table in it defined as follows:

CREATE TABLE data

(

msts BIGINT,

variable_id INT,

p_id INT,

value DOUBLE,

PRIMARY KEY(msts,variable_id,p_id)

);



ALTER TABLE data

ADD    FOREIGN KEY (p_id)

REFERENCES p(id);



ALTER TABLE data

ADD    FOREIGN KEY (variable_id)

REFERENCES variables(id);

This table can contain billions of records.

When doing a simple query:

SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (

msts<1535886000000000000 AND msts>1535796060000000000  AND  

variable_id=107  AND p_id=661 );

yeilding:

+-------------+

| COUNT(msts) |

+-------------+

|       89873 |

+-------------+

1 row in set (42.51 sec)

takes 42.51 seconds to count 89873.

Why should it take so long seeing as the primary key should act as a composite index?

This is the explain:

EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 

( msts<1535886000000000000 AND msts>1535796060000000000  AND 

variable_id=107  AND plant_id=661 );

which gives:

rows = 190996998

filtered=0

ref=NULL

type=range

Any help would be much appreciated!

mysql sql indexing database-indexes

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

edited Nov 9 at 16:21

O. Jones

58.7k971106

edited Nov 9 at 16:21

O. Jones

58.7k971106

edited Nov 9 at 16:21

O. Jones

58.7k971106

asked Nov 9 at 15:45

kitt91

asked Nov 9 at 15:45

kitt91

asked Nov 9 at 15:45

kitt91

Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47

Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12

add a comment |

Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47

Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12

Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47

Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12

add a comment |

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

Your query, rewritten to change the order of the where clauses, is this.

SELECT COUNT(msts) 

  from data 

 where variable_id=107 

   and p_id=661

   and msts>1535796060000000000

   and msts<1535886000000000000;

It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.

Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.

Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.

Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.

Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.

Pro tip 3: Forcing the use of an index is almost always a bad choice.

Pro tip 4: Read https://use-the-index-luke.com/

Edit: You asked how to do the conversion.

If your table does not yet contain many millions of rows, just change the primary key definition like this.

ALTER TABLE data 

       DROP PRIMARY KEY, 

       ADD PRIMARY KEY (variable_id, p_id, msts);

If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228949%2fmysql-database-composite-key-issues%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

accepted

Your query, rewritten to change the order of the where clauses, is this.

SELECT COUNT(msts) 

  from data 

 where variable_id=107 

   and p_id=661

   and msts>1535796060000000000

   and msts<1535886000000000000;

Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.

Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.

Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.

Pro tip 3: Forcing the use of an index is almost always a bad choice.

Pro tip 4: Read https://use-the-index-luke.com/

Edit: You asked how to do the conversion.

If your table does not yet contain many millions of rows, just change the primary key definition like this.

ALTER TABLE data 

       DROP PRIMARY KEY, 

       ADD PRIMARY KEY (variable_id, p_id, msts);

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

add a comment |

up vote
2
down vote

accepted

Your query, rewritten to change the order of the where clauses, is this.

SELECT COUNT(msts) 

  from data 

 where variable_id=107 

   and p_id=661

   and msts>1535796060000000000

   and msts<1535886000000000000;

Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.

Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.

Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.

Pro tip 3: Forcing the use of an index is almost always a bad choice.

Pro tip 4: Read https://use-the-index-luke.com/

Edit: You asked how to do the conversion.

If your table does not yet contain many millions of rows, just change the primary key definition like this.

ALTER TABLE data 

       DROP PRIMARY KEY, 

       ADD PRIMARY KEY (variable_id, p_id, msts);

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

add a comment |

up vote
2
down vote

accepted

Your query, rewritten to change the order of the where clauses, is this.

SELECT COUNT(msts) 

  from data 

 where variable_id=107 

   and p_id=661

   and msts>1535796060000000000

   and msts<1535886000000000000;

Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.

Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.

Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.

Pro tip 3: Forcing the use of an index is almost always a bad choice.

Pro tip 4: Read https://use-the-index-luke.com/

Edit: You asked how to do the conversion.

If your table does not yet contain many millions of rows, just change the primary key definition like this.

ALTER TABLE data 

       DROP PRIMARY KEY, 

       ADD PRIMARY KEY (variable_id, p_id, msts);

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

Your query, rewritten to change the order of the where clauses, is this.

SELECT COUNT(msts) 

  from data 

 where variable_id=107 

   and p_id=661

   and msts>1535796060000000000

   and msts<1535886000000000000;

Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.

Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.

Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.

Pro tip 3: Forcing the use of an index is almost always a bad choice.

Pro tip 4: Read https://use-the-index-luke.com/

Edit: You asked how to do the conversion.

If your table does not yet contain many millions of rows, just change the primary key definition like this.

ALTER TABLE data 

       DROP PRIMARY KEY, 

       ADD PRIMARY KEY (variable_id, p_id, msts);

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

edited Nov 9 at 23:11

answered Nov 9 at 16:18

O. Jones

58.7k971106

answered Nov 9 at 16:18

O. Jones

58.7k971106

answered Nov 9 at 16:18

O. Jones

58.7k971106

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

add a comment |

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37

@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27

@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16

@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Xtykutl