mysql database composite key issues











up vote
1
down vote

favorite












We have a database with a data table in it defined as follows:



CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);

ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);

ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);


This table can contain billions of records.



When doing a simple query:



SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );


yeilding:



+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)


takes 42.51 seconds to count 89873.



Why should it take so long seeing as the primary key should act as a composite index?



This is the explain:



EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );


which gives:



rows = 190996998
filtered=0
ref=NULL
type=range


Any help would be much appreciated!










share|improve this question
























  • Hint: range scan
    – Lukasz Szozda
    Nov 9 at 15:47










  • Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
    – kitt91
    Nov 9 at 16:12















up vote
1
down vote

favorite












We have a database with a data table in it defined as follows:



CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);

ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);

ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);


This table can contain billions of records.



When doing a simple query:



SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );


yeilding:



+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)


takes 42.51 seconds to count 89873.



Why should it take so long seeing as the primary key should act as a composite index?



This is the explain:



EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );


which gives:



rows = 190996998
filtered=0
ref=NULL
type=range


Any help would be much appreciated!










share|improve this question
























  • Hint: range scan
    – Lukasz Szozda
    Nov 9 at 15:47










  • Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
    – kitt91
    Nov 9 at 16:12













up vote
1
down vote

favorite









up vote
1
down vote

favorite











We have a database with a data table in it defined as follows:



CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);

ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);

ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);


This table can contain billions of records.



When doing a simple query:



SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );


yeilding:



+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)


takes 42.51 seconds to count 89873.



Why should it take so long seeing as the primary key should act as a composite index?



This is the explain:



EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );


which gives:



rows = 190996998
filtered=0
ref=NULL
type=range


Any help would be much appreciated!










share|improve this question















We have a database with a data table in it defined as follows:



CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);

ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);

ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);


This table can contain billions of records.



When doing a simple query:



SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );


yeilding:



+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)


takes 42.51 seconds to count 89873.



Why should it take so long seeing as the primary key should act as a composite index?



This is the explain:



EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where 
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );


which gives:



rows = 190996998
filtered=0
ref=NULL
type=range


Any help would be much appreciated!







mysql sql indexing database-indexes






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 9 at 16:21









O. Jones

58.7k971106




58.7k971106










asked Nov 9 at 15:45









kitt91

82




82












  • Hint: range scan
    – Lukasz Szozda
    Nov 9 at 15:47










  • Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
    – kitt91
    Nov 9 at 16:12


















  • Hint: range scan
    – Lukasz Szozda
    Nov 9 at 15:47










  • Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
    – kitt91
    Nov 9 at 16:12
















Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47




Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47












Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12




Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12












1 Answer
1






active

oldest

votes

















up vote
2
down vote



accepted










Your query, rewritten to change the order of the where clauses, is this.



SELECT COUNT(msts) 
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;


It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.



Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.



Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.



Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.



Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.



Pro tip 3: Forcing the use of an index is almost always a bad choice.



Pro tip 4: Read https://use-the-index-luke.com/



Edit: You asked how to do the conversion.



If your table does not yet contain many millions of rows, just change the primary key definition like this.



ALTER TABLE data 
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);


If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.






share|improve this answer























  • Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
    – kitt91
    Nov 9 at 16:37










  • @kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
    – Rick James
    Nov 9 at 22:27












  • @O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
    – kitt91
    Nov 13 at 14:16












  • @RickJames thanks for the advice
    – kitt91
    Nov 13 at 14:17











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














 

draft saved


draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228949%2fmysql-database-composite-key-issues%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
2
down vote



accepted










Your query, rewritten to change the order of the where clauses, is this.



SELECT COUNT(msts) 
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;


It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.



Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.



Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.



Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.



Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.



Pro tip 3: Forcing the use of an index is almost always a bad choice.



Pro tip 4: Read https://use-the-index-luke.com/



Edit: You asked how to do the conversion.



If your table does not yet contain many millions of rows, just change the primary key definition like this.



ALTER TABLE data 
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);


If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.






share|improve this answer























  • Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
    – kitt91
    Nov 9 at 16:37










  • @kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
    – Rick James
    Nov 9 at 22:27












  • @O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
    – kitt91
    Nov 13 at 14:16












  • @RickJames thanks for the advice
    – kitt91
    Nov 13 at 14:17















up vote
2
down vote



accepted










Your query, rewritten to change the order of the where clauses, is this.



SELECT COUNT(msts) 
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;


It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.



Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.



Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.



Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.



Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.



Pro tip 3: Forcing the use of an index is almost always a bad choice.



Pro tip 4: Read https://use-the-index-luke.com/



Edit: You asked how to do the conversion.



If your table does not yet contain many millions of rows, just change the primary key definition like this.



ALTER TABLE data 
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);


If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.






share|improve this answer























  • Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
    – kitt91
    Nov 9 at 16:37










  • @kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
    – Rick James
    Nov 9 at 22:27












  • @O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
    – kitt91
    Nov 13 at 14:16












  • @RickJames thanks for the advice
    – kitt91
    Nov 13 at 14:17













up vote
2
down vote



accepted







up vote
2
down vote



accepted






Your query, rewritten to change the order of the where clauses, is this.



SELECT COUNT(msts) 
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;


It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.



Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.



Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.



Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.



Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.



Pro tip 3: Forcing the use of an index is almost always a bad choice.



Pro tip 4: Read https://use-the-index-luke.com/



Edit: You asked how to do the conversion.



If your table does not yet contain many millions of rows, just change the primary key definition like this.



ALTER TABLE data 
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);


If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.






share|improve this answer














Your query, rewritten to change the order of the where clauses, is this.



SELECT COUNT(msts) 
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;


It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts) columns in that order to help satisfy your query quickly.



Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.



Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.



Pro tip 1: Use COUNT(*) when you can instead of COUNT(column). The second one is slower because it must omit any column values that are NULL. The first one just counts them all.



Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.



Pro tip 3: Forcing the use of an index is almost always a bad choice.



Pro tip 4: Read https://use-the-index-luke.com/



Edit: You asked how to do the conversion.



If your table does not yet contain many millions of rows, just change the primary key definition like this.



ALTER TABLE data 
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);


If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old or something, and rename the new table to data. This may be a complex task involving batches of data; if you can't figure it out ask another question.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 9 at 23:11

























answered Nov 9 at 16:18









O. Jones

58.7k971106




58.7k971106












  • Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
    – kitt91
    Nov 9 at 16:37










  • @kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
    – Rick James
    Nov 9 at 22:27












  • @O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
    – kitt91
    Nov 13 at 14:16












  • @RickJames thanks for the advice
    – kitt91
    Nov 13 at 14:17


















  • Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
    – kitt91
    Nov 9 at 16:37










  • @kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
    – Rick James
    Nov 9 at 22:27












  • @O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
    – kitt91
    Nov 13 at 14:16












  • @RickJames thanks for the advice
    – kitt91
    Nov 13 at 14:17
















Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37




Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37












@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27






@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27














@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16






@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16














@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17




@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17


















 

draft saved


draft discarded



















































 


draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228949%2fmysql-database-composite-key-issues%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

how to define a CAPL function taking a sysvar argument

Schultheiß

Ansible :Unable to parse /etc/ansible/hosts as an inventory source