mysql database composite key issues
up vote
1
down vote
favorite
We have a database with a data table in it defined as follows:
CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);
ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);
ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);
This table can contain billions of records.
When doing a simple query:
SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );
yeilding:
+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)
takes 42.51 seconds to count 89873.
Why should it take so long seeing as the primary key should act as a composite index?
This is the explain:
EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );
which gives:
rows = 190996998
filtered=0
ref=NULL
type=range
Any help would be much appreciated!
mysql sql indexing database-indexes
add a comment |
up vote
1
down vote
favorite
We have a database with a data table in it defined as follows:
CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);
ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);
ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);
This table can contain billions of records.
When doing a simple query:
SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );
yeilding:
+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)
takes 42.51 seconds to count 89873.
Why should it take so long seeing as the primary key should act as a composite index?
This is the explain:
EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );
which gives:
rows = 190996998
filtered=0
ref=NULL
type=range
Any help would be much appreciated!
mysql sql indexing database-indexes
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
We have a database with a data table in it defined as follows:
CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);
ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);
ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);
This table can contain billions of records.
When doing a simple query:
SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );
yeilding:
+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)
takes 42.51 seconds to count 89873.
Why should it take so long seeing as the primary key should act as a composite index?
This is the explain:
EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );
which gives:
rows = 190996998
filtered=0
ref=NULL
type=range
Any help would be much appreciated!
mysql sql indexing database-indexes
We have a database with a data table in it defined as follows:
CREATE TABLE data
(
msts BIGINT,
variable_id INT,
p_id INT,
value DOUBLE,
PRIMARY KEY(msts,variable_id,p_id)
);
ALTER TABLE data
ADD FOREIGN KEY (p_id)
REFERENCES p(id);
ALTER TABLE data
ADD FOREIGN KEY (variable_id)
REFERENCES variables(id);
This table can contain billions of records.
When doing a simple query:
SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where (
msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND p_id=661 );
yeilding:
+-------------+
| COUNT(msts) |
+-------------+
| 89873 |
+-------------+
1 row in set (42.51 sec)
takes 42.51 seconds to count 89873.
Why should it take so long seeing as the primary key should act as a composite index?
This is the explain:
EXPLAIN SELECT COUNT(msts) from data FORCE INDEX(PRIMARY) where
( msts<1535886000000000000 AND msts>1535796060000000000 AND
variable_id=107 AND plant_id=661 );
which gives:
rows = 190996998
filtered=0
ref=NULL
type=range
Any help would be much appreciated!
mysql sql indexing database-indexes
mysql sql indexing database-indexes
edited Nov 9 at 16:21
O. Jones
58.7k971106
58.7k971106
asked Nov 9 at 15:45
kitt91
82
82
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12
add a comment |
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12
add a comment |
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
Your query, rewritten to change the order of the where clauses, is this.
SELECT COUNT(msts)
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;
It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts)
columns in that order to help satisfy your query quickly.
Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.
Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.
Pro tip 1: Use COUNT(*)
when you can instead of COUNT(column)
. The second one is slower because it must omit any column values that are NULL. The first one just counts them all.
Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.
Pro tip 3: Forcing the use of an index is almost always a bad choice.
Pro tip 4: Read https://use-the-index-luke.com/
Edit: You asked how to do the conversion.
If your table does not yet contain many millions of rows, just change the primary key definition like this.
ALTER TABLE data
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);
If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old
or something, and rename the new table to data
. This may be a complex task involving batches of data; if you can't figure it out ask another question.
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Your query, rewritten to change the order of the where clauses, is this.
SELECT COUNT(msts)
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;
It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts)
columns in that order to help satisfy your query quickly.
Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.
Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.
Pro tip 1: Use COUNT(*)
when you can instead of COUNT(column)
. The second one is slower because it must omit any column values that are NULL. The first one just counts them all.
Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.
Pro tip 3: Forcing the use of an index is almost always a bad choice.
Pro tip 4: Read https://use-the-index-luke.com/
Edit: You asked how to do the conversion.
If your table does not yet contain many millions of rows, just change the primary key definition like this.
ALTER TABLE data
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);
If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old
or something, and rename the new table to data
. This may be a complex task involving batches of data; if you can't figure it out ask another question.
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
add a comment |
up vote
2
down vote
accepted
Your query, rewritten to change the order of the where clauses, is this.
SELECT COUNT(msts)
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;
It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts)
columns in that order to help satisfy your query quickly.
Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.
Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.
Pro tip 1: Use COUNT(*)
when you can instead of COUNT(column)
. The second one is slower because it must omit any column values that are NULL. The first one just counts them all.
Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.
Pro tip 3: Forcing the use of an index is almost always a bad choice.
Pro tip 4: Read https://use-the-index-luke.com/
Edit: You asked how to do the conversion.
If your table does not yet contain many millions of rows, just change the primary key definition like this.
ALTER TABLE data
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);
If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old
or something, and rename the new table to data
. This may be a complex task involving batches of data; if you can't figure it out ask another question.
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Your query, rewritten to change the order of the where clauses, is this.
SELECT COUNT(msts)
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;
It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts)
columns in that order to help satisfy your query quickly.
Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.
Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.
Pro tip 1: Use COUNT(*)
when you can instead of COUNT(column)
. The second one is slower because it must omit any column values that are NULL. The first one just counts them all.
Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.
Pro tip 3: Forcing the use of an index is almost always a bad choice.
Pro tip 4: Read https://use-the-index-luke.com/
Edit: You asked how to do the conversion.
If your table does not yet contain many millions of rows, just change the primary key definition like this.
ALTER TABLE data
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);
If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old
or something, and rename the new table to data
. This may be a complex task involving batches of data; if you can't figure it out ask another question.
Your query, rewritten to change the order of the where clauses, is this.
SELECT COUNT(msts)
from data
where variable_id=107
and p_id=661
and msts>1535796060000000000
and msts<1535886000000000000;
It contains two equality matches, on variable_id and p_id. It then contains a range filter on msts. Therefore, you need an index on the (variable_id, p_id, msts)
columns in that order to help satisfy your query quickly.
Why? You can think of MySQL indexes as sorted in order. To satisfy your query, MySQL random-accesses the index to the first eligible item. It then scans it sequentially until the last item. That's called an index range scan.
Your pre-existing index lists msts first. That means your index can't be scanned sequentially, because each msts value in the index potentially has lots of values of the other two columns.
Pro tip 1: Use COUNT(*)
when you can instead of COUNT(column)
. The second one is slower because it must omit any column values that are NULL. The first one just counts them all.
Pro tip 2: Extra single column indexes are not useful unless they help speed up a specific query.
Pro tip 3: Forcing the use of an index is almost always a bad choice.
Pro tip 4: Read https://use-the-index-luke.com/
Edit: You asked how to do the conversion.
If your table does not yet contain many millions of rows, just change the primary key definition like this.
ALTER TABLE data
DROP PRIMARY KEY,
ADD PRIMARY KEY (variable_id, p_id, msts);
If it does contain billions of rows already, you probably should create a new table, with the correct definition, into which to copy your existing table. Then copy your data. Then rename the old table to data_old
or something, and rename the new table to data
. This may be a complex task involving batches of data; if you can't figure it out ask another question.
edited Nov 9 at 23:11
answered Nov 9 at 16:18
O. Jones
58.7k971106
58.7k971106
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
add a comment |
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
Thanks O. Jones. So would this involve changing the ordering in the primary key to: (variable_id, p_id, msts)? I'll give it a go! Many thanks!
– kitt91
Nov 9 at 16:37
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@kitt91 - Plan A rearrange the PK.. Plan B add a secondary index with those 3 columns. Either is time-consuming on a billion rows. 'B' will consume a bunch of extra disk space. Another index cookbook
– Rick James
Nov 9 at 22:27
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@O.Jones - thanks this worked! Considerable speed improvement to a fraction of a second! Thanks again
– kitt91
Nov 13 at 14:16
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
@RickJames thanks for the advice
– kitt91
Nov 13 at 14:17
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53228949%2fmysql-database-composite-key-issues%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Hint: range scan
– Lukasz Szozda
Nov 9 at 15:47
Thanks Lukasz any chance of describing in code how it should be done. Thanks again.
– kitt91
Nov 9 at 16:12