19

I have a query of such like

$query = "SELECT * FROM tbl_comments WHERE id=222 ORDER BY comment_time";

Do I need to add an index on the comment_time field?

Also, if I want to get the data between two dates then how should I build the index?

Rohitashwa Nigam
  • 327
  • 3
  • 14
Munib
  • 2,995
  • 7
  • 24
  • 35
  • 1
    It's odd that a column called 'id' wouldn't be the PK, but that aside whether or not you index comment_time won't affect the outcome - but doing so may improve performance. See http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html – Strawberry May 28 '13 at 12:49
  • Hmm. not sure.. but you might try to prepend EXPLAIN to your query and have a look at what effect an INDEX has. Also, getting the data between 2 values is called a range search, have a look at the mysql docs on that http://dev.mysql.com/doc/refman/5.0/en/range-optimization.html – Johan May 28 '13 at 12:50
  • 5
    Unfortunately there is not a yes/no answer to this -- it really depends on several factors such as size of your table, number of columns in your table, other number of indices, cost benefit for reads/writes, etc. Best bet is to try and look at your execution plans. In regards to the dates, just search for that -- but I'd recommend using >= and <= instead of Between when working with dates. – sgeddes May 28 '13 at 12:51

7 Answers7

16

Yes, index will help you, when using ORDER BY. Because INDEX is a sorted data structure, so the request will be executed faster.

Look at this example: table test2 with 3 rows. I used LIMIT after order by to show the difference in execution.

DROP TABLE IF EXISTS `test2`;
CREATE TABLE `test2` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `value` varchar(10) CHARACTER SET utf8 COLLATE utf8_swedish_ci NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_value` (`value`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;

-- ----------------------------
-- Records of test2
-- ----------------------------
INSERT INTO `test2` VALUES ('1', '10');
INSERT INTO `test2` VALUES ('2', '11');
INSERT INTO `test2` VALUES ('2', '9');

-- ----------------------------
-- Without INDEX
-- ----------------------------

mysql> EXPLAIN SELECT * FROM test2 ORDER BY value LIMIT 1\G
*************************** 1. row *************************
           id: 1
  select_type: SIMPLE
        table: test2
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
        Extra: Using filesort
1 row in set (0.00 sec)

MySQL checked 3 rows to output the result. After CREATE INDEX, we get this:

mysql> CREATE INDEX ix_value ON test2 (value) USING BTREE;
Query OK, 0 rows affected (0.14 sec)

-- ----------------------------
-- With INDEX
-- ----------------------------

mysql> EXPLAIN SELECT * FROM test2 ORDER BY value LIMIT 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: test2
         type: index
possible_keys: NULL
          key: ix_value
      key_len: 32
          ref: NULL
         rows: 1
        Extra: Using index
1 row in set (0.00 sec)

Now MySQL used only 1 row.

Answering the received comments, I tried the same query without LIMIT:

-- ----------------------------
-- Without INDEX
-- ----------------------------

mysql> EXPLAIN SELECT * FROM test2 ORDER BY value\G
*************************** 1. row ******************
           id: 1
  select_type: SIMPLE
        table: test2
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
        Extra: Using filesort

-- ----------------------------
-- With INDEX
-- ----------------------------

mysql> EXPLAIN SELECT * FROM test2 ORDER BY value\G
*************************** 1. row *****************
           id: 1
  select_type: SIMPLE
        table: test2
         type: index
possible_keys: NULL
          key: ix_value
      key_len: 32
          ref: NULL
         rows: 3
        Extra: Using index

As we see, it uses index, for the 2-nd ORDER BY.

To build an index on your field, use this:

CREATE INDEX ix_comment_time ON tbl_comments (comment_time) USING BTREE;

http://dev.mysql.com/doc/refman/5.0/en/create-index.html

user4035
  • 19,332
  • 7
  • 51
  • 78
  • Does this constitute 'proof'? – Strawberry May 28 '13 at 12:58
  • @Strawberry Explain, what do you mean. – user4035 May 28 '13 at 12:58
  • 1
    Does your explanation prove that 'index will help you, when using ORDER BY'? – Strawberry May 28 '13 at 13:05
  • 3
    @Strawberry Yes, because if you look at explain before index creation, it says: "Extra: Using filesort". And after - "Using index". When MySQL can’t use an index to produce a sorted result, it must sort the rows itself. It can do this in memory or on disk, but it always calls this process a filesort (taken from High Performance MySQL). So, it doesn't sort the records in the second case, and the request must be executed faster. – user4035 May 28 '13 at 13:13
  • @NiklasModess It's correct but for queries that have `LIMIT`. If you want to order a whole table, an index may not be used. MySQL may choose (wisely) to retrieve the whole table and sort. – ypercubeᵀᴹ May 28 '13 at 13:38
  • @user4035 . . . This is all very interesting, but it is not the query in the question. – Gordon Linoff May 28 '13 at 14:00
  • @GordonLinoff I tried without LIMIT, the result was the same: "Extra: Using filesort" and "Extra: Using index" – user4035 May 28 '13 at 15:14
  • @ypercube Yes, I agree. This answer: http://stackoverflow.com/questions/1142483/optimizing-my-mysql-query-to-use-index-for-sorting says, that MySQL doesn't use index, when query is matching a large percentage of the data. But why does it use index in my request, that matches all the data in the table? – user4035 May 28 '13 at 15:22
  • Do you really think it matters (if index is used or not) with 3 rows? Try with 3 thousand or 3 million rows and tell us. – ypercubeᵀᴹ May 28 '13 at 15:35
  • @ypercube Experiment results. Without index: `99901 rows in set (1.72 sec)`. Then logged out of mysql, then logged in, did `RESET QUERY CACHE;`, then created index. 99901 rows in set (0.34 sec). Why? – user4035 May 28 '13 at 16:12
8

An index on the comment_time field might not help at all for a query like this:

SELECT *
FROM tbl_comments
WHERE id=222
ORDER BY comment_time;

The query needs to scan the the table to find the matching id values. It can do this by scanning the index, looking up the rows, and doing the test. If there is one row that matches and it has the highext comment_time, then this requires scanning the index and reading the table.

Without the index, it would scan the table, find the row, and very quickly sort the 1 row. The sequential scan of the table would typically be faster than an index scan followed by a page lookup (and would definitely be faster on a table larger than available memory).

On the other hand, an index on id, comment_time would be very helpful.

Gordon Linoff
  • 1,122,135
  • 50
  • 484
  • 624
1

Technically you don't need indices on every field, as it will work too, however for performance reasons you might need one or more.

EDIT

This problem is known from the beginning of software design. Typically if you increase amount of memory used by the program, you will reduce its speed (assuming the program is well-written). Assigning an index to a field increases data used by the db, but makes searching faster. If you do not want to search anything by this field (you actually do in the question), it would not be necessary.

In modern era the indices are not so big comparing to disk data size and adding one or more should not be a bad idea.

Normally it is very difficult to surely tell "do I need index or not". Some help is provided by EXPLAIN statement (refer to the manual).

Voitcus
  • 4,369
  • 4
  • 22
  • 40
1

Regarding your first question, you don't have to create index on comment_time. If the number of records is very large you'll need indices to speed your retrieval. But for your operation you don't need indices. For your second question using a WHERE Clause like this will help you.

WHERE(comment_time BETWEEN 'startDate' AND 'endDate');
glglgl
  • 81,640
  • 11
  • 130
  • 202
Ananth
  • 41
  • 1
  • 1
  • 10
  • @Strawberry With "inverted commas" you mean the single quotes? Indeed they are not the same as back ticks, but where do you see the need for back ticks? – glglgl May 28 '13 at 13:16
0

You don't have to put the index on comment_time if your where id is distinct.

Manish Jangir
  • 519
  • 3
  • 9
0

To increase the speed of retrieval of data you would need index. This will work with out index also. For your second question you can use WHERE and BETWEEN clause.

Refer: http://www.w3schools.com/sql/sql_between.asp

Sathish D
  • 4,560
  • 26
  • 44
0

The EXPLAIN statement is very useful in situations like that. For your query, you would use it as follows:

EXPLAIN SELECT * FROM tbl_comments WHERE id=222 ORDER BY comment_time

This will output which indexes are being used to execute the query and allows you to perform experiments with different indexes to find the best configuration. In order to speed up sorting, you will want a BTREE index since it stores data in a sorted manner. To speed up finding items with a certain id, a HASH index is the better option since it provides quick lookups for equality predicates. Note that MySQL might not be able to use a combination of both indexes to execute your query and will instead use just one of them.

Further information: http://dev.mysql.com/doc/refman/5.7/en/using-explain.html

For range predicates, like dates in a range of dates, a BTREE index will perform better than a HASH index.

Further information: http://dev.mysql.com/doc/refman/5.7/en/create-index.html

r4C9rAyrd6A1
  • 86
  • 1
  • 4