-1

I pull a range (e.g. limit 72, 24) of games from a database according to which have been voted most popular. I have a separate table for tracking game data, and one for tracking individual votes for a game (rating from 1 to 5, one vote per user per game). A game is considered "most popular" or "more popular" when that game has the highest average rating of all the rating votes for said game. Games with less than 5 votes are not considered. Here is what the tables look like (two tables, "games" and "votes"):

games:
gameid(key)
gamename
thumburl

votes:
userid(key)
gameid(key)
rating

Now, I understand that there is something called an "index" which can speed up my queries by essentially pre-querying my tables and constructing a separate table of indices (I don't really know.. that's just my impression).

I've also read that mysql operates fastest when multiple queries can be condensed into one longer query (containing joins and nested select statements, I presume).

However, I am currently NOT using an index, and I am making multiple queries to get my final result.

What changes should be made to my database (if any -- including constructing index tables, etc.)? And what should my query look like?

Thank you.

dan
  • 61
  • 4

4 Answers4

1

Your query that calculates the average for every game could look like:

SELECT gamename, AVG(rating)
FROM games INNER JOIN votes ON games.gameid = votes.gameid
GROUP BY games.gameid
HAVING COUNT(*)>=5 
ORDER BY avg(rating) DESC
LIMIT 0,25

You must have an index on gameid on both games and votes. (if you have defined gameid as a primary key on table games that is ok)

cristian
  • 8,300
  • 3
  • 35
  • 44
  • Add a carriage return after "like" to force the code to properly indent. I would, but I don't have "edit" privileges. – Aaron Dec 11 '11 at 15:38
  • correct me if i'm wrong, but there doesn't seem to be any ordering in the query you've provided. – dan Dec 11 '11 at 15:41
  • Failed to execute SQL : SQL SELECT gamename, AVG(*) FROM games INNER JOIN votes ON games.gameid = votes.gameid GROUP BY games.gameid HAVING COUNT(*)>=5 limit 0,25 failed : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '*) FROM games INNER JOIN votes ON games.gameid = votes.ga' at line 1 – dan Dec 11 '11 at 15:47
  • I edited the query to add a order by and I fix the average (I hope the result would be what you expect it to be :)) ) – cristian Dec 11 '11 at 15:48
  • I'm glad I could help. But you should read about indexes and about explain to find out how to optimize queries – cristian Dec 11 '11 at 16:08
0

According to the MySQL documentation, an index is created when you designate a primary key at table creation. This is worth mentioning, because not all RDBMS's function this way.

I think you have the right idea here, with your "votes" table acting as a bridge between "games" and "user" to handle the many-to-many relationship. Just make sure that "userid" and "gameid" are indexed on the "votes" table.

Aaron
  • 46,155
  • 11
  • 102
  • 118
0

If you have access to use InnoDB storage for your tables, you can create foreign keys on gameid in the votes table which will use the index created for your primary key in the games table. When you then perform a query which joins these two tables (e.g. ... INNER JOIN votes ON games.gameid = votes.gameid) it will use that index to speed things up.

Your understanding of an index is essentially correct — it basically creates a separate lookup table which it can use behind the scenes when the query is executed.

When using an index it is useful to use the EXPLAIN syntax (simply prepend your SELECT with EXPLAIN to try this out). The output it gives show you the list of possible keys available for the query as well as which key the query is using. This can be very helpful when optimising your query.

cmbuckley
  • 33,879
  • 7
  • 69
  • 86
  • Thank you, your answer was very informative. You seem quite knowledgeable on this topic.. maybe you could suggest a query that might work for me? – dan Dec 11 '11 at 15:51
  • The specific query (and indexes) provided in Octopus-Paul's answer would seem to produce what you're after from your question. If you have a number of queries to consider then the links (and MySQL docs in general) are a good place to start if you want to learn more about indexes. [This article](http://www.databasejournal.com/features/mysql/article.php/1382791/Optimizing-MySQL-Queries-and-Indexes.htm) has a good hands-on example of using indexes and `EXPLAIN`. – cmbuckley Dec 11 '11 at 15:58
0

An index is a PHYSICAL DATA STRUCTURE which is used to help speed up retrieval type queries; it's not simply a table upon a table -> good for a concept though. Another concept is the way indexes work at the back of your text book (the only difference is with your book a search key could point to multiple pages / matches whereas with indexes a search key points to only one page/match). An index is defined by data structures so you could use a B+ tree index and there are even hash indexes. It's Database/Query optimization from the physical/internal level of the Database - I'm assuming that you know that you're working at the higher levels of the DBMS which is easier. An index is rooted within the internal levels and that make DB query optimization much more effective and interesting.

I've noticed from your question that you have not even developed the query as yet. Focus on the query first. Indexing comes after, as a matter of a fact, in any graduate or post graduate Database course, indexing falls under the maintenance of a Database and not necessarily the development.

Also N.B. I have seen quite many people say as a rule to make all primary keys indexes. This is not true. There are many instances where a primary key index would slow up the Database. Infact, if we were to go with only primary indexes then should use hash indexes since they work better than B+ trees!

In summary, it doesn't make sense to ask a question for a query and an index. Ask for help with the query first. Then given your tables (relational schema) and SQL query, then and only then could I advice you on the best index - remember its maintenance. We can't do maintanance if there is 0 development.

Kind Regards,

N.B. most questions concerning indexes at the post graduate level of many computing courses are as follows: we give the students a relational schema (i.e. your tables) and a query and then ask: critically suggest a suitable index for the following query on the tables ----> we can't ask a question like this if they dont have a query

D. Rattansingh
  • 1,241
  • 3
  • 16
  • 26
  • i'm new to the site and wasn't aware that multi-part questions were frowned upon. regardless of the frowning though, i understand your point concerning the need for more information before tackling the question of indices (i was merely hoping that the necessary prerequisite steps would be included in the answer). thank you. you've been helpful and informative. – dan Dec 11 '11 at 23:37
  • oh no I'm not frowning at all. after the query is developed, you then study the query and decide which parts of it you need to boost. It's from there you will then decide if an index is suitable and what type of index to go with. You need to remember an index consumes a cost of the DBMS maintaining it. – D. Rattansingh Dec 12 '11 at 02:00