Querying across thousands of MySQL databases with same schema

Question

I have a problem I've been scratching my head over for the past couple of months. I have tens of thousands of MySQL databases on the same server, all with the same schema. I want to be able to query across all of them and get a list of results. Currently, I am using a PHP script to loop through every database and execute the query on each one. Sometimes this can take a very long time (10+ minutes) so it's obviously not the best solution.

Here's an example of what I would like to do:

SELECT something FROM db_prefix_*.tablename WHERE something = 1;

... and have it return one set of results, db_prefix_* being the name of the database.

Does anything like this exist or am I dreaming?

This problem might get more complicated as I will be moving these databases to multiple servers but I will cross that road when I get there.

AFAIK there is no such thing - what about consolidating the DBs into one big DB ? — Yahia, Oct 28 '11 at 06:33
I've been toying with the idea of making an auxiliary database that is updated and synced with the relevant data that I want so I can then figure out exactly which databases I need to hit individually. That's starting to look like the most attractive option but I feel like replicating data like that isn't the most graceful (or real time) solution. — Brian Herbert, Oct 28 '11 at 06:43

score 1 · Answer 1 · answered Oct 28 '11 at 07:03

1

Is this something that you are looking for:

DECLARE @sqlQuery nvarchar(MAX)
SET @sqlQuery=''
select @sqlQuery=@sqlQuery+'select something from '+ SCHEMA_NAME  +'.tableName' from INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME LIKE 'db_prefix_%'
print @sqlQuery
EXEC sp_executeSQL @sqlQuery

I come from a SQL server background and I have tried my best to translate MySQL please correct me if something here is incorrect.

answered Oct 28 '11 at 07:03

Baz1nga

15,010
3
31
59

"I have tens of thousands of MySQL databases on the same server" – Baz1nga Oct 28 '11 at 10:23

Sled · Answer 2 · 2013-03-03T05:08:37.320

1

Why not merge the databases permanently and just add an extra column to indicate whatever database it originally came from? I don't really understand why you would have so many separate databases if the data data across them was somehow meaningful. If necessary, consider a different database that can scale like that.

edited Mar 03 '13 at 05:08

answered Feb 16 '12 at 14:50

Sled

16,514
22
110
148

score 0 · Answer 3 · answered Oct 28 '11 at 08:03

You could try Parallel::ForkManager, which provides "A simple parallel processing fork manager" for perl.

You could alter the maximum number of processes up and down based on how heavy the query is, and also use nice/ionice to prevent undue stress to the server.

As long as you have version 0.7.6 or later, you can pass data structures back to the parent process, which would allow you to post-process the results (e.g. show a summary).

score 0 · Answer 4 · answered Oct 29 '11 at 00:24

The short answer is no: all the php mysql drivers (afaik) treat connections to databases independently, as you are already doing by iterating through them. Even if there was some driver that allowed the syntax you want, I'm pretty sure that it would still be creating a new connection per database at some level due to mysql itself.

The long answer is yes: you can build an index across multiple db's. Having the same schema across multiple dbs is essentially sharding, so this is a known problem (although primary key non-uniqueness might be an issue). If you create an index on something across the multiple databases you might be able to get real-time performance, especially if you only want the values of the indexed fields. Twitter report success with indexes across mysql tables early on (http://engineering.twitter.com/2011/05/engineering-behind-twitters-new-search.html) so there's an encouraging precedent. Sharding mysql has been (arguably) more productive within the Rails community than the PHP community (http://planet.mysql.com/?tag_search=2013) so you might find some pointers there.

Faced with a similarly scaled problem recently we simply jumped ship to Mongo.

Querying across thousands of MySQL databases with same schema

4 Answers4