4

What if I want to search for a single row in a table with a decrementing precision, e.g. like this:

SELECT * FROM image WHERE name LIKE 'text' AND group_id = 10 LIMIT 1

When this gives me no result, try this one:

SELECT * FROM image WHERE name LIKE 'text' LIMIT 1

And when this gives me no result, try this one:

SELECT * FROM image WHERE group_id = 10 LIMIT 1

Is it possible to do that with just one expression?

Also there arises a problem when I have not two but e.g. three or more search parameters. Is there a generic solution for that? Of course it would come in handy when the search result is sorted by its relevance.

Erwin Brandstetter
  • 479,275
  • 111
  • 893
  • 1,042
nepa
  • 1,183
  • 1
  • 13
  • 25
  • Do you mean `name LIKE '%text%'` or `name = 'text'`? `name LIKE 'text'` is rather pointless. This is important for performance and has bearing on the solution. Also, the second part of your question is unclear. Why sort when you only return *one* row? Please define more clearly what you are after. – Erwin Brandstetter Jan 16 '13 at 18:30
  • It's just a detail but it could be anything :) – nepa Jan 30 '13 at 10:46

4 Answers4

4

LIKE without wildcard character is equivalent to =. Assuming you actually meant name = 'text'.

Indexes are the key to performance.

Test setup

CREATE TABLE image (
  image_id serial PRIMARY KEY
, group_id int NOT NULL
, name     text NOT NULL
);

Ideally, you create two indexes (in addition to the primary key):

CREATE INDEX image_name_grp_idx ON image (name, group_id);
CREATE INDEX image_grp_idx ON image (group_id);

The second may not be necessary, depending on data distribution and other details. Explanation here:

Query

This should be the fastest possible query for your case:

SELECT * FROM image WHERE name = 'name105' AND group_id = 10
UNION ALL
SELECT * FROM image WHERE name = 'name105'
UNION ALL
SELECT * FROM image WHERE group_id = 10
LIMIT  1;

SQL Fiddle.

The LIMIT clause applies to the whole query. Postgres is smart enough not to execute later legs of the UNION ALL as soon as it has found enough rows to satisfy the LIMIT. Consequently, for a match in the first SELECT of the query, the output of EXPLAIN ANALYZE looks like this (scroll to the right!):

Limit  (cost=0.00..0.86 rows=1 width=40) (actual time=0.045..0.046 rows=1 loops=1)
  Buffers: local hit=4
  ->  Result  (cost=0.00..866.59 rows=1002 width=40) (actual time=0.042..0.042 rows=1 loops=1)
        Buffers: local hit=4
        ->  Append  (cost=0.00..866.59 rows=1002 width=40) (actual time=0.039..0.039 rows=1 loops=1)
              Buffers: local hit=4
              ->  Index Scan using image_name_grp_idx on image  (cost=0.00..3.76 rows=2 width=40) (actual time=0.035..0.035 rows=1 loops=1)
                    Index Cond: ((name = 'name105'::text) AND (group_id = 10))
                    Buffers: local hit=4
              ->  Index Scan using image_name_grp_idx on image  (cost=0.00..406.36 rows=500 width=40) (never executed)
                    Index Cond: (name = 'name105'::text)
              ->  Index Scan using image_grp_idx on image  (cost=0.00..406.36 rows=500 width=40) (never executed)
                    Index Cond: (group_id = 10)
Total runtime: 0.087 ms

Bold emphasis mine.

Do not add an ORDER BY clause, this would void the effect. Then Postgres would have to consider all rows before returning the top row.

Final questions

Is there a generic solution for that?

This is the generic solution. Add as many SELECT statements as you want.

Of course it would come in handy when the search result is sorted by its relevance.

There is only one row in the result with LIMIT 1. Kind of voids sorting.

Community
  • 1
  • 1
Erwin Brandstetter
  • 479,275
  • 111
  • 893
  • 1,042
  • Very convincing indeed. Today/yesterday I've learnt something new. – dezso Jan 16 '13 at 23:01
  • @Erin: Your query is awesome, and postgres is just, awesome! The question has `LIKE` though, not `=` in the second comparison. This may make a difference in some scenario. Don't you think. – ypercubeᵀᴹ Jan 16 '13 at 23:24
  • @ypercube: Thanks. :) Did you see my first paragraph and the comment under the question concerning `LIKE` vs. `=`? The principal of `UNION ALL` stopping evaluation as soon as it has found enough rows should apply to any `SELECT` statement. If we'd be talking about *fuzzy* string matching, things would get more complicated ... – Erwin Brandstetter Jan 16 '13 at 23:29
  • Oh, ok, just noticed the first line. I was wondering for the case when the query has `LIKE '%text%'` and there is a row with `group_id = 10` but none that matches the LIKE. In that case, after the first part of the UNION is run (and returns no rows), the 2nd part will be run (which will also return no rows). And then the 3rd part , which will give 1 result. But if we had run the three parts with the 1st-3rd order, it would be faster (and there would be really no need to run the 2nd part in that specific case). – ypercubeᵀᴹ Jan 16 '13 at 23:35
  • @ypercube: Yeah, if one particular condition is *much* more expensive than others, combining SELECTs may help to further optimize. For non-left-anchored `LIKE` I would start with a [*GIN index* using `gin_trgm_ops`](http://www.postgresql.org/docs/current/interactive/pgtrgm.html) to support that. – Erwin Brandstetter Jan 16 '13 at 23:45
  • @ErwinBrandstetter: Thank you, very clear and complete answer! (from and to Vienna *g*) – nepa Jan 30 '13 at 10:48
  • Wouldn't it also be possible to calculate a rating for each row and sort the resultset by rating? – nepa Jan 30 '13 at 10:51
  • @nepa: Sure. Please start a new question with details for that. You can always refer to this one for context. – Erwin Brandstetter Jan 30 '13 at 16:09
2

It's late and I don't feel like writing out a full solution, but if I needed this I would probably create a customer function that returned a customer type, record or a table (depending on what your needs are). The advantage to this would be that once you found your record, you could stop.

Making the number of params be dynamic will make it a bit more challenging. Depending on your version of PostgreSQL (and the extension available to you), you might be able to pass in an hstore or json and dynamically build the query.

Maybe not the greatest SO answer, but it's more than a comment and hopefully some food for thought.

David S
  • 10,941
  • 10
  • 50
  • 91
  • If I understood you correctly, your assumption that a UDF will return rows one by one is wrong. Can't find a link to augment this yet. – dezso Jan 16 '13 at 10:20
  • The OP doesn't exactly say what he is trying to do here. Maybe he just needs the record's primary key value back? (obviously a guess here). I was just trying to point out that you could wrap the logic he wanted above in a UDF and stop when the record was found and return something (that something is not clear). I was just trying to highlight an approach to solving the problem that had not been mentioned yet. Also, it seemed like if he wanted to make it more flexible, that a UDF was likely the way to go but would be messy. – David S Jan 16 '13 at 15:36
  • I did not want to argue with your overall point, just nitpicked a bit :) – dezso Jan 16 '13 at 17:14
  • A `UNION ALL` query with a `LIMIT` clause happens to stop evaluation automatically as soon as it has found enough rows. I demonstrate in another answer. Barring that I would have gone for a function, too. – Erwin Brandstetter Jan 16 '13 at 22:55
  • @dezso - nitpicking is always welcome! Thanks for your comments! – David S Jan 17 '13 at 00:40
  • @ErwinBrandstetter - I did not know that. Thanks for your comments and your answer. Definitely learned something. – David S Jan 17 '13 at 00:41
2

I don't think there is anything wrong with running these queries separately until you find the result you want. While there are ways to combine these into one query, those end up being more complicated and slower, which isn't what you wanted.

You should run consider running all of the queries in one transaction, probably best in repeatable-read isolation level, so you get consistent results and also avoid the overhead of setting up repeated transactions. If in addition you make judicious use of prepared statements, you will have almost the same overhead as running all three queries in one combined statement.

Peter Eisentraut
  • 31,594
  • 10
  • 75
  • 84
1
SELECT *, 
CASE WHEN name like 'text' AND group_id = 10 THEN 1
WHEN name like 'text' THEN 2
WHEN group_id = 10 THEN 3
ELSE 4
END ImageRank
FROM image
WHERE ImageRank <> 4
ORDER BY ImageRank ASC
LIMIT 1

This would be a pseudo-solution approach but I'm not entirely sure if the syntax in your scenario would allow for it

Lorcan O'Neill
  • 3,093
  • 1
  • 20
  • 22