10

Background

I have a table that contains only one column: Name. There are only four rows in it, say

| Name      |
| test1.com |
| test2.com |
| test3.com |
| test4.com |

Problem

If I query

var email = "a@test2.com";
Table.Where(x => email.EndsWith(x.Name));

I'll get an empty list. but If I query all rows first and calculate Where in memory like this

var email = "a@test2.com";
Table.ToList().Where(x => email.EndsWith(x.Name));

I'll get a list contains only test2.com which is correct.

The generated SQL for the first query is

SELECT "Extent1"."Name" AS "Name"
FROM "USER"."Table" "Extent1"
WHERE (( NVL(INSTR(REVERSE(:p__linq__0), REVERSE("Extent1"."Name")), 0) ) = 1)

I've tried replacing :p__linq__0 with 'a@test2.com' and running the query in the SQLDeveloper, the result is correct.

More Information

If I change EndsWith() to Contains(), the problem will be gone. Here is the generated SQL for Contains()

SELECT "Extent1"."Name" AS "Name"
FROM "USER"."Table" "Extent1"
WHERE (( NVL(INSTR(:p__linq__0, "Extent1"."Name"), 0) ) > 0)

Do you have any idea what's wrong with EndsWith or REVERSE method?

Environment

  • EF5.0
  • .NET4.5
  • Oracle11g
  • ODP.NET11.2 Release 3
Moozz
  • 599
  • 3
  • 17
  • 3
    This is a great example as to why we should always check the SQL EntityFramework Generates when using IQueryables directly with SQL. I cant explain why it generates the SQL it does... but it's become a common practice to check with just about every IQueryable I produce. – Derek Jan 14 '15 at 10:29

1 Answers1

3

This line concerns me and is a common pitfall with people using EF:

Table.ToList().Where(x => email.EndsWith(x.Name));

The part Table.ToList() is the worst part because this will actually materialise the entire table into memory and then perform the EndsWith in C#. This line:

Table.Where(x => email.EndsWith(x.Name));

I would caution this approach just on general principle as it will be horrendously slow when the table grows to reasonable size. You can do the heavy lifting before the query hits the database by splitting out the domain from the the email as you construct the query:

var email = "a@test2.com";

/* You should null check this of course and not just assume a match was found */
var domain = Regex.Match(email , "@(.*)").Groups[1].Value; 

/* Note: ToList() materialisation happens at the end */
var result = Table.Where(x => x.Name == domain).ToList();  

Furthermore, if you need to match on the domain names of a column storing emails, then my preferred approach would be to split the email and store the domain name in a separate column that you index and just match on, this will scale and be a heck of a lot easier to manage. Remember that these days data is cheap... especially compared to non-indexable table scans.

Also remember (for both scenarios) that your database is set to CI (case insensitive)

Paul Carroll
  • 1,484
  • 13
  • 15
  • So, the answer is EndsWith isn't supported in EF? I couldn't see any document describes this. Do you have a link? – Moozz Jan 15 '15 at 03:04
  • No, the answer is what I've said above... you don't need it. Just extract the domain name in c# and pass it to the query. Make sure you re-read my answer as I think you're not quite understanding the boundaries between c# and sql. When you call ToList() on an EF table entity you are materialising (pulling into memory) the entire table which is horrible for performance and will not scale. Does that clear things up? Is there anything else confusing? – Paul Carroll Jan 15 '15 at 03:34
  • I understand that ToList() pulls all rows into memory. I'm just curious why EndsWith doesn't work. The generated SQL seems to be OK but it doesn't work. Anyway, you're right that I don't have to use EndsWith. – Moozz Jan 15 '15 at 03:39
  • Yeah I admit that is a bit strange, but I'd never use it myself due to the huge (on large tables) performance hit. Can you please accept the answer too? – Paul Carroll Jan 15 '15 at 03:42