219

My product model contains some items

 Product.first
 => #<Product id: 10, name: "Blue jeans" >

I'm now importing some product parameters from another dataset, but there are inconsistencies in the spelling of the names. For instance, in the other dataset, Blue jeans could be spelled Blue Jeans.

I wanted to Product.find_or_create_by_name("Blue Jeans"), but this will create a new product, almost identical to the first. What are my options if I want to find and compare the lowercased name.

Performance issues is not really important here: There are only 100-200 products, and I want to run this as a migration that imports the data.

Any ideas?

Jesper Rønn-Jensen
  • 91,561
  • 40
  • 112
  • 147

19 Answers19

383

You'll probably have to be more verbose here

name = "Blue Jeans"
model = Product.where('lower(name) = ?', name.downcase).first 
model ||= Product.create(:name => name)
Andrew Marshall
  • 89,426
  • 20
  • 208
  • 208
alex.zherdev
  • 23,020
  • 8
  • 58
  • 55
  • It also means I can't dry it. Find_or_create_by(search, *args, &block) won't work because I can't assume all my fields are either text, or numbers. Find_or_create_by_price || Find_or_create_by_name – baash05 Jun 01 '12 at 00:40
  • 5
    @botbot's comment does not apply to strings from user input. "#$$" is a little-known shortcut for escaping global variables with Ruby string interpolation. It's equivalent to "#{$$}". But string interpolation doesn't happen to user-input strings. Try these in Irb to see the difference: `"$##"` and `'$##'`. The first is interpolated (double-quotes). The second is not. User input never gets interpolated. – Brian Morearty Jan 23 '13 at 19:47
  • 5
    Just to note that `find(:first)`is deprecated, and the option now is to use `#first`. Thus, `Product.first(conditions: [ "lower(name) = ?", name.downcase ])` – Luís Ramalho Apr 15 '13 at 14:47
  • 2
    You don't need to do all this work. Use [the built-in Arel library or Squeel](http://robb.weblaws.org/2013/12/05/yes-rails-supports-case-insensitive-database-queries/) – Dogweather Dec 06 '13 at 21:06
  • 17
    In Rails 4 you can now do `model = Product.where('lower(name) = ?', name.downcase).first_or_create` – Derek Lucas May 06 '14 at 19:27
  • @DerekLucas I think your comment should be added to the answer – Nickolay Kondratenko Sep 04 '15 at 12:39
  • 1
    @DerekLucas although it's possible to do so in Rails 4, this method might cause an unexpected behavior. Suppose we have `after_create` callback in `Product` model and inside the callback, we have `where` clause, e.g. `products = Product.where(country: 'us')`. In this case, the `where` clauses are chained as callbacks execute within the context of the scope. Just FYI. – elquimista Mar 04 '16 at 08:57
  • You can also use scopes like this below and put them in a concern and include in models you may need them: `scope :ci_find, lambda { |column, value| where("lower(#{column}) = ?", value.downcase).first }` Then use like this: `Model.ci_find('column', 'value')` – theterminalguy Nov 12 '16 at 00:38
  • I'd suggest going DB agnostic and using Are like so `arel_table[:name].lower.eq(value)` – MetaSkills Jun 20 '17 at 12:35
110

This is a complete setup in Rails, for my own reference. I'm happy if it helps you too.

the query:

Product.where("lower(name) = ?", name.downcase).first

the validator:

validates :name, presence: true, uniqueness: {case_sensitive: false}

the index (answer from Case-insensitive unique index in Rails/ActiveRecord?):

execute "CREATE UNIQUE INDEX index_products_on_lower_name ON products USING btree (lower(name));"

I wish there was a more beautiful way to do the first and the last, but then again, Rails and ActiveRecord is open source, we shouldn't complain - we can implement it ourselves and send pull request.

Community
  • 1
  • 1
oma
  • 30,364
  • 11
  • 66
  • 91
  • 7
    Thanks for the credit on creating the case-insensitive index in PostgreSQL. Credit back to you for showing how to use it in Rails! One additional note: if you use a standard finder, e.g. find_by_name, it still does an exact match. You have to write custom finders, similar to your "query" line above, if you want your search to be case-insensitive. – Mark Berry Oct 01 '13 at 02:09
  • Considering that `find(:first, ...)` is now deprecated, I think this is the most proper answer. – user Feb 07 '14 at 06:44
  • is name.downcase needed? It seems to work with ```Product.where("lower(name) = ?", name).first``` – Jordan Dec 17 '14 at 23:23
  • 1
    @Jordan have you tried that with names having capital letters? – oma Dec 21 '14 at 00:15
  • @Jordan I guess this is a case of LIKE vs ILIKE. what DB are you using? name holding capital letters shouldn't have worked – oma Jan 05 '15 at 19:37
  • @oma PostgreSQL -I could be mistaken, but I thought I had this working. Not a big deal either way this was helpful. Thanks – Jordan Jan 06 '15 at 03:54
  • 1
    @Jordan, perhaps not too important, but we should strive for accuracy on SO as we are helping others :) – oma Jan 08 '15 at 10:40
32

If you are using Postegres and Rails 4+, then you have the option of using column type CITEXT, which will allow case insensitive queries without having to write out the query logic.

The migration:

def change
  enable_extension :citext
  change_column :products, :name, :citext
  add_index :products, :name, unique: true # If you want to index the product names
end

And to test it out you should expect the following:

Product.create! name: 'jOgGers'
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'joggers')
=> #<Product id: 1, name: "jOgGers">

Product.find_by(name: 'JOGGERS')
=> #<Product id: 1, name: "jOgGers">
Viet
  • 3,048
  • 3
  • 26
  • 33
21

You might want to use the following:

validates_uniqueness_of :name, :case_sensitive => false

Please note that by default the setting is :case_sensitive => false, so you don't even need to write this option if you haven't changed other ways.

Find more at: http://api.rubyonrails.org/classes/ActiveRecord/Validations/ClassMethods.html#method-i-validates_uniqueness_of

jomo
  • 11,665
  • 4
  • 25
  • 28
Sohan
  • 3,597
  • 2
  • 20
  • 24
  • 5
    In my experience, in contrast to the documentation, case_sensitive is true by default. I've seen that behavior in postgresql and others have reported the same in mysql. – Troy Jan 02 '12 at 22:01
  • 1
    so I'm trying this with postgres, and it does not work. find_by_x is case sensitive regardless... – Louis Sayers Sep 12 '13 at 10:47
  • This validation is only when creating the model. So if you have 'HAML' in your database, and you try to add 'haml', it won't pass validations. – Dudo Oct 10 '13 at 18:17
16

Several comments refer to Arel, without providing an example.

Here is an Arel example of a case-insensitive search:

Product.where(Product.arel_table[:name].matches('Blue Jeans'))

The advantage of this type of solution is that it is database-agnostic - it will use the correct SQL commands for your current adapter (matches will use ILIKE for Postgres, and LIKE for everything else).

Brad Werth
  • 16,308
  • 9
  • 57
  • 85
14

In postgres:

 user = User.find(:first, :conditions => ['username ~* ?', "regedarek"])
tomekfranek
  • 7,110
  • 7
  • 41
  • 74
9

Quoting from the SQLite documentation:

Any other character matches itself or its lower/upper case equivalent (i.e. case-insensitive matching)

...which I didn't know.But it works:

sqlite> create table products (name string);
sqlite> insert into products values ("Blue jeans");
sqlite> select * from products where name = 'Blue Jeans';
sqlite> select * from products where name like 'Blue Jeans';
Blue jeans

So you could do something like this:

name = 'Blue jeans'
if prod = Product.find(:conditions => ['name LIKE ?', name])
    # update product or whatever
else
    prod = Product.create(:name => name)
end

Not #find_or_create, I know, and it may not be very cross-database friendly, but worth looking at?

Mike Woodhouse
  • 48,790
  • 12
  • 86
  • 124
  • 1
    like is case sensitive in mysql but not in postgresql. I'm not sure about Oracle or DB2. The point being, you can't count on it and if you use it and your boss changes your underlying db you'll start having "missing" records without an obvious reason why. @neutrino's lower(name) suggestion is probably the best way to address this. – masukomi Aug 31 '11 at 17:44
7

Another approach that no one has mentioned is to add case insensitive finders into ActiveRecord::Base. Details can be found here. The advantage of this approach is that you don't have to modify every model, and you don't have to add the lower() clause to all your case insensitive queries, you just use a different finder method instead.

Alex Korban
  • 14,220
  • 5
  • 40
  • 54
6

Upper and lower case letters differ only by a single bit. The most efficient way to search them is to ignore this bit, not to convert lower or upper, etc. See keywords COLLATION for MSSQL, see NLS_SORT=BINARY_CI if using Oracle, etc.

Soviut
  • 79,529
  • 41
  • 166
  • 227
Dean Radcliffe
  • 2,154
  • 20
  • 28
5

Find_or_create is now deprecated, you should use an AR Relation instead plus first_or_create, like so:

TombolaEntry.where("lower(name) = ?", self.name.downcase).first_or_create(name: self.name)

This will return the first matched object, or create one for you if none exists.

bhfailor
  • 155
  • 2
  • 6
superluminary
  • 38,944
  • 21
  • 142
  • 143
3

Similar to Andrews which is #1:

Something that worked for me is:

name = "Blue Jeans"
Product.find_by("lower(name) = ?", name.downcase)

This eliminates the need to do a #where and #first in the same query. Hope this helps!

2

Case-insensitive searching comes built-in with Rails. It accounts for differences in database implementations. Use either the built-in Arel library, or a gem like Squeel.

Dogweather
  • 12,603
  • 15
  • 54
  • 73
2

There are lots of great answers here, particularly @oma's. But one other thing you could try is to use custom column serialization. If you don't mind everything being stored lowercase in your db then you could create:

# lib/serializers/downcasing_string_serializer.rb
module Serializers
  class DowncasingStringSerializer
    def self.load(value)
      value
    end

    def self.dump(value)
      value.downcase
    end
  end
end

Then in your model:

# app/models/my_model.rb
serialize :name, Serializers::DowncasingStringSerializer
validates_uniqueness_of :name, :case_sensitive => false

The benefit of this approach is that you can still use all the regular finders (including find_or_create_by) without using custom scopes, functions, or having lower(name) = ? in your queries.

The downside is that you lose casing information in the database.

Nate Murray
  • 3,465
  • 5
  • 29
  • 33
1

You can also use scopes like this below and put them in a concern and include in models you may need them:

scope :ci_find, lambda { |column, value| where("lower(#{column}) = ?", value.downcase).first }

Then use like this: Model.ci_find('column', 'value')

theterminalguy
  • 1,430
  • 11
  • 20
0

Assuming that you use mysql, you could use fields that are not case sensitive: http://dev.mysql.com/doc/refman/5.0/en/case-sensitivity.html

marcgg
  • 60,067
  • 49
  • 172
  • 221
0
user = Product.where(email: /^#{email}$/i).first
shilovk
  • 7,603
  • 15
  • 54
  • 63
0

Some people show using LIKE or ILIKE, but those allow regex searches. Also you don't need to downcase in Ruby. You can let the database do it for you. I think it may be faster. Also first_or_create can be used after where.

# app/models/product.rb
class Product < ActiveRecord::Base

  # case insensitive name
  def self.ci_name(text)
    where("lower(name) = lower(?)", text)
  end
end

# first_or_create can be used after a where clause
Product.ci_name("Blue Jeans").first_or_create
# Product Load (1.2ms)  SELECT  "products".* FROM "products"  WHERE (lower(name) = lower('Blue Jeans'))  ORDER BY "products"."id" ASC LIMIT 1
# => #<Product id: 1, name: "Blue jeans", created_at: "2016-03-27 01:41:45", updated_at: "2016-03-27 01:41:45"> 
6ft Dan
  • 2,125
  • 27
  • 44
0

An alternative can be

c = Product.find_by("LOWER(name)= ?", name.downcase)
-9

So far, I made a solution using Ruby. Place this inside the Product model:

  #return first of matching products (id only to minimize memory consumption)
  def self.custom_find_by_name(product_name)
    @@product_names ||= Product.all(:select=>'id, name')
    @@product_names.select{|p| p.name.downcase == product_name.downcase}.first
  end

  #remember a way to flush finder cache in case you run this from console
  def self.flush_custom_finder_cache!
    @@product_names = nil
  end

This will give me the first product where names match. Or nil.

>> Product.create(:name => "Blue jeans")
=> #<Product id: 303, name: "Blue jeans">

>> Product.custom_find_by_name("Blue Jeans")
=> nil

>> Product.flush_custom_finder_cache!
=> nil

>> Product.custom_find_by_name("Blue Jeans")
=> #<Product id: 303, name: "Blue jeans">
>>
>> #SUCCESS! I found you :)
Jesper Rønn-Jensen
  • 91,561
  • 40
  • 112
  • 147
  • 3
    That's extremely inefficient for a larger data set, since it has to load the entire thing into memory. While not a problem for you with only a few hundred entries, this isn't good practice. – lambshaanxy Dec 07 '11 at 05:14