I copy-pasted a string into a form field and a strange character broke my MySql query.
I could force the error on the console this way (the weird character is in the middle of the two words "Invalid" and "Character", you can also copy-paste it):
> dog.name = "Invalid Character"
> dog.save # -> false
Which returns the following error:
ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xE2\x80\x8BCha...' for column 'name' at row 1: UPDATE `dogs` SET `name` = 'Invalid Character' WHERE `dogs`.`id` = 2227
It replaced the character by '\xE2\x80\x8B' as the error said.
Is there any validation that I could use to remove these kind of weird characters?
Obs: I also saw that
> "Invalid Character".unpack('U*')
Returns
[73, 110, 118, 97, 108, 105, 100, 32, 8203, 67, 104, 97, 114, 97, 99, 116, 101, 114]
The weird character must be the 8230 one.
Obs2: In my application.rb, I have: config.encoding = "utf-8"
EDIT
On my console, I got:
> ActiveRecord::Base.connection.charset # -> "utf8"
> ActiveRecord::Base.connection.collation # -> "utf8_unicode_ci"
I also ran (on the rails db
mySql console):
> SELECT table_collation FROM INFORMATION_SCHEMA.TABLES where table_name = 'dogs';
and got "utf8_unicode_ci"
EDIT2
If I change the table's character set to utf8mb4 I don't get the error. But still, I have to filter those characters.