3

I copy-pasted a string into a form field and a strange character broke my MySql query.

I could force the error on the console this way (the weird character is in the middle of the two words "Invalid" and "Character", you can also copy-paste it):

> dog.name = "Invalid ​Character"
> dog.save # -> false

Which returns the following error:

ActiveRecord::StatementInvalid: Mysql2::Error: Incorrect string value: '\xE2\x80\x8BCha...' for column 'name' at row 1: UPDATE `dogs` SET `name` = 'Invalid ​Character' WHERE `dogs`.`id` = 2227

It replaced the character by '\xE2\x80\x8B' as the error said.

Is there any validation that I could use to remove these kind of weird characters?

Obs: I also saw that

> "Invalid ​Character".unpack('U*')

Returns

[73, 110, 118, 97, 108, 105, 100, 32, 8203, 67, 104, 97, 114, 97, 99, 116, 101, 114]

The weird character must be the 8230 one.

Obs2: In my application.rb, I have: config.encoding = "utf-8"

EDIT

On my console, I got:

> ActiveRecord::Base.connection.charset # -> "utf8"
> ActiveRecord::Base.connection.collation # -> "utf8_unicode_ci"

I also ran (on the rails db mySql console):

> SELECT table_collation FROM INFORMATION_SCHEMA.TABLES where table_name = 'dogs';

and got "utf8_unicode_ci"

EDIT2

If I change the table's character set to utf8mb4 I don't get the error. But still, I have to filter those characters.

Mauricio Moraes
  • 6,889
  • 5
  • 34
  • 54

1 Answers1

0

On the rails db MySql console, I used:

SHOW CREATE TABLE dogs;

To find out that the charset for the table was latin1.

I just added a migration with this query:

ALTER TABLE dogs CONVERT TO CHARACTER SET utf8mb4;

And it started to work fine.

Mauricio Moraes
  • 6,889
  • 5
  • 34
  • 54
  • It's been quite some time and I still didn't find or get a better solution to my problem than this one. I Wish the one that downvoted could add to the discussion. ;) – Mauricio Moraes Jul 31 '17 at 23:45
  • You should not use utf8 in MySQL, but utf8mb4, that's why you were downvoted. You can find more on the reason why here: https://stackoverflow.com/questions/30074492/what-is-the-difference-between-utf8mb4-and-utf8-charsets-in-mysql – Theemuts Nov 21 '17 at 09:10
  • @Theemuts, thanks. That's simple then. I've updated it after reading your reference. – Mauricio Moraes Nov 21 '17 at 18:13