-2

I've a varbinary column profile_name in my accounts table. Initially, I thought my users will not enter special characters and will keep it short and sweet.

But, that was my terrible mistake. They started keeping long profile names with lots of special characters, which is causing me trouble.

Now, I've limited their options to alphanumerics, underscore and hyphen and should be 3 to 10 characters. In other words this regex: [\w-]{3,10}

The task is uphill now.

I've 570 users already and I want to update their profile names to fit into the above rules.

I've some barebone code:

public NpMaintainance()
{
    UTF8Encoder encode = new UTF8Encoder();
    List<string> names = new List<string>();
    using (con)
    {
        con.Open();
        using (cmd)
        {
            cmd.Parameters.Clear();
            cmd.Connection = con;
            cmd.CommandText = "select profile_name from accounts";
            using (dr = cmd.ExecuteReader())
            {
               names.Add((dr[0] as byte[]).ToUnicodeString(encode));
            }
        }
    }
}

I've just the code, which I've not tried, because it is a production site. Now names contains the profile names, which I've to reinsert into the table. And to add to my problem, profile_name is unique, so no two names should be same.

I've provided all the details I could, and hoping to get an answer!

What should I do? Is there any other way(using only sql?), or if not please help me with the way I'm currently doing!

Amit Joki
  • 53,955
  • 7
  • 67
  • 89
  • Wait! How can you take freedom to change their user name? Will the users be happy with that? – Sriram Sakthivel May 14 '14 at 12:06
  • @SriramSakthivel, they aren't their names, just profile names, which they have given some damn things, which I didn't expect `:(`. Their user names will remain untouched. Just the profile names, and the site is new too, so not much functionality is based on profile names. – Amit Joki May 14 '14 at 12:08
  • I'm unsure if you could do it only using SQL, but you could surely write a script and just remove all special characters and update the names with the new inputs from the script? see: http://stackoverflow.com/questions/7552253/how-to-remove-special-characters-from-an-string – CoderDojo May 14 '14 at 12:09
  • @CoderDojo, yes, I know I could do it with plain c#, but am seeking guidance, so that nothing goes wrong. – Amit Joki May 14 '14 at 12:11
  • Well if I could give my input on that, you should probably do tests first on a copy of the data you want to change rather than to take guidance and apply it directly onto your website. That's the only way to really reduce risk. – CoderDojo May 14 '14 at 12:14
  • http://stackoverflow.com/questions/1120198/most-efficient-way-to-remove-special-characters-from-string – CoderDojo May 14 '14 at 12:17

1 Answers1

1

UPDATE:
Here is MySQL version, with the help of cleanString function written by Shay Anderson with just a minor change to accept numbers an letters.

IF ASCII(c) > 31 AND ASCII(c) < 127 THEN 
replaced with
IF (ASCII(c) > 47 AND ASCII(c) < 57) 
OR (ASCII(c) > 64 AND ASCII(c) < 91) 
OR (ASCII(c) > 96 AND ASCII(c) < 122) THEN

The basic idea is the same create a temp table with profile_name converted to varchar, cleaning it and updateing it back to accounts table.

-- for testing purposes create and populate test table
CREATE TABLE accounts(id INT, profile_name VARBINARY(8000));

INSERT INTO accounts(id, profile_name)
SELECT 1,  CAST('User|%$&&/(/' AS BINARY);

-- check what's in there
SELECT id, profile_name, CAST(profile_name AS CHAR) FROM accounts;

CREATE TEMPORARY TABLE IF NOT EXISTS tmp_accounts AS (
    SELECT id, profile_name, CAST(profile_name AS CHAR) AS 'converted_name'
    FROM accounts
);

-- clean converted profile_name in temp table using function udf_cleanString
UPDATE tmp_accounts
SET converted_name = UDF_CLEANSTRING(converted_name);

/* do any other change necessary.... length etc. 
  when satisfied with changes in temp table update accounts table with new profile_name
*/
UPDATE accounts a
INNER JOIN tmp_accounts t ON t.id = a.id
SET a.profile_name = CAST(t.converted_name AS BINARY);

-- check what was done with profile_name after changes
SELECT id, profile_name, CAST(profile_name AS CHAR) FROM accounts;

-- DROP TABLE tmp_accounts

In SQL you could try converting varbinary(max) to nvarchar(max) type something like this (assuming accountid is primary key of accounts table)

SELECT accountid, profile_name, CONVERT(NVARCHAR(MAX),profile_name) as converted_name 
INTO #tmpaccounts
FROM accounts

then use some character stripping function like the one from How to strip all non-alphabetic characters from string in SQL Server?. It might need some tweaking to allow hyphens and underscores, but it shouldn't be too hard

UPDATE #tmpaccounts SET converted_name = [dbo].[RemoveNonAlphaCharacters](converted_name)

that should take care of the character limitations, if you have longer profile names then 10 trim them, hopefully there shouldn't be a lot of them. (I'm am somewhat skeptical of changing profile names without permission or at least awareness from the account owners) After that you can update the accounts table from #tmpaccounts.

UPDATE a
SET profile_name = CONVERT(VARBINARY(max), converted_name)
FROM accounts a
INNER JOIN #tmpaccounts t on t.accountid = a.accountid

Of course DON'T do this on production database, use demo or training DB.

Community
  • 1
  • 1
OttO
  • 421
  • 4
  • 8