21

I found this Perl script while migrating my SQLite database to mysql

I was wondering (since I don't know Perl) how could one rewrite this in Python?

Bonus points for the shortest (code) answer :)

edit: sorry I meant shortest code, not strictly shortest answer

#! /usr/bin/perl

while ($line = <>){
    if (($line !~  /BEGIN TRANSACTION/) && ($line !~ /COMMIT/) && ($line !~ /sqlite_sequence/) && ($line !~ /CREATE UNIQUE INDEX/)){

        if ($line =~ /CREATE TABLE \"([a-z_]*)\"(.*)/){
                $name = $1;
                $sub = $2;
                $sub =~ s/\"//g; #"
                $line = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";
        }
        elsif ($line =~ /INSERT INTO \"([a-z_]*)\"(.*)/){
                $line = "INSERT INTO $1$2\n";
                $line =~ s/\"/\\\"/g; #"
                $line =~ s/\"/\'/g; #"
        }else{
                $line =~ s/\'\'/\\\'/g; #'
        }
        $line =~ s/([^\\'])\'t\'(.)/$1THIS_IS_TRUE$2/g; #'
        $line =~ s/THIS_IS_TRUE/1/g;
        $line =~ s/([^\\'])\'f\'(.)/$1THIS_IS_FALSE$2/g; #'
        $line =~ s/THIS_IS_FALSE/0/g;
        $line =~ s/AUTOINCREMENT/AUTO_INCREMENT/g;
        print $line;
    }
}

Some additional code was necessary to successfully migrate the sqlite database (handles one line Create table statements, foreign keys, fixes a bug in the original program that converted empty fields '' to \'.

I posted the code on the migrating my SQLite database to mysql Question

Community
  • 1
  • 1
Jiaaro
  • 67,024
  • 38
  • 154
  • 182

8 Answers8

52

Here's a pretty literal translation with just the minimum of obvious style changes (putting all code into a function, using string rather than re operations where possible).

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    print line,

main()
dr jimbob
  • 15,644
  • 5
  • 53
  • 74
Alex Martelli
  • 762,786
  • 156
  • 1,160
  • 1,345
  • 12
    I love getting code in answers... I always learn something new :) – Jiaaro Jul 01 '09 at 03:32
  • 8
    Always happy to answer with code when feasible -- most concise and least ambiguous way, after all!-) – Alex Martelli Jul 01 '09 at 04:33
  • 1
    I could be wrong, but shouldn't it be line.replace('THIS_IS_FALSE', '0') – Jiaaro Jul 01 '09 at 12:35
  • 6
    I thought Perl was supposed ot be hard to read. This is way harder the Perl is intuative by comparison. – Martin York Jul 01 '09 at 19:52
  • @martin this is a translation... there are better ways to do it in python – Jiaaro Jul 01 '09 at 20:52
  • 3
    @Jim - And ergo there are much better ways of doing it in Perl – draegtun Jul 02 '09 at 10:15
  • In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression: line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line) or line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line) Also, this line should be added before print: line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT') Last, the column names in create statements should be backticks in MySQL. Add this in line 15: sub = sub.replace('"','`') – bb. Sep 09 '10 at 10:48
  • Obviously, comments are not treated as markdown, so I'll create a new answer. – bb. Sep 09 '10 at 10:51
  • `if any(s in line for s in ['BEGIN TRANSACTION','COMMIT', 'sqlite_sequence','CREATE UNIQUE INDEX']): continue` could be used instead of `process = False ... if not process: continue` – jfs Feb 07 '11 at 22:23
  • Thanks, this is run like so for example (save the snippet to run.py for example): 1$: sqlite3 campaign.sqlite .dump > backup.dmp 2$: python run.py backup.dmp > backupinmysql.dmp 3$: mysql -h server -p -u username databasename < backupinmysql.dmp – David d C e Freitas May 16 '11 at 14:45
  • I added line = re.sub(r"(CHECK\s+\(.+?\))([^,]+)(,?)", r"\2 \1\3", line) because MySQL doesn't seem to like CHECK before DEFAULT. – powlo Jun 06 '12 at 15:36
  • Folks, you must include the reserved words backticking as in https://stackoverflow.com/questions/23446377/syntax-error-due-to-using-a-reserved-word-as-a-table-or-column-name-in-mysql – Mário de Sá Vera May 25 '17 at 14:33
  • any chance this can be updated to Python 3 esp. 3.8? – evandrix Aug 25 '20 at 08:16
12

Alex Martelli's solution above works good, but needs some fixes and additions:

In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression:

line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)

or

line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)

Also, this line should be added before print:

line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')

Last, the column names in create statements should be backticks in MySQL. Add this in line 15:

  sub = sub.replace('"','`')

Here's the complete script with modifications:

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      sub = sub.replace('"','`')
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", "\\1THIS_IS_FALSE\\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    if re.search('^CREATE INDEX', line):
        line = line.replace('"','`')
    print line,

main()
Community
  • 1
  • 1
bb.
  • 1,353
  • 1
  • 13
  • 21
  • Just noticed this after copying his script and finding the same error. (Edited the earlier answer so its now correct). – dr jimbob Jul 29 '11 at 19:18
  • Two more corrections I required to migrate a phpBB board: 1) escabe backslash, and 2) `mediumtext(n)` --> `mediumtext`; – davide Apr 02 '14 at 23:51
  • 1
    I had to modify this script a bit to get it to work for me (had some issues importing dates, escaped semi-colons in strings, and having the table name in quotes) https://gist.github.com/rednebmas/df2535924bbbf7f32f3dd6040e370d15 – SamB May 20 '18 at 00:54
8

Here is a slightly better version of the original.

#! /usr/bin/perl
use strict;
use warnings;
use 5.010; # for s/\K//;

while( <> ){
  next if m'
    BEGIN TRANSACTION   |
    COMMIT              |
    sqlite_sequence     |
    CREATE UNIQUE INDEX
  'x;

  if( my($name,$sub) = m'CREATE TABLE \"([a-z_]*)\"(.*)' ){
    # remove "
    $sub =~ s/\"//g; #"
    $_ = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";

  }elsif( /INSERT INTO \"([a-z_]*)\"(.*)/ ){
    $_ = "INSERT INTO $1$2\n";

    # " => \"
    s/\"/\\\"/g; #"
    # " => '
    s/\"/\'/g; #"

  }else{
    # '' => \'
    s/\'\'/\\\'/g; #'
  }

  # 't' => 1
  s/[^\\']\K\'t\'/1/g; #'

  # 'f' => 0
  s/[^\\']\K\'f\'/0/g; #'

  s/AUTOINCREMENT/AUTO_INCREMENT/g;
  print;
}
Brad Gilbert
  • 32,263
  • 9
  • 73
  • 122
  • 1
    A better way would be to actually parse the input, and output it, in the correct format. – Brad Gilbert Jul 01 '09 at 18:00
  • Initially it seemed like a SQL export from sqlite would be easy enough to massage into a mysql compatible format, but the more time I spend on this the more I agree with you :) – Jiaaro Jul 01 '09 at 18:43
7

all of scripts on this page can't deal with simple sqlite3:

PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Filename (
  FilenameId INTEGER,
  Name TEXT DEFAULT '',
  PRIMARY KEY(FilenameId) 
  );
INSERT INTO "Filename" VALUES(1,'');
INSERT INTO "Filename" VALUES(2,'bigfile1');
INSERT INTO "Filename" VALUES(3,'%gconf-tree.xml');

None were able to reformat "table_name" into proper mysql's `table_name` . Some messed up empty string value.

Mickey Mouse
  • 91
  • 1
  • 3
3

Based on http://docs.python.org/dev/howto/regex.html ...

  1. Replace $line =~ /.*/ with re.search(r".*", line).
  2. $line !~ /.*/ is just !($line =~ /.*/).
  3. Replace $line =~ s/.*/x/g with line=re.sub(r".*", "x", line).
  4. Replace $1 through $9 inside re.sub with \1 through \9 respectively.
  5. Outside a sub, save the return value, i.e. m=re.search(), and replace $1 with the return value of m.group(1).
  6. For "INSERT INTO $1$2\n" specifically, you can do "INSERT INTO %s%s\n" % (m.group(1), m.group(2)).
Brad Gilbert
  • 32,263
  • 9
  • 73
  • 122
Ken_g6
  • 83
  • 1
  • 5
3

I am not sure what is so hard to understand about this that it requires a snide remark as in your comment above. Note that <> is called the diamond operator. s/// is the substitution operator and // is the match operator m//.

Sinan Ünür
  • 113,391
  • 15
  • 187
  • 326
2

Real issue is do you know actually how to migrate the database? What is presented is merely a search and replace loop.

hpavc
  • 1,294
  • 7
  • 7
1

Shortest? The tilde signifies a regex in perl. "import re" and go from there. The only key differences are that you'll be using \1 and \2 instead of $1 and $2 when you assign values, and you'll be using %s for when you're replacing regexp matches inside strings.

anschauung
  • 3,627
  • 3
  • 22
  • 33