1

Update: The reason I used code to write to MySQL server from another server is that this is an experiment. I need to record the time it takes to finish inserting all these rows. Loading data locally to MySQL through other methods is not the concern.

I was trying to insert over 15 million rows of records to a MySQL database that was located on another server. I used "github.com/go-sql-driver/mysql" driver and employed transaction. However, it takes me about 75 min to finish inserting. The bandwidth between the two servers is over 1GB/s. The CPU and memory utilization of MySQL is less than 20% throughout the time with 1 core CPU and 2GB memory. The code snippet is as follows:

fmt.Println("Connecting to MySQL...")

db, err := sql.Open("mysql", "root:password@tcp(ipAddress:3306)/ssc")
if err != nil {
    panic(err.Error())
}

fmt.Println("Connected to MySQL")

defer db.Close()

var transaction *sql.Tx
var stmt *sql.Stmt
transaction, err = db.Begin()
stmt, err = transaction.Prepare(`INSERT INTO Star2002PrimaryTracks(primaryTracks, offsetInByte, sizeInByte)     VALUE (?, ?, ?)`)
start := time.Now()
// primaryTracksSlice has over 10 million elements.
for i, ele := range primaryTracksSlice {
    if i != 0 && i%100000 == 0 {
        // commit transaction. init transaction and stmt
        elapsed := time.Since(start)
                    fmt.Printf("time since:%s\n", elapsed)
                    transaction.Commit()
                    fmt.Println("transaction commited with 100000 rows each.")
        elapsed = time.Since(start)
        fmt.Printf("transaction commited. time since: %s\n", elapsed)

        elapsed = time.Since(start)
        fmt.Printf("time since:%s\n", elapsed)
        transaction, err = db.Begin()
        stmt, err = transaction.Prepare(`INSERT INTO Star2002PrimaryTracks(primaryTracks, offsetInByte, sizeInByte)         VALUE (?, ?, ?)`)
        elapsed = time.Since(start)
        fmt.Printf("transaction and stmt initialization. time since: %s\n", elapsed)
    } else {
        // stmt.exec
        stmt.Exec(ele.PrimaryTracks, ele.Offset, ele.Size)
    }
}
transaction.Commit()

}

transaction.Commit(), db.Begin() and transaction.Prepare() takes way less than a second to finish. It seems that 100000 stmt.Exec()s takes about 30 seconds to finish. I do not think that it should take that long since stmt.Exec() does not have anything to do with transmitting data to MySQL server until transaction.Commit(). I would like to ask you guys out there, what might be wrong?

I feel like I should make an update. The table is created with:

CREATE TABLE `Star2002PrimaryTracks` (`id` INT UNSIGNED AUTO_INCREMENT, `primaryTracks` int, `offsetInByte` int, `sizeInByte` int, primary key (`id`))ENGINE=InnoDB DEFAULT CHARSET=utf8;

without adding addition indexes which could be the cause. There's only the primary key.

I think there is something wrong about stmt.Exec(). Why does it takes so long to execute when it does not have to commit data to MySQL server. Is not transaction.Commit() the one method which is supposed to do the job of commiting data to MySQL server and takes a relatively long time to finish?

CREATE TABLE `Star2002PrimaryTracks` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `primaryTracks` int(11) DEFAULT NULL,
  `offsetInByte` int(11) DEFAULT NULL,
  `sizeInByte` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1067367 DEFAULT CHARSET=utf8 |
danblack
  • 7,260
  • 2
  • 13
  • 31
Bruce
  • 169
  • 2
  • 11
  • Dose single stmt.Exec()s takes about 30 seconds to finish or whole 10 million rows takes 30sec? – Nameless May 27 '20 at 02:11
  • Try this method: https://stackoverflow.com/questions/19682414/how-can-mysql-insert-millions-records-faster and https://dba.stackexchange.com/questions/112409/fastest-way-to-do-bulk-import-in-mysql. The other thought it: let it take 75 minutes. By the time you get an answer and test it out, it's likely to be 75 minutes. If you just let that process run, you'd be done - assuming it is a one time effort. – zedfoxus May 27 '20 at 02:12
  • 100000 stmt.Exec() take about 30 seconds to finish. @RajivShah . The whole 15 million rows take about 75 minutes, which, I think, is too oddly slow. – Bruce May 27 '20 at 02:16
  • 2
    Or look at the [LOAD DATA LOCAL INFILE](https://godoc.org/github.com/go-sql-driver/mysql#RegisterLocalFile) hooks to push the data that way. – danblack May 27 '20 at 02:17
  • how much time does it take to commit 1 batch of transaction? – Nameless May 27 '20 at 02:20
  • @RajivShah you mean transaction.Commit()? transaction.Commit() takes way less than 1 second. But before transaction.Commit(), there are 100000 stmt.Exec()s which take about 30 seconds. – Bruce May 27 '20 at 02:22
  • @danblack That would work fine but this is an experiment. I need to insert rows of records to the MySQL table from another server. I need to record the time finishing inserting those 15 million rows and 75 mins are way below expectation. – Bruce May 27 '20 at 02:26
  • What MySQL version? What my.cnf configuration is applied to the mysql server to help it load data quickly? Can you include `SHOW CREATE TABLE Star2002PrimaryTracks` in your question? 2G may be insufficient depending on keys etc. Can you include a link to `SHOW GLOBAL STATUS` after a load? OT but it looks like your dropping every 100Kth record. – danblack May 27 '20 at 02:38
  • @danblack mysql Ver 14.14 Distrib 5.7.30, for Linux (x86_64) using EditLine wrapper. my.cnf: !includedir /etc/mysql/conf.d/ !includedir /etc/mysql/mysql.conf.d/ – Bruce May 27 '20 at 02:54
  • Hi, @danblack what is a load? I used transaction to make insertion faster. But it does not work – Bruce May 27 '20 at 02:54
  • @RajivShah I think there is something wrong about stmt.Exec(). Why does it takes so long to execute when it does not have to commit data to MySQL server. Is not transaction.Commit() the one method which is supposed to do the job of commiting data to MySQL server and takes a relatively long time to finish? – Bruce May 27 '20 at 02:57
  • @danblack I think there is something wrong about stmt.Exec(). Why does it takes so long to execute when it does not have to commit data to MySQL server. Is not transaction.Commit() the one method which is supposed to do the job of commiting data to MySQL server and takes a relatively long time to finish? – Bruce May 27 '20 at 02:57
  • @zedfoxus I think there is something wrong about stmt.Exec(). Why does it takes so long to execute when it does not have to commit data to MySQL server. Is not transaction.Commit() the one method which is supposed to do the job of commiting data to MySQL server and takes a relatively long time to finish? – Bruce May 27 '20 at 02:58
  • Your assumtion that commit does all the work is an unfounded one! Commit finalises the transaction, but each insert is sent to mysql server one by one, entered into the transaction log one by one. Using a prepared statement saves some time, but not that much. – Shadow May 27 '20 at 03:00
  • This is why mysql has a bulkminsert syntax and also provides load data statement. These are the statements that you should use for transferring large amount of data. – Shadow May 27 '20 at 03:02
  • @Shadow Really? Thank you for pointing it out. So with this in mind, how should I change the code to make insertions faster? Do I really have to "INSERT INTO Star2002PrimaryTracks(primaryTracks, offsetInByte, sizeInByte) VALUE (?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)(?, ?, ?)..."? – Bruce May 27 '20 at 03:03
  • stop repeating yourself. Read [insert optimization](https://dev.mysql.com/doc/refman/5.7/en/insert-optimization.html) and [innodb bulk loading](https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-bulk-data-loading.html). You look like you haven't done any tuning - set innodb_buffer_pool_size to 1.5G and [innodb_log_file_size](https://dba.stackexchange.com/questions/1261/how-to-safely-change-mysql-innodb-variable-innodb-log-file-size#1265) to 5G as a first estimate. – danblack May 27 '20 at 03:03
  • @danblack Thank you. I'm sorry. I am reading the three links you posted. – Bruce May 27 '20 at 03:10
  • I will let you guys know when I make a "breakthrough." – Bruce May 27 '20 at 03:11
  • @Bruce, first, try what danblack mentioned about `load data infile` or `load local data infile`. https://stackoverflow.com/questions/14127529/mysql-import-data-from-csv-using-load-data-infile. Take the 100K records and dump it to a text file. Then, ask MySQL to load that text file into your table (or even a temporary table) as a start. – zedfoxus May 27 '20 at 03:20
  • 1
    Hi, guys. I managed to achieve pretty good insertion performance by using "INSERT INTO Star2002PrimaryTracks(primaryTracks, offsetInByte, sizeInByte) VALUE (?, ?, ?),(?, ?, ?),(?, ?, ?)..." like this post mentioned https://stackoverflow.com/questions/12486436/how-do-i-batch-sql-statements-with-package-database-sql. Thank you guys for helping. I really appreciate your time, insights and patience. – Bruce May 27 '20 at 05:13
  • @Bruce could you type in your own answer and mark it as accepted so that others who come to the question can see that it is a solved question and can refer to your answer for guidance? How long did it take 100K transactions to get inserted? – zedfoxus May 27 '20 at 13:22

0 Answers0