3

I have got a datatable with thousands of records. I have got a postgres table with the same fields of the datatable. I want everyday to truncate this table and fill again with the data of the datatable. I have seen sql bulk copy, but it is not avalaible on postgres. So, which one is the most effective way?

  • One insert per record
  • Multiple insert: insert into table values (1,1),(1,2),(1,3),(2,1);
  • Select from datatable and insert into postgres with linq? no idea...

Thanks.

Za7pi
  • 1,048
  • 4
  • 18
  • 31

5 Answers5

4

PostgreSQL definitely does have a bulk copy (it's actually called copy), and it has a nice wrapper for .NET. If you are loading, you want to use the NpgsqlCopyIn, and if you are extracting data you can use NpgsqlCopyOut.

Your question is a little vague on details -- I don't know the fields in your datatable or anything about your actual database, so take this as a brief example on how to bulk insert data into a table using C#/PostgreSQL:

    NpgsqlCopyIn copy = new NpgsqlCopyIn("copy table1 from STDIN WITH NULL AS '' CSV;",
        conn);
    copy.Start();

    NpgsqlCopySerializer cs = new NpgsqlCopySerializer(conn);
    cs.Delimiter = ",";

    foreach (var record in RecordList)
    {
        cs.AddString(record.UserId);
        cs.AddInt32(record.Age);
        cs.AddDateTime(record.HireDate);
        cs.EndRow();
    }

    cs.Close();
    copy.End();

-- Edit 8/27/2019 --

The construct for Npgsql has completely changed. Below is a boilerplate for the same example above, using binary import (text is also available):

using (var writer = conn.BeginBinaryImport(
    "copy user_data.part_list from STDIN (FORMAT BINARY)"))
{
    foreach (var record in RecordList)
    {
        writer.StartRow();
        writer.Write(record.UserId);
        writer.Write(record.Age, NpgsqlTypes.NpgsqlDbType.Integer);
        writer.Write(record.HireDate, NpgsqlTypes.NpgsqlDbType.Date);
    }

    writer.Complete();
}
Hambone
  • 13,222
  • 6
  • 37
  • 59
3

Probably you can check my other answer in which I describe a little helper I have created for this problem, making use of another helper really easy: https://stackoverflow.com/a/46063313/6654362

Edit: I have recently run into similar problem, but we were using Postgresql. I wanted to use effective bulkinsert, what turned out to be pretty difficult. I haven't found any proper free library to do so on this DB. I have only found this helper: https://bytefish.de/blog/postgresql_bulk_insert/ which is also on Nuget. I have written a small mapper, which auto mapped properties the way Entity Framework does:

public static PostgreSQLCopyHelper<T> CreateHelper<T>(string schemaName, string tableName)
        {
            var helper = new PostgreSQLCopyHelper<T>(schemaName, "\"" + tableName + "\"");
            var properties = typeof(T).GetProperties();
            foreach(var prop in properties)
            {
                var type = prop.PropertyType;
                if (Attribute.IsDefined(prop, typeof(KeyAttribute)))
                    continue;
                switch (type)
                {
                    case Type intType when intType == typeof(int) || intType == typeof(int?):
                        {
                            helper = helper.MapInteger("\"" + prop.Name + "\"",  x => (int?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type stringType when stringType == typeof(string):
                        {
                            helper = helper.MapText("\"" + prop.Name + "\"", x => (string)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type dateType when dateType == typeof(DateTime) || dateType == typeof(DateTime?):
                        {
                            helper = helper.MapTimeStamp("\"" + prop.Name + "\"", x => (DateTime?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type decimalType when decimalType == typeof(decimal) || decimalType == typeof(decimal?):
                        {
                            helper = helper.MapMoney("\"" + prop.Name + "\"", x => (decimal?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type doubleType when doubleType == typeof(double) || doubleType == typeof(double?):
                        {
                            helper = helper.MapDouble("\"" + prop.Name + "\"", x => (double?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type floatType when floatType == typeof(float) || floatType == typeof(float?):
                        {
                            helper = helper.MapReal("\"" + prop.Name + "\"", x => (float?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type guidType when guidType == typeof(Guid):
                        {
                            helper = helper.MapUUID("\"" + prop.Name + "\"", x => (Guid)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                }
            }
            return helper;
        }

I use it the following way (I had entity named Undertaking):

var undertakingHelper = BulkMapper.CreateHelper<Model.Undertaking>("dbo", nameof(Model.Undertaking));
undertakingHelper.SaveAll(transaction.UnderlyingTransaction.Connection as Npgsql.NpgsqlConnection, undertakingsToAdd));

I showed an example with transaction, but it can also be done with normal connection retrieved from context. undertakingsToAdd is enumerable of normal entity records, which I want to bulkInsert into DB.

This solution, to which I've got after few hours of research and trying, is as you could expect much faster and finally easy to use and free! I really advice you to use this solution, not only for the reasons mentioned above, but also because it's the only one with which I had no problems with Postgresql itself, many other solutions work flawlessly for example with SqlServer.

Michał Pilarek
  • 141
  • 1
  • 10
1

There is some option to bulk insert into PostgreSQL.

By example, in my library, I'm using the SQL Copy

COPY TableName (Column1, Column2, Column3) FROM STDIN BINARY

Disclaimer: I'm the owner of the project Bulk-Operations.NET

This library make it very easy to perform any kind of bulk operations:

  • BulkInsert
  • BulkUpdate
  • BulkDelete
  • BulkMerge

In multiple database provider including PostgreSQL

// Easy to use
var bulk = new BulkOperation(connection);
bulk.BulkInsert(dt);
bulk.BulkUpdate(dt);
bulk.BulkDelete(dt);
bulk.BulkMerge(dt);
Jonathan Magnan
  • 9,283
  • 2
  • 28
  • 53
1

As said in other answers there is no built-in solution for that and just some helper libraries (free and non-free), personally I come up with my own solution. Advantages of this are

  • Free, easy to use
  • Doesn't demand extra setup for mappings, it reuses metadata from DB itself and from EF DbContext
  • Uses dyamic code building for performance

Usage is like this:

var uploader = new NpgsqlBulkUploader(context);
var data = GetALotOfData();
uploader.Insert(data);
// OR
uploader.Update(data);

I described it there

Tony
  • 6,917
  • 3
  • 21
  • 33
0

Above mentioned solutions require you to specify number of columns and their types, thus making your code table specific. If your tables are relatively small and have the same amount of columns and same/compatible column types, you can do it in a generic way. Assume you want to migrate a Sqlite table to PosgreSql:

// Get data from SqlLite database table
SQLiteConnection sqliteConnection = new SQLiteConnection(new SQLiteConnectionStringBuilder() { DataSource = @"C:\dataBase.sqlite" }.ConnectionString);
sqliteConnection.Open();
var reader = new SQLiteCommand($"SELECT * from table_which_we_want_to_migrate").ExecuteReader();
var dataFromSqliteTable = new DataTable() { CaseSensitive = true };
dataFromSqliteTable.Load(reader);

// Connect to PostgreSql database
var connection = new NpgsqlConnection(new NpgsqlConnectionStringBuilder()
{
    Host = "localhost",
    Port = 5432,
    Database = "DatabaseName",
    Username = "UserName",
    Password = "Password"
}.ToString());
connection.Open();

// Insert every row from the Sqlite table into PostgreSql table
foreach (DataRow row in dataFromSqliteTable.Rows)
{
    // Create an NpgsqlParameter for every field in the column
    var parameters = new List<DbParameter>();
    for (var i = 0; i < dataFromSqliteTable.Columns.Count; i++)
    {
        parameters.Add(new NpgsqlParameter($"@p{i}", row[i]));
    }
    var parameterNames = string.Join(", ", parameters.Select(p => p.ParameterName));
    
    // Create an INSERT SQL query which inserts the data from the current row into PostgreSql table
    var command = new NpgsqlCommand(
        $"INSERT INTO table_which_we_want_to_migrate VALUES ({parameterNames})",
        connection);
    command.Parameters.AddRange(parameters.ToArray());
    command.ExecuteNonQuery();
}

Alternative way is by using command line utilities and import/export via CSV file. This way is much faster and works even with big tables:

sqlite3 dataBase.sqlite ".output 'temporaryFile.csv.tmp'" ".headers off" ".mode csv" "SELECT * FROM table_which_we_want_to_migrate;" ".quit"
psql --command="\copy table_which_we_want_to_migrate FROM 'temporaryFile.csv.tmp' DELIMITER ',' CSV"
PolarBear
  • 618
  • 7
  • 16