1

I know similar questions have been posted before, but when I try to follow similar approaches as per the suggestions in the comments, it simply does not help. My query is the following:

LOAD DATA INFILE 'File.txt'
IGNORE
INTO TABLE table_name
FIELDS TERMINATED BY '^~'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(RUN_DATE, PROC_DT, STL_DT, TRD_DT)
SET RUN_DATE = STR_TO_DATE(RUN_DATE, '%d-%b-%y'); 

The records in the file look something like this:

RUN_DATE^~PROC_DT^~STL_DT^~TRD_DT
21-DEC-20^~23-DEC-20^~23-DEC-20^~21-DEC-20

The dates that get loaded are all populated as '0000-00-00 00:00:00' which I know are the default values when there is a datatype error and IGNORE is used. From what I found online, the issue has to do with the in-file date not being in yyyy-mm-dd format which is the default for mySQL, but the '%d-%b-%y' in the STR_TO_DATE function should help alleviate this issue since

%d: Day of the month as a numeric value (01 to 31) -

%b: Abbreviated month name (Jan to Dec) -

%y: Year as a numeric, 2-digit value

Why is this not helping? I also tried making the months lower case using LOWER() thinking maybe the abbreviated months needed to be all lower case, but this produces the same result. What am I missing here?

Bill Karwin
  • 462,430
  • 80
  • 609
  • 762

1 Answers1

2

To read from the file but store a modified value, you need to use variables:

LOAD DATA INFILE 'File.txt'
IGNORE
INTO TABLE table_name
FIELDS TERMINATED BY '^~'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(@RUN_DATE, @PROC_DT, @STL_DT, @TRD_DT)
SET RUN_DATE = STR_TO_DATE(@RUN_DATE, '%d-%b-%y'),
    PROC_DT = STR_TO_DATE(@PROC_DT, '%d-%b-%y'),
    STL_DT = STR_TO_DATE(@STL_DT, '%d-%b-%y'),
    TRD_DT = STR_TO_DATE(@TRD_DT, '%d-%b-%y');
ysth
  • 88,068
  • 5
  • 112
  • 203
  • Hi @ysth, thank you so much for your response. It fixed the issue in some of the columns, however, in some of the other columns it now has NULL instead. – Jesus Rodriguez Jan 11 '21 at 03:31
  • 1
    sorry, I had the wrong variable name one place – ysth Jan 11 '21 at 04:14
  • 1
    likely STR_TO_DATE is failing to parse a date and returning NULL. show the values that are ending up NULL? – ysth Jan 11 '21 at 04:15
  • 1
    Either something is wrong in your data or something is wrong in your sql. Show your full sql and the lines of data that are not working correctly? – ysth Jan 11 '21 at 05:05
  • Sorry, for deleting the comments, I'm trying to be clear and succinct. So for example the columns above RUN_DATE, PROC_DT, STL_DT, TRD_DT have values '21-DEC-20^~17-DEC-20^~21-DEC-20^~18-DEC-20' and the logic is exactly identical for each of these, but RUN_DATE is NULL while the others are perfectly populated – Jesus Rodriguez Jan 11 '21 at 05:08
  • 1
    if you do `SHOW WARNINGS;` immediately after the LOAD DATA, what does that show? – ysth Jan 11 '21 at 05:09
  • 1
    does `SHOW CREATE TABLE table_name` show any differences between the date column types? – ysth Jan 11 '21 at 05:11
  • 1
    are you sure you have @RUN_DATE both places, not RUN_DATE? – ysth Jan 11 '21 at 05:14
  • So the SHOW WARNINGS shows that I'm using the previous column, which is an 8 digit ID as RUN_DATE. RUN_DATE is the second column and its using the first column to populate it. I'm guessing only a few got correctly populated because the prev column is a date and the other errors are due to the same column offset. How do I fix this? When I load the whole table everything except the dates get correctly populated. – Jesus Rodriguez Jan 11 '21 at 05:15
  • 1
    sorry, I thought you meant you had found the problem. if the columns are off by one, then the list just before the SET doesn't match the fields in the file. make it match (specifying the column names for the non-date fields and the variable names for the date fields) – ysth Jan 12 '21 at 05:34
  • Yes, I actually did find a solution by doing just as you said. I had to write all 154 columns and then do the preprocessing logic on just the date columns. Thank you SO much for your help! – Jesus Rodriguez Jan 12 '21 at 15:54