0

I have 2 csv files. In one file I have a phone number with prices and in the second file I have a phone number with the name of its owner.

First file: file1.csv

491732234332;30,99
491723427343;12,59
491732097232;33,31

Second file: file2.csv

01732/234332;Ben Jefferson
01723/427343;Jon Doe
01732/097232;Benjamin Franklin

My problem is, that the phone number columns are formatted differently and I can not find a way to compare them.

Desired output is:

01732/234332;30,99;Ben Jefferson
01723/427343;12,59;Jon Doe
01732/097232;33,31;Benjamin Franklin

My sql statement is

create temp table FILETB1
      (phonenr    char(30),
       price      char(30)
      );

create temp table FILETB2
      (phonenr char(40),
       owner char(60)   
      );

       load from "file1.csv"
       insert into FILETB1;

       load from "file2.csv"
       insert into FILETB2;

       unload to "output.csv"

       select FILETB1.phonenr, FILETB1.price, FILETB2.owner
       from FILETB1, FILETB2
       where FILETB1.phonenr = FILETB2.phonenr

How do I have to modify my where clause to be able to compare both columns?

We are working on linux with IBM INFORMIX-SQL Version 7.50.UC5 which makes finding a working solution not easier since many functions are not supported...

Any help is highly appreciated!

Alexander Baltasar
  • 1,006
  • 1
  • 11
  • 25
  • What OS are you using. I hope is UNIX/Linux if so tell me what shell flavour is in use. Are those prefix (4917? for the csv1 and 0173?/ for csv2) all the same size (5 digits for the csv1 and 5 digits and the '/' for the csv2)? If so, in the csv2 you can use the `tr` command to switch the '/' to a ';' and use `cut` on the csv1 to drop the prefix then you can unload it to a temp table and cross directly the 1st column on the csv1 with the 2nd one of the cvs2. – Ricardo Henriques Feb 12 '16 at 11:56
  • The point is not the "tr"! I have problems with the different formating of the phonenumbers. "491732234332" is not the same as "01732/234332" so my "where clause" does not work. – Alexander Baltasar Feb 12 '16 at 12:08
  • Note that ISQL is a set of front-end tools; the server version is separate from, and largely independent of, the version of ISQL. – Jonathan Leffler Feb 13 '16 at 04:16

2 Answers2

1

Using just the facilities of ISQL, you can use:

CREATE TEMP TABLE FILETB1
(
    phonenr    CHAR(30),
    price      CHAR(30)
);

CREATE TEMP TABLE FILETB2
(
    phonenr CHAR(40),
    owner CHAR(60)   
);

LOAD FROM "file1.csv" DELIMITER ';' INSERT INTO FILETB1;
LOAD FROM "file2.csv" DELIMITER ';' INSERT INTO FILETB2;

UNLOAD TO "output.csv" DELIMITER ';'
    SELECT FILETB2.phonenr, FILETB1.price, FILETB2.owner
      FROM FILETB1, FILETB2
     WHERE FILETB1.phonenr[3,6] = FILETB2.phonenr[2,5]
       AND FILETB1.phonenr[7,12] = FILETB2.phonenr[7,12];

Testing with DB-Access, I got:

$ dbaccess stores so-35360310.sql


Database selected.


Temporary table created.


Temporary table created.


3 row(s) loaded.


3 row(s) loaded.


3 row(s) unloaded.


Database closed.

$ cat output.csv
01732/234332;30,99;Ben Jefferson;
01723/427343;12,59;Jon Doe;
01732/097232;33,31;Benjamin Franklin;
$

The key is using the built-in substring [start,end] operator. You compare the two parts of the phone numbers that are comparable. And you select the number from file2.csv (table FILETB2) because that's the format you wanted.

For the sample data, of course, you could simply use Unix command line tools to do the job, but knowing how to do it inside the DBMS is helpful too.

You could also use the SUBSTR(col, start, len) function:

UNLOAD TO "output2.csv" DELIMITER ';'
    SELECT FILETB2.phonenr, FILETB1.price, FILETB2.owner
      FROM FILETB1, FILETB2
     WHERE SUBSTR(FILETB1.phonenr, 3, 3) = SUBSTR(FILETB2.phonenr, 2, 3)
       AND SUBSTR(FILETB1.phonenr, 7, 6) = SUBSTR(FILETB2.phonenr, 7, 6);

This produces the same output from the sample data.

If ISQL does not recognize the DELIMITER ';' clause to the UNLOAD (or LOAD) pseudo-SQL statements, then you can set the environment variable DBDELIMITER=';' before running the script and remove those clauses from the SQL.

Jonathan Leffler
  • 666,971
  • 126
  • 813
  • 1,185
0

The sugestion is, for the file2.csv if you use tr you get:

[infx1210@tardis ~]$ cat file2.csv | tr '/' ';' > file.2
[infx1210@tardis ~]$ cat file.2
01732;234332;Ben Jefferson
01723;427343;Jon Doe
01732;097232;Benjamin Franklin
[infx1210@tardis ~]$

For the file1.csv if you know that the prefix is always 6 digits long you can use:

    [infx1210@tardis ~]$ cut -c7- file1.csv > file.1
    [infx1210@tardis ~]$ cat file.1
    234332;30,99
    427343;12,59
    097232;33,31
    [infx1210@tardis ~]$

As you can see you can use the 1st field of the file.1 to cross directly with the 2nd one on the file.2.

Then you can execute:

CREATE TEMP TABLE filetb1(
    phonenr CHAR(30),
    price   CHAR(30)
);

CREATE TEMP TABLE filetb2(
    prefix  CHAR(30),
    phonenr CHAR(30),
    owner   CHAR(60)   
);

LOAD FROM 'file.1' DELIMITER ';' INSERT INTO filetb1;

LOAD FROM 'file.2' DELIMITER ';' INSERT INTO filetb2;

UNLOAD TO 'output.csv' DELIMITER ';'
SELECT 
    TRIM(f2.prefix )|| '/' || TRIM(f2.phonenr),
    f1.price,
    f2.owner
FROM 
    filetb1 f1, filetb2 f2
WHERE 
    f1.phonenr = f2.phonenr;

And you'll get the disered ouput:

[infx1210@tardis ~]$ cat output.csv
01732/234332;30,99;Ben Jefferson;
01723/427343;12,59;Jon Doe;
01732/097232;33,31;Benjamin Franklin;
[infx1210@tardis ~]$

If you're not sure that the prefix on the file1.csv is a 6 digit length leave it and use LIKE:

CREATE TEMP TABLE filetb1(
    phonenr CHAR(30),
    price   CHAR(30)
);

CREATE TEMP TABLE filetb2(
    prefix  CHAR(30),
    phonenr CHAR(30),
    owner   CHAR(60)   
);

LOAD FROM 'file.1' DELIMITER ';' INSERT INTO filetb1;

LOAD FROM 'file.2' DELIMITER ';' INSERT INTO filetb2;

UNLOAD TO 'output.csv' DELIMITER ';'
SELECT 
    TRIM(f2.prefix )|| '/' || TRIM(f2.phonenr),
    f1.price,
    f2.owner
FROM 
    filetb1 f1, filetb2 f2
WHERE 
    f1.phonenr LIKE TRIM(f2.phonenr)||'%';
Ricardo Henriques
  • 1,026
  • 6
  • 13