0

I have a huge mainframe file extracted from a legacy system. The file is encoded in ascii format. I want to convert that to comp3. Is there any algorithm available in java to do that? Also I need assistance on how to unpack the comp3 fields. I tried a java code to unpack comp3 but I found improper result

Please refer the code to unpack comp3 fields

import java.math.BigInteger;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;

/**
 * Converts between integer and an array of bytes in IBM mainframe packed
 * decimal format. The number of bytes required to store an integer is (digits +
 * 1) / 2. For example, a 7 digit number can be stored in 4 bytes. Each pair of
 * digits is packed into the two nibbles of one byte. The last nibble contains
 * the sign, 0F for positive and 0C for negative. For example 7654321 becomes
 * 0x76 0x54 0x32 0x1F.
 **/ 


public class PackedDecimalToComp {

    public static void main(String[] args) {

        try {
            // test.unpackData(" 0x12345s");
            Path path = Paths.get("C:\\Users\\AV00499269\\Desktop\\Comp3 data file\\Comp3Test.txt");
            byte[] data = Files.readAllBytes(path);
            PackedDecimalToComp test = new PackedDecimalToComp();
            test.unpackData(data);
        } catch (Exception ex) {
            System.out.println("Exception is :" + ex.getMessage());
        }    
    }

    private static String unpackData(byte[] packedData) {
        String unpackedData = "";

        final int negativeSign = 13;
        for (int currentCharIndex = 0; currentCharIndex < packedData.length; currentCharIndex++) {
            byte firstDigit = (byte) ((packedData[currentCharIndex] >>> 4) & 0x0F);
            byte secondDigit = (byte) (packedData[currentCharIndex] & 0x0F);
            unpackedData += String.valueOf(firstDigit);
            if (currentCharIndex == (packedData.length - 1)) {
                if (secondDigit == negativeSign) {
                    unpackedData = "-" + unpackedData;
                }
            } else {
                unpackedData += String.valueOf(secondDigit);
            }
        }
        System.out.println("Unpackeddata is :" + unpackedData);

        return unpackedData;
    }    
}
  • You can't have COMP-3 packed fields in an ASCII file. They are not ASCII. – user207421 Dec 10 '18 at 06:20
  • I checked the file, each character was represented in ascii code @user207421 – Ashwini vijaykumar Dec 10 '18 at 06:36
  • 2
    @ashwini if the file file was always ascii (e.g. from a unix box) you are OK; but if the source was EBCDIC and some one has done an EBCDIC to ASCII conversion, the comp-3 fields are screwed. See my answer in https://stackoverflow.com/questions/46313332/how-do-you-generate-javajrecord-code-for-a-cobol-copybook For binary EBCDIC files you **must** transfer the file as EBCDIC (or convert the binary fields to text **before** the translation – Bruce Martin Dec 10 '18 at 07:46
  • @BruceMartin is there any tool with which we can convert mainframe ebcdic to ascii? And by the by, the file provided to me is an ebcdic file not ascii, I mentioned wrongly above – Ashwini vijaykumar Dec 13 '18 at 09:41
  • 1
    JRecord can (with java code); you can use a Cobol Copybook; use a Xml file description or define the fields in Java. Have a look the quetion / answer in my last post. Also look at CobolToCsv (https://sourceforge.net/projects/coboltocsv/) and CobolToXml child projects – Bruce Martin Dec 13 '18 at 12:25
  • You cannot convert binary EBCDIC data to ASCII safely, because the conversion is ambiguous. – Jonathan Rosenne May 11 '19 at 18:55

1 Answers1

3

The comment in your code is incorrect. Packed data with a positive sign has x'A', x'C', x'E', or x'F' in the last nibble. Mainframes also have a concept of "preferred sign" which is x'C' in the last nibble for positive and x'D' in the last nibble for negative.

It is common for mainframe data to include both text and binary data in a single record, for example a name, a currency amount, and a quantity:

Hopper Grace ar% .

...which would be...

x'C8969797859940404040C799818385404040404081996C004B'

...in hex. This is code page 37, commonly referred to as EBCDIC.

Without knowing that the family name is confined to the first 10 bytes, the given name confined the the subsequent 10 bytes, the currency amount is in packed decimal (also known as binary coded decimal) in the next 3 bytes, and the quantity in the next two bytes, you cannot accurately transfer the data because code page conversion will destroy the currency amount. Converting to code page 1250, commonly in use on Microsoft Windows, you would end up with...

x'486F707065722020202047726163652020202020617225002E'

...where the text data is translated but the packed data is destroyed. The packed data no longer has a valid sign in the last nibble (the lower half of the last byte), the currency amount itself has been changed as has the quantity (from decimal 75 to decimal 11,776 due to both code page conversion and mangling of a big endian number as a little endian number).

This question has some answers that may help you. My recommendation is to have all the data converted to text on the mainframe prior to transferring it to another platform. There are mainframe utilities that excel at this.

cschneid
  • 9,037
  • 1
  • 28
  • 34