0

I have a text file containing 10 rows. Each row has 10 elements separated by commas which are already sorted row wise like:

3463,34957,44443,50481,71036,73503,74289,76671,82462,92527
1456,2731,18159,20440,32962,38562,49321,64220,67615,72541
1073,6217,9695,27372,30624,38021,47851,68479,76834,88021
7930,11882,17681,27267,32131,45096,59008,69156,72843,94146
2381,4359,30194,40730,73714,74721,75127,78830,86753,89475
1466,21335,21369,23342,36973,50888,67891,78069,90346,99970
15015,16628,21012,25483,42387,42519,45472,49552,57193,71449
1751,8833,35433,39972,44475,47604,51601,59108,87957,94764
10728,17248,31885,41453,41479,54785,81400,83554,86014,87105
228,9479,25187,50956,70720,71878,78744,84341,86637,88225

Now i want to sort these 100 elements without disturbing the row order (i.e: The smallest number (228) should be at first position and largest number (99970) should be at the last position and i need to store those fully Sorted numbers into another file.

I am facing problem to add these numbers in Array and then i want to know how to sort these. The constraint is not more than 10 elements should be in RAM at a time.

I have started to written some code for this purpose to get data from the file:

public static void main(String args[])
{
    File file = new File("SortedLines.txt");
    FileInputStream fis = null;
    String st;
    try 
    {
        fis = new FileInputStream(file);

        int content;
        while ((content = fis.read()) != -1) 
        {
            // convert to char and display it
            System.out.print((char)content);
        }
    } 
    catch (IOException e) 
    {
        e.printStackTrace();
    } 
}
tshepang
  • 10,772
  • 21
  • 84
  • 127
  • What do you mean by "without disturbing the row order?" Your explanation for it makes no sense. Do you mean you want to sort each row separately? – 2rs2ts Apr 04 '14 at 14:25
  • No rows are already sorted. actually i need to sort each and every data element in that file. after sorting each row should contain exactly 10 elements. As there were initially 10 elements in a row in my input file. Sorry for my bad english – Harpreet Singh Apr 04 '14 at 16:37

3 Answers3

0

For each row:

  1. split the String read from the file by ','
  2. create a new Integer[splittedString.length], iterate over the string array, create an Integer from that String with Integer.parseInt(..) and put that in the appropriate position of the created Integer[]
  3. call Arrays.sort(..) with the created array
markusw
  • 1,840
  • 13
  • 27
0

If the numbers to sort is in the order you mention above, you could simply use Arrays.sort():

  • Create one array containing all numbers (I understand these are integers), say myUnsortedArray

  • Call Arrays.sort(myUnsortedArray).

That should do the job of sorting the array. You can then transform it the way you want.

Hope this helps.

Diferdin
  • 1,232
  • 1
  • 14
  • 30
  • problem is in real i have a file containing 1 million entries in 1000 * 1000 matrix form and i can't take more than 10000 elements in RAM at a time. – Harpreet Singh Apr 04 '14 at 16:42
  • This is my problem Statement "You are given a Million of numbers and you have to find the 100th smallest number." PROBLEM CONSTRAINT: Not more than 10,000 elements can be in RAM at a time – Harpreet Singh Apr 04 '14 at 16:42
  • i am using files to accomplish that, but unfortunately i m not good in file handling – Harpreet Singh Apr 04 '14 at 16:44
  • @HarpreetSingh If that's your actual problem you should have asked that. – 2rs2ts Apr 04 '14 at 17:02
  • if u have a solution for that then please provide it to me, i have solved almost half of my problem and now i m stuck – Harpreet Singh Apr 04 '14 at 18:29
  • @HarpreetSingh I think it is unpolite to approach this forum asking solutions the way you do in the above comment. This forum is about helping each other -- hence you are supposed to help others out the same way you are being helped here -- and your score does not show really much commitment to others. So just bear in mind nobody is due to give you ready solutions. Having said that, please look at the other answer I'm about to post. – Diferdin Apr 07 '14 at 08:48
0

The comments to answers above seem to slightly shift the problem here. You stated the problem as follows:

in real i have a file containing 1 million entries in 1000 * 1000 matrix form and i can't take more than 10000 elements in RAM at a time

And then:

"You are given a Million of numbers and you have to find the 100th smallest number." PROBLEM CONSTRAINT: Not more than 10,000 elements can be in RAM at a time

I would suggest you to edit your original question to reflect this problem -- if this is really the ultimate problem you are trying to solve.

What I understand of your problem is: you have to find the 100th smallest number out of a 1000*1000 matrix (please note: this is quite different from saying you've for 1 million numbers), with the constraint that no more than 10.000 numbers can be kept in memory. If I am correct, a potential solution may be:

  1. Load one matrix row in memory as an array, let's call it minValues
  2. Sort it with Arrays.sort() as suggested before.
  3. Keep the lowest and highest value in temporary variables, let's call them a and b
  4. for each values of the subsequent rows, let's call it x, check whether a < x < b. If that's the case, insert the value in minValues. this will naturally push the last element out of the array, so you'll have to change the value for b
  5. At the end of all iterations, minValues will contain the smallest 100 elements in the matrix, just pick the last (i.e. b) and that will be your 100th smallest element.

You can parameterise this method with any value (e.g. if you need the smallest 157th element), and the memory footprint is

100 elements (for MinValues) + 1000 elements (the row under inspection) + 2 (a and b) = 1102 elements

Still way below the maximum 10.000 elements limit. Performance in terms of speed may not be great, but this requirements was not in the picture -- and anyway, when dealing with large amounts of data under small memory requirements you do have to trade some performance.

I'd love to hear of a better way of achieving the goal.

EDIT: I'd suggest you also check out the Frederickson and Johnson algorithm. It solves the issue in O(K) time, where K is the element shought after (100 in your case). Not sure about the memory footprint though.

Hope this helps.

Community
  • 1
  • 1
Diferdin
  • 1,232
  • 1
  • 14
  • 30