Background:
(1) Here is what I extract from a huge ascii file of around 700Mb:
0, 0, 0, 0, 0, 0, 0, 0, 3.043678e-05, 3.661498e-05, 2.070347e-05,
2.47175e-05, 1.49877e-05, 3.031176e-05, 2.12128e-05, 2.817522e-05,
1.802658e-05, 7.192285e-06, 8.467806e-06, 2.047874e-05, 9.621194e-05,
4.467542e-05, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.000421869,
5.0003081213, 0.0001938675, 8.70334e-05, 0.0002973858, 0.0003385935,
8.763598e-05, 2.743326e-05, 0, 0.0001043894, 3.409237e-05, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0;
(2) I would like to do two tasks:
(2.1) Find the maximum among the numbers separated by colons and semicolons.
It is 5.0003081213
in the above extracted lines.
(2.2) Find the largest 4 (says) values among the lines.
It is 5.0003081213, 0.000421869, 0.0003385935 and 0.0002973858
in the above extracted lines.
My thought:
(3) I expect to do the work with perl
.
(4) I think that I can match the number with ([0-9.e-]+)
.
My Problem:
(5) However, I am new to perl
and unix
and I do not know how to proceed to find the maximum values.
(6) I searched similar questions for a half day and found that I may make use of List::Util
. I do not know it is an appropriate choice for my problem and actually I do not know how this subroutine can be adopted.
(7) Says, the numbers are contained in a file, named input.txt
. May I know if it is possible to finish the tasks with a one line script?
Thanks for your understanding and I appreciate so much for your help.
Further Question raised:
Thanks to many warm replies and help from stack overflow users, I got the above question solved. However, if I would like to find out a maximum only from Line 3 to Line 6 of the following data:
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1.193129938e-07, 0, 0, 0, 0, 0, 0,
0, 2.505016514e-05, 4.835713883e-05, 6.128770648e-05, 1.38018881e-05, 2.303402101e-05,
0, 0, 0, 0, 3.5838803e-05, 0.000104883779, 0, 0, 1.813278467e-05, 0.0001350646297,
0.0007846746908, 0.001728603877, 0.001082733652, 0.001511217708, 0.0009537032505,
0.0004436753321, 0.002182536356, 0.0005719495782, 9.055173127e-05, 1.245663419e-05,
0.0004568318755, 0.0003056741688, 3.186642459e-05, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0.000101613512, 5.451410965e-05, 0, 0, 0, 0, 0.001172270099, 7.088900819e-05, 0,
1.848198352e-06, 0.0006870109246, 0.00276857581, 0.002038545509, 0.001111047938,
0.0007607533934, 0.0007915864957, 0.001105735631, 0.001456989534, 0.0007245351113,
0.0004262289031, 0.0003041285247, 0.0001528418892, 2.332078749e-05, 9.695149464e-05,
1.004024021e-07, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
That is,
0, 0, 0, 0, 3.5838803e-05, 0.000104883779, 0, 0, 1.813278467e-05, 0.0001350646297,
0.0007846746908, 0.001728603877, 0.001082733652, 0.001511217708, 0.0009537032505,
0.0004436753321, 0.002182536356, 0.0005719495782, 9.055173127e-05, 1.245663419e-05,
0.0004568318755, 0.0003056741688, 3.186642459e-05, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Then, how can I modify the script
grep -o '[0-9e.-]*' file | sort -rg | head -1
to achieve this purpose?
I know that the command sed
can work on lines of files by adding an option (3,6p)
. So, I am wondering if I can modify the above scripts by adding an option like this. I appreciate your help again.