Perl : Trying to deference an array after sorting it

Question

I am currently writing a perl script where I have a reference to an array (students) of references. After adding the hash references to the array. Now I add the references to the array of students and then ask the user how to sort them. This is where it gets confusing. I do not know how to deference the sorted array. Using dumper I can get the sorted array but in a unorganized output. How can I deference the array of hash references after sorting?

#!bin/usr/perl
use strict;
use warnings;
use Data::Dumper;
use 5.010;
#reference to a var $r = \$var; Deferencing $$r
#reference to an array $r = \@var ; Deferencing @$r
#referenc to  a hash $r = \%var ; deferencing %$r

my $filename = $ARGV[0];
my $students = [];




open ( INPUT_FILE , '<', "$filename" ) or die "Could not open to read \n ";


sub readLines{
    
    while(my $currentLine = <INPUT_FILE>){
        chomp($currentLine);
        my @myLine = split(/\s+/,$currentLine);
        my %temphash = (
        name => "$myLine[0]",
        age  => "$myLine[1]",
        GPA  => "$myLine[2]",
        MA   => "$myLine[3]"
        );
        pushToStudents(\%temphash);
    }
    
}

sub pushToStudents{
    my $data = shift;
    push $students ,$data;
}

sub printData{
    
    my $COMMAND = shift;
    
    if($COMMAND eq "sort up"){
        my @sortup = sort{ $a->{name} cmp $b->{name} } @$students;
        print Dumper @sortup;
        
    }elsif($COMMAND eq "sort down"){
        my @sortdown = sort{ $b->{name} cmp $a->{name} } @$students;
        print Dumper @sortdown;
        //find a way to deference so to make a more organize user friendly read.
        
    }else{
        print "\n quit";
    }
    
    
}

readLines();
#Output in random, the ordering of each users data is random
printf"please choose display order : ";
my $response = <STDIN>;
chomp $response;
printData($response);

My mistake. `print Dumper(\@sortdown);` All the same, I would have expected *something*, not "doesn't do anything". — cdarke, Apr 25 '15 at 07:39
What do you mean by *“unorganised output”*? Are you trying to get a tidy, formatted output using `Dumper`? — Borodin, Apr 25 '15 at 09:09

Sobrique · Accepted Answer · 2015-04-25T11:16:14.753

The problem here is that you're expected Dumper to provide an organised output. It doesn't. It dumps a data structure to make debugging easier. The key problem will be that hashes are explicitly unordered data structures - they're key-value mappings, they don't produce any output order.

With reference to perldata:

Note that just because a hash is initialized in that order doesn't mean that it comes out in that order.

And specifically the keys function:

Hash entries are returned in an apparently random order. The actual random order is specific to a given hash; the exact same series of operations on two hashes may result in a different order for each hash.

There is a whole section in perlsec which explains this in more detail, but suffice to say - hashes are random order, which means whilst you're sorting your students by name, the key value pairs for each student isn't sorted.

I would suggest instead of:

my @sortdown = sort{ $b->{name} cmp $a->{name} } @$students;
print Dumper @sortdown;

You'd be better off with using a slice:

my @field_order = qw ( name age GPA MA );

foreach my $student ( sort { $b -> {name} cmp $a -> {name} } @$students ) {
   print @{$student}{@field_order}, "\n";
}

Arrays (@field_order) are explicitly ordered, so you will always print your student fields in the same sequence. (Haven't fully tested for your example I'm afraid, because I don't have your source data, but this approach works with a sample data snippet).

If you do need to print the keys as well, then you may need a foreach loop instead:

foreach my $field ( @field_order ) {
    print "$field => ", $student->{$field},"\n";
}

Or perhaps the more terse:

print "$_ => ", $student -> {$_},"\n" for @field_order;

I'm not sure I like that as much though, but that's perhaps a matter of taste.

Say I wanted to separate each element in an array by a character. Would it be acceptable to use $, = ',' — MooCow, Apr 26 '15 at 17:34
That is what it is for. Be slightly cautious though, as it is a global. localising it to a scope is a good thing. But mostly I would say just use "join" instead — Sobrique, Apr 26 '15 at 19:20

score 2 · Answer 2 · edited May 23 '17 at 11:51

The essence of your mistake is to assume that hashes will have a specific ordering. As @Sobrique explains, that assumption is wrong.

I assume you are trying to learn Perl, and therefore, some guidance on the basics will be useful:

#!bin/usr/perl

Your shebang line is wrong: On Windows, or if you run your script with perl script.pl, it will not matter, but you want to make sure the interpreter that is specified in that line uses an absolute path.

Also, you may not always want to use the perl interpreter that came with the system, in which case #!/usr/bin/env perl maybe helpful for one-off scripts.

use strict;
use warnings;
use Data::Dumper;
use 5.010;

I tend to prefer version constraints before pragmata (except in the case of utf8). Data::Dumper is a debugging aid, not something you use for human readable reports.

my $filename = $ARGV[0];

You should check if you were indeed given an argument on the command line as in:

@ARGV or die "Need filename\n";
my $filename = $ARGV[0];

open ( INPUT_FILE , '<', "$filename" ) or die "Could not open to read \n ";

File handles such as INPUT_FILE are called bareword filehandles. These have package scope. Instead, use lexical filehandles whose scope you can restrict to the smallest appropriate block.

There is no need to interpolate $filename in the third argument to open.

Always include the name of the file and the error message when dying from an error in open. Surrounding the filename with ' ' helps you identify any otherwise hard to detect characters that might be causing the problem (e.g. a newline or a space).

open my $input_fh, '<', $filename
    or die "Could not open '$filename' for reading: $!";

sub readLines{

This is reading into an array you defined in global scope. What if you want to use the same subroutine to read records from two different files into two separate arrays? readLines should receive a filename as an argument, and return an arrayref as its output (see below).

   while(my $currentLine = <INPUT_FILE>){
        chomp($currentLine);

In most cases, you want all trailing whitespace removed, not just the line terminator.

       my @myLine = split(/\s+/,$currentLine);

split on /\s+/ is different than split ' '. In most cases, the latter is infinitely more useful. Read about the differences in perldoc -f split.

       my %temphash = (
        name => "$myLine[0]",
        age  => "$myLine[1]",
        GPA  => "$myLine[2]",
        MA   => "$myLine[3]"
        );

Again with the useless interpolation. There is no need to interpolate those values into fresh strings (except maybe in the case where they might be objects which overloaded the stringification, but, in this case, you know they are just plain strings.

       pushToStudents(\%temphash);

No need for the extra pushToStudents subroutine in this case, unless this is a stub for a method that will later be able to load the data to a database or something. Even in that case, it be better to provide a callback to the function.

sub pushToStudents{ my $data = shift; push $students ,$data; }

You are pushing data to a global variable. A program where there can only ever be a single array of student records is not useful.

sub printData{
    my $COMMAND = shift;
    if($COMMAND eq "sort up"){

Don't do this. Every subroutine should have one clear purpose.

Here is a revised version of your program.

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

use Carp qw( croak );

run(\@ARGV);

sub run {
    my $argv = $_[0];
    @$argv
        or die "Need name of student records file\n";

    open my $input_fh, '<', $argv->[0]
        or croak "Cannot open '$argv->[0]' for reading: $!";

    print_records(
        read_student_records($input_fh),
        prompt_sort_order(),
    );

    return;
}

sub read_student_records {
    my $fh = shift;

    my @records;

    while (my $line = <$fh>) {
        last unless $line =~ /\S/;
        my @fields = split ' ', $line;
        push @records, {
            name => $fields[0],
            age => $fields[1],
            gpa => $fields[2],
            ma => $fields[3],
        };
    }

    return \@records;
}

sub print_records {
    my $records = shift;
    my $sorter = shift;

    if ($sorter) {
        $records = [ sort $sorter @$records ];
    }

    say "@{ $_ }{ qw( age name gpa ma )}" for @$records;

    return;
}

sub prompt_sort_order {
    my @sorters = (
        [ "Input order",  undef ],
        [ "by name in ascending order",  sub { $a->{name} cmp $b->{name} } ],
        [ "by name in descending order", sub { $b->{name} cmp $a->{name} } ],
        [ "by GPA in ascending order",   sub { $a->{gpa} <=> $b->{gpa} } ],
        [ "by GPA in descending order",  sub { $b->{gpa} <=> $a->{gpa} } ],
    );

    while (1) {
        print "Please choose the order in which you want to print the records\n";
        print "[ $_ ] $sorters[$_ - 1][0]\n" for 1 .. @sorters;
        printf "\n\t(%s)\n", join('/', 1 .. @sorters);

        my ($response) = (<STDIN> =~ /\A \s*? ([1-9][0-9]*?) \s+ \z/x);
        if (
            $response and
            ($response >= 1) and
            ($response <= @sorters)
        ) {
            return $sorters[ $response - 1][1];
        }
    }

    # should not be reached
    return;
}

As a matter of style, I tend not to just take the first entry off `@ARVG` when taking file arguments - I'll usually do it as a `foreach` loop. — Sobrique, Apr 25 '15 at 13:50
This is quite something I will try substituting some of these methods into the program. I am not so familiar with regex. This will take sometime to read! — MooCow, Apr 26 '15 at 17:33

Perl : Trying to deference an array after sorting it

2 Answers2