0

I'm trying to create a sorted list in ascending order. I must read a file which contains a year in every line ( so I want to order from the earliest date to the most recent).

What I'm trying to accomplish with my code is:

  1. List item
  2. Retrieve data (year) from a line of the .csv file;
  3. Iterate over the list until the place for the node containing data is found;
  4. Repeat until the file is over;
  5. Print;

Whenever I try to run it, the Virtual box starts lagging and doesn't do anything. (even when I remove the print function).

I've been trying to solve this for 3 days now, so I'm quite desperate.

Here's my code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>

struct node {
int data;
int key;
struct node *next;
} node;


struct node *head_countries = NULL;
struct node *current = NULL;

//display the list
void printList() {
   struct node *ptr = head_countries;
   printf("\n[ ");


    //start from the beginning
    while(ptr != NULL) {
    printf("(%d,%d)",ptr->key,ptr->data);
    ptr = ptr->next;
}

printf(" ]");
}

//insert link at the first location
struct node* insertFirst(int key, int data) {
    //create a link
    struct node *link = (struct node*) malloc(sizeof(struct node));

    link->key = key;
    link->data = data;

    //point it to old first node
    link->next = head_countries;

    //point first to new first node
    head_countries = link;

    return link;
    }

void insertAfter(struct node* PrevNode, int key, int data)
{
    struct node *NewNode = (struct node*) malloc(sizeof(struct node));

    if ( PrevNode == NULL )
    {
        printf("Erro 1");
        return;
    }

    NewNode->key = key;
    NewNode->data = data;

    NewNode->next = PrevNode->next;
    PrevNode->next = NewNode;

}


void CreateCountriesList()
{
    char linha[150];
    char cabecalho[100];
    int key = 0, data = 0;
    int test[15] = {0};

    test[0] = 10;
    test[1] = 25;
    test[2] = 7;
    test[3] = 5;
    test[4] = 30;
    test[5] = 40;
    test[6] = 3;
    test[7] = 4;
    test[8] = 98;
    test[10] = 4;
    test[11] = 14;
    test[12] = 23;
    test[13] = 16;
    test[14] = 35;
    test[15] = 6;

    //dados_temp New_Entry;

    struct node* curr_head = NULL;
    struct node* curr = NULL;

    FILE *inputf;
    inputf = fopen("tempcountries_all.csv", "r");


    if (inputf == NULL)
    {
        printf("Nao da pa abrir o ficheiro");
        exit(EXIT_FAILURE);
    }

    fgets(cabecalho, 100, inputf);

    for (key = 0; key <  20 ; key++)
    {
        data = test[key]

    if ( head_countries == NULL || key == 2 )
    {
        insertFirst( key, data );
    }

    else
    {
        curr = head_countries;
        //insertAfter(curr, key, data);
        //printf("%d\n", curr->data);
        //curr = curr->next;

        while ( curr->next != NULL )
        {
            //printf("%d", key);
            if ( curr->data < data && curr->next->data > data )
            {
                insertAfter(curr, key, data);
            }

            if ( data == curr->data )
            {
                //insertAfter(curr, key, data);
            }

            curr = curr->next;
        }
    }


}

printList();

fclose(inputf);
}

int main() {
CreateCountriesList();
return EXIT_SUCCESS;
}

Is it because the list is too big? If so, how do you recommend I proceed for a list this big?

Thank you in advance!

EDIT: removed warnings from code, and unused functions.

EDIT: added test.

  • Welcome to Stack Overflow. Please read the [About] and [Ask] pages before long. Please note that an MCVE ([MCVE]) includes sample input data and the actual and expected output. As to your question, fix the "few warnings about initialization" and come back with the fixed code. If the compiler is complaining, it means you have bugs. If you're on a box with GCC, use `gcc -std=c11 -Wall -Wextra -Werror -Wmissing-prototypes -Wstrict-prototypes` (the `-Werror` means you won't be able to run the code until it compiles with no warnings). I don't bother with running code that isn't that clean. – Jonathan Leffler May 25 '18 at 15:52
  • Also note that `void main()` is wrong except, perhaps, on [Windows](http://stackoverflow.com/questions/204476/). – Jonathan Leffler May 25 '18 at 15:56
  • Note that `NULL` is a pointer constant, not a character constant: `error: initialization of ‘char’ from ‘void *’ makes integer from pointer without a cast [-Werror=int-conversion]` — `char linha[150] = {NULL};` — to fix that, use `'\0'` instead of `NULL`. Also, there are 4 unused functions in the code; those are inherently not part of an MCVE. – Jonathan Leffler May 25 '18 at 16:00
  • You don't check that `fgets()` succeeds. You don't check that `sscanf()` succeeds. After you manage to insert the first row, you can never insert another because `while (curr->next != NULL)` is never true. And your VM goes spinning its wheels because you aren't detecting EOF because you don't check `fgets()`. – Jonathan Leffler May 25 '18 at 16:06
  • @JonathanLeffler I've checked, but I decided not to include it in the code, since I didn't find it necessary. I don't understand? How is it never true? My goal with that while is to iterate over all the list's elements. – Carlos Martins May 25 '18 at 16:14
  • 1
    Step through the code with a debugger. And don't make us try to debug an approximation to the code that is causing trouble — that is totally unfair to us. Make sure that you are debugging the MCVE that you post, and that the MCVE reproduces the exact problem. You've not supplied even 5 lines of sample data (which should be enough). – Jonathan Leffler May 25 '18 at 16:18
  • When you insert the first record, you set `link->next = head_countries;` where the current value of `head_countries` is null (it gets set in the next line). (Incidentally, you don't check that the memory allocation succeeds; you should! It often feels like 50% or more of C is checking for errors.) Then when you enter the alternative code, you set `curr = head_country` and test `while (curr->next != NULL)` but it is null, so you skip the while-loop and ... go read the next line, etc.\ – Jonathan Leffler May 25 '18 at 16:22
  • Because you scan linearly through the list to find each insertion position, the overall cost of finding all the insertion positions scales with the square of the total number of nodes. If you really have in excess of 500k nodes, then that is going to take a while. – John Bollinger May 25 '18 at 16:24
  • For certain input orders you *could* do much better than that, but you won't, because you keep scanning even after performing the insertion. – John Bollinger May 25 '18 at 16:25
  • @JonathanLeffler Unfortunately I've yet to learn how to use the debugger since I'm very new to C. I've updated the code with some test values. Before I tried sorting, I've created an unsorted list with all the values and everything was working fine, no issues. However I will find out how to check for allocation errors, thank you! Before entering the while loop, it goes trough an if that checks if head_country is NULL, and attributes a value to it, so that when it reaches the while it's never NULL. – Carlos Martins May 25 '18 at 16:35
  • @JohnBollinger is there a simplier way to do this? Can you specify how much is a 'while'? More or less than 10 minutes? I don't understand, what do you mean "you keep scanning even after performing the insertion"? – Carlos Martins May 25 '18 at 16:37
  • But when you’ve got a single node, `head_country->next` is NULL and that’s what the loop tests. – Jonathan Leffler May 25 '18 at 16:38
  • @JonathanLeffler but it never has a single node when it reaches the while loop, since I add a node if head_country is NULL and then for when key = 2 it adds another one, so it only goes to the while loop when key = 3, and by then the list already has 2 nodes – Carlos Martins May 25 '18 at 16:47
  • OK; if you say so, I'll believe you. I've only read the code, not run it. Time to learn how to debug — either with a debugger or with print statements. I usually use print statements; they suit my style better. – Jonathan Leffler May 25 '18 at 17:04
  • Out of curiosity, what's the point of a sorted linked list? Typically the reason you'd want to sort the contents of a data structure is to enable faster searching, but you don't get that benefit with a list since you can only access its elements sequentially. Are you sure this is the right data structure for your application? – Joe Farrell May 25 '18 at 17:47
  • @JoeFarrell It's part of an assignment, and this is a requirement – Carlos Martins May 25 '18 at 18:30

2 Answers2

1

You have several problems, but the most significant one seems to revolve around how you insert data into the list:

    while ( curr->next != NULL )
    {
        //printf("%d", key);
        if ( curr->data < data && curr->next->data > data )
        {
            insertAfter(curr, key, data);
        }

        if ( data == curr->data )
        {
            //insertAfter(curr, key, data);
        }

        curr = curr->next;
    }

You do not break from the loop after performing an insertion, so observe what happens:

  • You find an appropriate predecessor, P, for the data you want to insert. curr will at that time point to P.
  • You insert the a new node, N, after P.
  • You continue iterating through the list. The next node you consider is the successor to P, which is now N.
  • N also meets the conditions for predecessor to your data (data == N.data), so you insert another new node. And another. And another ...

This will continue indefinitely, and the computer is indeed likely to start slowing down before too long, as its physical and virtual memory become loaded down with all the nodes the program allocates.

John Bollinger
  • 121,924
  • 8
  • 64
  • 118
  • I should add a break after each if then? – Carlos Martins May 25 '18 at 16:38
  • @CarlosMartins, what point is there in continuing to scan the list when you've already inserted the data? Even if you avoided inserting duplicate nodes, it would be wasteful. But no, you should not break *after* each `if`. You should break after each insertion. – John Bollinger May 25 '18 at 16:40
  • yes I understand! So I could insert a break after the insertAfter function? Or even put some nested else if. – Carlos Martins May 25 '18 at 16:45
  • Yes, @CarlosMartins, the simplest thing to do, at least to resolve this, most significant, part of your problem, would be to insert a `beak` after each call to `insertAfter()`. – John Bollinger May 25 '18 at 18:31
  • I've tried it, and now the VM stopped slowing down when I run the program, however it still doesn't print anything – Carlos Martins May 25 '18 at 18:54
0

So i guess your biggest problem is to make that list...

This is how i would make those functions.

    #include<stdio.h>
#include<stdlib.h>


typedef struct node {
int data;
int key;
struct node *next;
} node;


node *new_node(int key, int data)
{
    node * n =malloc(sizeof(node));
    if(!n)
    {
        fprintf(stderr,"malloc failed");
        exit(EXIT_FAILURE);
    }
    n->data = data;
    n->key = key;
    n->next = NULL;
}

node *insert_middle(node *next, int key, int data)
{
    node *new = new_node(key,data);
    new->next = next;
    return new;
}


node * insert_node(node *n, int key, int data)
{
    if(!n)
    {
        return new_node(key, data);
    }
    else if(n->data < data)
    {
        n->next = insert_node(n->next, key, data);
    }
    else
    {
        return insert_middle(n, key, data);
    }

}

int main()
{
    node *head=NULL;
    node *tmp = NULL;
    int i,j;
    for(i=0;i<10;i++)
    {
        scanf("%d",&j);
        tmp = insert_node(head, 0, j);
         //keep track of the head
        if(tmp!=head)
            head = tmp;
    }
    printf("====================");
    while(head)
    {
        printf("%d\n",head->data);
        head=head->next;
    }
    return 0;
}

Some bugs in your code i have found:

for (key = 0; key <  20 ; key++)
{
    //you never change the value of data
    data = test[0];

if ( head_countries == NULL || key == 2 )
{
     //you should place curr = insertFirst(key, data);
     //the way you have written the code nothing happens
    //inserFirst function just return the vale and you don't store it anywhere so you can not use it
    insertFirst( key, data );
}

else
{
    curr = head_countries;
    //after this line of code curr=NULL in first iteration 

     //curr->next will cause SIGSEGV since curr=NULL, so NULL->next is not possible
    while ( curr->next != NULL )
    {
        //printf("%d", key);


        if ( curr->data < data && curr->next->data > data )
        {
            insertAfter(curr, key, data);
        }
        if ( data == curr->data )
        {
            //insertAfter(curr, key, data);
        }

        curr = curr->next;
    }

Too many bugs... I think you should start again. If you dont understand my example just comment and i will give my best to explain

Aleksa Jovanovic
  • 187
  • 1
  • 11
  • First of all, thank you for your answer! I wrote the test part wrong, but it wasn't in the original code. About the SIGSEGV, I must admit I don't know anything about it, but from what you wrote, that wouldn't occur since curr->next will never be NULL, because of the while loop. – Carlos Martins May 25 '18 at 18:31
  • You can accept an answer or like if it helped you – Aleksa Jovanovic May 25 '18 at 20:10
  • I did leave a like.. but it's not publicly displayed until I have 15 rep :( But I still have the issue so I wouldn't say there has been an answer yet – Carlos Martins May 25 '18 at 20:15
  • Alright then leave in comment what is still bothering you – Aleksa Jovanovic May 25 '18 at 20:23