5

I am currently developing a kind of static analysis tool that performs pattern matching. I am using Flex to generate lexical analyzer, and I wrote code to manage the symbol table. I am not very experienced with C, so I decided to implement the symbol table as a linear linked list.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>

struct symtab {
   int id;
   char *name;
   int type;
   struct symtab *next;
};

enum types {
   KEYWORD = 1,
   CONSTANT,
   IDENTIFIER,
   OPERATOR,
   DELIMITER,
   WHITESPACE
};

struct symtab *last_entry(struct symtab *start)
{
   struct symtab *p;
   p = start;
   while(p -> next != NULL) {
      p = p -> next;
   }
   return p;
}

void add_entry(char* name, int type, struct symtab *start)
{
   struct symtab *new;
   new = last_entry(start);
   int id;
   if(new == start) {
      new = start;
      id = 0;
   }
   else {
      new = malloc(sizeof(struct symtab));
      id = last_entry(start) -> id;
      last_entry(start) -> next = new;
   }
   new -> id = id + 1;
   new -> name = name;
       new -> type = type;
   new -> next = NULL;
}

struct symtab *find_entry(char* name, struct symtab *start)
{
   struct symtab *p;
   p = start;
   while(p -> next != NULL) {
      if(strcmp(p -> name, name) == 0) {
         return p;
      }
   }
}

However, when I use add_entry() to add symbols, and then try to find them with find_entry(), find_entry() returns null. Can someone please assist?

rici
  • 201,785
  • 23
  • 193
  • 283
Melvin Smiley
  • 99
  • 2
  • 8
  • If I were not very experienced in C, I would not implement a symbol table as a linked list, I would implement it as a large, fixed sized array. If I were experienced in C (which I am), I'd do the same. You can always convert to a linked list (or better, a dynamic array) later, but get it working first. And pro tip - a linked list is the very last data structure you should consider, in most situations. –  May 28 '11 at 14:13
  • Instead of implementing the data structure on your own, you should use an already implemented solution like Glib. There are many different data structures available for C programs: http://developer.gnome.org/glib/2.28/glib-data-types.html BTW: a symbol table isn't a list but a hash. – ceving May 28 '11 at 14:21
  • 1
    Please do not put spaces around the `->` operator. Both it and the `.` operator bind extremely tightly, and it just looks weird to anyone experienced in C or C++ to see spaces there. – Jonathan Leffler May 28 '11 at 14:29
  • Yes, even I (who gets flak a lot here for so-called "excessive" use of spaces) would not do that. –  May 28 '11 at 14:31
  • 2
    @ceving: a symbol table is not usually created using a simple linked list, but it can be created that way. Your statement "a symbol table isn't a list but a hash" is misleading, therefore. – Jonathan Leffler May 28 '11 at 14:34
  • @Jonathan: A symbol table is a hash. Everything else is wasting resources. – ceving May 28 '11 at 14:49
  • 3
    @ceving Rubbish. Suppose the symbol table need only ever contain say 3 symbols - an array (or even a linked list) implementation may well be faster and consume less memory. Many compilers use lots of small symbol tables. –  May 28 '11 at 14:59
  • @Neil: Set the initial capacity to 1 if you have a gusto for linked lists. – ceving May 29 '11 at 18:44
  • @Jonathan There is no hard rule here, it depends on the application. At runtime, a link list via direct pointer references is faster to access than a hash table. – rhody Oct 08 '20 at 18:22

1 Answers1

5

It looks like you're trying to represent the list as a header object (start), followed by the actual elements of the list. This is a good idea since it simplifies the empty-list case, but you've not got the implementation right.

When you add, you need to remove the special case code you've got for last_entry being start. The start node will never contain symbol data.

When you lookup, you've got to make sure you skip the head (start) since it doesn't contain symbol data. A second bug in your lookup code is that you stop searching when p->next is NULL (which means you can never return the final element in your list.) You should stop when p is NULL.

Of course, you shouldn't be using a linked list at all: a hash table would be a better choice since it's got better performance and memory efficiency.

  • 2
    Good points - all of them. Curiously, putting the new symbols at the end of the list as this does is also probably bad for performance; putting them at the head means that newly defined symbols are found quicker, which is often beneficial. – Jonathan Leffler May 28 '11 at 14:39
  • Thank you very much.I understand that a hash table would be more efficient, and in the future i will probably re-implement symbopl table as a hash table. – Melvin Smiley May 28 '11 at 14:44