Structures in C

Heterogeneous data

char *Name[50] ;
char  Abbrev[50][2] ;
int   Population[50] ;
Name[12]       = "North Carolina" ;
Abbrev[12][0]  = 'N' ;
Abbrev[12][1]  = 'C' ;
Population[12] = 9222414 ;

Is placing these values in three different arrays convenient, logical, or efficient?

Structure

The C struct constructor allows the definition of a heterogeneous data structures with fields or members of different types. The name of each field within a structure defintion must be unique, but often different structure defintions will share a common name.

struct state {
  char *Name ;
  char Abbrev[2] ;
  int Population ;
} ;

struct state NC ;

NC.Name       = "North Carolina" ;
NC.Abbrev[0]  = 'N' ;
NC.Abbrev[1]  = 'C' ;
NC.Population = 9222414 ;

Effectively every struct definition introduces a new programmer defined type. Many programmers use typedef's for structures.

typedef struct state State ;

State NC ;

The FILE "type" used for file I/O operations in C is typically defined with a typedef of a C structure. The quotes are used because technically a typedef is not a new type, but rather a "synonym" for an existing type.

By the way, the C structure is rather like a method-less class in either C++ or Java.

The dot operator

The dot operator joins a structure variable with a field or member name.

The . operator has the highest precedence level and has left-to-right precedence. Other operators at this level are:

A horrendous example of this would be f("United States")[30].Popluation++ used as a expression; however, you do often find long sequences such as USA[12].Name[0].

Structures and symbols

Consider each structure as having its own symbol table. Here's a possible one for a 32-bit Intel architecture.

fieldstructoffsetsize
Abbrevstate42
Namestate04
Populationstate84

The adress of NC.Population would be the address of NC plus the offset of Population.

In C you are allowed to assume that fields will be allocated in memory in the same order as they appear in the definition; however, because the compiler is free to insert padding between fields, this assumption is rarely useful.

Structures as variables

In ANSI C, one structure can be assigned (with =) to another and structures can be passed to and returned from functions. This involves copying the entire structure and is reasonable only with very small structures.

Initialization of structures

struct state NC = { "North Carolina", {'N','C'}, 9222414 } ;

Arrays of structures

There is nothing unusual here.

struct state USA[50] ;

USA[12].Name       = "North Carolina" ;
USA[12].Abbrev[0]  = 'N' ;
USA[12].Abbrev[1]  = 'C' ;
USA[12].Population = 9222414 ;

Nested structures

Structures nested a couple of levels can be useful and are quite common.

struct county {
  char *Name ;
  char  Population ;
  struct state State ;
} ;

struct county Haywood ;

Haywood.Name = "Haywood" ;
Haywood.State.Name = "North Carolina" ;

Dynamic allocation of structures using pointers

Dynamically allocated structures are quite common in C programs, particularly those written for second programming courses.

struct state *pNC ;

pNC = (struct state *)malloc(sizeof(struct state)) ;

It is so common that P->F is a way of saying (*P).F in C. (Question: Why are the parentheses needed?)

pNC->Name       = "North Carolina" ;
pNC->Abbrev[0]  = 'N' ;
pNC->Abbrev[1]  = 'C' ;
pNC->Population = 9222414 ;

Recursive structure definitions

It only gets worse. Structures are often defined to contains pointers to other stuctures, and some of those structures may even be of the same type! Often this is quite natural. After all a structure representing a person way need a reference to another person, such as a mother!

typedef struct Person *PersonRef ;

typedef struct Person {
  char Name[80] ;
  PersonRef mother ;
} PersonNode ;

Linked list examples

Often structures are chained in links with recursively defined structures.

struct state {
  char *Name ;
  char Abbrev[2] ;
  int Population ;
} ;

typedef struct state State ;

typedef struct stateLink *StateRef ;

typedef struct stateLink {
  State    value ;
  StateRef next ;
} StateNode ;

Useful function

Here's a useful function for our example.

StateRef AllocateState(char *name, char *abbr, int pop) {
  char *nameCopy = (char *)malloc(strlen(name)+1) ;
  if (nameCopy == NULL) {
    return (StateRef)NULL ;
  }
  strcpy(nameCopy, name) ;

  StateRef             R = (StateRef)malloc(sizeof(StateNode)) ;
  R->value.Name       = nameCopy ;
  R->value.Abbrev[0]  = abbr[0] ;
  R->value.Abbrev[1]  = abbr[1] ;
  R->value.Population = pop ;
  R->next             = (StateRef)NULL ;
  return R ;
}

Dangerous function

Consider the folly of the following.

StateRef BadAllocateState(char *name, char *abbr, int pop) {
  StateNode P ;  
  char *nameCopy = (char *)malloc(strlen(name)+1) ;
  if (nameCopy == NULL) {
    return (StateRef)NULL ;
  }
  strcpy(nameCopy, name) ;

  StateRef             R = &P ;
  R->value.Name       = nameCopy ;
  R->value.Abbrev[0]  = abbr[0] ;
  R->value.Abbrev[1]  = abbr[1] ;
  R->value.Population = pop ;
  R->next             = (StateRef)NULL ;
  return R ;
}

Allocating for the first state

There are some historic inacuracies in this example. There were no computers when Delaware ratified the constitution.

firstNode = AllocateState("Delaware", "DE", 55000) ;
lastNode  = firstNode ;

Allocating for the twelfth state

  newState = AllocateState("North Carolina", "NC", 37000) ;
  lastState->next = newState ;
  lastState = newState ;

Counting the number of states

int NumberStates(StateRef s) {
  int number = 0 ;
  while (s != NULL) {
    ++number ;
    s = s->next ;
  }
  return number ;
}

Finding the state with the largest population

char * MostPeople(StateRef s) {
  int mostPeople = 0 ;
  char *biggestState ;
  while (s != NULL) {
    if (s->value.Population > mostPeople) {
      mostPeople = s->value.Population ;
      biggestState = s->value.Name ;
    }
    s = s->next ;
  }
  return biggestState ;
}

Updating the population of North Carolina

void PlusPlusNC(StateRef s) {
  while (strncmp(s->value.Abbrev, "NC", 2)) {
    s = s->next ;
  }
  s->value.Population++ ;
}

Reversing the Dakotas

void DakotaSwitch(StateRef s) {
  StateRef Dakota1 ;
  StateRef Dakota2 ;
  while (strncmp(s->next->value.Abbrev, "ND", 2)
         && strncmp(s->next->value.Abbrev, "SD", 2)) {
    s = s->next ;    
  }
  Dakota1 = s->next ;
  Dakota2 = Dakota1->next ;
  s->next = Dakota2 ;
  Dakota1->next = Dakota2->next ;
  Dakota2->next = Dakota1 ;
}