char *Name[50] ; char Abbrev[50][2] ; int Population[50] ;
Name[12] = "North Carolina" ; Abbrev[12][0] = 'N' ; Abbrev[12][1] = 'C' ; Population[12] = 9222414 ;
Is placing these values in three different arrays convenient, logical, or efficient?
The C struct
constructor allows the definition of a
heterogeneous data structures with fields or members
of different types.
The name of each field within a structure defintion must be unique,
but often different structure defintions will share a common name.
struct state { char *Name ; char Abbrev[2] ; int Population ; } ; struct state NC ; NC.Name = "North Carolina" ; NC.Abbrev[0] = 'N' ; NC.Abbrev[1] = 'C' ; NC.Population = 9222414 ;
Effectively every struct
definition
introduces a new programmer defined type.
Many programmers use typedef
's for structures.
typedef struct state State ; State NC ;
The FILE
"type" used for file I/O operations in C
is typically defined with a typedef
of a C structure.
The quotes are used because
technically a typedef
is not a new type, but rather
a "synonym" for an existing type.
By the way, the C structure is rather like a method-less class in either C++ or Java.
The dot operator joins a structure variable with a field or member name.
The .
operator has the highest precedence level and has
left-to-right precedence. Other operators at this level are:
( )
[ ]
.
->
++
--
A horrendous example of this would be
f("United States")[30].Popluation++
used as a expression;
however, you do often find long sequences such as
USA[12].Name[0]
.
Consider each structure as having its own symbol table. Here's a possible one for a 32-bit Intel architecture.
field | struct | offset | size |
---|---|---|---|
Abbrev | state | 4 | 2 |
Name | state | 0 | 4 |
Population | state | 8 | 4 |
The adress of NC.Population
would be the
address of NC
plus the offset of Population
.
In C you are allowed to assume that fields will be allocated in memory in the same order as they appear in the definition; however, because the compiler is free to insert padding between fields, this assumption is rarely useful.
In ANSI C, one structure can be assigned (with =
) to another
and structures can be passed to and returned from functions.
This involves copying the entire structure and is reasonable only with
very small structures.
struct state NC = { "North Carolina", {'N','C'}, 9222414 } ;
There is nothing unusual here.
struct state USA[50] ; USA[12].Name = "North Carolina" ; USA[12].Abbrev[0] = 'N' ; USA[12].Abbrev[1] = 'C' ; USA[12].Population = 9222414 ;
Structures nested a couple of levels can be useful and are quite common.
struct county { char *Name ; char Population ; struct state State ; } ; struct county Haywood ; Haywood.Name = "Haywood" ; Haywood.State.Name = "North Carolina" ;
Dynamically allocated structures are quite common in C programs, particularly those written for second programming courses.
struct state *pNC ; pNC = (struct state *)malloc(sizeof(struct state)) ;
It is so common that P->F
is a way of saying (*P).F
in C. (Question: Why are the parentheses needed?)
pNC->Name = "North Carolina" ; pNC->Abbrev[0] = 'N' ; pNC->Abbrev[1] = 'C' ; pNC->Population = 9222414 ;
It only gets worse. Structures are often defined to contains pointers to other stuctures, and some of those structures may even be of the same type! Often this is quite natural. After all a structure representing a person way need a reference to another person, such as a mother!
typedef struct Person *PersonRef ; typedef struct Person { char Name[80] ; PersonRef mother ; } PersonNode ;
Often structures are chained in links with recursively defined structures.
struct state { char *Name ; char Abbrev[2] ; int Population ; } ; typedef struct state State ; typedef struct stateLink *StateRef ; typedef struct stateLink { State value ; StateRef next ; } StateNode ;
Here's a useful function for our example.
StateRef AllocateState(char *name, char *abbr, int pop) { char *nameCopy = (char *)malloc(strlen(name)+1) ; if (nameCopy == NULL) { return (StateRef)NULL ; } strcpy(nameCopy, name) ; StateRef R = (StateRef)malloc(sizeof(StateNode)) ; R->value.Name = nameCopy ; R->value.Abbrev[0] = abbr[0] ; R->value.Abbrev[1] = abbr[1] ; R->value.Population = pop ; R->next = (StateRef)NULL ; return R ; }
Consider the folly of the following.
StateRef BadAllocateState(char *name, char *abbr, int pop) { StateNode P ; char *nameCopy = (char *)malloc(strlen(name)+1) ; if (nameCopy == NULL) { return (StateRef)NULL ; } strcpy(nameCopy, name) ; StateRef R = &P ; R->value.Name = nameCopy ; R->value.Abbrev[0] = abbr[0] ; R->value.Abbrev[1] = abbr[1] ; R->value.Population = pop ; R->next = (StateRef)NULL ; return R ; }
There are some historic inacuracies in this example. There were no computers when Delaware ratified the constitution.
firstNode = AllocateState("Delaware", "DE", 55000) ; lastNode = firstNode ;
newState = AllocateState("North Carolina", "NC", 37000) ; lastState->next = newState ; lastState = newState ;
int NumberStates(StateRef s) { int number = 0 ; while (s != NULL) { ++number ; s = s->next ; } return number ; }
char * MostPeople(StateRef s) { int mostPeople = 0 ; char *biggestState ; while (s != NULL) { if (s->value.Population > mostPeople) { mostPeople = s->value.Population ; biggestState = s->value.Name ; } s = s->next ; } return biggestState ; }
void PlusPlusNC(StateRef s) { while (strncmp(s->value.Abbrev, "NC", 2)) { s = s->next ; } s->value.Population++ ; }
void DakotaSwitch(StateRef s) { StateRef Dakota1 ; StateRef Dakota2 ; while (strncmp(s->next->value.Abbrev, "ND", 2) && strncmp(s->next->value.Abbrev, "SD", 2)) { s = s->next ; } Dakota1 = s->next ; Dakota2 = Dakota1->next ; s->next = Dakota2 ; Dakota1->next = Dakota2->next ; Dakota2->next = Dakota1 ; }