char *Name[50] ; char Abbrev[50][2] ; int Population[50] ;
Name[12] = "North Carolina" ; Abbrev[12][0] = 'N' ; Abbrev[12][1] = 'C' ; Population[12] = 9535483 ;
Is placing these values in three different arrays convenient, logical, or efficient?
The C struct
constructor allows the definition of a
heterogeneous data structures with fields or members
of different types.
The name of each field within a structure defintion must be unique,
but often different structure defintions will share a common name.
struct state { char *Name ; char Abbrev[2] ; int Population ; } ; struct state NC ; NC.Name = "North Carolina" ; NC.Abbrev[0] = 'N' ; NC.Abbrev[1] = 'C' ; NC.Population =
Effectively every struct
definition
introduces a new programmer-defined type.
Many programmers use typedef
's for structures.
typedef struct state State ; State NC ;
The FILE
“type” used for file I/O operations in C
is typically defined with a typedef
of a C structure.
The quotes are used because
technically a typedef
is not a new type, but rather
a “synonym” for an existing type.
By the way, the C structure is rather like a method-less class in either C++ or Java.
The dot operator joins a structure expression with a field or member name.
The .
operator has the highest precedence level and has
left-to-right precedence. Other operators at this level are:
( )
[ ]
.
->
++
--
A horrendous example of this would be
f("United States")[30].Popluation++
used as a expression;
however, you do often find long sequences such as
USA[12].Name[0]
.
Consider each structure as having its own symbol table. Here's a possible one for a 32-bit Intel architecture.
field | struct | offset | size |
---|---|---|---|
Abbrev | state | 4 | 2 |
Name | state | 0 | 4 |
Population | state | 8 | 4 |
The adress of NC.Population
would be the
address of NC
plus the offset of Population
.
In C you are allowed to assume that fields will be allocated in memory in the same order as they appear in the definition; however, because the compiler is free to insert padding between fields, this assumption is rarely useful.
In ANSI C, one structure can be assigned (with =
) to another
and structures can be passed to and returned from functions.
This involves copying the entire structure and is reasonable only with
very small structures.
In C you can't return an array from a function, but you can return a structure containing an array.
struct state NC = { "North Carolina", {'N','C'}, 9535483 } ;
Effectively every struct
definition
introduces a new programmer-defined type.
Many programmers use typedef
's for structures.
typedef struct state State ; State NC ;
The FILE
“type” used for file I/O operations in C
is typically defined with a typedef
of a C structure.
The quotes are used because
technically a typedef
is not a new type, but rather
a “synonym” for an existing type.
By the way, the C structure is rather like a method-less class in either C++ or Java.
There is nothing unusual here.
struct state USA[50] ; USA[12].Name = "North Carolina" ; USA[12].Abbrev[0] = 'N' ; USA[12].Abbrev[1] = 'C' ; USA[12].Population = 9535483 ;
Structures nested a couple of levels can be useful and are quite common.
struct county { char *Name ; char Population ; struct state State ; } ; struct county Haywood ; Haywood.Name = "Haywood" ; Haywood.State.Name = "North Carolina" ;
Dynamically allocated structures are quite common in C programs, particularly those written for second programming courses.
struct state *pNC ; pNC = (struct state *)malloc(sizeof(struct state)) ;
Of course, you do need to use a pointer to refer to fields within
dynamically allocated structures. For example,
you'd need to use (*pNC).Population
to access the Population
field of our dynamic state.
Because dynamic structures are so popular, C has a special
syntax for refering to their fields:
P->F
is a
way of saying (*P).F
.
(Think of ->
as a sign pointing in a direction.)
This allows you to refer to our state's population as
pNC->Population
pNC->Name = "North Carolina" ; pNC->Abbrev[0] = 'N' ; pNC->Abbrev[1] = 'C' ; pNC->Population = 9535483 ;
It only gets worse. Structures are often defined to contain pointers to other stuctures, and some of those structures may even be of the same type! Often this is quite natural. After all a structure representing a person will need a reference to another person, such as a mother!
typedef struct Person *PersonRef ; typedef struct Person { char Name[80] ; PersonRef mother ; } PersonNode ;
Often structures are chained in links with recursively defined structures.
struct state { char *Name ; char Abbrev[2] ; int Population ; } ; typedef struct state State ; typedef struct stateLink *StateRef ; typedef struct stateLink { State value ; StateRef next ; } StateNode ;
Here’s a useful function for our example. Draw it!
StateRef AllocateState(char *name, char *abbr, int pop) { char *nameCopy = (char *)malloc(strlen(name)+1) ; if (nameCopy == NULL) { return (StateRef)NULL ; } strcpy(nameCopy, name) ; StateRef R = (StateRef)malloc(sizeof(StateNode)) ; R->value.Name = nameCopy ; R->value.Abbrev[0] = abbr[0] ; R->value.Abbrev[1] = abbr[1] ; R->value.Population = pop ; R->next = (StateRef)NULL ; return R ; }
Consider the folly of the following which returns a pointer to a structure allocated on the stack.
StateRef BadAllocateState(char *name, char *abbr, int pop) { StateNode P ; char *nameCopy = (char *)malloc(strlen(name)+1) ; if (nameCopy == NULL) { return (StateRef)NULL ; } strcpy(nameCopy, name) ; StateRef R = &P ; R->value.Name = nameCopy ; R->value.Abbrev[0] = abbr[0] ; R->value.Abbrev[1] = abbr[1] ; R->value.Population = pop ; R->next = (StateRef)NULL ; return R ; }
firstState = AllocateState("Delaware", "DE", 55000) ; lastState = firstState ;
There are some historical inacuracies in this example. There were no computers when Delaware ratified the constitution.
newState = AllocateState("North Carolina", "NC", 37000) ; lastState->next = newState ; lastState = newState ;
int NumberStates(StateRef s) { int number = 0 ; while (s != NULL) { ++number ; s = s->next ; } return number ; }
char * MostPeople(StateRef s) { int mostPeople = 0 ; while (s != NULL) { if (s->value.Population > mostPeople) { mostPeople = s->value.Population ; } s = s->next ; } return s->value.Name ; }
void PlusPlusNC(StateRef s) { while (strncmp(s->value.Abbrev, "NC", 2)) { s = s->next ; } s->value.Population++ ; }
void DakotaSwitch(StateRef s) { StateRef Dakota1 ; StateRef Dakota2 ; while (strncmp(s->next->value.Abbrev, "ND", 2) && strncmp(s->next->value.Abbrev, "SD", 2)) { s = s->next ; } Dakota1 = s->next ; Dakota2 = Dakota1->next ; s->next = Dakota2 ; Dakota1->next = Dakota2->next ; Dakota2->next = Dakota1 ; }