ECE 209 -- Files

Definitions

file A sequence of bytes with a name maintained by the operating system. Example filenames include /home/clauss/NiceList.txt and C:\Users\Claus\NiceList.txt.
FILE pointer The C value of type FILE *. This mysterious value is created when files are opened and is used in almost all C I/O operations. Evidently, a lot of stuff is stored in a FILE, but no one wants to know what it is.
stream Not everything with a filename is a real file. Examples include named pipes in Unix and Windows and special devices like /dev/null and /dev/random in Unix and NUL and COM1 in Windows. C and, especially C++, gurus always call opened files or file-like entities streams. In fact, the terms stream and FILE pointer are often used as synonyms.
mode The way in which a stream is being used: reading, writing, or both. Can also apply to opening files as binary or text. (In Unix, mode also confusingly refers to the protection associated with a file.)
file offset Thus is the point where an application is presently reading or writing in a stream. It is expressed as the number of bytes from the beginning of a file. Sometimes called the file position indicator or even (especially by non-C programmers) the file pointer which is easily confused with the FILE pointer.
sequential access Reading or writing a stream in serial order, that is, from the first byte to the last. These is how files are usually processed.
random access Reading or writing a stream by moving directly to specific offsets within the file. These locations are not random. They are chosen by the C program.
binary mode The values read and written by the C application are exactly those the are stored within the file as maintained by the operating system. When C is run on a Unix computer, all file operations are performed in binary mode. For this reason, Unix does not distinguish between binary and text mode.
text mode In Unix and Mac OS X operating systems, lines of text are separated by a single character, the linefeed (aka, '\n' or new line or ASCII 10). In the Windows operating system, lines of text are separated by two characters, the carriage return (aka, '\r' or ASCII 13) and then the linefeed. To maintain program portability, when a C program writes "\n" to a text file in Windows, the C I/O library actually writes "\r\n". A similar trick is done when reading text files. (C++ "solves" this problem by using a special constant endl to write the end-of-line sequence.)
standard files When Unix programs are started they are given three opened file descriptors (the operating system view of a file): standard input, for reading characters; standard output, for writing characters; and standard error, for writing urgent output. Windows and Mac OS provide similar file descriptors. In C, these are represented by three FILE pointers: stdin, stdout, and stderr.
buffer When a C program reads or writes data, the C I/O system does not directly request data from the operating system. Instead it maintains a buffer of data that is sent to the operating system when appropriate. This approach significantly improves program performance. Unfortunately, there are times when it is necessary to flush the buffer to make sure changes are written to the file system or the terminal.

Opening and closing files

fopen Opens a file and associates a stream with it. Returns NULL if the open fails.
fclose Closes a stream. Very rarely fails.
stdin Standard input stream
stdout Standard output stream
stderr Standard error stream

Reading streams

fscanf Complicated function for formatted input.
fgetc Reads a single character.
fgets Reads a line.
fread Reads raw bits from the stream.

Writing streams

fprintf Complicated function for formatted output.
fputc Writes a single character.
fputs Write a line.
fwrite Writes raw bits to the stream.

Random access I/O

fseek Moves the offset of a stream.
ftell Returns the current offset within a stream.

Buffering

fflush Really writes buffered output of stream.
ungetc Unreads a characters. Allows peeking ahead. Best to avoid.

Example program for writing a file

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
  int totalLines, lineNum ,
  FILE *fileStream ;

  if (argc != 3
      || (totalLines = atoi(argv[2])) <= 0) {
    fprintf(stderr, "Usage: %s OutputFileName NumberOfLines\n", argv[0]) ;
    return(EXIT_FAILURE) ;
  }

  if ((fileStream = fopen(argv[1], "w")) == NULL) {
    fprintf(stderr, "Unable to open %s for output\n", argv[1]) ;
    return(EXIT_FAILURE) ;
  }

  for (lineNum=1; lineNum<=totalLines; ++lineNum) {
    fprintf(fileStream, "Line %4d\n", lineNum) ;
  }

  fclose(fileStream) ;

  return(EXIT_SUCCESS) ;
}

Updating a file

This program is passed three arguments: (1) a file where characters should be replaced, (2) an offset within the file where the replacement should start, and (3) a string of character to replace the present characters. The program is obsessive in checking function return codes.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>

const char usageMsg[] = "Usage:  UpdateChars file offset string\n" ;

int main(int argc, char *argv[]) {
  char *fileName ;     /* Arg 1:  The file name */
  long int offset ;    /* Arg 2:  Offset within file */
  char *newString ;    /* Arg 3:  String to write to file */

  FILE *fileHand ;     /* File stream */
  char *oldString ;    /* Characters read from file */
  int   strSize ;      /* Size of string to be read or written */
  long int filSize ;   /* Size of file */

  /* Useful variables.... */
  char *strChecker ;   /* used for checking strtol */
  int i ;

  if ( argc != 4) {
    fputs(usageMsg, stderr) ;
    fputs("Wrong number of arguments\n", stderr) ;
    return (EXIT_FAILURE) ;
  }
    
  if (! isdigit(argv[2][0])
      || (offset = strtol(argv[2], &strChecker, 10)) < 0
      || strChecker == NULL
      || *strChecker != '\0') {
    fputs(usageMsg, stderr) ;
    fprintf(stderr, "%s is not a valid integer\n", argv[2]) ;
  }
    
  newString = argv[3] ;
  strSize = strlen(newString) ;

  if ((oldString = (char *)malloc(strSize))==NULL) {
    /* This really couldn't happen */
    fputs("Unable to allocate temporary storage\n", stderr) ;
    return(EXIT_FAILURE) ;
  }

  fileName = argv[1] ;
  if ((fileHand=fopen(fileName,"r+")) == NULL) {
    fputs(usageMsg, stderr) ;
    fprintf(stderr, "Unable to open %s\n", fileName) ;
    return (EXIT_FAILURE) ;
  }

  if (fseek(fileHand, 0l, SEEK_END)
      || (filSize = ftell(fileHand))<0) {
    fprintf(stderr, "Unable to determine size of %s\n", fileName) ;
    fclose(fileHand) ;
    return (EXIT_FAILURE) ;
  }

  if (offset + strSize > filSize) {
    fprintf(stderr, "Attempting to write past end of %s (%ld bytes)\n",
	    fileName, filSize) ;
    fclose(fileHand) ;
    return (EXIT_FAILURE) ;
  }

  if (fseek(fileHand, offset, SEEK_SET)
      || fread((void *)oldString, 1, strSize, fileHand) != strSize) {
    fprintf(stderr, "Unable to read present characters from %s\n", fileName) ;
    fclose(fileHand) ;
    return (EXIT_FAILURE) ;
  }

  for (i=0; i<strSize; ++i) {
    putchar(oldString[i]) ;
  }
  putchar('\n') ;

  if (fseek(fileHand, offset, SEEK_SET)
      || fwrite((void *)newString, 1, strSize, fileHand) != strSize) {
    fprintf(stderr, "Unable to write new characters to %s\n", fileName) ;
    fclose(fileHand) ;
    return (EXIT_FAILURE) ;
  }

  fclose(fileHand) ;
  return (EXIT_SUCCESS) ;
}