UNIX/Linux & C Programming:
Chapter 3: System Libraries & I/O



Coverage: [UPE] Chapters 3 and 6


Standard I/O vs. file I/O

  • file streams and standard streams (really one in the same)
  • to what do you connect the stream? file or device?
  • what streams are automatically open for you?
    • stdin
      • C analog of cin in C++
      • connected to keyboard by default
    • stdout
      • C analog of cout in C++
      • connected to display by default
    • stderr (connected to display by default)
  • can redirect stdin, stdout, and stderr to files


Standard I/O redirection

(setup for free by the shell)
  • < (redirects stdin)
  • << (redirects stdin to HERE file)
  • > (redirects stdout, overwrites)
  • >> (redirects stdout, but appends)


Demo of cat

  • cat [<file(s)>] (concatenate): displays contents of one or more files to standard output
  • capable of reading from file input or standard input
  • only writes to standard output


Redirecting standard I/O

devices in UNIX are represented as files,
for instance, /dev/console.

UNIX commands receive their input from the standard input (stdin) and send their output to the standard output (stdout), by default these files are the console or terminal.
  • shells allow I/O to be redirected to other devices; thus UNIX commands are unaware of the device from which their input may originate or to what device their output may be sent.

  • the output redirection symbol, >, sends a command's output to the specified file instead of the console/terminal screen.

      command name [args] > filename

      ls -l > ls.out

      the file ls.out is created if it does not exist (or emptied and overwritten if it exists prior to command execution) and sent the directory listing.

  • programs which send their output to the console/terminal may also be redirected.

      ./a.out > out.txt

      this allows execution results of a program to be captured in a file.

  • the output redirection symbol >> appends the output to a file, rather than overwriting an existing file.

      ./a.out >> out.txt

      the results of several program runs can thus be saved in a single file out.txt.

  • the input redirection symbol, <, sends the command input from the specified file instead of the console/terminal keyboard.

      command name [args] < filename



      ./a.out < in.txt > out.txt

      this allows program execution with several data sets very easily.

  • the input redirection symbol << is known as the "here is" symbol, and provides a mechanism for reading data from the same file as a command is contained in (why might one want to do this?).

      ./a.out > out.txt << HERE
      $ cat << HERE
      hello folks
      stop reading when you
      see
      HERE


      ./a.out > out.txt << HERE
      first line of data
      second line of data
      . . .
      last line of data
      HERE


      the string following <<, in this example, the term HERE, when appearing at the beginning of a line, terminates the data input. The HERE here logically acts like EOF (<ctrl-d>).

  • pipes | are the logical extension of I/O redirection.
    • pipes allow the stdout of one program to become the stdin of another program.
    • specifically, a pipe redirects the standard output of the command to the immediate left from the screen to the standard input of the command to the immediate right.

      ls -l | more

      this command allows the viewing of the long listing of a large directory one screen at a time.

      ls -l > ls.out
      more < ls.out
      rm ls.out

      this I/O redirection is an equivalent set of commands as the prior, but requires the ls.out temporary file.
    • pipes support interprocess communication and introduce concurrency
    • recall UNIX model of computation
      • pipes are the powerful communication mechanism
      • pipes are the glue
    • how can you verify that the processes in a pipeline are running concurrently?

  • tee: reads from standard input and writes to standard output and files (e.g., cat .profile | tee profile.bak)


More on redirecting standard error

  • shell time vs. UNIX time
    • former writes to terminal
    • latter writes to standard error; use fully qualified path: /usr/bin/time
  • examples:
    $ time cat /etc/termcap >/dev/null 2>timelog.txt
    $ /usr/bin/time cat /etc/termcap >/dev/null 2>timelog.txt
    
  • redirecting stderr to the same place as stdout
    $ /usr/bin/time 2>&1 | wc -l
    $ wc ~/.profile ~/.kshrc > output 2>output # which will happen first?
    $ wc ~/.profile ~/.kshrc > output 2>&1 # order not preserved
    $ cat ~/.profile doesnotexist &>output-and-error
    
  • investigate tee (e.g., $ cat ~/.profile doesnotexist | tee output-and-error)
  • writing directly to the terminal /dev/tty
  • why might you want to write directly to the terminal? consider a pipeline of commands requiring a prompt for user input where stderr is redirected through each pipe


File descriptors

  • 0 for stdin, 1 for stdout, 2 for stderr
  • the 0 is implicit in < when redirecting stdin
  • the 1 is implicit in > and >> when redirecting stdout
  • must use the file descriptor to redirect stderr
    • for instance, wc -q 2> errors
    • often to /dev/null (e.g., wc -q 2> /dev/null)


I/O in C

  • scanf and printf
    • what do they return?
    • we must develop the habit of checking return values
  • opening and closing files: fopen and fclose
  • file pointer: FILE*
  • fscanf is the C analog of the C++ extraction operator (>>)
  • fprintf is the C analog of the C++ insertion operator (<<)
  • conversion specifiers
    • %d for decimal
    • %f for floating-point
    • %c for single character
    • %s for string
    • %x for lower-case hexidecimal (e.g., fa0b)
    • %X for upper-case hexidecimal (e.g., FA0B)
  • formatted output ([CPL] §7.2, pp. 153-155), between % and conversion character, there may be, in order
    • a minus sign, - indicating left justification
    • a minimum field-width
    • a period which separates the field-width from the precision
    • a precision
  • EOF
    • a #defined constant in stdio.h
    • is <crtl-d> character on UNIX system
  • getchar
  • declared in stdio.h: the C analog of iostream in C++


Effect of a Successful Open on a File

    (ref. [C] 7-17)

    "r"
    read
    "w"
    write
    "a"
    append
    File
    Exists
    -Old contents
    discarded
    -
     
    Error
     
     
    File created
     
     
    File created
     
    File
    Does Not
    Exist


  • "r+", "w+", "a+"         Updating; allows reading and writing

  • "r+"                                Commonly used to read and change an existing file


Analogs from C++ to C

    C++ C
    iostream stdio.h
    cin stdin
    cout stdout
    >> fscanf
    << fprintf

    Also, unlike C++, in C you must declare all variables in any function prior to any other code; in other words, you cannot declare variables in lexically scoped blocks in C.


Review of standard I/O functions

    (ref. [C] 7-7)

    stdin and stdout file I/O
    character getchar
    putchar
    getc
    putc
    fgetc
    fputc
    ungetc
    line gets
    puts
    fgets
    fputs
    formatted scanf
    printf
    fscanf
    fprintf
    record -
    -
    fread
    fwrite
Never use gets. It will continue to store characters past the end of the passed buffer. Thus, it is dangerous to use. See man gets. Use fgets instead.


Developing cat in C

    (ref. [CPL] Chapter 7, §§7.5-7.6, pp. 160-164)
    /* ref. [CPL] Chapter 7, 7.5, p. 162 with minor modification by Perugini */
    #include<stdio.h>
    
    /* cat: version 1 */
    void filecopy (FILE* ifp, FILE* ofp) {
    
       char c;
    
       while ((c = getc (ifp)) != EOF)
          putc (c, ofp);
    }
    
    int main (int argc, char** argv) {
    
       FILE* fp = NULL;
    
       if (argc == 1)
          filecopy (stdin, stdout);
       else
          while (--argc > 0)
             if ((fp = fopen (*(++argv), "r")) == NULL) {
                 printf ("cat: can't open %s\n", *argv);
             } else {
                 filecopy (fp, stdout);
                 fclose (fp);
               }
    
       return 0;
    }
    
    /* ref. [CPL] Chapter 7, 7.6, p. 163 with minor modifications by Perugini */
    #include<stdio.h>
    #include<stdlib.h>
    #include<errno.h>
    #include<string.hgt;
    
    /* cat: version 2 */
    int main (int argc, char** argv) {
    
       void filecopy (FILE* ifs, FILE* ofs);
    
       int exit_status = 0;
    
       char* pgm = *argv;
    
       char* s = malloc (sizeof(*s)*16);
    
       FILE* fp = NULL;
                                     
       if (argc == 1)
          filecopy (stdin, stdout);
       else
          while (--argc > 0)
             if ((fp = fopen (*(++argv), "r")) == NULL) {
                //fprintf (stderr, "%s: %s: No such file or directory\n", pgm, *argv);
    
                //fprintf (stderr, "%s: %s: %s.\n", pgm, *argv, strerror(errno));
    
                //perror("preprended message");
                sprintf (s, "%s: %s", pgm, *argv);
                perror(s);
                //exit (1);
                /* or use following line to continue processing */
                exit_status = 1;
             } else {
                  filecopy (fp, stdout);
                  fclose (fp);
               }
    
       if (ferror (stdout)) {
          fprintf (stderr, "%s: error writing stdout\n", pgm);
          //perror("error writing stdout.");
          exit_status = 2;
       }
    
       exit (exit_status);
    }
    
    void filecopy (FILE* ifp, FILE* ofp) {
       int c;
    
       while ((c = getc (ifp)) != EOF)
          putc (c, ofp);
    }
    


Portability (safety)

    char c;
    while ((c = getchar()) != EOF) { ... }
    
  • use /* C-style comments */ vs. // C++-style comments
  • also, do not use TABs in your code


String copy code from first day of class

    #include <stdio.h>
    
    main() {
    
       char* q = "copy this";
       char* p = (char*) malloc (sizeof (char)*10);
       char* r = p;
    
       printf ("%s\n", q);
       while (*p++ = *q++);
       *p = '\0';  /* necessary? no */
       printf ("%s\n", r);
    }
    


String functions

  • prototyped, not defined, in <string.h>
  • int strlen (char*),
  • int strcmp (char*, char*), int strncmp (char*, char*, int)
  • char* strcpy (char*, char*), char* strncpy (char*, char*, n),
  • char* strcat (char*, char*), char* strncat(char*, char*)
  • char* strdup (const char*) // strdup = malloc + strcat
  • char* strstr(const char* haystack, const char* needle)
  • char* strchr(const char* s, int c);
  • when copying or concatenating strings, make sure destination string has sufficient space (memory)


`s' family of printf/scanf functions

  • sprintf
  • snprintf
  • sscanf


Command-line arguments

  • argc (argument count; command name is included)
  • argv (argument vector, termined by null pointer; argv[0] is command name)
  • main (int argc, char* argv[]) or main (int argc, char** argv)
  • echoargs.c
    #include<stdio.h>
    #include<stdlib.h>
    
    int main (int argc, char* argv[]) {
       int i;
    
       printf ("argc is %d\n", argc);
    
       for (i = 0; i < argc; i++)
          printf ("argv[%1d] is %s\n", i, argv[i]);
    
       exit (0);
    }
    
  • echopargs.c
    #include<stdio.h>
    #include<stdlib.h>
    
    int main (int argc, char** argv) {
    
       printf ("argc is %d\n", argc);
    
       for (; *argv; argv++)
          printf ("Next argument is %s\n", *argv);
    
       exit (0);
    }
    


argv array for the call ./a.out -wlc myfile



(regenerated with minor modifications from [USP] Fig. 2.2, p. 32)


Using a pointer to traverse an array


Demo of wc

  • word, line, and byte count program
  • capable of standard or file input
  • always writes to standard output
  • example: $ wc ~/.login ~/.tcshrc


FILEs


References

    [C] C Language for Experienced Programmers, Version 2.0.0, AT&T, 1988.
    [CPL] B.W. Kernighan and D.M. Ritchie. The C Programming Language. Prentice Hall, Upper Saddle River, NJ, Second edition, 1988.
    [UPE] B.W. Kernighan and R. Pike. The UNIX Programming Environment. Prentice Hall, Upper Saddle River, NJ, Second edition, 1984.
    [USP] K.A. Robbins and S. Robbins. UNIX Systems Programming: Concurrency, Communication, and Threads. Prentice Hall, Upper Saddle River, NJ, Second edition, 2003.

Return Home