Computer Science 010

Lecture Notes 1

Getting Started with C and Unix

Unix originated in the early 1970s at Bell Labs. It built upon ideas developed in other operating systems to become an extremely popular operating system, dominating the market of multi-user, timesharing operating systems. Since its inception, it has evolved greatly and now comes in many variations. The variations are similar in more ways than the differ. Basic functionality and user interface is the same across all variations. The most notable differences occur due to the window manage chosen, when doing systems programming or shell programming. The most recent variations of Unix include Linux, FreeBSD and several other free variations that are primarily targeted at the PC. Unix remains the most stable and secure operating system in wide use; and Unix remains one of the least user-friendly operating systems around. Nevertheless, once you become familiar with Unix, it is very fast to use and powerful.

C originated in the early 1970s so that the folks at Bell Labs could use it to create Unix. At the time, C was considered a "high-level" language. By today's standards it is not high-level at all. By high-level we mean to suggest that the programming model it provides is convenient for a programmer to use and very likely harder to write a compiler for because it provides a higher level of abstraction than the underlying hardware supports directly. C hovers nanometers above the hardware. This has several implications. First, the good news:

It's fast both to compile and execute
You can control the computer more directly.

Now, the bad news:

C assumes you know what you're doing. It will do what you tell it to do, without asking questions. For example, if an array has 10 elements, it will happily let you use an index of 10 or 100 or -100 and give you back an answer.
There is no magic. Nothing happens automatically just because a user clicked a mouse. This is good news in that it is easier to understand why things are happening. It's bad news because it means you need to define more yourself.
C does not come with a huge standard library like Java does. There is a much smaaller standard library and beyond that everything (including graphical user interfaces) is system-specific.
C requires you to manage your own memory. If you've only programmed in Java, your response should be huh? You will learn what this means over the coming weeks.
C does not have classes, interfaces, inheritance and other features you have encountered in Java.

Given all these weaknesses of C, why should you learn it?

Knowing C may help you get a job as lots of existing programs are written in C and must be maintained.
For programs with real-time constraints, C is often still the language of choice.
C is a proper subset of C++. Thus learning C is a step toward learning C++.
Some of our upper level courses use C or C++ so this will give you a head start in those courses.

C was created in an era in which machines were slow and had little memory. To get maximum speed out of programs, the C runtime system does minimal error checking. The reasoning was that programmers could insert error checking code at just those places in which it was needed. Of course, the end result is that most C programs do insufficient error checking and can crash in unexpected ways.

Since C and Java have so much in common, we will not spend a great deal of time discussing their similarities, but instead focusing on their differences. The goals for today are to discuss the basic data types and statements, functions, and how to do input and output. By the end of today, you should be able to read, write, compile and execute simple C programs.

Basic Data Types

The basic data types in C are:

int: integer (generally 32 bits)


char: Character

Booleans

There is no boolean type in C. Of course, C still has if-statements and while-statements, so you would expect there to be a boolean type. In fact, there is not. The expressions in these statements are integers! A value of 0 is equivalent to false; a non-zero value is equivalent to true. It is generally a good idea to define a type to represent booleans with two constant values true and false. This can be done by putting the following 2 lines at the beginning of your C program:

typedef int bool;
#define TRUE 1
#define FALSE 0

The first of these statements creates a new type named bool that behaves exactly the same as int. The reason to do this is to make your programs more understandable. If you are using a variable as a boolean, it would be good to declare it as such so that you (and others) can better understand your code. The next two lines each define a constant. #define is the keyword for defining constants. The word following the #define is the symbol being defined while the expression after that is the value it is given. You can now use TRUE and FALSE as symbols even though they will be interpreted as integers, as follows:

bool done = FALSE;
while (!done) {
    <do more processing>
}

Unfortunately, the use of integer expressions in conditionals leads to the following extremely common C programming error:

if (a = 1) {
    ...
}

Note that the expression contains an assignment operator rather than an equality operator. The result is that 1 is assigned to a. This assignment expression returns the value assigned, in this case 1. 1 is equivalent to true so the body of the if-statement is always executed. This is almost certainly not what the programmer intended! Beware of this very common error and be sure to use == inside expressions rather than =.

Unsigned integers

It is possible to restrict the range of values that an integer takes on to be only positive integers. You do this by prefacing the type name with the keyword unsigned:

unsigned int counter;

The declaration above gives you a 32 bit integer that can take on the values from 0 to 2³²-1. If the unsigned keyword is missing, the value can range from -2³¹+1 to 2³¹-1.

As an example of C's lack of runtime checking, the unsigned keyword is really only useful as documentation. Neither the C compiler nor the C runtime system prevent you from assigning negative numbers to variables declared to be unsigned.

Enums

The enum keyword allows you to define enumerated types. This is a feature that is, unfortunately missing from Java. When you define an enumerated type you provide a name for the type and the values that the comprise the type:

enum Marsupial {
    kangaroo,
    wallaby,
    wallaroo
};

In this example the type is named Marsupial. It can take on the values kangaroo, wallaby, or wallaroo. The only thing you can do with enums is to assign values to variables and to compare variables, as in:

enum Marsupial pet;
pet = kangaroo;

As with all the types we have seen so far, C really treats these the same as integers. Thus, you could assign them to integer variables or assign an integer to an enum variable. These are generally not sensible things to do, but C will not complain if you try.

Arrays

Arrays are basically the same as in Java with a few differences:

You can declare the size of an array when you declare the array as follows:
```
int scores[10];
```
This gives you an array of 10 integers indexed from 0 to 9.
There is no equivalent of the .length variable that Java's arrays have. As a result, there is no way to ask an array how big it is.
Since an array does not know how big it is, it cannot determine if a value you use to index into an array is within the size of the array. Even more silly, C does not even warn you if you use a negative index even though the lowest bound for any array is 0. So, what happens if you use an array index that is out of bounds? Some random piece of memory will be returned or modified (depending on whether the bad index is on the left or right of an assignment statement). What should you do about it? If you have any doubt at all about whether an index is valid, you should add code to compare it to the array bounds:
```
if (i >= 0 && i < 10) {
    scores[i] = 100;
} else {
    /* Do something else! */
}
```

Basic Statements

The syntax for statements is the same in C as it is in Java. C has:

Assignment statements:
```
a = b;
```
If statements:
```
if (a == b) {
    ...
}
```
While statements:
```
while (a == b) {
    ...
}
```
Do statements:
```
do {
    ...
} while (a == b);
```

For statements:

for (i = 0; i < 10; i++) {
    ...
}

Return to CS 010 Home