CodingBison - C Basics: Variable Types

Variables allow us to store data of different types. It is fairly easy to define and use a C variable. Let us see this using an example. The example (provided below) defines four variables: var_num1, var_num2, var_sum, and var_product. Next, it places the sum and product the first two variables (var_num1 and var_num2) into the last two variables (var_sum and var_product).

 #include <stdio.h>

 int main () {
     int var_num1 = 10;
     int var_num2 = 5;
     int var_sum, var_product;

     /* Store the sum of var_num1 and var_num2 in var_sum */
     var_sum = var_num1 + var_num2;

     /* Store the product of var_num1 and var_num2 in var_product */
     var_product = var_num1 * var_num2;

     printf("Variables: var_num1: %d, var_num2: %d\n", var_num1, var_num2);
     printf("var_sum: %d, var_product: %d\n", var_sum, var_product);
     return 0;
 }

Let us compile and run the program.

 $ gcc variable.c -o var_o
 $ ./var_o
 Variables: var_num1: 10, var_num2: 5 
 var_sum: 15, var_product: 50

The above program shows how to assign a value to a variable, "var_num1 = 10", or "var_sum = var_num1 + var_num2"; these statements show that the value on the right hand side, e.g. 10 gets assigned to the variable, var_num1. For assignment operations, the order is important -- assigning in the opposite way (e.g. "10 = var_num1") would be a compilation error. The variable on the left hand side is also sometimes known as an lvalue.

Depending upon where we define a variable, the scope of the variable is qualified. For example, if we define a variable outside of any function (including main()), then it can be used throughput the file. On the other hand, if we define a variable inside a function, then it can be used only inside that function. We will revisit both functions and variable scope later!

Storage Types for Variables

In the above program, both var_num1 and var_num2 store an integer (we provide the "int" qualifier before the variable names); an integer is a number (digit) without any fractions. C provides storage for several types of numbers. For example, if we were to store "10.50" as value of var_num1, then we could not do that since "10.50" is a fractional number and not an integer (try assigning "10.50" to var_num1 in the above program and recompiling it!). For such cases, C provides two additional types: float and double. Both int and float usually require a size of 4 bytes, whereas a double requires a size of 8 bytes.

Besides int type, there are two other types for storing integers in C: short and char. Variables of type "short" require 2 bytes and variables of type "char" require only 1 byte.

A char (or character) type can also store values of alphabets or other non-numeric characters. For example, we can use a char to store values like 'A' (which has an ASCII value of 50). Since a char variable uses only 1 byte (which is 8 bits), the maximum value of a digit that it can store is 255 ((2 to the power 8) - 1). Technically speaking, when we assign a character to a variable, then the variable actually stores the numeric ASCII value.

All of the above variables can store both positive and negative values. To do this, these variables use the leftmost bit to store if it is positive or negative. Thus, to store the sign of a value, we lose one bit of storage. To avoid this inefficient behavior, when working with only positive numbers, we can use an "unsigned" versions of char, short, and int: "unsigned char", "unsigned short", and "unsigned int". Due to reasons of efficiency, C does not support unsigned variants for float and double!

With that, let us use a small example to investigate variables further. This example defines variables of different storage types and prints their values/sizes; for that, it uses the built-in sizeof() function to retrieve size of a variable. Please note that the printf() function uses a "%f" for printing fractional numbers (float and double), "%d" for printing integer numbers, and "%u" for printing unsigned integers. We will revisit printf() a little later.

 #include <stdio.h>

 int main () {
     char var_char = -10;
     short var_short = -10;
     int var_int = -10;
     float var_float = -10.50; 
     double var_double = -10.50;

     /* These are the unsigned variants */
     unsigned char var_uchar = 10; 
     unsigned short var_ushort = 10; 
     unsigned int var_uint = 10; 

     printf("  var_char: %-5d (sizeof char: %d)\n", var_char, sizeof(var_char));
     printf(" var_short: %-5d (sizeof short: %d)\n", var_short, sizeof(var_short));
     printf("   var_int: %-5d (sizeof int: %d)\n", var_int, sizeof(var_int));
     printf(" var_float: %.1f (sizeof float: %d)\n", var_float, sizeof(var_float));
     printf("var_double: %.1f (sizeof double: %d)\n", var_double, sizeof(var_double));

     printf(" var_uchar: %-5u (sizeof unsigned char: %d)\n", var_uchar, sizeof(var_uchar));
     printf("var_ushort: %-5u (sizeof unsigned short: %d)\n", var_ushort, sizeof(var_ushort));
     printf("  var_uint: %-5u (sizeof unsigned int: %d)\n", var_uint, sizeof(var_uint));
     return 0;
 }

Assuming that the file name of above program is "sizes.c", here is its compilation and output:

 $ gcc sizes.c -o size
 $
 $ ./size
   var_char: -10   (sizeof char: 1)
  var_short: -10   (sizeof short: 2)
    var_int: -10   (sizeof int: 4)
  var_float: -10.5 (sizeof float: 4)
 var_double: -10.5 (sizeof double: 8)
  var_uchar: 10    (sizeof unsigned char: 1)
 var_ushort: 10    (sizeof unsigned short: 2)
   var_uint: 10    (sizeof unsigned int: 4)

If we were to assign "-10" to var_uchar or any of the unsigned variables, then the value printed may not be "-10" because storing a negative number into an unsigned variable is undefined. Clearly, it is not a good idea to assign a negative value to an unsigned variable!

Variable Operators

Besides simple operators to do addition and multiplication, C provides additional arithmetic operators as well: subtraction using "-", division using "/", modulo using "%" etc. We should note that the modulo operator has an important constraint: both of the variables should be integral (integer, short, char) and not float or double.

C also provides a set of operators that operate the value of a variable. For example, if we were to increase the value of "var1" by 10, we could do "var1 = var1 + 10" or more crisply, "var1 += 10". Likewise, if we were to multiply the value of "var1" by 10, we could do "var1 = var1 * 10" or "var1 *= 10". Same rule applies to several other operators like subtraction "-"), division ("/"), modulo ("%"), bit-shifts ("<<" or ">>"), logical and ("&"), logical or ("|"), etc.

Another set of handy operators are the unary operators: "++" and "--", these operators increment and decrement an integer variable by 1 respectively. Thus, "var1 = var1 + 1" is same as "var1++". Likewise, "var1 = var1 -1" is same as "var1--".

However, we should note that the above unary operations can be applied either before or after a variable and accordingly, it can have different meanings. For example, "var1++" means that we use the variable first and then increment it, where as, "++var1" means we increment the variable first and then use it. Thus, "var2 = ++var1" and "var2 = var1++" mean different assignments to var2. It is a bad idea to have complicated expressions involving these operators; overzealous usage of these operators can create confusing statements that will reduce code-readability!

Let us use a simple example to illustrate the usage of above operators.

 #include <stdio.h>

 int main () {
     int var_num1, var_num2, var_num3; 

     var_num1 = 10;
     var_num1 += 100;
     printf("[Line: %2d] var_num1 is: %d\n", __LINE__, var_num1);

     var_num1 = 10;
     var_num1 *= 10;
     printf("[Line: %2d] var_num1 is: %d\n", __LINE__, var_num1);

     var_num1 = 10;
     var_num1 %= 7;
     printf("[Line: %2d] var_num1 is: %d\n", __LINE__, var_num1);

     var_num1 = 10;
     var_num1 /= 2;
     printf("[Line: %2d] var_num1 is: %d\n", __LINE__, var_num1);

     var_num1 = 10;
     /* Assign before incrementing var_num1 */
     var_num2 = ++var_num1;

     var_num1 = 10;
     /* Assign after incrementing var_num1 */
     var_num3 = var_num1++;

     printf("[Line: %2d] var_num2 is: %d, var_num3 is : %d\n", 
           __LINE__, var_num2, var_num3);
     return 0;
 }

We provide the output of the above program below. Please note that the (built-in) macro __LINE__ prints the current line of where __LINE__ is present.

 $ gcc operations.c  -o operations
 $ 
 $ ./operations
 [Line:  8] var_num1 is: 110
 [Line: 12] var_num1 is: 100
 [Line: 16] var_num1 is: 3
 [Line: 20] var_num1 is: 5
 [Line: 31] var_num2 is: 11, var_num3 is : 10

C also provides advanced storage types like arrays and strings. In the following sections, we provide a brief introduction for both arrays and strings. We will revisit both of them later.

Arrays

A C array is a series of data, all of them being of the same type. When we define an array, we need to provide a name, the storage type of array elements and the number of the elements. Thus, if we say "int painting_array[1000]", then C defines an array with name painting_array that holds 1000 integer values.

Each array element is identified using an index. For an array with n values, the first element has an index of 0, the second element has the next index value of 1, and in the end, the last element has an index of (n-1). We can easily navigate, and update the integer values stored in this array.

The following figure shows an array (with name painting_array) that has n elements (for sake of simplicity, the keep values of all elements as 0).

               painting_array[1]              painting_array[n-1]
                       |                               |
                       |                               |
                       V                               V
                 -----------------------------------------
                 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 
                 -----------------------------------------
                   ^
                   |
                   |
                 painting_array[0]

                     Figure: An Array with n elements

We provide a small example that shows the usage of an array. The program begins by declaring "int painting_array[5]", which means that painting_array is an array of 5 integer elements. This program uses a "for" loop; put simply, a "for" loop helps us traverse all the elements of the array. We will revisit the "for" loop later.

 #include <stdio.h>

 int main () {
     int i;
     int painting_array[4]; /* Define the array */

     /* Assign values to array elements */
     painting_array[0] = 1000;
     painting_array[1] = 1001;
     painting_array[2] = 1002;
     painting_array[3] = 1003;

     for (i=0; i < 4; i++) {
         printf("[i: %d] painting id: %d \n", i, painting_array[i]);
     }   

     printf("Length of array: %d \n", 
         sizeof(painting_array)/sizeof(painting_array[0]));
     return 0;
 }

The above example uses the sizeof() function to print the size of the array. First, it uses this function to find the total storage of the array and then, it uses this function again to find the storage of the first element. Next, it divides the storage of the total array with the storage of the first element to get the total number of elements in the array. Note that since all elements are of the same size, it does not matter if we take the size of the first element or any other. Here is the output:

 $ gcc arrays.c -o array
 $ 
 $ ./array
 [i: 0] painting id: 1000 
 [i: 1] painting id: 1001 
 [i: 2] painting id: 1002 
 [i: 3] painting id: 1003 
 Length of array: 4

Strings

A C string is an array of char types. As we saw earlier, a char type requires a storage of one byte. So, a string is essentially an array of 1 byte-sized elements. To mark the end of a string, C strings have the last character as '\0', which is a NUL termination. The NUL termination has an ASCII value of 0 and should not be confused with the value of '0' character, which has an ASCII value of 48!

Once again, we provide an program to illustrate C Strings. The example begins by defining a string, var_string; the var_string is defined with "[]" to indicate that it is an array. The program also initializes the var_string with characters ("Mona Lisa"). Next, we find the length of the array and provide this as a limit to the "for" loop. After that, the program uses a "for" loop to iterate through the character elements of the string, var_string.

The example calculates the size of the string, var_string, using two methods. The first method is our earlier method, where we find the storage of the array and divide it by storage of the first element -- this method will include the NUL termination character as well. The second method is by using a built-in function strlen(); this function returns length of a string and we need to include the "string.h" library header to use this function. The strlen() does not include the NUL termination character.

Here is the example:

 #include <stdio.h>
 #include <string.h>

 int main () {
     int counter, size;
     char var_string[12] = "Mona Lisa";

     size = sizeof(var_string)/sizeof(var_string[0]);
     for (counter=0; counter < size; counter++) {
         printf("[i: %d] var_string: %c \n", counter, var_string[counter]);
     }

     printf("The length of this string is %d \n", size);
     printf("The length of this string is %d \n", strlen(var_string));
     return 0;
 }

Note that, in the above definition of var_string, we do not need to provide the number of elements in this array. This is allowed by C when we initialize an array with elements. C uses the type of elements (char in this case) and then allocates enough space to accommodate the passed initial value ("Mona Lisa" in this case). Thus, "char var_string[] = "Mona Lisa";" would have worked equally well!

the output (provided below) shows that each element stores one of the characters of the string, "Mona Lisa". Note that the characters (after "i" equals 8) have the NUL termination and hence print empty characters.

 $ gcc string.c -o string
 $
 [i: 0] var_string: M 
 [i: 1] var_string: o 
 [i: 2] var_string: n 
 [i: 3] var_string: a 
 [i: 4] var_string:   
 [i: 5] var_string: L 
 [i: 6] var_string: i 
 [i: 7] var_string: s 
 [i: 8] var_string: a 
 [i: 9] var_string:  
 [i: 10] var_string:  
 [i: 11] var_string:  
 The length of this string is 12 
 The length of this string is 9