C defines several built-in storage data types like char, short, int, long, array, string etc. In fact, it does not stop there -- it allows us to even build data structures that can keep custom data types. Data structures can combine various data types (like int, char, etc) to create a more customized encapsulation that is tailored for the application requirement. Data structures are versatile and can include other data structures as well. It is this ability to bind together storage of various types that data structures play an important role in C programming.
We can use the "struct" keyword to define a structure. Following this keyword, we need to provide the name of the structure. Next, we can provide data of various types, each separated by a semi-colon, and enclose them within braces. Each of these individual data types are referred to as data structure members.
Let us say, we want to write an application that handles an inventory of paintings and would like to use a data structure to hold information relevant to a painting. We can define a structure as follows; do not forget the semi-colon at the end of the structure definition.
struct painting_frame { int painting_id; int width; int height; float price; char painter[50]; };
The above structure contains data members (or fields) that store painter's name, price of the framed painting, along with the width and height of the frame; the structure also stores a unique painting ID for each painting. Thus, we can have an instance of this data structure for each painting and a common data representation for all paintings.
We do not have to stop here! We can even define an array of structures. The code snippet provided below, defines all_frames as an array of painting_frame data structures. The value of MAX_FRAMES is 1000 and that means all_frames can store information about 1000 paintings.
#define MAX_FRAMES 1000 struct painting_frame all_frames[MAX_FRAMES];
Once we have defined a data structure, we need to access its members. C allows us to refer to members of a data structure using the dot (".") operator.
Let us see an example that illustrates the usage of data structure. The example (provided below) initializes a single structure as well as an array of the same structure and also accesses their members. To keep the code compact, we provide a function, print_painting_frame(), that prints various members of the painting_frame data structure.
#include <stdio.h> #include <string.h> #define MAX_FRAMES 5 typedef struct painting_frame { int painting_id; int width; int height; float price; char painter[50]; } painting_frame_t; void print_painting_frame(painting_frame_t *frame) { printf("ID: %d Width: %d Height: %d Painter: %s\n", (*frame).painting_id, (*frame).width, (*frame).height, (*frame).painter); } int main() { painting_frame_t sample_frame; /* Single data structure */ painting_frame_t all_frames[MAX_FRAMES]; /* An array of data structures */ int i; /* Initialize members of sample_frame */ sample_frame.painting_id = 100; sample_frame.width = 50; sample_frame.height = 100; memcpy(sample_frame.painter, "Leonardo da Vinci", strlen("Leonardo da Vinci") + 1); printf("Let us print info about sample_frame\n"); print_painting_frame(&sample_frame); /* Initialize members of an array of painting_frame */ for (i=0; i < MAX_FRAMES; i++) { all_frames[i].painting_id = i; all_frames[i].width = 50; all_frames[i].height = 100; /* Let us say, these paintings are painted by Van Gogh */ memcpy(all_frames[i].painter, "Van Gogh", strlen("Van Gogh") + 1); } printf("\nLet us print info about painting array\n"); for (i=0; i < MAX_FRAMES; i++) { print_painting_frame(&all_frames[i]); } return 0; }
Here is the output of the above code:
Let us print info about sample_frame ID: 100 Width: 50 Height: 100 Painter: Leonardo da Vinci Let us print info about painting array ID: 0 Width: 50 Height: 100 Painter: Van Gogh ID: 1 Width: 50 Height: 100 Painter: Van Gogh ID: 2 Width: 50 Height: 100 Painter: Van Gogh ID: 3 Width: 50 Height: 100 Painter: Van Gogh ID: 4 Width: 50 Height: 100 Painter: Van Gogh
Note that it is possible to initialize the value of various members of a structure during the declaration itself. The following code defines a simple structure and also initializes its members during the definition of temp_painting_id.
typedef struct painting_dimensions { float length; float width; float height; } painting_d; painting_d temp_painting_d = {100.0, 80.0, 10.0};
Lastly, data structures can include other data structures. For example, if we were to define a structure to hold various information of bidding of a frame along with the painting_frame_t itself, then here is how we can define it:
typedef struct frame_bid { painting_frame_t painting_frame; int bid_value; char *bidder_name; } frame_bid_t;
Data structures are like any other variables and so even though, they are a collection of variables, one can still assign a value of one data structure to another. The following example uses a painting_frame_t structure to define two variables, frame1 and frame2. Then it assigns frame1 to frame2. When we run the example, we find that frame2 has same value as frame1.
#include <stdio.h> typedef struct painting_frame { int painting_id; int width; int height; } painting_frame_t; void print_painting_frame(painting_frame_t *frame) { printf("ID: %d Width: %d Height: %d\n", (*frame).painting_id, (*frame).width, (*frame).height); } int main() { painting_frame_t frame1 = {100, 80.0, 10.0}; painting_frame_t frame2; frame2 = frame1; print_painting_frame(&frame1); print_painting_frame(&frame2); return 0; }
Since data structures can potentially store large amounts of data, we often access data structures using pointers; this is especially true when we pass data structures to functions. Passing by pointer ensures that only the address of the structure is passed instead of a copy of the data contained by the structure; passing by value can be inefficient.
Unlike the dot operator, when referring to a member of a data structure via pointer, we need to use the structure pointer operator, "->", in place of the dot operator. Trying to access a data member using the dot operator (as "new_frame.painting_id") for a pointer would be a compilation error.
painting_frame_t sample_frame; painting_frame_t *new_frame = &sample_frame; //Change the ID to new ID new_frame->painting_id = 101;
Of course, the other alternative would be to call "(*new_frame).painting_id", where we access the value of the pointer first and then use the dot operator. Clearly, using "->" is more convenient when dealing with pointers!
Next, we provide a sample address of a data structure and a pointer pointing to it. In the figure provided below, new_frame is a pointer that points to the address of sample_frame data structure. Like the case of other pointers, the value of the new_frame is simply the address of the sample_frame data structure.
Address: 0x6fff520 Address: 0x211f000 ------------------ ------------------------- | Value: 0x211f000 | -----------------> | Value painting_id: 101 | ------------------ | ------------------------| Name: new_frame | Value width: 0 | | ------------------------| | Value height: 0 | | ------------------------| | Value price: 0.0 | | ------------------------| | Value painter: NULL | ------------------------- Name: sample_frame Figure: A pointer stores an address of a data structure
We do not necessarily have to define a structure as a variable and then use a pointer to refer to it. We can define a pointer from the very beginning and then allocate memory to it, and then continue to use it. Do not forget to free the allocated memory, once you are done!
Here is a sample program that modifies the earlier program to make sample_frame a pointer from the get-go!
#include <stdio.h> #include <string.h> #include <stdlib.h> typedef struct painting_frame { int painting_id; int width; int height; float price; char painter[50]; } painting_frame_t; void print_painting_frame(painting_frame_t *frame) { printf("ID: %d Width: %d Height: %d Painter: %s\n", frame->painting_id, frame->width, frame->height, frame->painter); } int main() { painting_frame_t *sample_frame; sample_frame = (painting_frame_t *)malloc(sizeof(painting_frame_t)); if (sample_frame == NULL) { printf("Error: Malloc Failed\n"); return -1; } /* Initialize members of sample_frame */ sample_frame->painting_id = 100; sample_frame->width = 50; sample_frame->height = 100; memcpy(sample_frame->painter, "Leonardo da Vinci", strlen("Leonardo da Vinci") + 1); printf("Let us print info about sample_frame\n"); print_painting_frame(sample_frame); free(sample_frame); return 0; }
Here is the output:
Let us print info about sample_frame ID: 100 Width: 50 Height: 100 Painter: Leonardo da Vinci
Like structures, unions are another constructs that allow us to store multiple values (aka data members). The difference between the two is that with unions, we can hold only one value at a given time. In that sense, all the values stored in a union share the same common space. The size of the common space of a union is the size of the largest member of the union.
We define a union using the "union" keyword. Union members can be accessed just like that of data structures. If we are using a union variable, then we can use the dot operator ("."). If we are using a union pointer, then we can use the pointer operator ("->").
Since a union can hold different values at different times, one way to avoid ambiguity is to maintain a state for each union instance. Whenever, we update the value of the union to a new type, we need to update the state as well.
With that, let us see an example that uses a union to hold various values for tax types for a painting. The example uses a union painting_tax that contains various storage types reflecting taxes and commissions. Next, it also uses an enum of tax_type to indicate each of the types. The example also prints address of various union members.
#include <stdio.h> #include <string.h> typedef union painting_tax { char tax_code[8]; double commission; int city_tax; } painting_tax_t; typedef enum tax_type { TAX_CODE, COMMISSION, CITY_TAX } tax_type_t; void print_tax (painting_tax_t *tax, tax_type_t type) { switch (type) { case TAX_CODE: printf("The Tax code is %s (Address: %p)\n", tax->tax_code, &tax->tax_code); break; case COMMISSION: printf("The Commission is %f (Address: %p)\n", tax->commission, &tax->commission); break; case CITY_TAX: printf("The City tax is %d (Address: %p)\n", tax->city_tax, &tax->city_tax); break; default: printf("Error: unknown category\n"); break; } } int main () { painting_tax_t tax; tax_type_t type; printf("The sizeof painting_tax_t is %d (Address: %p)\n", sizeof(tax), &tax); type = TAX_CODE; strcpy(tax.tax_code, "TXC-101"); print_tax(&tax, type); type = COMMISSION; tax.commission = 9.25; print_tax(&tax, type); type = CITY_TAX; tax.city_tax = 10; print_tax(&tax, type); /* It is an error to not update the type variable */ strcpy(tax.tax_code, "TXC-101"); print_tax(&tax, type); return 0; }
The output demonstrates that each time we call print_tax(), it prints the corresponding value as dictated by the variable type. Also, for the last case, when we update the value of tax_code but do not update the type to TAX_CODE, print_tax() incorrectly prints the value of tax_code.
The sizeof painting_tax_t is 8 (Address: 0xbff16734) The Tax code is TXC-101 (Address: 0xbff16734) The Commission is 9.250000 (Address: 0xbff16734) The City tax is 10 (Address: 0xbff16734) The City tax is 759388244 (Address: 0xbff16734)
The output also shows that the size of the union is 8, which equals the size of the tax_code or the commission. Lastly, when we print the address of the union or its values, they all point to the same memory location.