Data types
C has several types of variables, but there are a few basic types:
- Integers - whole numbers which can be either positive or negative. Defined using char, int, short, long or long long.
- Unsigned integers - whole numbers which can only be positive. Defined using unsigned char, unsigned int, unsigned short, unsigned long or unsigned long long.
- Floating point numbers - real numbers (numbers with fractions). Defined using float and double.
- Structures - will be explained later, in the Structures section. The different types of variables define their bounds. A char can range only from -128 to 127, whereas a long can range from -2,147,483,648 to 2,147,483,647 (long and other numeric data types may have another range on different computers, for example - from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 on 64-bit computer).
Note that C does not have a boolean type. Usually, it is defined using the following notation:
#define BOOL char
#define FALSE 0
#define TRUE 1
C uses arrays of characters to define strings, and will be explained in the Strings section.
Defining variables For numbers, we will usually use the type int. On most computers today, it is a 32-bit number, which means the number can range from -2,147,483,648 to 2,147,483,647.
To define the variables foo and bar, we need to use the following syntax:
int foo;
int bar = 1;
The variable foo can be used, but since we did not initialize it, we don’t know what’s in it. The variable bar contains the number 1.
Now, we can do some math. Assuming a, b, c, d, and e are variables, we can simply use plus, minus and multiplication operators in the following notation, and assign a new value to a:
int a = 0, b = 1, c = 2, d = 3, e = 4;
a = b - c + d * e;
printf("%d", a); /* will print 1-2+3*4 = 11 */
Arrays
Arrays are special variables which can hold more than one value under the same variable name, organised with an index. Arrays are defined using a very straightforward syntax:
/* defines an array of 10 integers */
int numbers[10];
Accessing a number from the array is done using the same syntax. Notice that arrays in C are zero-based, which means that if we defined an array of size 10, then the array cells 0 through 9 (inclusive) are defined. numbers[10] is not an actual value.
int numbers[10];
/* populate the array */
numbers[0] = 10;
numbers[1] = 20;
numbers[2] = 30;
numbers[3] = 40;
numbers[4] = 50;
numbers[5] = 60;
numbers[6] = 70;
/* print the 7th number from the array, which has an index of 6 */
printf("The 7th number in the array is %d", numbers[6]);
Arrays can only have one type of variable, because they are implemented as a sequence of values in the computer’s memory. Because of that, accessing a specific array cell is very efficient.
Multidimensional Arrays
In the previous tutorials on Arrays, we covered, well, arrays and how they work. The arrays we looked at were all one-dimensional, but C can create and use multi-dimensional arrays. Here is the general form of a multidimensional array declaration:
type name[size1][size2]...[sizeN];
For example, here’s a basic one for you to look at -
int foo[1][2][3];
or maybe this one -
char vowels[1][5] = {
{'a', 'e', 'i', 'o', 'u'}
};
Two-dimensional Arrays
The simplest form of multidimensional array is the two-dimensional array. A two-dimensional array is pretty much a list of one-dimensional arrays. To declare a two-dimensional integer array of size [ x ][ y ], you would write something like this β
type arrayName [x][y];
Where type can be any C data type (int, char, long, long long, double, etc.) and arrayName will be a valid C identifier, or variable. A two-dimensional array can be considered as a table which will have [ x ] number of rows and [ y ] number of columns.
In this sense, every element in the array a is identified by an element name in the form a[i][j], where ‘a’ is the name of the array, and ‘i’ and ‘j’ are the indexes that uniquely identify, or show, each element in ‘a’.
And honestly, you really don’t have to put in a [ x ] value really, because if you did something like this -
char vowels[][5] = {
{'A', 'E', 'I', 'O', 'U'},
{'a', 'e', 'i', 'o', 'u'}
};
the compiler would already know that there are two “dimensions” you could say, but, you NEED a [ y ] value!! The compiler may be smart, but it will not know how many integers, characters, floats, whatever you’re using you have in the dimensions. Keep that in mind.
Initializing Two-Dimensional Arrays
Multidimensional arrays may be used by specifying bracketed[] values for each row. Below is an array with 3 rows and each row has 4 columns. To make it easier, you can forget the 3 and keep it blank, it’ll still work.
int a[3][4] = {
{0, 1, 2, 3} , /_ initializers for row indexed by 0 _/
{4, 5, 6, 7} , /_ initializers for row indexed by 1 _/
{8, 9, 10, 11} /_ initializers for row indexed by 2 _/
};
The inside braces, which indicates the wanted row, are optional. The following initialization is the same to the previous example β
int a[3][4] = {0,1,2,3,4,5,6,7,8,9,10,11};
Accessing Two-Dimensional Array Elements
An element in a two-dimensional array is accessed by using the subscripts, i.e., row index and column index of the array. For example β
int val = a[2][3]; The above statement will take the 4th element from the 3rd row of the array.
Conditions
Decision Making
In life, we all have to make decisions. In order to make a decision we weigh out our options and so do our programs.
Here is the general form of the decision making structures found in C.
int target = 10;
if (target == 10) {
printf("Target is equal to 10");
}
The if statement
The if statement allows us to check if an expression is true or false, and execute different code according to the result.
To evaluate whether two variables are equal, the == operator is used, just like in the first example.
Inequality operators can also be used to evaluate expressions. For example:
int foo = 1;
int bar = 2;
if (foo < bar) {
printf("foo is smaller than bar.");
}
if (foo > bar) {
printf("foo is greater than bar.");
}
We can use the else keyword to execute code when our expression evaluates to false.
int foo = 1;
int bar = 2;
if (foo < bar) {
printf("foo is smaller than bar.");
} else {
printf("foo is greater than bar.");
}
Sometimes we will have more than two outcomes to choose from. In these cases, we can “chain” multiple if else statements together.
int foo = 1;
int bar = 2;
if (foo < bar) {
printf("foo is smaller than bar.");
} else if (foo == bar) {
printf("foo is equal to bar.");
} else {
printf("foo is greater than bar.");
}
You can also nest if else statements if you like.
int peanuts_eaten = 22;
int peanuts_in_jar = 100;
int max_peanut_limit = 50;
if (peanuts_in_jar > 80) {
if (peanuts_eaten < max_peanut_limit) {
printf("Take as many peanuts as you want!\n");
}
} else {
if (peanuts_eaten > peanuts_in_jar) {
printf("You can't have anymore peanuts!\n");
}
else {
printf("Alright, just one more peanut.\n");
}
}
Two or more expressions can be evaluated together using logical operators to check if two expressions evaluate to true together, or at least one of them. To check if two expressions both evaluate to true, use the AND operator &&. To check if at least one of the expressions evaluate to true, use the OR operator ||.
int foo = 1;
int bar = 2;
int moo = 3;
if (foo < bar && moo > bar) {
printf("foo is smaller than bar AND moo is larger than bar.");
}
if (foo < bar || moo > bar) {
printf("foo is smaller than bar OR moo is larger than bar.");
}
The NOT operator ! can also be used likewise:
int target = 9;
if (target != 10) {
printf("Target is not equal to 10");
}
Strings
Defining strings
Strings in C are actually arrays of characters. Although using pointers in C is an advanced subject, fully explained later on, we will use pointers to a character array to define simple strings, in the following manner:
char * name = "John Smith";
This method creates a string which we can only use for reading. If we wish to define a string which can be manipulated, we will need to define it as a local character array:
char name[] = "John Smith";
This notation is different because it allocates an array variable so we can manipulate it. The empty brackets notation [] tells the compiler to calculate the size of the array automatically. This is in fact the same as allocating it explicitly, adding one to the length of the string:
char name[] = "John Smith";
/_ is the same as _/
char name[11] = "John Smith";
The reason that we need to add one, although the string John Smith is exactly 10 characters long, is for the string termination: a special character (equal to 0) which indicates the end of the string. The end of the string is marked because the program does not know the length of the string - only the compiler knows it according to the code.
String formatting with printf
We can use the printf command to format a string together with other strings, in the following manner:
char \* name = "John Smith";
int age = 27;
/_ prints out 'John Smith is 27 years old.' _/
printf("%s is %d years old.\n", name, age);
Notice that when printing strings, we must add a newline (\n) character so that our next printf statement will print in a new line.
String Length
The function ‘strlen’ returns the length of the string which has to be passed as an argument:
char \* name = "Nikhil";
printf("%d\n",strlen(name));
String comparison
The function strncmp compares between two strings, returning the number 0 if they are equal, or a different number if they are different. The arguments are the two strings to be compared, and the maximum comparison length. There is also an unsafe version of this function called strcmp, but it is not recommended to use it. For example:
char \* name = "John";
if (strncmp(name, "John", 4) == 0) {
printf("Hello, John!\n");
} else {
printf("You are not John. Go away.\n");
}
String Concatenation
The function ‘strncat’ appends first n characters of src string to the destination string where n is min(n,length(src)); The arguments passed are destination string, source string, and n - maximum number of characters to be appended. For Example:
char dest[20]="Hello";
char src[20]="World";
strncat(dest,src,3);
printf("%s\n",dest);
strncat(dest,src,20);
printf("%s\n",dest);
For loops
For loops in C are straightforward. They supply the ability to create a loop - a code block that runs multiple times. For loops require an iterator variable, usually notated as i.
For loops give the following functionality:
- Initialize the iterator variable using an initial value
- Check if the iterator has reached its final value
- Increases the iterator
For example, if we wish to iterate on a block for 10 times, we write:
int i;
for (i = 0; i < 10; i++) {
printf("%d\n", i);
}
This block will print the numbers 0 through 9 (10 numbers in total).
For loops can iterate on array values. For example, if we would want to sum all the values of an array, we would use the iterator i as the array index:
int array[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int sum = 0;
int i;
for (i = 0; i < 10; i++) {
sum += array[i];
}
/_ sum now contains a[0] + a[1] + ... + a[9] _/
printf("Sum of the array is %d\n", sum);
While loops
While loops are similar to for loops, but have less functionality. A while loop continues executing the while block as long as the condition in the while remains true. For example, the following code will execute exactly ten times:
int n = 0;
while (n < 10) {
n++;
}
While loops can also execute infinitely if a condition is given which always evaluates as true (non-zero):
while (1) {
/_ do something _/
}
Loop directives
There are two important loop directives that are used in conjunction with all loop types in C - the break and continue directives.
The break directive halts a loop after ten loops, even though the while loop never finishes:
int n = 0;
while (1) {
n++;
if (n == 10) {
break;
}
}
In the following code, the continue directive causes the printf command to be skipped, so that only even numbers are printed out:
int n = 0;
while (n < 10) {
n++;
/* check that n is odd */
if (n % 2 == 1) {
/* go back to the start of the while block */
continue;
}
/* we reach this code only if n is even */
printf("The number %d is even.\n", n);
}
Functions
C functions are simple, but because of how C works, the power of functions is a bit limited.
Functions receive either a fixed or variable amount of arguments. Functions can only return one value, or return no value. In C, arguments are copied by value to functions, which means that we cannot change the arguments to affect their value outside of the function. To do that, we must use pointers, which are taught later on.
Functions are defined using the following syntax:
int foo(int bar) {
/* do something */
return bar * 2;
}
int main() {
foo(1);
}
The function foo we defined receives one argument, which is bar. The function receives an integer, multiplies it by two, and returns the result.
To execute the function foo with 1 as the argument bar, we use the following syntax:
foo(1);
In C, functions must be first defined before they are used in the code. They can be either declared first and then implemented later on using a header file or in the beginning of the C file, or they can be implemented in the order they are used (less preferable).
The correct way to use functions is as follows:
/* function declaration */
int foo(int bar);
int main() {
/* calling foo from main */
printf("The value of foo is %d", foo(1));
}
int foo(int bar) {
return bar + 1;
}
We can also create functions that do not return a value by using the keyword void:
void moo() {
/* do something and don't return a value */
}
int main() {
moo();
}
Static
static is a keyword in the C programming language. It can be used with variables and functions.
What is a static variable?
By default, variables are local to the scope in which they are defined. Variables can be declared as static to increase their scope up to file containing them. As a result, these variables can be accessed anywhere inside a file.
Consider the following scenario β where we want to count the runners participating in a race:
#include<stdio.h>
int runner() {
int count = 0;
count++;
return count;
}
int main()
{
printf("%d ", runner());
printf("%d ", runner());
return 0;
}
We will see that count is not updated because it is removed from memory as soon as the function completes. If static is used however, we get the desired result:
#include<stdio.h>
int runner()
{
static int count = 0;
count++;
return count;
}
int main()
{
printf("%d ", runner());
printf("%d ", runner());
return 0;
}
What is a static function?
By default, functions are global in C. If we declare a function with static, the scope of that function is reduced to the file containing it.
The syntax looks like this:
static void fun(void) {
printf("I am a static function.");
}
Pointers
Pointers are also variables and play a very important role in C programming language. They are used for several reasons, such as:
- Strings
- Dynamic memory allocation
- Sending function arguments by reference
- Building complicated data structures
- Pointing to functions
- Building special data structures (i.e. Tree, Tries, etc…)
And many more.
What is a pointer?
A pointer is essentially a simple integer variable which holds a memory address that points to a value, instead of holding the actual value itself.
The computer’s memory is a sequential store of data, and a pointer points to a specific part of the memory. Our program can use pointers in such a way that the pointers point to a large amount of memory - depending on how much we decide to read from that point on.
Strings as pointers
We’ve already discussed strings, but now we can dive in a bit deeper and understand what strings in C really are (which are called C-Strings to differentiate them from other strings when mixed with C++)
The following line:
char * name = "John";
does three things:
- It allocates a local (stack) variable called name, which is a pointer to a single character.
- It causes the string “John” to appear somewhere in the program memory (after it is compiled and executed, of course).
- It initializes the name argument to point to where the J character resides at (which is followed by the rest of the string in the memory).
If we try to access the name variable as an array, it will work, and will return the ordinal value of the character J, since the name variable actually points exactly to the beginning of the string.
Since we know that the memory is sequential, we can assume that if we move ahead in the memory to the next character, we’ll receive the next letter in the string, until we reach the end of the string, marked with a null terminator (the character with the ordinal value of 0, noted as \0).
Dereferencing
Dereferencing is the act of referring to where the pointer points, instead of the memory address. We are already using dereferencing in arrays - but we just didn’t know it yet. The brackets operator - [0] for example, accesses the first item of the array. And since arrays are actually pointers, accessing the first item in the array is the same as dereferencing a pointer. Dereferencing a pointer is done using the asterisk operator *.
If we want to create an array that will point to a different variable in our stack, we can write the following code:
/* define a local variable a */
int a = 1;
/* define a pointer variable, and point it to a using the & operator */
int * pointer_to_a = &a;
printf("The value a is %d\n", a);
printf("The value of a is also %d\n", *pointer_to_a);
Notice that we used the & operator to point at the variable a, which we have just created.
We then referred to it using the dereferencing operator. We can also change the contents of the dereferenced variable:
int a = 1;
int * pointer_to_a = &a;
/* let's change the variable a */
a += 1;
/* we just changed the variable again! */
*pointer_to_a += 1;
/* will print out 3 */
printf("The value of a is now %d\n", a);
Structures
C structures are special, large variables which contain several named variables inside. Structures are the basic foundation for objects and classes in C. Structures are used for:
- Serialization of data
- Passing multiple arguments in and out of functions through a single argument
- Data structures such as linked lists, binary trees, and more The most basic example of structures are points, which are a single entity that contains two variables - x and y. Let’s define a point:
struct point {
int x;
int y;
};
Now, let’s define a new point, and use it. Assume the function draw receives a point and draws it on a screen. Without structs, using it would require two arguments - each for every coordinate:
/* draws a point at 10, 5 */
int x = 10;
int y = 5;
draw(x, y);
Using structs, we can pass a point argument:
/* draws a point at 10, 5 */
struct point p;
p.x = 10;
p.y = 5;
draw(p);
To access the point’s variables, we use the dot . operator.
Typedefs
Typedefs allow us to define types with a different name - which can come in handy when dealing with structs and pointers. In this case, we’d want to get rid of the long definition of a point structure. We can use the following syntax to remove the struct keyword from each time we want to define a new point:
typedef struct {
int x;
int y;
} point;
This will allow us to define a new point like this:
point p;
Structures can also hold pointers - which allows them to hold strings, or pointers to other structures as well - which is their real power. For example, we can define a vehicle structure in the following manner:
typedef struct {
char * brand;
int model;
} vehicle;
Since brand is a char pointer, the vehicle type can contain a string (which, in this case, indicates the brand of the vehicle).
vehicle mycar;
mycar.brand = "Ford";
mycar.model = 2007;
Function arguments by reference
Assuming you now understand pointers and functions, you are aware that function arguments are passed by value, by which means they are copied in and out of functions. But what if we pass pointers to values instead of the values themselves? This will allow us to give functions control over the variables and structures of the parent functions and not just a copy of them, thus directly reading and writing the original object.
Let’s say we want to write a function which increments a number by one, called addone. This will not work:
void addone(int n) {
// n is local variable which only exists within the function scope
n++; // therefore incrementing it has no effect
}
int n;
printf("Before: %d\n", n);
addone(n);
printf("After: %d\n", n);
However, this will work:
void addone(int *n) {
// n is a pointer here which point to a memory-adress outside the function scope
(*n)++; // this will effectively increment the value of n
}
int n;
printf("Before: %d\n", n);
addone(&n);
printf("After: %d\n", n);
The difference is that the second version of addone receives a pointer to the variable n as an argument, and then it can manipulate it, because it knows where it is in the memory.
Notice that when calling the addone function, we must pass a reference to the variable n, and not the variable itself - this is done so that the function knows the address of the variable, and won’t just receive a copy of the variable itself.
Pointers to structures
Let’s say we want to create a function which moves a point forward in both x and y directions, called move. Instead of sending two pointers, we can now send only one pointer to the function of the point structure:
void move(point * p) {
(*p).x++;
(*p).y++;
}
However, if we wish to dereference a structure and access one of it’s internal members, we have a shorthand syntax for that, because this operation is widely used in data structures. We can rewrite this function using the following syntax:
void move(point * p) {
p->x++;
p->y++;
}
Dynamic allocation
Dynamic allocation of memory is a very important subject in C. It allows building complex data structures such as linked lists. Allocating memory dynamically helps us to store data without initially knowing the size of the data in the time we wrote the program.
To allocate a chunk of memory dynamically, we need to have a pointer ready to store the location of the newly allocated memory. We can access memory that was allocated to us using that same pointer, and we can use that pointer to free the memory again, once we have finished using it.
Let’s assume we want to dynamically allocate a person structure. The person is defined like this:
typedef struct {
char * name;
int age;
} person;
To allocate a new person in the myperson argument, we use the following syntax:
person * myperson = (person *) malloc(sizeof(person));
This tells the compiler that we want to dynamically allocate just enough to hold a person struct in memory and then return a pointer of type person to the newly allocated data. The memory allocation function malloc() reserves the specified memory space. In this case, this is the size of person in bytes.
The reason we write (person _) before the call to malloc() is that malloc() returns a “void pointer,” which is a pointer that doesn’t have a type. Writing (person _) in front of it is called typecasting, and changes the type of the pointer returned from malloc() to be person. However, it isn’t strictly necessary to write it like this as C will implicitly convert the type of the returned pointer to that of the pointer it is assigned to (in this case, myperson) if you don’t typecast it.
Note that sizeof is not an actual function, because the compiler interprets it and translates it to the actual memory size of the person struct.
To access the person’s members, we can use the -> notation:
myperson->name = "John";
myperson->age = 27;
After we are done using the dynamically allocated struct, we can release it using free:
free(myperson);
Note that the free does not delete the myperson variable itself, it simply releases the data that it points to. The myperson variable will still point to somewhere in the memory - but after calling myperson we are not allowed to access that area anymore. We must not use that pointer again until we allocate new data using it.
Arrays and Pointers
In a previous tutorial on Pointers, you learned that a pointer to a given data type can store the address of any variable of that particular data type. For example, in the following code, the pointer variable pc stores the address of the character variable c.
char c = 'A';
char *pc = &c;
Here, c is a scalar variable that can store only a single value. However, you are already familiar with arrays that can hold multiple values of the same data type in a contiguously allocated memory block. So, you might wonder, can we have pointers to arrays too? Indeed, we can.
Let us start with an example code and look at its output. We will discuss its behavior subsequently.
char vowels[] = {'A', 'E', 'I', 'O', 'U'};
char *pvowels = vowels;
int i;
// Print the addresses
for (i = 0; i < 5; i++) {
printf("&vowels[%d]: %p, pvowels + %d: %p, vowels + %d: %p\n", i, &vowels[i], i, pvowels + i, i, vowels + i);
}
// Print the values
for (i = 0; i < 5; i++) {
printf("vowels[%d]: %c, *(pvowels + %d): %c, *(vowels + %d): %c\n", i, vowels[i], i, *(pvowels + i), i, *(vowels + i));
}
A typical output of the above code is shown below.
&vowels[0]: 0x7ffee146da17, pvowels + 0: 0x7ffee146da17, vowels + 0: 0x7ffee146da17
&vowels[1]: 0x7ffee146da18, pvowels + 1: 0x7ffee146da18, vowels + 1: 0x7ffee146da18
&vowels[2]: 0x7ffee146da19, pvowels + 2: 0x7ffee146da19, vowels + 2: 0x7ffee146da19
&vowels[3]: 0x7ffee146da1a, pvowels + 3: 0x7ffee146da1a, vowels + 3: 0x7ffee146da1a
&vowels[4]: 0x7ffee146da1b, pvowels + 4: 0x7ffee146da1b, vowels + 4: 0x7ffee146da1b
vowels[0]: A, *(pvowels + 0): A, *(vowels + 0): A
vowels[1]: E, *(pvowels + 1): E, *(vowels + 1): E
vowels[2]: I, *(pvowels + 2): I, *(vowels + 2): I
vowels[3]: O, *(pvowels + 3): O, *(vowels + 3): O
vowels[4]: U, *(pvowels + 4): U, *(vowels + 4): U
As you rightly guessed, &vowels[i] gives the memory location of the ith element of the array vowels. Moreover, since this is a character array, each element occupies one byte so that the consecutive memory addresses are separated by a single byte. We also created a pointer, pvowels, and assigned the address of the array vowels to it. pvowels + i is a valid operation; although in general, this may not always be meaningful (explored further in Pointer Arithmetics ). In particular, the output shown above indicates that &vowels[i] and pvowels + i are equivalent. Feel free to alter the data types of the array and pointer variables to test this out.
If you look carefully at the previous code, you will notice that we also used another apparently surprising notation: vowels + i. Moreover, pvowels + i and vowels + i returns the same thing β address of the ith element of the array vowels. On the other hand, _(pvowels + i) and _(vowels + i) both return the ith element of the array vowels. Why is that so?
This is because the name of an array itself is a (constant) pointer to the first element of the array. In other words, the notations vowels, &vowels[0], and vowels + 0 all point to the same location.
Dynamic Memory Allocation for Arrays
By now we know that we can traverse an array using pointers. Moreover, we also know that we can dynamically allocate (contiguous) memory using blocks pointers. These two aspects can be combined to dynamically allocate memory for an array. This is illustrated in the following code.
// Allocate memory to store five characters
int n = 5;
char *pvowels = (char *) malloc(n * sizeof(char));
int i;
pvowels[0] = 'A';
pvowels[1] = 'E';
*(pvowels + 2) = 'I';
pvowels[3] = 'O';
*(pvowels + 4) = 'U';
for (i = 0; i < n; i++) {
printf("%c ", pvowels[i]);
}
printf("\n");
free(pvowels);
In the above code, we allocated five contiguous bytes of memory to store five characters. Subsequently, we used array notations to traverse the blocks of memory as if pvowels is an array. However, remember that pvowels actually is a pointer. Pointers and arrays, in general, are not the same thing.
So when is this useful? Remember that while declaring an array, the number of elements that it would contain must be known beforehand. Therefore, in some scenarios it might happen that the space allocated for an array is either less than the desired space or more. However, by using dynamic memory allocation, one can allocate just as much memory as required by a program. Moreover, unused memory can be freed as soon as it is no longer required by invoking the free() function. On the down side, with dynamic memory allocation, one must responsibly call free() wherever relevant. Otherwise, memory leaks would occur.
We conclude this tutorial by looking at dynamic memory allocation for a two-dimensional array. This can be generalized to n-dimensions in a similar way. Unlike one-dimensional arrays, where we used a pointer, in this case we require a pointer to a pointer, as shown below.
int nrows = 2;
int ncols = 5;
int i, j;
// Allocate memory for nrows pointers
char **pvowels = (char **) malloc(nrows * sizeof(char *));
// For each row, allocate memory for ncols elements
pvowels[0] = (char *) malloc(ncols * sizeof(char));
pvowels[1] = (char *) malloc(ncols * sizeof(char));
pvowels[0][0] = 'A';
pvowels[0][1] = 'E';
pvowels[0][2] = 'I';
pvowels[0][3] = 'O';
pvowels[0][4] = 'U';
pvowels[1][0] = 'a';
pvowels[1][1] = 'e';
pvowels[1][2] = 'i';
pvowels[1][3] = 'o';
pvowels[1][4] = 'u';
for (i = 0; i < nrows; i++) {
for(j = 0; j < ncols; j++) {
printf("%c ", pvowels[i][j]);
}
printf("\n");
}
// Free individual rows
free(pvowels[0]);
free(pvowels[1]);
// Free the top-level pointer
free(pvowels);
Recursion
Recursion occurs when a function contains within it a call to itself. Recursion can result in very neat, elegant code that is intuitive to follow. It can also result in a very large amount of memory being used if the recursion gets too deep.
Common examples of where recursion is used :
Walking recursive data structures such as linked lists, binary trees, etc. Exploring possible scenarios in games such as chess Recursion always consists of two main parts. A terminating case that indicates when the recursion will finish and a call to itself that must make progress towards the terminating case.
For example, this function will perform multiplication by recursively adding :
#include <stdio.h>
unsigned int multiply(unsigned int x, unsigned int y)
{
if (x == 1)
{
/* Terminating case */
return y;
}
else if (x > 1)
{
/* Recursive step */
return y + multiply(x-1, y);
}
/* Catch scenario when x is zero */
return 0;
}
int main() {
printf("3 times 5 is %d", multiply(3, 5));
return 0;
}
Linked lists
Introduction
Linked lists are the best and simplest example of a dynamic data structure that uses pointers for its implementation. However, understanding pointers is crucial to understanding how linked lists work, so if you’ve skipped the pointers tutorial, you should go back and redo it. You must also be familiar with dynamic memory allocation and structures.
Essentially, linked lists function as an array that can grow and shrink as needed, from any point in the array.
Linked lists have a few advantages over arrays:
Items can be added or removed from the middle of the list There is no need to define an initial size However, linked lists also have a few disadvantages:
There is no “random” access - it is impossible to reach the nth item in the array without first iterating over all items up until that item. This means we have to start from the beginning of the list and count how many times we advance in the list until we get to the desired item. Dynamic memory allocation and pointers are required, which complicates the code and increases the risk of memory leaks and segment faults. Linked lists have a much larger overhead over arrays, since linked list items are dynamically allocated (which is less efficient in memory usage) and each item in the list also must store an additional pointer.
What is a linked list?
A linked list is a set of dynamically allocated nodes, arranged in such a way that each node contains one value and one pointer. The pointer always points to the next member of the list. If the pointer is NULL, then it is the last node in the list.
A linked list is held using a local pointer variable which points to the first item of the list. If that pointer is also NULL, then the list is considered to be empty.
------------------------------ ------------------------------
| | | \ | | |
| DATA | NEXT |--------------| DATA | NEXT |
| | | / | | |
------------------------------ ------------------------------
Let’s define a linked list node:
typedef struct node {
int val;
struct node * next;
} node_t;
Notice that we are defining the struct in a recursive manner, which is possible in C. Let’s name our node type node_t.
Now we can use the nodes. Let’s create a local variable which points to the first item of the list (called head).
node_t * head = NULL;
head = (node_t *) malloc(sizeof(node_t));
if (head == NULL) {
return 1;
}
head->val = 1;
head->next = NULL;
We’ve just created the first variable in the list. We must set the value, and the next item to be empty, if we want to finish populating the list. Notice that we should always check if malloc returned a NULL value or not.
To add a variable to the end of the list, we can just continue advancing to the next pointer:
node_t * head = NULL;
head = (node_t *) malloc(sizeof(node_t));
head->val = 1;
head->next = (node_t *) malloc(sizeof(node_t));
head->next->val = 2;
head->next->next = NULL;
This can go on and on, but what we should actually do is advance to the last item of the list, until the next variable will be NULL.
Iterating over a list
Let’s build a function that prints out all the items of a list. To do this, we need to use a current pointer that will keep track of the node we are currently printing. After printing the value of the node, we set the current pointer to the next node, and print again, until we’ve reached the end of the list (the next node is NULL).
void print_list(node_t * head) {
node_t * current = head;
while (current != NULL) {
printf("%d\n", current->val);
current = current->next;
}
}
Adding an item to the end of the list
To iterate over all the members of the linked list, we use a pointer called current. We set it to start from the head and then in each step, we advance the pointer to the next item in the list, until we reach the last item.
void push(node_t * head, int val) {
node_t * current = head;
while (current->next != NULL) {
current = current->next;
}
/* now we can add a new variable */
current->next = (node_t *) malloc(sizeof(node_t));
current->next->val = val;
current->next->next = NULL;
}
The best use cases for linked lists are stacks and queues, which we will now implement:
Adding an item to the beginning of the list (pushing to the list)
To add to the beginning of the list, we will need to do the following:
- Create a new item and set its value
- Link the new item to point to the head of the list
- Set the head of the list to be our new item
This will effectively create a new head to the list with a new value, and keep the rest of the list linked to it.
Since we use a function to do this operation, we want to be able to modify the head variable. To do this, we must pass a pointer to the pointer variable (a double pointer) so we will be able to modify the pointer itself.
void push(node_t ** head, int val) {
node_t * new_node;
new_node = (node_t *) malloc(sizeof(node_t));
new_node->val = val;
new_node->next = *head;
*head = new_node;
}
Removing the first item (popping from the list)
To pop a variable, we will need to reverse this action:
- Take the next item that the head points to and save it
- Free the head item
- Set the head to be the next item that we’ve stored on the side
Here is the code:
int pop(node_t ** head) {
int retval = -1;
node_t * next_node = NULL;
if (*head == NULL) {
return -1;
}
next_node = (*head)->next;
retval = (*head)->val;
free(*head);
*head = next_node;
return retval;
}
Removing the last item of the list
Removing the last item from a list is very similar to adding it to the end of the list, but with one big exception - since we have to change one item before the last item, we actually have to look two items ahead and see if the next item is the last one in the list:
int remove_last(node_t * head) {
int retval = 0;
/* if there is only one item in the list, remove it */
if (head->next == NULL) {
retval = head->val;
free(head);
return retval;
}
/* get to the second to last node in the list */
node_t * current = head;
while (current->next->next != NULL) {
current = current->next;
}
/* now current points to the second to last item of the list, so let's remove current->next */
retval = current->next->val;
free(current->next);
current->next = NULL;
return retval;
}
Removing a specific item
To remove a specific item from the list, either by its index from the beginning of the list or by its value, we will need to go over all the items, continuously looking ahead to find out if we’ve reached the node before the item we wish to remove. This is because we need to change the location to where the previous node points to as well.
Here is the algorithm:
Iterate to the node before the node we wish to delete Save the node we wish to delete in a temporary pointer Set the previous node’s next pointer to point to the node after the node we wish to delete Delete the node using the temporary pointer There are a few edge cases we need to take care of, so make sure you understand the code.
int remove_by_index(node_t ** head, int n) {
int i = 0;
int retval = -1;
node_t * current = *head;
node_t * temp_node = NULL;
if (n == 0) {
return pop(head);
}
for (i = 0; i < n-1; i++) {
if (current->next == NULL) {
return -1;
}
current = current->next;
}
if (current->next == NULL) {
return -1;
}
temp_node = current->next;
retval = temp_node->val;
current->next = temp_node->next;
free(temp_node);
return retval;
}
Binary trees
Introduction
A Binary Tree is a type of data structure in which each node has at most two children (left child and right child). Binary trees are used to implement binary search trees and binary heaps, and are used for efficient searching and sorting. A binary tree is a special case of a K-ary tree, where k is 2. Common operations for binary trees include insertion, deletion, and traversal. The difficulty of performing these operations varies if the tree is balanced and also whether the nodes are leaf nodes or branch nodes. For balanced trees the depth of the left and right subtrees of every node differ by 1 or less. This allows for a predictable depth also known as height. This is the measure of a node from root to leaf, where root is 0 and sebsequent nodes are (1,2..n). This can be expressed by the integer part of log2(n) where n is the number of nodes in the tree.
g s 9
/ \ / \ / \
b m f u 5 13
/ \ / \ / \
c d t y 11 15
The operations performed on trees requires searching in one of two main ways: Depth First Search and Breadth-first search. Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. One starts at the root and explores as far as possible along each branch before backtracking. There are three types of depth first search traversal: pre-order visit, left, right, in-order left, visit, right, post-order left, right, visit. Breadth-first search (BFS) is an algorithm for traversing or searching tree or graph structures. In level-order, where we visit every node on a level before going to a lower level.
Unions
C Unions are essentially the same as C Structures, except that instead of containing multiple variables each with their own memory a Union allows for multiple names to the same variable. These names can treat the memory as different types (and the size of the union will be the size of the largest type, + any padding the compiler might decide to give it)
So if you wanted to be able to read a variable’s memory in different ways, for example read an integer one byte at a time, you could have something like this:
union intParts {
int theInt;
char bytes[sizeof(int)];
};
Allowing you to look at each byte individually without casting a pointer and using pointer arithmetic:
union intParts parts;
parts.theInt = 5968145; // arbitrary number > 255 (1 byte)
printf("The int is %i\nThe bytes are [%i, %i, %i, %i]\n",
parts.theInt, parts.bytes[0], parts.bytes[1], parts.bytes[2], parts.bytes[3]);
// vs
int theInt = parts.theInt;
printf("The int is %i\nThe bytes are [%i, %i, %i, %i]\n",
theInt, *((char*)&theInt+0), *((char*)&theInt+1), *((char*)&theInt+2), *((char*)&theInt+3));
// or with array syntax which can be a tiny bit nicer sometimes
printf("The int is %i\nThe bytes are [%i, %i, %i, %i]\n",
theInt, ((char*)&theInt)[0], ((char*)&theInt)[1], ((char*)&theInt)[2], ((char*)&theInt)[3]);
Combining this with a structure allows you to create a “tagged” union which can be used to store multiple different types, one at a time.
For example, you might have a “number” struct, but you don’t want to use something like this:
struct operator {
int intNum;
float floatNum;
int type;
double doubleNum;
};
Because your program has a lot of them and it takes a bit too much memory for all of the variables, so you could use this:
struct operator {
int type;
union {
int intNum;
float floatNum;
double doubleNum;
} types;
};
Like this the size of the struct is just the size of the int type + the size of the largest type in the union (the double). Not a huge gain, only 8 or 16 bytes, but the concept can be applied to similar structs.
use:
operator op;
op.type = 0; // int, probably better as an enum or macro constant
op.types.intNum = 352;
Also, if you don't give the union a name then it's members are accessed directly from the struct:
struct operator {
int type;
union {
int intNum;
float floatNum;
double doubleNum;
}; // no name!
};
operator op;
op.type = 0; // int
// intNum is part of the union, but since it's not named you access it directly off the struct itself
op.intNum = 352;
Another, perhaps more useful feature, is when you always have multiple variables of the same type, and you want to be able to use both names (for readability) and indexes (for ease of iteration), in that case you can do something like this:
union Coins {
struct {
int quarter;
int dime;
int nickel;
int penny;
}; // anonymous struct acts the same way as an anonymous union, members are on the outer container
int coins[4];
};
In that example you can see that there is a struct which contains the four (common) coins in the United States.
since the union makes the variables share the same memory the coins array matches with each int in the struct (in order):
union Coins change;
for(int i = 0; i < sizeof(change) / sizeof(int); ++i)
{
scanf("%i", change.coins + i); // BAD code! input is always suspect!
}
printf("There are %i quarters, %i dimes, %i nickels, and %i pennies\n",
change.quarter, change.dime, change.nickel, change.penny);
Pointer Arithmetics
You previously learned what is a pointer and how to manipulate pointers. In this tutorial you will be learning the arithmetic operations on pointers. There are multiple arithmetic operations that can be applied on C pointers: ++, –, -, +
Incrementing a Pointer with (++)
Just like any variable the ++ operation increases the value of that variable. In our case here the variable is a pointer hence when we increase its value we are increasing the address in the memory that pointer points to. Let’s combine this operation with an array in our example:
#include <stdio.h>
int main()
{
int intarray[5] = {10,20,30,40,50};
int i;
for(i = 0; i < 5; i++)
printf("intarray[%d] has value %d - and address @ %x\n", i, intarray[i], &intarray[i]);
int *intpointer = &intarray[3]; //point to the 4th element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 4th element
intpointer++; //now increase the pointer's address so it points to the 5th elemnt in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 5th element
return 0;
}
Decreasing a Pointer with (–)
Just like in our previous example we increased the pointer’s pointed-to address by one using the ++ operator, we can decrease the address pointed-to by one using the decrement operator (–).
#include <stdio.h>
int main()
{
int intarray[5] = {10,20,30,40,50};
int i;
for(i = 0; i < 5; i++)
printf("intarray[%d] has value %d - and address @ %x\n", i, intarray[i], &intarray[i]);
int *intpointer = &intarray[4]; //point to the 5th element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 5th element
intpointer--; //now decrease the point's address so it points to the 4th element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 4th element
return 0;
}
Adding Pointers with (+)
We previously increased a pointer’s pointed-to address by one. We can also increase it by an integer value such:
#include <stdio.h>
int main()
{
int intarray[5] = {10,20,30,40,50};
int i;
for(i = 0; i < 5; i++)
printf("intarray[%d] has value: %d - and address @ %x\n", i, intarray[i], &intarray[i]);
int *intpointer = &intarray[1]; //point to the 2nd element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 2nd element
intpointer += 2; //now shift by two the point's address so it points to the 4th element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the addres of the 4th element
return 0;
}
Note how in the output the address shifted by 8 steps in the memory. You might be wondering why? The answer is simple: Because our pointer is an int-pointer and the size of an int variable is 4 bytes the memory is shift-able by 4 blocks. In our code we shifted by 2 (added +2) to the initial address so that makes them 2 x 4 byte = 8.
Subtracting Pointers with (-)
Similarly we can subtract:
#include <stdio.h>
int main()
{
int intarray[5] = {10,20,30,40,50};
int i;
for(i = 0; i < 5; i++)
printf("intarray[%d] has value: %d - and address @ %x\n", i, intarray[i], &intarray[i]);
int *intpointer = &intarray[4]; //point to the 5th element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 5th element
intpointer -= 2; //now shift by two the point's address so it points to the 3rd element in the array
printf("address: %x - has value %d\n", intpointer, *intpointer); //print the address of the 3rd element
return 0;
}
again the address is shifted by blocks of 4bytes (in case of int).
Other Operations
There are more operations such as comparison >, <, ==. The idea is very similar of comparing variables, but in this case we are comparing memory address.
Function Pointers
Remember pointers? We used them to point to an array of chars then make a string out of them. Then things got more interesting when we learned how to control these pointers. Now it is time to do something even more interesting with pointers, using them to point to and call functions.
Why point to a function?
The first question that may come to your mind is why would we use pointers to call a function when we can simply call a function by its name: function(); - that’s a great question! Now imagine the sort function where you need to sort an array. Sometimes you want to order array elements in an ascending order or descending order. How would you choose? Function pointers!
Function Pointer Syntax
void (*pf)(int);
I agree with you. This definitely is very complicated, or so you may think. Let’s re-read that code and try to understand it point by point. Read it inside-out. *pf is the pointer to a function. void is the return type of that function, and finally int is the argument type of that function. Got it? Good.
Let’s insert pointers into the function pointer and try to read it again:
char* (*pf)(int*)
Again: 1. *pf is the function pointer. 2. char* is the return type of that function. 3. int* is the type of the argument.
Ok enough with theory. Let’s get our hands dirty with some real code. See this example:
#include <stdio.h>
void someFunction(int arg)
{
printf("This is someFunction being called and arg is: %d\n", arg);
printf("Whoops leaving the function now!\n");
}
main()
{
void (*pf)(int);
pf = &someFunction;
printf("We're about to call someFunction() using a pointer!\n");
(pf)(5);
printf("Wow that was cool. Back to main now!\n\n");
}
Remember sort() we talked about earlier? We can do the same with it. Instead of ordering a set in an ascending way we can do the opposite using our own comparison function as follows:
#include <stdio.h>
#include <stdlib.h> //for qsort()
int compare(const void* left, const void* right)
{
return (*(int*)right - *(int*)left);
// go back to ref if this seems complicated: http://www.cplusplus.com/reference/cstdlib/qsort/
}
main()
{
int (*cmp) (const void* , const void*);
cmp = &compare;
int iarray[] = {1,2,3,4,5,6,7,8,9};
qsort(iarray, sizeof(iarray)/sizeof(*iarray), sizeof(*iarray), cmp);
int c = 0;
while (c < sizeof(iarray)/sizeof(*iarray))
{
printf("%d \t", iarray[c]);
c++;
}
}
Let’s remember again. Why do we use function pointers? 1. To allow programmers to use libraries for different usages -> “Flexibility”
Bitmasks
Bit masking is simply the process of storing data truly as bits, as opposed to storing it as chars/ints/floats. It is incredibly useful for storing certain types of data compactly and efficiently.
The idea for bit masking is based on boolean logic. For those not familiar, boolean logic is the manipulation of ’true’ (1) and ‘false’ (0) through logical operations (that take 0s and 1s as their argument). We are concerned with the following operations:
- NOT a - the final value is the opposite of the input value (1 -> 0, 0 -> 1)
- a AND b - if both values are 1, the final value is 1, otherwise the final value is 0
- a OR b - if either value is 1, the final value is 1, otherwise the final value is 0
- a XOR b - if one value is 1 and the other value is 0, the final value is 1, otherwise the final value is 0
In computing, one of these true/false values is a bit. Primitives in C (int, float, etc) are made up of some number of bits, where that number is a multiple of 8. For example, an int may be at least 16 bits in size, where a char may be 8 bits. 8 bits is typically referred to as a byte. C guarantees that certain primitives are at least some number of bytes in size. The introduction of stdint.h in C11 allows the programmer to specify integer types that are exactly some number of bytes, which is extremely useful when using masks.
Bit masks are often used when setting flags. Flags are values that can be in two states, such as ‘on/off’ and ‘moving/stationary’.
Setting bit n
Setting bit n is as simple as ORing the value of the storage variable with the value 2^n.
storage |= 1 << n;
As an example, here is the setting of bit 3 where storage is a char (8 bits):
01000010 OR 00001000 == 01001010
The 2^n logic places the ‘1’ value at the proper bit in the mask itself, allowing access to that same bit in the storage variable.
Clearing bit n
Clearing bit n is the result of ANDing the value of the storage variable with the inverse (NOT) of the value 2^n:
storage &= ~(1 << n);
Here’s the example again:
01001010 AND 11110111 == 01000010
Flipping bit n
Flipping bit n is the result of XORing the value of the storage variable with 2^n:
storage ^= 1 << n;
01000010 01001010 XOR XOR 00001000 00001000 == == 01001010 01000010
Checking bit n
Checking a bit is ANDing the value of 2^n with the bit storage:
bit = storage & (1 << n);
01000010 01001010 AND AND 00001000 00001000 == == 00000000 00001000