C++ Tutorial - Data Types

13 Feb 2013

Introduction

In the previous tutorial, we discussed how to use functions to make your code more modular and reusable. In this tutorial, we are going to talk about the different data types that exist in C++ and when to use each one.

Integer Values

When we write the following:

int num = 0;

We are telling the computer that we want to store some information. This information that we are storing, we want to give it a name of num, so that if we ever want to access it again, we can just tell the computer to get the information that we have referred to as num. We are also telling the computer that the type of information that we want to store is an integer. Now, what exactly does that tell the raspberry pi? From a mathematical standpoint, integers can range from negative infinity all the way to positive infinity, so can this store any integer value?

The answer is a resounding no. By telling the raspberry pi that we want to store an integer, we are in fact telling it that we want to set aside 32 bits to store that number in. With 32 bits, we can store the numbers from -2^31 to 2^31 - 1, or -2,147,483,648 to 2,147,483,647, which is much more limited than mathematical idea of an integer.

So, what are we supposed to do if we want to store values outside of this range? Well, we can tell the raspberry pi to set aside more bits to store the number in. The next larger size is a long long, and it is 64 bits, which allows for storage of numbers from -2^63 to 2^63 - 1, or 9,223,372,036,854,775,808 to 9,223,372,036,854,775,807.

Sometimes, you might need numbers even larger still. One example would be if you wanted to compute 52 factorial. That’s much larger than what can be stored in 64 bits. Well, there’s nothing built into C++ to handle it but others have written custom libraries to handle even larger numbers.

What if I know that there is no way that my number will ever get as large as an integer can store? One example would be the guess my number game where the numbers were limited to be between 1 and 10. In this case, we could use a short instead, which is 16 bits, or even a char, which is 8 bits. Additionally, if we know that the number is only going to be positive, than we can use an unsigned version of any of those data types. Here is a helpful table showing the data type, the number of bits required to store the value, and the range of values that can be stored for integer types:

Name	Bits	Range of Values
char	8	-128 to 127 (-2^7 to 2^7-1)
unsigned char	8	0 to 255 (0 to 2^8-1)
short	16	-32,768 to 32,767 (-2^15 to 2^15-1)
unsigned short	16	0 to 65,535 (0 to 2^16-1)
int	32	-2,147,483,648 to 2,147,483,647 (-2^31 to 2^31-1)
unsigned int	32	0 to 4,294,967,295 (0 to 2^32-1)
long long	64	9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (-2^63 to 2^63-1)
unsigned long long	64	0 to 18,446,744,073,709,551,615 (0 to 2^64-1)

Fractional Values

Now, up until this point, we have only talked about storing integers. What about if we need to store a fractional value like say 1/4 = 0.25? For fractional values, there is what are called floating point values. Floating point values mimic scientific notation, meaning that there is a coefficient and an exponent:

c * 10^e

However, unlike scientific notation, the base for the exponent is 2 and not 10. Much like integers, they still have a certain range of values that they can store and take a certain number of bits of memory. The table looks like this:

Name	Total Bits	Coefficient Bits	Exponent Bits	Range
float	32	23	8	±1.5 × 10^−45 to ±3.4 × 10^38
double	64	52	11	±5.0 × 10^−324 to ±1.7 × 10^308

If you add up the coefficient bits and the exponent bits, you come up one bit short from the total number of bits, so where did it go? It stores the sign (positive or negative), and unlike the integer data types, there is no unsigned version.

The big thing to always remember about floating point data types is that they cannot store every value in their defined range! The reason why is pretty simple. Take the range from 0.1 to 0.2 for example. There are an infinite number of values in that range, and since we are storing the value in a finite number of bits (32 or 64), there is no way that we can store every possible value. This is a big contrast to integers, which do store every possible integer in their given range.

So, let’s go through an example:

double x = 3.0 / 10.0;

double y = 0.1 + 0.1 + 0.1;

double z = x - y;

cout << z << endl;

So, we are taking 3 / 10 = 0.3 and subtracting it from 0.1 + 0.1 + 0.1 = 0.3, so z should be 0.0, but running this code on the raspberry pi, we get:

-5.55112E-17

That’s close to zero, but it is definitely not zero. The take away is that floating point numbers aren’t exact. One consequence of this is that if you ever do comparisons with floating point numbers, then make sure that they are approximately equal (abs(x - y) < 1E-6), instead of absolutely equal (x - y == 0), because in cases like the one above, the absolutely equals check will fail, because -5.55112E-17 != 0.

Characters and Booleans

In the integers section, we talked about the char data type, but it has another use and that is to store letters. It does this by storing an integer value of the letter that it wishes to represent. For example, the number 65 maps to the upper case letter A. This mapping is done through the ASCII table. Therefore, the following two statements are equivalent:

char A = 65;

char A = 'A';

So, that’s how characters work. They are really just a number and then the raspberry pi uses the ASCII table to translate numbers into letters and vice versa.

The final data type is the bool. A bool just stores two values true and false. Unfortunately, since the smallest addressable unit of storage for the raspberry pi (or most other computers) is 8 bits, bools are 8 bits, even though they really only require 1 bit.

Numbers

Every time you write a number in a program, the compiler automatically turns that number into a data type. So, if you put the number 5 in your program (or any other integer), that will get converted into an int. To get the compiler to interpret it as a long, you can add an l to the end, so 5l would be a long instead of an int.

The same thing goes with fractional values. If you put the number 5.2 or 5.0 in your program, then the compiler will interpret it as a double. If you want the compiler to treat it as a float, then you can add an f to the end, so 5.0f would be a float.

With the latest C++ standard, you can now use the auto keyword. This tells the compiler that you want it to automatically determine the data type to use. So, the following two statements are equivalent:

int x = 5;

auto x = 5;

Now auto really isn’t magic, since it knows that you mean an int when you just write an integer, it just creates a variable of that type. If it can’t figure out what you want, then it throws an error message and you have to specify the type. Also, variables can’t change type, so the following code doesn’t work like you would expect it to:

auto x = 5;

x = 6.5;

At the end, x = 6, because x is an int and can only store integers.

Summary

In this tutorial, we discussed the various data types that exist in C++ and the range of values that they can store. I’ll admit that in practice, most programmers just use an int if they ever want to store an integer and double if they ever want to store a fractional value. However, it is very important to know what the limitations of these data types are, in case you ever run up against it.

In the next tutorial, we will be discussing arrays, which are a way to store many related values (like a matrix), and we are going to create a simple maze game.

If you have any questions or comments about what was covered here, post them to the comments. I watch them closely and will respond and try to help you out.