unsigned vs. long, an integer odyssey

Note: This entry has been restored from old archives.

It’s probably no surprise that I’m a little hyper-aware of secure coding of late. Sure, I’ve got the certification – but there’s still so much I don’t know. One thing I note in much of my own code and most code I’ve read is that there is little regard for integer errors. I’d take a stab that this is the most common coding error out there.

Of particular note is the problem of assigning the result of an operation between two variables to a variable of a different type. For example (unsigned) int versus (signed) long. Intuitively you’d think something like “long has more bits so it’s safe to assign an unsigned int to a signed long.” That could result in the following code:

unsigned aardvarkCageCount();
unsigned aardvarkCount();

int main(void) {
    ...
    long x = aardvarkCount();
    x -= aardvarkCageCount();
    ...
}

Our assumption is that because the functions called return an unsigned int that it is safe to put this into a signed long because a long‘s gotta have at least one more bit than an int, right? Yeah?

It is best to never make assumptions. The C99 standard defines only minimum values for integer bounds, by the standard the relevant bounds are:

Bound
Value
UINT_MAX: 65,535
LONG_MAX: 2,147,483,647

Hey, looks pretty safe doesn’t it? Recall that I said these are minimum bounds! Compiler writers can do whatever they like as far as maximum values are concerned. Compile and run this code:

#include <limits.h>
#include <stdio.h>

int main(void) {
    printf("UINT_MAX: %u (%d bytes)n", UINT_MAX, sizeof(unsigned));
    printf("LONG_MAX: %ld (%d bytes)n", LONG_MAX, sizeof(long));
    return 0;
}

For me:

:; gcc stdint.c -o stdint
:; ./stdint 
UINT_MAX: 4294967295 (4 bytes)
LONG_MAX: 2147483647 (4 bytes)
:;

Oh oh! An unsigned int can be larger than a signed long! So the assumption that the latter will be big enough to hold the former is wrong! Unfortunately this is an intuitive “easy fix” for those worrying about overflow, it seems good at the time, in theory. But the loose specification for int is that it represents the most natural value size for the underlying hardware, so 32 bits on most machines with which we’ve grown, with growing popularity for 64 bits. We’ll get to the relationship between int and long shortly.

#include <limits.h>
#include <stdio.h>

/* We have four billion aardvarks!  That's a lot of stew. */
unsigned aardvarkCount() { return 4000000000u; }
/* And only 20 cages. */
unsigned aardvarkCageCount() { return 20; }

int main(void) {
    long x = aardvarkCount();
    printf("There are %ld aardvarks.n", x);
    x -= aardvarkCageCount();
    printf("Aardvarks minus cages is %ld.n", x);
    return 0;
}

Gives:

:; gcc aardvark.c -Wall -o aardvark
:; ./aardvark 
There are -294967296 aardvarks.
Aardvarks minus cages is -294967316.

No surprise. Surely we can get the compiler to tell us when we’re doing something stupid like this?

:; gcc aardvark.cc -Wall -Wextra -ansi -pedantic -o aardvark
:;

No luck. As far as I can tell from scanning man gcc there isn’t an option that’s going to work for us. Some might be thinking, “what about -ftrapv?!” Sorry, not gonna help. The -ftrapv option in gcc only works when both arguments to arithmetic operations are signed – technically it works for overflow but not wrap. If we compile this code with -ftrapv it runs fine. For an example of how not-so-useful this can be:

:; cat ftrapv.c 
#include <stdio.h>
int main(void) {
    long x = 2000000000;
    unsigned y = 2000000000;
    long z = x + y;
    printf("z = %ldn", z);
    return 0;
}
:; gcc ftrapv.c -Wall -Wextra -ftrapv -o ftrapv
:; ./ftrapv 
z = -294967296
:;

To make -ftrapv do something the arguments to arithmetic operations must both be signed. As in:

:; cat ftrapv.c 
#include <stdio.h>
int main(void) {
    long x = 2000000000;
    long y = 2000000000;
    long z = x + y;
    printf("z = %ldn", z);
    return 0;
}
:; gcc ftrapv.c -Wall -Wextra -ftrapv -o ftrapv
:; ./ftrapv 
Aborted (core dumped)
:;

Unfortunately, as far as I can see, even then that’s all we can do. Aborting isn’t entirely useful, but I guess it is better than continuing on in an undefined state! Furthermore -ftrapv turns signed arithmetic operations into a function call, if we look at the disassembly of the generated code we see:

...
080483a4 <main>:
...
 80483b5: c7 45 f0 00 94 35 77 movl   $0x77359400,-0x10(%ebp)
 80483bc: c7 45 f4 00 94 35 77 movl   $0x77359400,-0xc(%ebp)
 80483c3: 8b 45 f4             mov    -0xc(%ebp),%eax
 80483c6: 89 44 24 04          mov    %eax,0x4(%esp)
 80483ca: 8b 45 f0             mov    -0x10(%ebp),%eax
 80483cd: 89 04 24             mov    %eax,(%esp)
 80483d0: e8 2b 00 00 00       call   8048400 <__addvsi3>
...
...
08048400 <__addvsi3>:
 80483f0: 55                 push  %ebp
 80483f1: 89 e5              mov   %esp,%ebp
 80483f3: 53                 push  %ebx
 80483f4: 83 ec 04           sub   $0x4,%esp
 80483f7: 8b 45 0c           mov   0xc(%ebp),%eax           # arg2
 80483fa: 8b 4d 08           mov   0x8(%ebp),%ecx           # arg1
 80483fd: e8 2a 00 00 00     call  804842c <__i686.get_pc_thunk.bx>
 8048402: 81 c3 e2 11 00 00  add   $0x11e2,%ebx
 8048408: 85 c0              test  %eax,%eax                # set SF if arg2 < 0
 804840a: 8d 14 01           lea   (%ecx,%eax,1),%edx       # 'lea' trick for arg1 + arg2
 804840d: 78 11              js    8048420 <__addvsi3+0x30> # if arg1 < 0 goto cmp below
 804840f: 39 d1              cmp   %edx,%ecx                # else compare result and arg1
 8048411: 0f 9f c0           setg  %al                      # %al = 1 if arg1 < result
 8048414: 84 c0              test  %al,%al
 8048416: 75 0f              jne   8048427 <__addvsi3+0x37> # if %al == 0 jump to abort!
 8048418: 83 c4 04           add   $0x4,%esp
 804841b: 89 d0              mov   %edx,%eax
 804841d: 5b                 pop   %ebx
 804841e: 5d                 pop   %ebp
 804841f: c3                 ret   
 8048420: 39 d1              cmp   %edx,%ecx                 # compare result and arg1
 8048422: 0f 9c c0           setl  %al                       # %al = 1 if arg1 > result
 8048425: eb ed              jmp   8048414 <__addvsi3+0x24>  # jump to test above
 8048427: e8 b0 fe ff ff     call  80482dc <abort@plt>
...

Ick! That’s a fair bit of code for an addition! A more readable representation of __addvsi3 would be:

int add(int a, int b) {
    int good = 0;
    int res = a + b;
    if ((b < 0) && (a > res)) good = 1;
    else if (a < res) good = 1;
    if (good == 0) abort();
    return res;
}

(Sorry about the iffy one-liners.)

Before you think “but what about -O[123]!?” … Yep, they get rid of the call. The optimisation takes precedence, since the code is adding two simple constants the optimiser does the addition at compile time. In this way optimisation reduced the effectiveness of -ftrapv. It doesn’t disable it entirely though, a simple trick is to set one of the variables volatile and do an optimised compile. In this case you’ll observe that the call to __addvsi3 is included despite optimisation. This is heading towards another can of worms now though, how often does optimisation really matter for most software? I’ll save the rest of this line of thought for another day…

Anyway, this is all beside the point as our original problem involved assignment with mixed signed values. I.e. it boils down to this:

    long x = 4000000000u;
    printf("z = %ldn", x);

We’ve failed to find any -W flags or any other trap/exception for gcc that will help us out here (or g++ for that matter, my original examples were all C++ but I decided to “purify” them.)

What to do? I think the answer is: don’t make the mistake in the first place! Be aware of the limits, and the limitations of the limits. When necessary always check the values, but try not to get into problematic situations in the first place. If you can afford the overhead use an arithmetic library that deals with the details for you (I’d guess that the majority of code can afford the overhead.) Adding your own boundary guards can be tricky work and I’d recommend sticking to safe alternatives or, at least, recipes from a reliable source.

Let us follow some “intuition” again. An approach is to first store the return values in local variables of the right type. Then test the values before trying to store the result in a long (yes, the example is rather contrived, live with it.)

int main(void) {
    long x = 0;
    unsigned a = aardvarkCount();
    unsigned c = aardvarkCageCount();
    if ((a - c) > LONG_MAX) {
        printf("Oops: %u - %u = %u!n", a, c, (a-c));
        printf("LONG_MAX=%ld LONG_MIN=%ldn", LONG_MAX, LONG_MIN);
        return 1;
    }
    x = a - c;
    printf("There are %u aardvarks.n", a);
    printf("Aardvarks minus cages is %ld.n", x);
    return 0;
}

This is only a guard for the “subtraction result too large to store in a long” case. It looks reasonable enough to cover this though, right? Nup! What happens when a is 1 and c is 2?

:; ./aardvark 
Oops: 1 - 2 = 4294967295!
LONG_MAX=2147483647 LONG_MIN=-2147483648

The problem here is that the (unsigned - unsigned) expression yields an unsigned value, thus we get integer wrap! So, we’re not there yet. We’d better guard out the wrap case:

    if ( ((a > c) && (a - c) > LONG_MAX) ) {
        printf("Oops: %u - %u = %u!n", a, c, (a-c));
        printf("LONG_MAX=%ld LONG_MIN=%ldn", LONG_MAX, LONG_MIN);
        return 1;
    }

Now, we also want to guard the case where (a - c) is is too small for LONG_MIN. Don’t go thinking this is good enough either:

    (a < c) && (c - a) < LONG_MIN) /* BAD! */

Remember that (a - c) will always yield an unsigned result, thus the second condition will always be false. Good thing about this case is that the compiler will actually warn you about the comparison between signed and unsigned values. (You always compile with warnings on, right?) We’ll need to rearrange the arithmetic.

    if ( ((a > c) && (a - c) > LONG_MAX) ||
         ((a < c) && (c > (LONG_MIN + a))) ) {
        printf("Oops: %u - %u = %u!n", a, c, (a-c));
        printf("LONG_MAX=%ld LONG_MIN=%ldn", LONG_MAX, LONG_MIN);
        return 1;
    }

This raises the question: “is (LONG_MIN + unsigned int) a safe expression?” We can look at section 6.3.1.1 of the C standard to answer this. The integer conversion rank definition states, in part, “The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.” (The precision is defined as the number of bits used to represent values.) The following ranking clause is “The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, …” These two clauses tell us, I think, that if an int is 64 bits then a long will also be 64 bits, which would be the primary concern here. Phew, we seem to be safe.

I’ll leave this here since I don’t have time to dig any deeper. There’s more to consider though! For example, what about non-two’s-complement representations, can we derive the same safety from the rules? What about performance optimisations? What else? I suggest you read the specification. It’s hard going! And reading is very different to understanding. Of course, understanding what is specified also requires understanding exactly what isn’t specified! I’m not even close myself…

The main point of this exercise is to indicate that the problem is a difficult one and the solutions are often non-obvious. For this reason I’d recommend the previously suggested path of relying on a 3rd party secure/safe arithmetic library. This offloads the worry to someone else, who’s hopefully better informed and also does their best to keep their solutions optimal. Personally, I’d prefer to let someone else solve the problem.

A good starting point for learning more yourself is the CERT Secure Coding Standard wiki section on integers (in C.) Then try to find a reasonable safe integer arithmetic library, I’m afraid I don’t have the experience to recommend any particular one. The ones documented in literature I’ve covered are for Windows (I’m most familiar with IntegerLib on the CERT site, it is a Visual Studio project and needs some tweaking to be compiled on Linux. I’ve done enough tweaking to make it compile for me, see note [1] below.) Alternatively (and perhaps most simply) change your integer representation completely by switching to one of the various arbitrary precision arithmetic libraries available. GMP, for example, has been around for ages, is multi-platform, and even has a C++ wrapper.


[1] I got the CERT IntegerLib compiling under Linux. In theory this is written by experts who got all the guards right… Unfortunately it was clearly written with only Windows in mind as there is no support for a non-Windows build. I’ve worked through the code and added the trivial changes and a build system that make it compile for me (up to date Ubuntu gutsy.) My mods to IntegerLib are here (or here for just a diff.) Don’t mistake “compiles” for “is correct” though, it has not gone through rigorous testing. You’ll need a typical C build environment (with all the autotools/libtool junk if you want mod and regenerate the configure stuff.) To compile enter IntegerLib/IntegerLib and do a ./configure then a make. If you’re lucky this will give you the test binary, run it to see the examples of how it works. Have a look at their IntegerLib.h file for a listing of the available methods. Note especially that correct use of their library requires that all parameters must be of the type defined in their API and the same goes for the variable used to store the return type. This code doesn’t directly solve the original problem: subtracting unsigned int values to a signed long result. The main problem is that the unsigned int subtraction may result in a value that is perfectly reasonable for a signed long but no good for an unsigned int. The library as it stands doesn’t cover arithmetic operations assigned to a different type (where the result may possibly be valid.)