The “unsigned” Conundrum

Home > C++ > The “unsigned” Conundrum

The “unsigned” Conundrum

October 16, 2016 bulldozer00 Leave a comment Go to comments

A few weeks ago, CppCon16 conference organizer Jon Kalb gave a great little lightning talk titled “unsigned: A Guideline For Better Code“. Right up front, he asked the audience what they thought this code would print out to the standard console:

$mixed-mode-math$

Even though -1 is obviously less 1, the program prints out “a is not less than b“. WTF?

The reason for the apparently erroneous result is due to the convoluted type conversion rules inherited from C regarding unsigned/signed types.

Before evaluating the (a < b) expression, the rules dictate that the signed int object, a, gets implicitly converted to an unsigned int type. For an 8 bit CPU, the figure below shows how the bit pattern 0xFF is interpreted differently by C/C++ compilers depending upon how it is declared:

Thus, after the implicit type conversion of a from -1 to 255, the comparison expression becomes (255 < 1) – which produces the “a is not less than b” output.

Since it’s unreasonable to expect most C++ programmers to remember the entire arcane rule set for implicit conversions/promotions, what heuristic should programmers use to prevent nasty unsigned surprises like Mr. Kalb’s example? Here is his list of initial candidates:

If you’re trolling this post and you’re a C++ hater, then the first guideline is undoubtedly your choice :). If you’re a C++ programmer, the second two are pretty much impractical – especially since unsigned (in the form of size_t) is used liberally throughout the C++ standard library. (By the way, I once heard Bjarne Stroustrup say in a video talk that requiring size_t to be unsigned was a mistake). The third and fourth guidelines are reasonable suggestions; and those are the ones I use in writing my own code and reviewing the code of others.

At the end of his interesting talk, Mr. Kalb presented his own guideline:

I think Jon’s guideline is a nice, thoughtful addition to the last two guidelines on the previous chart. I would like to say that “Don’t use “unsigned” for quantities” subsumes those two, but I’m not sure it does. What do you think?

Categories: C++ Tags: Bjarne Stroustrup, Jon Kalb, unsigned

Comments (25) Trackbacks (0) Leave a comment Trackback

syaghmour

October 16, 2016 at 12:54 pm

Reply

I agree it was a thoughtful talk, although the fact that we have size_t everywhere does make the advice a bit troublesome. Chandler’s undefined behavior talk made a more interesting case. Around 40 minutes in: https://youtu.be/yG1OZ69H_-o?t=39m41s he gives an interesting example where using an unsigned type as an index results in an optimization pessimization. This is more convincing case but I had a problems godbolting an idiomatic example using containers and size_t or uint32_t that had a similar issue.
Vittorio Romeo

October 16, 2016 at 3:54 pm

Reply

Typo: it’s “Jon Kalb”, not “John Kalb”.
- bulldozer00
  
  October 16, 2016 at 5:11 pm
  
  Reply
  
  I fixed it. Thanks 🙂
  - JonKalb
    
    October 19, 2016 at 2:03 pm
    
    I think you missed it in one case: “I think John’s guideline…” and in the tags.
  - bulldozer00
    
    October 19, 2016 at 8:02 pm
    
    Fixed ’em. 🙂
pip010 (@ppetrovdotnet)

October 17, 2016 at 6:06 am

Reply

implicit casts, the root of all evil 😉
- JonKalb
  
  October 19, 2016 at 2:06 pm
  
  Reply
  
  Not all evil, just a lot of evil. 🙂
fisherro

October 18, 2016 at 6:19 pm

Reply

I’ve been playing with foonathan’s type_safe library, which promises to fix both the mixing of unsigned/signed and the pessimistic optimization problems.
Brolloks
October 19, 2016 at 2:01 am

Reply
I follow the rule of never using unsigned types (unless you absolutely have to). The biggest problem with this is when you have to call the size() member of the standard containers and you get back a size_t. To get around this problem I use a simple template function to cast it to the signed type I want. So instead of using the ugly:
```
 static_cast(myvec.size()) 
```
I use
```
 size(myvec) 
```
which (at least to me) is slightly less ugly.
- Brolloks
  
  October 19, 2016 at 2:08 am
  
  Reply
  
  Looks like the commenting system messed up the templates… it should have the type int after static_cast, and also after my size function… I give up.
- Brolloks
  
  October 19, 2016 at 7:39 am
  
  Reply
  
  Let me try again: Instead of
  static_cast<int>(myvec.size())
  I use
  size<int>(myvec)
Eelis

October 19, 2016 at 6:03 am

Reply

Q1. “What does this code do?”
A1. Trigger a warning.

Q2. “What guideline will help?”
A2. Compile with warnings enabled.
- __vic
  
  October 19, 2016 at 7:13 am
  
  Reply
  
  +1
  
  It’s a stupid advise to use signed ints instead of unsigned ones everywhere. Just look at Java: https://blogs.oracle.com/darcy/entry/unsigned_api. After 20 years they added some clumsy support for unsigned integers.
  
  The rule is straightforward: use unsigned integers where you need non-negative integer values. But don’t mix them in the one expression – use explicit casts as appropriate.
  - Brolloks
    
    October 19, 2016 at 7:59 am
    
    Over the years I have been burnt with unsigned just too many times… not worth it for that one extra bit. So now unless I am doing something like bit manipulation I find it safer to just avoid unsigned altogether.
  - __vic
    
    October 19, 2016 at 8:25 am
    
    It is not just “one extra bit”. It is a natural representation for object size. In 16-bit address space you need either uint16_t or int32_t which is a double word on this arch.
    
    Second, it is stupid to write thing like this:
    
    void f(int arg);
    
    void f(int arg)
    {
    if(arg < 0) throw std::domain_error(“The argument must be non-negative”);
    …
    }
    
    When you must state it directly in the interface of the function:
    
    void f(unsigned arg);
  - __vic
    
    October 19, 2016 at 8:35 am
    
    When you must state it directly
    Please read as
    “When you can state it directly”
  - Brolloks
    
    October 19, 2016 at 8:39 am
    
    I actually prefer your f(int) to this:
    void f(unsigned arg) { if(arg > (UINT_MAX>>1)) throw std::domain_error(“Wow, that seems a bit high. Are you sure your calculation did not wrap into negative”); … }
  - skelband
    
    October 19, 2016 at 12:28 pm
    
    I agree with __vic. size_t is exactly what it says on the tin: the size of an object in memory. The make it signed is a total perversion of that concept.
    I’ve seen so many awful examples of mixing signed and unsigned and there lies the real danger.
    I do agree with the logic of not doing arithmetic on size_t values though. It makes no semantic sense to do so, which is why we have things like ptrdiff_t.
- Bitbeisser
  
  October 19, 2016 at 10:55 am
  
  Reply
  
  +1
  The compiler should already bark at the illegal assignment of the negative value.
JonKalb

October 19, 2016 at 2:14 pm

Reply

Bulldozer00, Thanks for the posting.

Eelis and __vic, I encourage you to watch the video (it is only six minutes). It may not change your minds, but I feel you owe it to yourselves to at least understand the arguments.

Brolloks, I really appreciated your comment (with the domain_error), but it looks like I can’t reply directly to it (I assume because it is too far down the hierarchy.

Jon
- JonKalb
  
  October 19, 2016 at 9:35 pm
  
  Reply
  
  It seems that I got the wrong link:
- __vic
  
  October 20, 2016 at 2:12 pm
  
  Reply
  
  Watched, thanks. Still find the argumentation doubtful.
  
  Taking the opportunity I would like to thank you for your talks/writings about exceptions. They were really eye-opening some time for me 🙂
  - JonKalb
    
    October 20, 2016 at 4:36 pm
    
    __vic, Thank you for the kind words.
- bulldozer00
  
  October 20, 2016 at 4:42 pm
  
  Reply
  
  Ur welcome. Thanks for the interesting lightning talk Jon!
Robert Ramey

October 19, 2016 at 3:21 pm

Reply

For the definitive solution to this, checkout the safe numerics project at the Boost Library Incubator – http://www.blincubator.com