Monday, 3 February 2014

When Python goes wrong: Off-by-one

Many niggling problems with Python were fixed in Python 3, but here's one that recently bit me in the expunged. It relates to rounding in format strings and so is a potential pitfall for anyone handling floats.

The thing is, "%f" rounds differently than "%d".
>>> "%.1f" % 1.56 ### I expect 1.6
'1.6'
>>> "%d" % 1.6    ### By analogy, I expect 2
'1'
The correct solution is either "%.0f" (yes - it exists) or explicitly round it yourself how you like, e.g. "%d" % int(x+0.5) or using round().

This was causing off-by-one results in my code which I fortunately identified prior to submitting a manuscript. But I'm disappointed, Python; and don't blame C.

Note: As a further issue, if you need to handle negative values also, doublecheck the rounding behaviour of your code...it may not always be what you expect.

6 comments:

remeznik said...

With a jump to python3 I've started to use "".format().
In this case, it gives me:
>>> a=1.56
>>> "{0:.0f} is rounded {0:.2f}".format(a)
'2 is rounded 1.56'

John Mayfield said...

Interesting in Java there's no compiler warning but it will panic with a runtime exception.

IllegalFormatConversionException: d != java.lang.Double

In C there is a compiler warning but it ploughs on regardless giving 1435806904 for 1.56.

Noel O'Boyle said...

@Nikita: Is that the preferred syntax now, rather than using %?

@John: What's the equivalent in Java, just for interest?

So some C convention isn't to blame. Hmmm...I don't get it. I never heard of %.0f before, but it seems that you need to know it even though it's counter-intuitive.

remeznik said...

Yes, I think now it's preferred syntax (and it works in 2.7)
Here you have some syntax and examples.
http://docs.python.org/3.2/library/string.html#format-specification-mini-language

John Mayfield said...

String.format("%.0f", 1.56);

or

System.out.printf("%.0f", 1.56);

String.format("%d", 1.56);

Will fail with with the exception.

Anonymous said...

This behavior of Python is perfectly consistent with the intention of how data is represented in most programming languages, including C and Java. The only thing "broken" in Python is that it allows a float value to be directly formatted into an integer, whereas the other languages actually expect any data formatted as %d to actually be integer data.

%d is intended to format integer data, and in all these computer languages, casting a float to an int is handled via truncation, not rounding. If you want to round a float to the nearest integer value, then %.0f is more appropriate to your intention.