This is the simplest ray-box test I was able to write:
The overall code seemed pretty slow so I tried some old more complicated code that stores the sign of the ray propagation directions as well as the inverses of the ray direction components:
This was a lot faster in the overall ray tracer so I started digging into it a bit. I wrote an isolation test that sends lots of rays from near the corner of a box in random directions (so a little less than 1/8 should hit). It occurred to me that the fmax and fmin might be slower due to their robust NaN handling and that does appear to be the case. Here is the test I used on my old macbook pro compiled with -O. Perhaps it gives a little too much amortization advantage to the long code because 100 trials for each ray, but maybe not (think big BVH trees).
The runtimes on my machine are:
traditional code: 4.3 s
using mymin/mymax: 5.4s
using fmin/fmax: 6.9s
As Tavian B points out, the properties of fmax and NaN can be used to programmer advantage, and vectorization might make system max functions a winner. But the gaps between all three functions were bigger than I expected.
Here is my code. If you try it on other compilers/machines please let us know what you see.