Wednesday, April 14, 2021

I am not a believer in trying to learn two things at once.

 I just spent this week learning to program with OpenGL in Python using the fabulous PyOpenGL.  I was barely able to do it.  I have written a whole textbook on computer graphics and it derives the OpenGL matrix stack and the viewing model conceptually.  The OpenGL graphics pipeline model, if I recall accurately, I was also barely able to learn!  I think there is a zero percent chance I could learn the ins and outs of calls to and behavior of PyOpenGL and the underlying conceptual model.

One reason I am pretty sure of that is that I once tried learning TensorFlow, Python, and neural nets.  It did not go well.  Once I was comfortable with Python and kind of understood neural nets, I tried again (about a year later) with PyTorch.  It was not pretty.  Finally, I implemented a neural net in C, then one in Python, from scratch.  It was painful, but I made it.  Barely.  (And thanks to some advice from Shalini Gupta, Rajesh Sharma, and Hector Lee).  Then used PyTorch and  managed to train a network from scratch.   Again, barely.

My empirical experience is that it takes a focused and concerted to learn anything really new.   And if it were any harder, I don't think I could do it.  And if I am trying to learn two things at once (particularly an API and the concepts/algorithms that the API is abstracting for me) then forget it.

Conclusion: I should never try to learn two things at once.  If you have trouble with that, break it down.  It may seem like it takes longer, but nothing is longer than never learning it!

Corollary:  when you see somebody quickly picking up packages and wondering why you don't, maybe you are just like me, or maybe they are really something special.  Either way, if you develop competence in a technical area, you are in the most fortunate 0.1% of humans, and take a bow.  You need to find a way that works for you.

Monday, December 7, 2020

Debugging refraction in a ray tracer

 Almost everybody has trouble with getting refraction right in a ray tracer.  I have a previous blog post on debugging refraction direction if you support boxes.  But usually your first ray tracers just supports spheres and when the picture just looks wrong and/or is black, what do you do?

So if you are more disciplined than I usually am, write a visualization tool that shows you in 3D the rays generated for a given pixel and you will often see what the problem is.  Like this super cool program for example.

But if you are a stone age person like me:

Pebbles Flintstone -- Evangelist! 

Then you have exactly two debugging tools: 1) printf() and 2) output various frame buffers.  For debugging refraction, I like #1.  First, create some model where you know exactly what the behavior of a ray and its descendants should be.  Real glass reflects and refracts.  Let's get refraction right.  So comment out any possibility of reflection.  A ray goes in, and refracts (or if that is impossible, prints that).

Now let's set up the simplest ray and single sphere possible.  The one I usually use is this:

 The viewing ray A from the camera starts at the eye and goes straight long the minus Z axis.  I assume here it is a unit length vector but it may not be depending on how you generate them.


How do you generate that ray?  You could hard-code it, or you could instrument your program to take a parameter or command line argument or whatever for which pixel to trace (like -x 250 -y 300 or whatever).  If you do that you may need to be careful to get the exact center-- like what are the pixel offsets?   That is why I usually just hard code it.  Then let the program recurse and make sure that you get:

A hits at Q which is point (0,0,1)

The surface normal vector at Q is (0,0,1)

The ray is refracted to create B with origin Q and direction (0,0,-1)

B hits at R is is point (0,0,-1)

The surface normal at R is (0,0,-1)

The ray is refracted to create C with origin R and direction (0,0,-1)

The ray C goes and computes a color H of the background in that direction. 

The recursion returns that color H all the way to the original caller

That will find 99% of bugs in my experience

Tuesday, November 17, 2020

How to write a paper when you suck at writing

Most papers are bad and almost unintelligible.  Most of my papers are bad and almost unintelligible.  The ubiquity of this is evidence that writing good papers must be hard.  So what am I to do about this if I suck at writing?  This is not false modesty; I do suck at writing.  But I am an engineer, so there must be a way to apply KISS here.  KISS works.  This blog post describes my current methodology to produce a paper.  My version of the famous Elmore Leonard quote is:

Concentrate on the parts most readers actually look at.

Your title, figures, and their captions should work as a stand-alone unit.  Write/make these first.  Doing this in the form of a Powerpoint presentation is a good method.  Explaining your paper to a friend using the white board and then photographing what evolves under Q&A is a good method.  Then ruthlessly edit.  Ask yourself: what is this figure conveying?  Each idea should have exactly one figure, and you should know exactly what each of those ideas is.

Let's do an example.  Suppose I were to write a paper on why it is better to feed your cats in the evening than in the morning.  First, you should have a figure on what is a cat, along with a caption.


File:Scheme cat anatomy-io.svg - Wikimedia Commons
This is a cat

Don't assume your reader knows much of anything (see caveat at end of article).  Now the figure above has details that are not relevant to the point, so you probably need a different one.  Getting exactly the right figure is an iterative process.

If my key reason for not feeding the cat is that in the morning they will wake me earlier each day, I need some figure related to that, such as:

Does Your Cat Wake You Up? - Cat Tree UK
The problem with a 6am feeding, is the cat starts thinking about it before 6am

(credit cattree)

Finally, you will need a figure that gets across the idea that feeding the cat in the evening works.*&output-format=auto&output-quality=auto
Feeding the cat before bedtime is effective

(credit buzzfeed)


So you now have an argument sequence where the reader can "get" the point of your paper.  Your actual paragraphs should make your case more airtight, but 90% of the battle is getting the reader to understand what you are even talking about.  "What is this?" should never be a struggle, but usually is.

For the paper writing itself, write to the figures.  First, add paragraphs that speak to the point of the figure and reference it.  Then, add paragraphs as needed to make the point convincing.  Each paragraph should have a definite purpose, and you should be able to say explicitly what that is.  The results section should convince the reader that, for example, cats do in fact behave better when fed in the evening.

Now, there is a caveat for peer-reviewed papers.  Reviewers often think that if something is clear, it must not be academic.  They will want you to omit the first figure.  This is an example of "you can't fight city hall".  But make things as clear as they will let you.   If this irritates you, suck it up, and do writing any way you want to in books, blog posts, and emerging forms of expression that you have full control over.

So, in summary, make the figures the skeleton of the paper and do them first.  An important point that applies to more than this topic: develop your own process that works for YOU; mine may be that process for you, and may not.  Final note-- I feed my cats in the morning.

Wednesday, August 19, 2020

Grabbing a web CSV file

I am tutoring an undergrad non-CS person on Python and we tried to graph some online data.  This turned out to be pretty painful to get to work just because of versioning and encodings and my own lack of Python experience.  But in the end it was super-smooth code because Python packages rule.

Here it is.  The loop over counter is super kludgy.  If you use this, improve that ;)

import csv
import urllib
import codecs
import matplotlib.pyplot as plt

url = ''

response = urllib.request.urlopen(url)
cr = csv.reader( codecs.iterdecode(response, 'utf-8') )


x = []
y = []

for row in cr:
    if counter == 3:
        for j in range(2,len(row)):
        plt.plot(x, y)
    counter=counter + 1

Thursday, July 2, 2020

Warping samples paper at rendering symposium

Here is a paper at EGSR being presented tomorrow at the time of this writing.

It deals with sampling lights.  The paper is pretty daunting because it has some curve fitting which always looks complicated and it works on manifolds so there are Jacobians running around.   But the basic ideas are less scary than the algebra implies and is clean in code.  I will cover a key part of it in a simple case.

If you are doing direct lighting from a rectangular light source, this is the typical Monte Carlo code for shading a point p:

pick point q on light with PDF p(q) = pdf_q
if (not_shadowed(p, q))
    d = unit_vector(q-p)
    light += BRDF * dot(d, n_p) * dot(-d, n_q) * emitted(q) / (distance_squared(p,q) * pdf_q)

Most renderings, or most of mine anyway, pick the point uniformly on the rectangle so pdf_q = 1/area.   There are more sophisticated methods that try to pick the PDF intelligently.  This is hard in practice.  See the paper for some references to cool work.

But we can also just try to make it one or two steps better than constant PDF.  The first obvious candidate is PDF that is bilinear which just computes this at each corner:

weight = BRDF * dot(d_corner, n_p) * dot(-d, n_corner) * emitted(corner) / distance_squared(p,corner)

The PDF gets 4 weights and needs to be normalized so it a PDF, and you need to pick points from it.  This is an algebraic book-keeping job.  But not bad (see the paper!).  The disruption to the code is no big deal.   For my particular C++ implementation here is my sampling code (which is a port of David Hart's shadertoy-- see that for the code in action-- this is just to show you it's no big deal)

// generate a sample from a bilinear set of weights
vec2 sample_quad(const vec2& uv, const vec4& w) {
    vec2 newuv;
    real u = pdf_u(uv.x(), w);
    real v = pdf_v(uv.y(), u, w);
    return vec2(u,v);

// helper functions
inline real pdf_u(real r, const vec4& w) {
    real A = w.z() + w.w() - w.y() - w.x();
    real B = 2.*(w.y()+w.x());
    real C = -(w.x() + w.y() + w.z() + w.w())*r;
    return solve_quadratic(A,B,C);

inline real pdf_v(real r, real u, const vec4& w) {
    real A = (1.-u)*w.y() + u*w.w() - (1.-u)*w.x() - u*w.z();
    real B = 2.*(1.-u)*w.x() + 2.*u*w.z();
    real C = -((1.-u)*w.y() + u*w.w() + (1.-u)*w.x() + u*w.z())*r;
    return solve_quadratic(A,B,C);

// evaluate the PDF
inline real eval_quad_pdf(const  vec2& uv, const vec4& w ) {
    return 4.0 *
        ( w.x() * (1.0 - uv.x()) * (1. - uv.y())
        + w.y() * (1.0 - uv.x()) * uv.y()
        + w.z() * uv.x()        * (1. - uv.y())
        + w.w() * uv.x()        * uv.y()
    ) / (w.x() + w.y() + w.z() + w.w());

The same basic approach works on triangles and works if you sample in a solid angle space.  I hope you try it!

Friday, April 24, 2020

Deugging refraction in a sphere

Here is a setup to help debugging refraction using a sphere and a particular ray.  I have do track single rays like this every time I implement refraction.   Here is the setup:

So let's put a sphere of radius 1 the with center (0,0,0).  Now all of the action is in XY so I will leave off the Zs.   First, what is A?

If it is to hit where the surface of the sphere is at 45 degrees, it has to have y component cos(45) = sqrt(2)/2 .  So the origin of the ray should be o = (-something, sqrt(2)/2).  And v = (1, 0).  B, the reflected direction should have origin (-sqrt(2)/2, sqrt(2)/2) and direction (0,1).

What about C?  It has origin  (-sqrt(2)/2, sqrt(2)/2) and direction (1+sqrt(2)/2, -sqrt(2)/2) which has a length of about 1.84776.  So C = (0.9239, -0.3827) approximately.

But what refractive index produces such a C?   We an see that two of the sides have length 1, so the two angles are the same, and we can see that sin(theta) = (sqrt(2)/2) / 1.8776 = 0.38268.  So the refractive index must be the ratio of that ans sin(45).  So refractive_index = 1.8478.

So what is D?  By symmetry it must leave at 45 degrees so D = (1, -1) / sqrt(2).

And finally E reflects across the X axis and = (-0.9239, -0.3827) approximately

Friday, March 13, 2020

Fresnel Equations, Schlick Approximation, Metals, and Dielectrics

If you look at a renderer you will probably see some references to "Schlick" and "Fresnel" and see some 5th order polynomial equations.  This is all about smooth metals and "dielectrics".

You will notice that the reflectivity of smooth surfaces vary with angle.  This is especially easy to see on water:

Three Worlds.jpgNote how the reflectivity increases as the angle of the viewer increases as the incident angle of the viewing direction to the normal increases.   Work by Escher.

The fraction of how much light is transmitted and how much is reflected adds to one.  This is true tracing the light in either direction:

Left: the light hitting the water divides between reflected and refracted based on angle (for this particular one it is 23% reflected).  Right: the eye sees 77% of whatever color comes from below the water and 23% of whatever color comes from above the water.

So in a ray tracer we would have the color of the ray be

color = 0.77*color(refracted_ray) + 0.23*color(reflected_ray)

But how do we compute those reflected and refracted constants?

These are given by the Fresnel Equations.  For a non-metal like water, all that you need for them is the refractive index of the material.  Back in the dark ages, I was very excited about these equations and used them in my ray tracer.  Here they are from that wikipedia page:

Here R is the reflectance, theta is the angle to the normal of the incident direction, n1 is the refractive index of the material the incident light is in, and n2 is the refractive index of the material the light is going into.  But there are two of them?!  One is for one type of polarization, and one is for the other.  The "types" are relative to the surface normal.  This is a cool topic but we would need would delve into some pretty serious optics we wont use (though many applications do where optical accuracy is needed and Alexander Wilkie has been doing great work in that area).  So what if I don't have polarization in my ray tracer.  I used to just average them:

R = (R_s + R_p) /2

So how do those look?  Here is from that same wikipedia page as above:

 Well there are some cool things there.  Especially that R_p goes to zero for one angle.  This is Brewster's Angle and why I have polarized sun glasses-- so I can get x-ray vision into water!  But I wont simulate that in a ray tracer.

So what does R = (R_s + R_p) /2 look like for the example above.  I'll type it into grapher on my mac, where x is in theta_i in radians:

Wow that is certainly a complicated expensive function for such a smooth curve.  The x axis in in radians so 90 degrees is x = PI/2 = 1.57.  So 0 to 1.57 is the interesting part.

Now I ran that code for years.  But Christophe Schlick has some sweet papers in the 1990s about using simple approximations to smooth functions in rendering.  I mean why are we solving the approximate equation (we threw away polarization) with exact, expensive equations??!  His equation for approximate fresnel reflection was almost immediately adopted by all of us.  Here it is from that wikipedia page:

 That R_0 is the reflectance for theta=0, so light going straight at the surface.  That is right where transparent surfaces are most clear.  It is just the big fancy equations above which both collapse to that for theta=0.   If I graph that too for the n=1.5 I get:

 OK that is close!  And a lot easier and cheaper.

So where do you get n1 and n2?   If one of the surfaces is air use 1.0 for that.  From that wikipedia article on refractive index above we have:

When in doubt, use 1.5.

Now what happens if you want the rainbows you can see along edges in glass?  That is dispersion and it is a pain in RGB renderers... you probably need more samples than 3 in color and you need to convert to a spectral renderer.  This has been done a few times (here is a fun one with lovely images) but I wont here!

Note that this same equation is used not only for transparent surfaces but also for the reflective coat of "diffuse-specular" surfaces.  Here is one way to think of these surfaces-- a chunk of glass/plastic/whatever with junk in it:

Note that the *surface* (specular) component is just like glass-- use the Schlick for that!  What refractive index?  When in doubt, 1.5 :)

Now what about metals?  Does the above apply?  No, they are nothing like the Fig 2.11 above.  All the action is at the surface.  But the Fresnel equations are more complex-- literally-- the refractive index of metals is complex!   This is sometimes written as real refractice index and conductivity (the real and imaginary part).

Here are the equations:

Wow.  Those are even worse (with k, the conductivity set to zero, you get the non-conductor ones).  So for some applications, this is needed.  But not for most graphics programs.  I rarely use them.  Note that they are real valued.  So can we use Fresnel?  The answer is yes!   So where do I get n and k?  I implemented them in the dark ages and tried it for a few metals.  Here is one:

That looks like we can probably use the Schlick equations but with different base cases for R, G, B.  So where do we get the normal incident reflectance?  We could by evaluating normal incidence.   But the data isn't widely available and a lot of metals are alloys of unknown composition.  So we make R0 up (like grab them from an image-- it's an RGB color, but be careful about gamma!)

vec3 R0 = (guess by looking)
vec3 R = R0 + (1-R0)*pow(1-cosine,5)

But with RGB not just floats.  This is because the refractive index for metals can vary a lot with wavelength (that is why some metals like gold are colored).

Now most graphics programs have two Schlick Equations running around-- one for dielectrics that are float and set by a refractive index, and one that is RGB and set by an RGB R0. This seems odd at first, because it is, but it makes sense if you go searching for the refractive index of glass, then gold (try it!).

Naty H writes:
I think this is good advice by Naty.

For the curious-- does light refract into metal?  Yes it does, but it gets absorbed quickly.  This is why a gold film is transparent if it is thin enough.   Here is information on refraction for metals (I have never implemented this) using Snell's law:
Thanks to Alain Galvan for pointing out some errors!