Sunday, May 15, 2016

Prototyping video processing

I got a prototype of my 360 video project done in Quartz Composer using a custom Core Image filter.     I am in love with Quartz Composer and core graphics because it is such a nice prototyping environment and because I can stay in 2D for the video.   Here is the whole program:




A cool thing is I can use an image for debugging where I can stick in whatever calibration points I want to in Photoshop.   Then I just connect the video part and no changes are needed-- the Core Image Filter takes and image or video equally happily and Billboard displays the same.

The Filter is pretty simple and is approximately GLSL     

One thing to be careful on is the return range of atan (GLSL atan is the atan2 we know and love).

I need to test this with some higer-res equirectangular video.    Preferably with fixes viewpoint and with unmodified time.   If anyone can point me to some I would appreciate it.

Saturday, May 14, 2016

What resolution is needed for 360 video?

I got my basic 360 video viewer working and was not pleased with the resolution.   I've realized that people are really serious that they need very high res.   I was skeptical of these claims because I am not that impressed with 4K TVs relative to 2K TVs unless they are huge.   So what minimum res do we need?    Let's say I have the following 1080p TV (we'll call that 2K to conform to the 4K terminology-- 2K horizontal pixels):

Image from https://wallpaperscraft.com
If we wanted to tile the wall horizontally with that TV we would need 3-4 of them.   For a 360 surround we would need 12-20.   Let's call it 10 because we are after approximate minimum res.  So that's 20K pixels.   To get up to "good" surround video 20K pixels horizontally.   4K is much more like NTSC.   As we know, in some circumstances that is good enough.

Facebook engineers have a nice talk on some of the engineering issues these large numbers imply. 

Edit: Robert Menzel pointed out on Twitter that the same logic is why 8K does suffice for current HMDs.


Thursday, May 12, 2016

equirectangular image to spherical coords

An equirectangular image, popular in 360 video, is a projection that has equal area on the rectangle match area on the sphere.   Here it is for the Earth:

Equirectangular projection (source wikipedia)
This projection is much simpler than I would expect.    The area on the unit radius sphere from theta1 to theta2 (I am using the graphics convention of theta is the angle down from the pole) is:

area = 2*Pi*integral sin(theta) d_theta = 2*Pi*(cos(theta_1) - cos(theta_2))

In Cartesian coordinates this is just:

area = 2*Pi*(z_1 - z_2)

So we can just project the sphere points in the xy plane onto the unit radius cylinder and unwrap it!   If we have such an image with texture coordinates (u,v) in [0,1]^2, then

phi = 2*Pi*u
cos(theta) = 2*v -1

and the inverse:

u = phi / (2*Pi)
v = (1 + cos(theta)) / 2

So yes this projection has singularities at the poles, but it's pretty nice algebraically!

spherical to cartesian coords

This is probably easy to google if I had used the right key-words.   Apparently I didn't.   I will derive it here for my own future use.

One of the three formulas I remember learning in the dark ages:

x = rho cos(phi) sin(theta)
y = rho sin(phi) sin(theta)
z = rho cos (theta)

We know this from geometry but we could also square everything and sum it to get:

rho = sqrt(x^2 + y^2 + z^2)

This lets us solve for theta pretty easily:

cos(theta) = z / sqrt(x^2 + y^2 + z^2)

Because sin^2 + cos^2 = 1 we can get:

sin(theta) = sqrt(1 - z^2/( x^2 + y^2 + z^2))

phi we can also get from geometry using the ever useful atan2:

phi = atan2(y, x)



Friday, May 6, 2016

Advice sought on 360 video processing SDKs

For a demo I would like to take come 360 video (panoramic, basically a moving environment map) such as that in this image:

An image such as you might get as a frame in a 360 video (http://www.airpano.com/files/krokus_helicopter_big.jpg)
And I want to select a particular convex quad region (a rectangle will do in a pinch):


And map that to my full screen.

A canned or live source will do, but if live the camera needs to be cheap.   MacOS friendly preferred.

I'm guessing there is some terrific infrastructure/SDK that will make this easy, but my google-fu is so far inadequate.

Tuesday, May 3, 2016

Machine learning in one weekend?

I was excited to see the title of this quora answer: What would be your advice to a software engineer who wants to learn machine learning?   However, I was a bit intimidated by the length of the answer.

What I would love to see is Machine Learning in One Weekend.    I cannot write that book; I want to rread it!    If you are a machine learning person, please write it!   If not, send this post to your machine learning friends.

For machine learning people: my Ray Tracing in One Weekend has done well and people seem to have liked it.    It basically finds the sweet spot between a "toy" ray tracer and a "real" ray tracer, and after a weekend people "get" what a ray tracer is, and whether they like it enough to continue in the area.   Just keep the real stuff that is easy, and skip the worst parts, and use a real language that is used in the discipline.   Make the results satisfying in a way that is similar to really working in the field.   Please feel free to contact me about details of my experience.