Wednesday, March 6, 2019

OpenCL Notes

If you're interested in OpenCL please please watch Houdini 16.5 Masterclass | OpenCL by Jeff Lait
After that, I'm sure you'll find most of the stuff I'll write in this post redundant. I'm writing this post mostly as a note to myself with code snippets (cause my memory sucks) and I thought you guys might find it useful as well. 

On top of that, I want your opinion. Please correct me if I say or write something wrong, or you know a better way. I'm really looking forward to learn.

I'll keep updating this same post with new code snippets periodically, so stay tuned.

VECTOR ATTRIBUTES


Ok let's start with an example:
I want to move all the points in my geometry up , in the positive y direction , by 1 unit.

This is how you do it in VEX ...
  • drop a Point Wrangle node
  • connect your geometry to input 1
  • write the following code in the parameter VEXpression:


... now in  OpenCL:

  • drop an OpenCL node
  • go in the bindings Tab and set it up as follows. The purpose of this tab is make geometry attributes available to the GPU. Differently from Wrangle nodes, in order to access the attributes you've to bind whatever attribute you want to read / write to a variable. In this case we bound the attribute P to the openCL variable P ( I could have chosen a different name, like for instance 'ciccio').
    It's important to note that the OpenCL node ny default will loop over every element of the first parameter appearing in the Binding Tab (as specified in the 'Options' Tab) . So , in this case, will loop over P (just like a Point Wrangle) !
  • go in the Kernel Tab and check the parameter 'Use Code Snippet'. This will allow you to write code directly on this node, opposed to point the node to a file containing the code.
  • now press the button "Generate Kernel". A window will pop up with some code. Copy the content of this window, close it and paste it in the parameter 'Kernel Code'.
  • now modify the code as follows:

You should see your geometry moving up by 1 unit.

Let's translate the OpenCL code above in English:

kernel void KernelName ( int P_length, global float * P )
Dear Houdini, please define a kernel function that I'll submit to the GPU if I press CTRL+Enter. Along with this function I'll deliver two pieces of data : one is a long list of numbers (float *) that is coming from the input connection (global) and it's called P (between me and you, this is a list of 3 numbers at a time, cause it's a list of vectors, but you don't have to worry about that for now). The other one is just an integer number (int), named P_length and will contain the length of the list P, so we know when we run out of elements.

{
...cool, let's start looping over P using idx as index

int idx = get_global_id(0);  
if (idx >= P_length) 
          return;
To start, kindly put the index of the data you're currently looping over in the variable idx. I know you'll be increasing idx at every iteration, but please if you run out of elements to loop over (if (idx>=P_length)...) , just stop (return) so we don't get one of those annoying crashes.

float3 pos = vload3(idx, P);
... fetch the current vector P in a variable called pos ...

pos.y += 1;
... now take whatever number was in the component y of the vector pos , and add 1 to it. Then store it back into pos.

vstore3 (pos, idx, P);
... and finally replace the current P with the content of pos.

}
... let's go to the next element in P and start a new loop iteration.



ARRAY ATTRIBUTES

Example:
I want to put the number -100 in the 3rd element of the array attribute A attached to point with @ptnum 14.

In VEX:


In OpenCL:



Why ?
Well, let's pretend we have this array point attribute named A :

in Vex the syntax would be 
i[]@A

and this is how we're used to see this data in the Geometry Spreadsheet:


in OpenCL the syntax would be :
global int * A
and the data structure looks like this :

... 
mem loc 45 : 395  
mem loc 46 : 302
mem loc 47 : 296
mem loc 48 : 396
mem loc 49 : 631
mem loc 50 : 614
mem loc 51 : 397
mem loc 52 : 326
mem loc 53 : 396
mem loc 54 : 634
mem loc 55 : 618
mem loc 56 : 649
mem loc 57 : 326
mem loc 58 : 396
... 

In other words, all the arrays are concatenated in a long array somewhere on some memory location in the GPU. For this reason we need an additional array that contains the index of the first element of each array in order to re-build the original data structure.

global int * A_index
...
idx 13 : mem loc 45
idx 14 : mem loc 51
idx 15 : mem loc 56
...

so, for instance, if we want to access the 3rd element of the array A[14] (which is number 396), first off we need to find the memory location of that array:

int mem_loc = A_index[14]

and then count 3 elements from that memory location, like so :

int element = A[mem_loc+3]


VOLUMES

Let's say we have a vector grid called "vel".
First off, make sure this is a Volume Grid (not a VDB). (If you find a way to pass VDBs into OpenCL please I'd love to know how).
In other words we're going to need 3 float grids (vel.x, vel.y, vel.z).
In the example below, I chose to add the Voxel Resolution, because I wanted to know weather or not I was outside of the grid boundaries, but it's not strictly necessary.


When you import a volume grid , for instance in this case it's called 'vel.x', this will result in this code in the Kernel:

...
int velx_stride_x
int velx_stride_y
int velx_stride_z
int velx_stride_offset
global float * velx ,
...

If you select "Voxel Resolution" for the grid 'vel.x', this will result in this code in the Kernel:

...
int velx_res_x
int velx_res_y
int velx_res_z
...

If you select "Volume Transform to Voxel" for the grid 'vel.x', this will result in this code in the Kernel:

...
float16 velx_xformtovoxel
...
(note this is a 4x4 matrix)


Access voxels from a world position
If you want to find the 3 voxel indices starting from a world position P, here's how (this was kindly described in the Masterclass mentioned above).

// first off we need to fetch pos from the P attribute ....
float3 pos = vload3(idx,P);

// now this suuuuper weird formula to find the volume indice
float4 voxelpos = pos.x * velx_xformtovoxel.lo.lo + pos.y * velx_xformtovoxel.lo.hi + pos.z * velx_xformtovoxel.hi.lo + 1 * velx_xformtovoxel.hi.hi;

// and of course the indices must be integers so ...
// NOTE : if you are not 100% sure P is within the grid boundaries you better modify this code to make sure you're not trying to fetch a non existing voxel (Houdini might very easily crash if you do). In this case let's pretend I'm sure P is within the volume boundaries.
int x = (int)(floor(voxelpos.x));
int y = (int)(floor(voxelpos.y));
int z = (int)(floor(voxelpos.z));

// and finally we can access the content of the voxel with indices (x,y,z)
float vx = velx[ velx_stride_offset + velx_stride_x * x + velx_stride_y * y + velz_stride_z * z ];

Monday, March 4, 2019

Sand without grains - PART 2 - VEX Implementation

I'll assume you've read PART 1 so you've a good grasp of this super simple algorithm.

Oh, and in case you're not familiar with HeightFields or Vex , I recommend to watch these two amazing and life changing Masterclasses first.


Let's quickly recap from PART 1 the ingredients:
  • the ideal data structure for this algorithm is a 2 dimensional grid.
  • we need 2 grids: one to store the height, one to store the 'toppling' direction
Now there is one data structure in Houdini that happens to be perfect for what we need :

                                                                   HeightFields (HF)

For those who don't know what HeightFields are in Houdini:
A heightfield is a grid, just instead of being a polygonal grid, it's a volumetric grid. To be more precise it's a 2d Volume. Each voxel of the grid contains a floating point number that corresponds to the height of the visual representation of that voxel.

Let's get familiar with HF first in Houdini.

In a SOP context drop a HeightField SOP node , set size to 100 , 100 and middle click on it to check the info.



You can see the node generates 2 volumes: 'height' and 'mask'.
Please note that even if there are 2 volumes here, in the view port you'll see only the 'height' volume. Mask volume is pretty much identical to Height , but is invisible and in the view port is used to color the Height volume in red for those Mask voxels bigger than 0. Mask is used to select areas of the Height volume to perform arbitrary HF operations. We will use Mask as a debugging tool.

In order to get familiar with these new concepts let's now drop a Volume Wrangle and enter the following code in the "VEXpression" field.


You'll notice that all the voxels that lay in the negative X plane (v@P.x < 0) are now 20 units taller (f@height=20). And you'll notice that all the voxels in the negative Z plane (v@P.z < 0) are now red (f@mask=1).

Remember that every Wrangle node is really a loop that iterates over all the specific elements the wrangle node is set to. In the case of a Point Wrangle the loop will iterate over all the points. In our case, Volume Wrangle , the loop will iterate over all the voxels. We don't have to write any code for the loop to happen, we only have to write the code for what happens for each iteration. So when I say 'current voxel' what I mean is 'the voxel that is currently being looped over'.




In this example we used the voxel World position (v@P) to choose where to change Height or Mask. This means that if you translate or rotate the grid before the Wrangle node, the result will 'slide'.

What if we want the result to be independent from the grid position / orientation ?
Well, we'll need to use a different way to address the voxels : indexing !
The Volume Wrangle node provides several pre-built attributes (v@P is one of those). The ones we want to use now are i@ix, and i@iy (we'll ignore i@iz since we're dealing with a 2d volume, so we'll only consider the components x and y , and always assume the z component as 0).
i@ix and i@iy contain the voxel index. Just what we need.
Ok let's drop a different Volume Wrangle in a new branch and write this code in the VEXpression field.


Since we set the grid to be 100x100 in the HeightField SOP node, with the code i@iy<50 I'm selecting only half of the voxels in the y component to change the Height to 20, and only half of the voxels in the x component to change the Mask to 1.

The result in the viewport did not change at all, but now the result of the Volume Wrangle is not affected by the grid location.
You can test the difference between the two wrangle nodes dropping a Transform SOP for each branch before the Wrangle node, and trying to move the HeightField around.


Now let's try something silly, just because I want to prove a point.

Let's attach a new Volume Wrangle to volumewrangle2 and let's try to color in red (using f@mask) all those voxels that are now higher.
It should be easy since we just set Height to 20 for certain voxels. All we need to do is select all the voxels whose world position y is bigger than, say ... 10 ?



It didn't work !
The reason is explained by the fact that we never changed the actual voxel position. We changed the data *contained* in voxels ('height'), but we never changed v@P.y (the y component of the voxel world position). Even if we wanted to change the voxel v@P.y, we wouldn't be able to do it using a Volume Wrangle node.
If you think about that, it makes perfect sense : a volume is a grid. It wouldn't make sense to be able to move voxels around. If we were able to do so , we would lose the advantages that comes from being ... a grid (we'll see this advantage in a minute).

I don't want to go too deep into HF (please make sure to watch the Masterclasses I mentioned before), but there are a couple of important facts to keep in mind:
  • Even if you see a HeightHield appearing as a 3d object in the viewport , internally it is a 2d grid.
  • The reason cause you see a 3d object, is because the 'height' volume has a special visualizer that shows the 'height' content to visually y-displace the grid in the view port.
  • The y component of a HF voxel world position *is not* 'height'.
  • Voxels cannot be moved in space by a Volume Wrangle. Volume Wrangle can only change the data (for example 'height' or 'mask') contained in the voxel.
  • Voxels can be moved in space with a Transform SOP or the HeightField SOP ('Center' parameter).
  • HF voxels can be uniquely identified by 2d coordinates (x, y). In VEX this coordinate is { i@ix , i@iy , 0 }
  • this is weird : the unit for the 'height' field is 10x bigger than Houdini Unit Length, which is 1 m by default. This means that in order to move a voxel up by 1 unit (or 1 meter) you've to set height = 0.1 (not 1).



Ok enough about HeightFields. Let's dive into the sand algorithm.

In PART 1 I named each cell 'stack'. Since we're now dealing with Volumes, I'll use the term 'voxel' to refer to the stacks.

First off we need to create an environment that will allow us to play with HeightFields. We'll use the Mask volume (generated by the HeightField SOP) as a debugging tool, as we have already done previously.

Cool, let's create a new Geo context, and name it "sandpile_VEX". Dive inside and create the following.


heightfield1 node generates a 20x20 sized grid. Grid Spacing is 1, meaning that there is 1 voxel per unit. If we put 0.5 in there we would have twice the number of voxels. This parameter is a very quick way to increase the HF resolution.

So we have a total of 400 voxels, and the volume indices are as follow:

i@ix = [ 0:19 ]
i@iy = [ 0:19 ]

heightfield_noise1 node modifies the 'height' field using a noise function. This is a quick way to create an initial sand distribution.
In the view port you should see something similar to this.



In PART 1 we split the algorithm in 2 steps. Let's work on step 1.

STEP 1

For each voxel (or stack), find out if any of the 4 adjacent voxels is lower by a certain amount (let's say 1).

Drop a Volume Wrangle node and write the following code in the parameter 'VEXpression'


In the first line we set the topple_threshold. If the current voxel is taller than the adjacent voxels by a height value bigger than 'topple_threshold', it means the current voxel should topple. But don't worry, we'll use this variable later.

Now, the biggest advantage of using volumes as a data structure is that I can find the adjacent voxels very very quickly. The grid position of the 'current voxel' is given by i@ix and i@iy. So if I want to access the voxel to the left, right , top or bottom of the current voxel all I need to do is add or subtract a 1 here and there.



The following 4 lines read the 'height' value from the 4 voxels adjacent to the current voxel.
We'll ignore the voxels in the corners.

Let's translate the following line in English :


Dear Houdini, please read the content of the node attached the first input (@OpInput1), which I assume is a volume. But since there are 2 volumes flowing through that connection ('height' and 'mask') I want to be more specific : the volume I am interested in is named "height", but hey ... not the whole thing. No way. I want the content of one voxel, and more specifically the one sitting at the index ( ix+1 , iy , 0 ). Once you get that data contained in that voxel, please put it in a float variable named 'hxp'.

@OpInput1 is just a fancy way of saying "node input 0". You can write directly 0 in there, and it will still work perfectly. I prefer to use the fancy form cause it's more descriptive for my lazy brain.

Ok now the variables hxp, hxn, hyp, hyn contain the height value of the 4 adjacent voxels. At the moment we're doing nothing with this data.

Let's add some more Vex code:



The purpose of the new 4 lines is to flag which of the adjacent voxels is a candidate for the current voxel to topple towards to.

Let's make an example:
Let's assume the height of the current voxel (f@height) is 3.
Let's assume the height of the adjacent voxel on the positive x (hxp) is 6.
And we know that topple_threshold is 1.
3 - 6 = -3
is -3 bigger than 1 ? False . Then txp = False (or 0).
Meaning, the current voxel cannot topple towards the adjacent voxel on the positive x, cause that voxel is actually taller ! Actually that voxel could topple towards the current, but that will be evaluated in some other iteration of the Volume Wrangle node loop.

Another example:
Let's assume the height of the current voxel (f@height) is 3.
Let's assume the height of the adjacent voxel on the negative y (hyn) is 1.
And we know that topple_threshold is 1.
3 - 1 = 2
is 2 bigger than 1 ? True . Then tyn = True (or 1).
This means that the current voxel could topple towards the adjacent voxel on the negative y, cause it's lower than the threshold value.

Once we collect all the values for txp, txn, typ, tyn we can put this into an array.



The line with hdiff[] is initializing the array to an array of 4 integers. All zeros to start.
Then I'm replacing each element of the array with txp ... tyn, that we know can be only 0 or 1.

Please note that in this moment we just established an important convention. Meaning that the first element in the array corresponds to the upper voxel (yp). The 3rd element of the array corresponds to the lower voxel (yn) and so on. We could have chosen a different convention of course, I just like starting from top and going clockwise. Feel free to go crazy with it, just remember to maintain the same convention later.

Fig 1

In the very last line I sum all the values contained in the array and store it in a new variable called 'topple_candidates'. This variable ultimately answers the most important question : "is the current stack going to topple ?"
If this variable is zero, it means that the array is composed by 4 zeroes, which in turn means that none of the adjacent voxels is a candidate for the current stack to topple.
In other words, if this variable is bigger than zero, the current stack is unstable. And it's really easy now to color code those voxels using the grid 'mask'.
All we need to do is add the following line :

f@mask = topple_candidates > 0;

Which brings the entire code to this :


Now if you look at the view port, you'll probably see this :


Yes it does look like a napkin full of blood. Let's try to ignore that.
This is a very comforting result : note how the voxels that lie on the ground area are all white, cause none of those have adjacent voxels that are much lower. Those white voxels are stable and will not topple. The red voxels instead are on steep slopes and it makes a lot of sense that they might topple down.

But there is one issue we need to resolve before moving to the next step: at the moment if you change the resolution of the grid ( heightfield1 -- Grid Spacing Parameter) you'll notice that the toppling areas (in red) shrink or grow. 


This is not what we want. We want to be able to develop our algorithm on a fast grid (say 20x20), and once we're happy with it, move to a way higher res grid without fearing that the result will change. In other words we need to make our system resolution independent.

First off let's try to understand why this is happening.
When we change the resolution of the grid , we effectively change the size of each voxel. But the 'topple_threshold' height is always 1. Somehow we need to tie grid spacing with topple_threshold.

when grid spacing becomes smaller, that means that the voxel becomes smaller, consequently even  topple_threshold should become smaller.

grid_spacing ---- voxel size ---- topple_threshold

So the parameter Grid Spacing and the variable 'topple_threshold' seem to be connected by a linear dependency. Let's just multiply those, and use the result as our new 'adaptive' topple_threshold.


First off I parametrized a new variable called 'topple_threshould_mult', and created a new parameter called 'grid_spacing' linked to the parameter 'grid_spacing' on the heightfield1 node.
At this point I can finally calculate the new 'topple_threshold' :

topple_threshold = grid_spacing * topple_threshold_mult ;

Now the red areas will stay consistent even if we change the resolution, on top of that we've an additional slider to change the toppling threshold !


Switch to a more decent resolution, something that will show some cool detail. On the node "Heightfield1" set the parameter Grid Spacing to 0.05.

Great !
Now let's get back for a moment to the variable 'topple_candidates'.
We said that if this is bigger than 0 it means the current stack will topple.
Now the question is "in which direction will it topple ?".
In PART 1 I decided to choose the toppling direction randomly. Note that I made this choice cause it's the simplest one. More elaborate solutions would be to choose the toppling direction based on a vector (which could be wind, or forces like mm .. collisions / proximity with other objects). For the sake of simplicity we'll go for the random route, but please feel free to experiment with other solutions on your own.

We need to choose one of the non-zero elements in the array randomly and store the index in an additional grid that we'll name 'dropdir'.

Let's abstract the problem here :

I have an array of 1s and 0s

example :
A = [ 0 , 1 , 0 , 1 ]
I want my code to *randomly* return 1 or 3 (the indices of the array that correspond to 1s).

another example : 
A = [ 1 , 0 , 0 , 1 ]
I want my code to *randomly* return 0 or 3 (the indices of the array that correspond to 1s).

another example : 
A = [ 0 , 0 , 1 , 0 ]
I want my code to always return 2 (the only index of the array that correspond to 1).

A really simple solution is the following (if you find a smarter and/or faster solution please please let me know):
  1. count how many 1s are present in the array (we already have this value, it's the variable 'topple_candidates')
    Example :
    if the array A = [  0 , 1 , 0 , 1 ]  , topple_candidates = 2
    if the array A = [  1 , 0 , 1 , 1 ]  , topple_candidates = 3
  2. if the array A = [  0 , 0 , 0 , 0 ]  , topple_candidates = 0
  3. find a random integer number between 0 and topple_candidates ( let's call it R )
    if topple_candidates = 2 , R = 1 or 2
    if topple_candidates = 3 , R = 1 , 2 or 3
    if topple_candidates = 1 , R = 1
  4. loop over the array until I find as many 1s as R
  5. store the array index in 'dropdir'
Ok first off we need to create an additional grid named 'dropdir'. As we just mentioned earlier each voxel of this grid will contain a number between 0 and 3. This number will give an indication of the toppling direction, according to the convention established in fig.1.

Modify your network this way:


In other words I'm duplicating the field 'height', renaming it to '_toppledir' (I like to ad a leading underscore to temporary data) and merging it back with the other 2 fields 'height' and 'mask'. The 'primitive1' node will disable the visibility of this new grid. For now I disabled this node cause we want to see what happens on this grid when we start storing values. We'll hide it after we are sure it's working properly.

In the view port you should see something like this


Now if you middle click on the merge1 node you should see 3 grids.

  • height - the bloody napkin, now more similar to a bloody mountain
  • mask - not visible as a grid, but visible as red mask on the 'height' field
  • _toppledir - a copy of the height field, but completely flat since all the voxels have a zero value.
Now modify the code in volumewrangle1 this way:


( I know , you hate me cause you can't copy and paste the code. But guess what, I did it on purpose ! 😼 I want you to write it all cause I want you to get used to code in VEX, opposed to become a copy and paste black belt. On top of that formatting code with Blogger is a real pain ... )

Let's quickly go over the new code.
First off, the whole new code is almost entirely enclosed in a big if-statement: if topple_candidates is zero, we mark that voxel as stable ( or -1 ) and we move on to the next voxel.
Otherwise ... 
first off we set a random number 'rnd' between 0.0 and 1.0.

let's focus on this line :
int r = 1 + floor ( rnd * topple_candidates )

The first thing to keep in mind is that the function floor( a ) returns the integer part of a. 
For instance :
floor(0.3) will return 0
floor(1.543) will return 1
floor(12.543) will return 12

now ...
(rnd * topple_candidates) is just a way to re-map a number that is in the range between 0.0 and 0.99999... to a number in a range between 0.0 and topple_candidates-1.

Floor will return the integer part of it, which is ... 0 - (topple_candidates-1)

But, remember that the purpose here is to create a random number between 1 and topple_candidates cause we want to use this number (r) as a counter ... so we just need to add 1. Hence ...
int r = 1 + floor ( rnd * topple_candidates )

In the for-loop below I start looping over the 4 elements of the array hdiff[]. This array will have at least one 1, otherwise we would have been kicked out of the if-statement earlier. 
In this loop , a variable 'c' will be increased every time we meet a 1 in the array. As soon as 'c' reaches the desired count 'r', we have found the index, and we can exit the loop (break statement).
All we need to do now is store the index in '_toppledir' !

Now if you look at the view port you'll see something weird ...


... and kinda beautiful.
The weird looking spiky thing is the '_toppledir' grid. This grid should contain numbers between -1 and 3. In fact, if you switch to the front view , you'll notice exactly that ! 


On top of that, if you look closely, you'll notice the following:
  • areas corresponding to white areas in the 'height' grid are flat and at level -1 
  • areas corresponding to red areas in the 'height' grid are always higher than -1
  • if you scrub the frame in the timeline, the peaks should change randomly (cause the variable 'rnd' is dependent on @Time) , but not the ones in the white areas of the 'height' field.
  • there are a few flat areas that don't correspond to white areas of the 'height' field ! Those should correspond to slopes where there can be only one toppling direction. These areas will be flat, yes, but higher than -1 level.
Cool.
The good news is that Step 1 is done !
Let's rename volumewrangle1 to step1.

Oh and let's store the value calculated for 'topple_threshold' in a detail attribute, on the geometry itself. This way I don't need to calculate it again in Step 2.


 The VEX line above will use the command 'setdetailattribute' to create a detail attribute

A detail attribute is simply a per-object attribute, opposed to a per-point, per-vertex, per-prim attributes.
I generally use detail attributes to store values that exist for the whole object. They're super convenient cause they're kinda similar to parameters, but attached to the geometry itself, so they flow along the whole graph and you can access it from everywhere. On top of that, they're very light since there's only one per object, and the size is independent on how heavy is the geometry.
Why not call them 'object attributes' instead of  'detail attributes' ? MMMmmm ... well ...

STEP 2

Let's quickly recap.
At this stage we've 3 volumes:
  • Height : contains our sand. It's a 2d volume. Each voxel contains the height of the corresponding stack of sand.
  • _dropdir : contains the info about which stack is stable or unstable. On top of that, for those who are unstable, it even tells us in which direction they'll topple. More specifically (check Fig.1) :
    • -1 : stable
    • 0 : topple up
    • 1 : topple right
    • 2 : topple down
    • 3 : topple left 
  • Mask : used only for debugging purpose. At the moment it flags red all the unstable Height voxels.
So, well ... we know everything. All we need to do is actually do the toppling thing.

All we need to do is this:

For each voxel V in the grid 'Height':
  • find which of the 4 adjacent voxels is marked to topple towards V and increase V's Height accordingly.

    Example:
Looking at V's 4 adjacent voxels, two of them will topple on V. One is the voxel on the right that contains the value -3 (which, according to out convention, means it's going topple left), and the other is the voxel down that contains the value 0 (which , according to our convention, means it's going to topple up). So a total of 2 'grains of sand' will land on V, increasing it's Height.

  • if V itself is not stable (meaning that it will topple on one of its 4 adjacent vectors) decrease V's 'Height" accordingly. In other words, if the current voxel V itself is not stable (meaning _toppledir contains any value different than -1) it means that one 'grain of sand' is going to topple , decreasing V's Height.

Let's implement Step 2 in Houdini : drop a new volumewrangle node and get the data about the 4 adjacent voxels.



The first 4 lines should be familiar. We're fetching the current voxel's adjacent voxels content for the grid _toppledir that we populated in step 1.

In line 5, I'm just fetching topple_threshold stored in step 1.
Why am I dividing it by 4 ? I'll explain it in a moment.

Cool, now we know in what direction the adjacent voxels will topple. Now we need to check if any of the adjacent stacks are going to topple on the current voxel. In that case we need to increase the current voxel's Height by the topple_threshold.
Let's remember that topple_threshold is the threshold value that decides if a voxel is unstable. So , in step 2, if all 4 the adjacent voxels are toppling towards the current voxel, it means that the current voxel's Height will be increased by topple_threshold*4.
Now, this wouldn't be 'fair' towards the adjacent voxels, because in Step 1, each one of them alone had a reason to topple on the current voxel. If when an adjacent stack topples on the current voxel I increase it by the *whole* topple_threshold amount, this alone would be enough to re-evaluate the reasons for the remaining voxels to topple on the current voxels. We need, instead, to preserve the consistency with the decisions made in Step 1. This is why I'm dividing the value by 4.0.
This way, if all the adjacent stacks topple towards the current stack , Height will never get higher than any of those adjacent stacks.

Cool, all we need to do, is to check if the adjacent stacks have any intention to topple on the current stack. In other words ...
  • if the stack on the right wants to topple on the left increase the current stack by threshold/4
  • if the stack on the left wants to topple on the right increase the current stack by threshold/4
  • if the stack up wants to topple down increase the current stack by threshold/4
  • if the stack down wants to topple up increase the current stack by threshold/4
 ... that's not it yet. We need to check if the current stack is not stable ..
  • if the current stack wants to topple (we don't care where) decrease the current stack by threshold/4
Let's write all this in VEX :










... first off I'm initializing to zero a new variable height_incr where I want to store the toppling contributions coming from all the directions.

All the if... statements are doing exactly what I described earlier.
  • if (dxp == 3) ...
    means ... if the stack on the right (xp = x positive) wants to topple towards the left (remember that according to our convention in Fig.1 the number 3 means 'left').
  • if (dxn == 1) ...
    means ... if the stack on the right (xn = x negative) wants to topple towards the right (remember that according to our convention in Fig.1 the number 1 means 'right').
  • ...
  • if (f@_toppledir!=-1)
    means ... if the current voxel is not stable (meaning _toppledir does not contain -1) then , regardless of what direction it'll topple, has to go down, decrease.
After accumulating all the sand contribution in height_incr , I'll add it to the current height (last line).

Done. Finish. Caput. Finito.

Almost.

Ok to recap this is what your network should look like


This is the vex code contained in 'step1' Point Wrangle node:


And this is the code contained in 'step2' Point Wrangle node:


Cool.
Now, if you check your bloody napk .. uhm your sand distribution in the viewport, not much is changed. Well, check better ... it is actually changed, but veeery very little. You need to really zoom in a detail, and if you enable and disable the node 'step2' you'll notice that something is actually changing.



This happens of course because this is only one iteration. We need to repeat this process over and over in order to see the something interesting, and of course Solver SOP is once again our friend.
All we need to do is move the nodes step1 and step2 in a Solver SOP. The Solver SOP will fetch the geometry only at the frame specified in the parameter Start Frame (typically frame 1). For all the subsequent frames, the Solver will use the result of the previous iteration. In other words, will perform a 'Feedback Loop'.

Let's do that. 

  • Create a Solver SOP node, and copy and paste the nodes step1 and step2 inside the Solver SOP network.
  • inside the Solver SOP network connect Prev_Frame input to the node step1
  • inside the Solver SOP , change the expression on the node step2 to

    ch("../../../../heightfield1/gridspacing")
  • inside the Solver SOP, on the node step 1 change the parameter Topple Threshold Mult to 2. This will make sure that only the very steep parts of the sand distribution will be modified, which is more realistic.

  • outside of the Solver SOP connect the node heightfield_noise1 with input 1 of the Solver SOP.
  • outside of the Solver SOP connect the output of the Solver SOP to a delete node, and use it to delete the field '__toppledir'.
    This field was only a temporary storage to store data between step1 and step2.  Once we're out of the Solver SOP we no longer need it.
  • outside of the Solver SOP, Set the number of Sub Steps to 20.



Make sure you're on Frame 1 , and in your ViewPort you should see something like this :


Notice that the mask field is marking in red what elements of the sand distribution are not stable.
Let's press play and see what happens.


It's cool to see how the unstable sand areas get smaller and smaller , that is comforting cause it means the sand distribution is adapting to find a stable state. Let's remove the red areas, so we can actually see the result of all our work.

In the delete node, append @name=mask to the Group parameter to remove the mask feedback from the viewport.


Now the red areas should be gone and we can run a new simulation...
(for the sim below I increased Sub Steps to 100. It's a little slower per frame, but it takes less frames to see the progress.)


Note how the steep area collapse ...


.. and the sand accumulates at the base , creating an advancing leading edge ..


Furthermore, note how the process reaches an equilibrium , and little by little the sand distribution becomes stable and the result is an entirely new distribution.

Advanced ideas for a more production-ready tool :
  • use f@mask to art direct the areas that should be affected by this process
  • use a noise function to create patterns / more interesting toppling effect
  • implement a wind direction
  • emit sand (this time particles) from the non-stable areas to simulate something like this
  • use other objects as 'colliders'. For instance create a sand footstep fx where the sand gets pushed on the sides and accumulates in taller areas around the foot, when the foot sinks in the sand (I haven't implemented this one, but would be cool to see an example of this !) (Thank you for the idea Daniel !)
I hope you enjoyed this article and apologies for taking so long before releasing part-2.
Next time we'll implement this same setup using OpenCL and compare the speed with the VEX implementation.

You'll find this setup HERE.
Note : I've implemented this on H17.0.XXX., but it should work great even on 16.

Saturday, March 24, 2018

Sand without grains - PART 1 - Abelian Sandpile Model

About a month ago I stumbled upon a project with an interesting FX : Moving Sand Dunes.
Yes ... like in the movie 'Dune' but without the worms.
And with 3 weeks dev time.

UPDATE:
The commercial is finally online so I can show you where I did apply this technique !




For some reason this video keeps being moved from YouTube and eventually has been deleted. Don't ask me why. This is the only version I found on the Net.

During the first day I kept watching and re-watching this YouTube video.


The part I found really fascinating is around second 16, where you can see the leading edge of the falling sand moving up, and the pile of sand underneath accumulating trying to chase the leading edge. Strangely enough this results in something visually moving up , opposed to what we would expect from sand falling down. 

What causes this behavior in real life is the result of millions and millions of tiny sand rocks. Each little rock has no choice than falling when there's nothing else supporting it, roll down colliding with other rocks and eventually stop as soon as the inclination is not strong enough to win the friction with the other rocks. 
All these little rocks will then eventually accumulate until the whole system becomes stable. 

If I had to recreate this in Houdini exactly as I just described, it would take a huge amount of CPU power, RAM and weeks of simulation. Not to mention disk space. In other words, a lot of resources. And the result would probably never be remotely close, because even with a terrific render farm, I could never have THAT many particles.

Actually, how many particles are we talking about ?

Well, given that the average medium-grained sized sand is 0.375 mm in diameter, how many grains of sand can I fit ...
Houdini Sand Solver is NOT an option in this case.

Let's make a few considerations:
  • looking at the video, we cannot see the single sand rocks ! All we see is a large surface deforming. So maybe we can skip the particles approach entirely and focus on the behavior of this large surface.
  • naively put, the behavior of the sand dune looks 'simple', hence there must be a 'simple' mechanism to reproduce it.
  • we don't have to be physically correct ! (unless you're animating a cube : proof).
After a few minutes on YouTube I found this video :





Is really that simple ? Well ... seems too good to be true.
Good news is that since the algorithm is so simple, it will take no time to figure out if we're going in the right direction !

Before we start, let me show you the kind of result we are aiming for.


What you're looking at is a flipbook generated in Houdini and the surface is a HeightField.
The sand starts from a veeery unstable configuration and slowly rearranges in a way that is stable. Note how the little mountain in the foreground becomes stable soon, cause its height is lower. Note how the sand accumulates at the base, and the walls slowly become flat while growing and covering underlying elements.


Sand Pile Algorithm in 2D

After watching that video, I Googled "sandpile algorithm" and found this : Abelian Sandpile Model

I started reading and after a few lines I got to this point :


I stopped reading here, and I was already on Houdini.

NOTE:
Houdini has an HeightField Erode solver which is super mega cool, but in the long 5 minutes I tried I couldn't reproduce the sandpile behaviour described below. Regardless, I still wanted to experiment this idea, primarily because it's really quick to implement and once I do it myself I can control every single aspect of it.

Cool, let's get started.

As always, let's work in 2D first. It'll be easy to switch to 3D later.
Let's pretend this is our initial sand distribution:



Each column is a stack of sand grains.
For instance the stack E has 2 grains piled , the stack G has 3, and so on.

The idea is that if a stack of sand is too tall compared to the adjacent stacks (say by 2 grains), one grain of sand would fall down, towards one of the adjacent lower stacks.
When this happens, the taller stack becomes 'one grain' shorter , while the shorter stack will become 'one grain' taller ! Well, believe it or not, this is pretty much it.


Iteration - 1
Let's analyze the stacks one by one to identify the 'unstable' stacks:
stack C is 2 grains taller than B
stack D is 2 grains taller than E
stack G is 2 grains taller than F
stack H is 3 grains taller than I

So ...

C = C - 1 and B = B +1 (one grain of sand moves from C to B)
D = D - 1 and E = E +1 (one grain of sand moves from D to E)
G = G - 1 and F = F +1 (one grain of sand moves from G to F)
H = H - 1 and I = I +1  (one grain of sand moves from H to I)

Which brings us to the next state:


Iteration - 2
Let's analyze the stacks one by one again:

stack B is now 2 grains taller than A

So ...

B = B - 1 and A = A +1 (one grain of sand moves from B to A)


Iteration - 3
If we analyze the stacks again we notice there's nothing left to do. This means our sand distribution has reached a stable state and we're done !

What is interesting is that if you sum all the heights in each of the steps above , they'll return the same number.

0+1+3+4+2+1+3+3+0 = 17
0+2+2+3+3+2+2+2+1 = 17
1+1+2+3+3+2+2+2+1 = 17

This is good. It means that sand is not magically appearing or disappearing (the mass of the system is stable, like in reality). We just managed to re-arrange the stacks in a stable distribution using a simple iterative algorithm.

Assumptions about our initial sand distribution:
  • if there is sand, there is sand underneath it.

This distribution would work with the algorithm ! Sand is supported by more sand underneath
This distribution would NOT work with the algorithm ! Sand in the red area is not supported.

  • our sand grains are arranged into a grid.
Given the two assumptions above, we can simplify the system considering only the grains at the surface and ignore the sand below. In other words we'll only consider the visible surface and ignore whatever is happening in the depth.

The ideal data structure for this approach turns out to be a simple grid. Each cell of the grid correspond to one virtual stack of sand rocks (seen from above) and the number in it corresponds to the height of each stack.

Sand Pile Algorithm in 3D


Let's set this initial 'sand' distribution.



Remember that each square is a stack of sand, and the number is the height of that stack.

Before proceeding we need to set two 'global' variables:
  • Mass transfer (M) : how much sand will fall from an unstable stack to one of the adjacent stacks.
  • Threshold (T) : the minimum height difference between the current stack and its adjacent stacks that will mark that stack as unstable (and cause an amount M of sand to be transferred to some adjacent lower stack).
For this example let's set T = 2 and M = 0.5
Now, let's identify the stacks whose adjacent stacks are lower by T.


We can identify 3 ! For instance, both '2's are surrounded by many '0's (2-0 = T!), and '4' is obviously way taller than it's neighbors by a value bigger than T.
'1' instead is not selected because the difference between it's height and the adjacent stack's height is smaller than T.

The next step is to transfer some sand (M) from those 3 stacks to some adjacent lower stack.

Let's check our options:



I marked the possible 'falling' directions with arrows. For instance '4' can fall in all the directions cause it's literally surrounded by lower stacks. So ... what direction shall we choose ? Let's pick one randomly.
In the pic above I randomly choose one arrow (the red one), as the 'falling' direction.

Pic A

Cool. So the first '2' will fall left, '4' will fall down, and the last '2' will fall right.
What does 'fall' mean ?
It means that we'll have to remove M=0.5 from the tall stack (see pic below) ...


... and transfer it to the selected adjacent lower stack (see pic below).


More specifically, in this last step for each stack we need to add M as many times as incoming arrows.

At this point we have a new state, and we need to run the same steps again and again, until all the stacks are stable.

Remember that the orange stacks will become smaller (M will be subtracted) and the green stacks will become bigger (M will be added).

Step 2 (unstable)
  

Step 3 (unstable)

Step 4 (unstable)

Step 5 (stable !!)

Why is the last state stable ? Because no stack is surrounded by stacks lower than itself by T.
In other words, sand rocks have no reason to fall anywhere cause they're all supported by solid neighbors.

Cool.
There's an important consideration to make.
You might be tempted to do the following:

For each stack A ...
  • get the 4 neighbors
  • if any neighbor is lower than A by (say) 2 then
    • reduce A by (say) 0.5
    • increase one of those 'low' neighbors by 0.5

Well this would NOT work because during one iteration you're modifying one neighbor. But in the next iteration that neighbor might become the current stack !
So .. we need to do this in 2 steps, as showed above in Sandpile Algorithm 3d.

STEP 1 - identifying the unstable stacks only

For each stack A ...
  • if it's surrounded by any stack lower than T, choose one randomly ... 
For instance if the stacks Up and Left of A are 'low', we can randomly choose one, say Up. So we'll mark A with 'U'. (so the choices are, U, D, L, R for an unstable A stack).
If A is NOT surrounded by 'low' stacks, then we mark A as 'S' (for stable).

Pic B: This corresponds to Pic A (scroll up a few pics).

STEP 2 - sand transfer

Now we can loop over each stack again and actually perform the transfer.
More specifically.
For every stack A...
  • if A is NOT marked with S, reduce A by M (remember we set M = 0.5).
    Or, simply put, if A is unstable has to drop some sand.

  • if the up stack is marked D (down) it means that the north stack wants to transfer some sand to the stack to his south, which is A. So increase A by M.
  • if the down stack is marked U (up) it means that the south stack wants to transfer some sand to the stack to his north, which is A. So increase A by M.
  • if the left stack is marked R (right) it means that the left stack wants to transfer some sand to the stack to his right, which is A. So increase A by M.
  • if the right stack is marked L (left) it means that the right stack wants to transfer some sand to the stack to his left, which is A. So increase A by M.
Pic C

Note that in both Step 1 and 2 we're modifying only the stack currently looped on (A). The neighbors are inspected, never modified. Which happens to be exactly how Houdini VEX works ! What an incredible coincidence (it's not, I did it on purpose since the beginning ha!).

Cool.
Let's summarize the whole thing in pseudo-code.

STEP 1 (pseudo code)

  1. set T = 2 (T is the height threshold)
  2. for each stack A
    1. h = height of A
    2. hU = A's UP neighbor height
    3. hD = A's DOWN neighbor height
    4. hL = A's LEFT neighbor height
    5. hR = A's RIGHT neighbor height
    6. fU = 1 if hU is smaller than h by at least T, 0 otherwise.  (or ... h-hU>=T) 
    7. fD = 1 if hD is smaller than h by at least T, 0 otherwise.  (or ... h-hD>=T)
    8. fL = 1 if hL is smaller than h by at least T, 0 otherwise.  (or ... h-hL>=T)
    9. fR = 1 if hR is smaller than h by at least T, 0 otherwise.  (or ... h-hR>=T)
    10. if (any of fU or fD or fL or fR is 1) 
      1. then pick one randomly and tag A with that label (example if we choose fU, A is marked with 'U').
      2. else, mark A with 'S' (as stable)
At this stage all the stacks should have a tag that can be 'U', 'D', 'L', 'R', 'S' (like in Pic B).

STEP 2 (pseudo code)

  1. set M = 0.5 (M is the mass transfer between sand stacks)
  2. for each stack A
    1. h = A's height
    2. u = label of A's UP neighbor (could be 'U', 'D', 'L', 'R', 'S')
    3. d = label of A's DOWN neighbor (could be 'U', 'D', 'L', 'R', 'S')
    4. l = label of A's LEFT neighbor (could be 'U', 'D', 'L', 'R', 'S')
    5. r = label of A's RIGHT neighbor (could be 'U', 'D', 'L', 'R', 'S')
    6. if (u is 'D') then h = h + M
    7. if (d is 'U') then h = h + M
    8. if (l is 'R') then h = h + M
    9. if (r is 'L') then h = h + M
    10. if (A's label is different than 'S') then h = h - M
Cool ! This is pretty much the whole idea.


PART 3 - Implementing a HeightField Sandpile Solver in Houdini with OpenCL (coming a little later than soon)
This algorithm is already fast cause it involves very simple math, nevertheless if you want to have production quality (don't we all ? ) you might want to use a very large grid with a ton of detail. In that case, why not speed up the calculation by say ... 10 or 20 times converting the code into OpenCL ?