Scape: 2. Procedural basics

December 20, 2011 Graphics

This is the second article in a series on the Scape terrain editor. The first article introduced the project and focused on the rendering side of the terrain editor. This second article covers the first (and simpler) half of the procedural algorithms behind Scape's brushes. The fifth article on Scape provides a summary of the project, together with all links to the project's research, articles, source code and binaries.

Panoramic canyon terrain in Scape

Scape allows the user to edit the terrain through different type of customizable brush types, including:

  • Push/pull smoothly (raise and lower terrain)
  • Push/pull procedurally (paint with procedural algorithms)
  • Smooth/contrast (suppress and boost smaller details)
  • Level (bulldoze all brushed terrain to the height of the stroke’s begin position).

To apply these brushes, the user can use a mouse or stylus to ‘stroke’ or 'paint' over the areas that need to be edited. The strength of the brush can be controlled as a brush parameter, but also by using a pressure-sensitive input device (e.g. a Wacom tablet). The width of the affected area is controlled the brush’s an inner and outer radius. The whole inner area is transformed with full strength, while the brush’s influence gradually fades to zero between the inner and outer radius.

Many of these brush types found in Scape can be found in other terrain editors as well. But the combination of exposing procedural techniques through fast user-controlled brushes is less common. This article will focus on its procedural brushes only, starting off with the simpler ones. And in the next article, two even more advanced algorithms are described. How all these procedural terrain-generating algorithms are used as fast brushes will be the topic of a future article in this series.

A number of different procedural algorithms have been implemented in Scape, but all are based on combining multiple samples from a 2D Perlin noise function. The reason for that is that Perlin noise is well-behaved (e.g. produces nicely band-limited noise), is fast, and can be evaluated locally and in parallel. This makes Perlin noise preferable over other techniques like iterative simulations, Poisson faulting, midpoint displacement, Voronoi summing and Fourier synthesis.

2D Perlin noise

2D Perlin noise

In case you don't know what Perlin noise looks like: Sampling it as a 2D image greyscale generates (something like) the image on the right.  But for the particular case of heightfield terrain, a Perlin noise function would typically be used to calculate the vertical y value for a given set of two-dimensional horizontal (x, z) coordinates. When the image is interpreted like that, white would indicate a high altitude, and black a low altitude.

Of course, this noise on its own wouldn't create a very interesting heightfield. But by mixing differently scaled versions of the Perlin noise functions, more complex and realistic terrain can be generated. But I’ll cover the implementation of the main building block first.

Perlin noise

Perlin noise functions can be written for any number of input and output dimensions, but for the particular case of heightfields, having one 2D horizontal position input (called p below), one 1D seed input and one 1D output (the calculated y component) is exactly enough. And for that particular case, the following implementation has been specifically written and optimized.

Note that all code in this article is written in the Cg language, allowing direct evaluation of the functions as (pixel) shader effects.

uniform sampler2D samplerPerlinPerm2D;
uniform sampler2D samplerPerlinGrad2D;

float perlinNoise(float2 p, float seed)
{
    // Calculate 2D integer coordinates i and fraction p.
    float2 i = floor(p);
    float2 f = p - i;

    // Get weights from the coordinate fraction
    float2 w = f * f * f * (f * (f * 6 - 15) + 10);
    float4 w4 = float4(1, w.x, w.y, w.x * w.y);

    // Get the four randomly permutated indices from the noise lattice nearest to
    // p and offset these numbers with the seed number.
    float4 perm = tex2D(samplerPerlinPerm2D, i / 256) + seed;

    // Permutate the four offseted indices again and get the 2D gradient for each
    // of the four permutated coordinates-seed pairs.
    float4 g1 = tex2D(samplerPerlinGrad2D, perm.xy) * 2 - 1;
    float4 g2 = tex2D(samplerPerlinGrad2D, perm.zw) * 2 - 1;

    // Evaluate the four lattice gradients at p
    float a = dot(g1.xy, f);
    float b = dot(g2.xy, f + float2(-1,  0));
    float c = dot(g1.zw, f + float2( 0, -1));
    float d = dot(g2.zw, f + float2(-1, -1));

    // Bi-linearly blend between the gradients, using w4 as blend factors.
    float4 grads = float4(a, b - a, c - a, a - b - c + d);
    float n = dot(grads, w4);

    // Return the noise value, roughly normalized in the range [-1, 1]
    return n * 1.5;
}

This implementation is loosely based on the GPU Gems 2 article ‘Implementing Improved Perlin Noise’ by Simon Green. See my MSc. thesis or his article for more background information on Perlin noise. A noticeable implementational difference is that Green's version defines a full 3D Perlin noise function, while the function above calculates only a full 2D Perlin noise result, randomized by a seed number. Obviously, you could interpret the seed as the third input dimension, but the conceptual difference here is that no smooth transition from ‘depth’ seed to seed+1 is required for the purpose of heightfield generation. Hence, no interpolation is needed between results from neighboring seeds, making it possible to considerably optimize Green’s implementation.

samplerPerlinPerm2D's permutation texture

Permutation texture

samplerPerlinPerm2D is used to sample a carefully constructed 256 x 256 permutation RGBA texture. This texture is effectively a lookup table that, given 'lattice point' i, returns the hash of (i.x, i.y), of (i.x, i.y + 1), of (i.x + 1, i.y) and of (i.x + 1, i.y + 1) in its R, G, B and A channel, respectively. In other words, perm will contain not only contain a randomly permutated version of i, but also that of its neighbours to the south, east and south-east. The R8G8B8A8 repeated texture is to be sampled at the nearest point (so, without bilinear blending) with mipmapping turned off.

samplerPerlinGrad2D's gradient texture

Gradient texture

samplerPerlinGrad2D uses a similar trick. Its 2D input (the hash of two north-south neighbors on the lattice, offsetted by the seed number) is used to do two lookups in one sample. Sampling its 256 x 256 R8G8B8A8 texture permutates its input coordinates one last time (properly hashing the seed together with the coordinates) and returns the two 2D gradients for the two neighbors in its RGBA components. Consequently, only two samplerPerlinGrad2D samples are needed to calculate the 2D gradients for all four cell corners (i.e. lattice points) surrounding p. Next, the four planes (defined by the four corners’ positions and gradients) are used to calculate the four heights at p. Lastly, these heights are blended using a quintic interpolator, and normalized. Like samplerPerlinPerm2D, the texture sampler mode should be set to repeating, nearest point sampling and no mipmapping.

So, in effect, only three texture lookups in total are necessary to shuffle the 2D coordinates plus seed three times through an 8-bit permutation table and look up the four gradients at the four nearest cell corners. 8-bit shuffling proved to be more than enough because, in practice, Perlin noise is always sampled at multiple different locations and seed numbers before being combined and outputted as terrain, hiding its (eventually) repetitive behaviour even better.

Near the end this article, you’ll find a link to a code snippet that generates the permutation and gradient textures for use in DirectX. Please note that DirectX and OpenGL define differently what 'the next line' is in a texture. To get the code to work correctly under OpenGL, you could add fragment program code to flip the v coordinate in the texture reads. But that would only work when additional scale-and-bias code is added as well, countering artefacts due to differences in rounding behaviour. Consequently, it's better to simply use a permutation and gradient image that have been flipped vertically in advance and keep the Cg code unaltered when using OpenGL, as demonstrated in the FX Composer project.

fBM

(fBm) turbulence

(fBm) turbulence

fBM (fractional Brownian motion) is the simplest of the composite Perlin noise algorithms. Because more complex turbulence algorithms are extensions of this basic idea, this algorithm is simply called 'turbulence' in Scape.

It calculates the weighted sum a number of scaled Perlin noise ‘octaves’, adding details over a broad range of scales. Each octave is responsible for a different size of features, or, inversely, a dominant 'frequency' in the generated 2D signal.

float turbulence(float2 p, float seed, int octaves,
                 float lacunarity = 2.0, float gain = 0.5)
{
    float sum = 0;
    float freq = 1.0, amp = 1.0;
    for (int i=0; i < octaves; i++)
    {
        float n = perlinNoise(p*freq, seed + i / 256.0);
        sum += n*amp;
        freq *= lacunarity;
        amp *= gain;
    }
    return sum;
}

The seed should be any number between 0.0 and 1.0. The lacunarity (the frequency multiplier between octaves) is typically set to a number just under 2.0 (e.g. 1.92) to prevent regular overlap in Perlin noise lattices.

To cover all possible feature scales, the number of octaves is typically a bit less than log(terrain_width) / log(lacunarity). So, for a 1024 x 1024 heightfield, about 10 octaves are needed. The gain is controlled by the user and directly influences the terrain roughness.

Billowy turbulence

(Billowy) turbulence

(Billowy) turbulence

When the following billowedNoise function is called instead of perlinNoise in the turbulence function, a more ‘billowy’ and eroded terrain with sharp creases can be created. This type of turbulence noise is often called simply ‘the’ turbulence function in other literature. As you can see below, it simply is defined as the absolute value of the standard perlinNoise function.

float billowedNoise(float2 p, float seed)
{
    return abs(perlinNoise(p, seed));
}

Ridged turbulence

Ridged turbulence

Ridged turbulence

Using the complement of the billowedNoise function results in terrain with sharp ridges instead of creases. That is, by calling ridgedNoise instead of perlinNoise in the turbulence function, a different, less eroded kind of mountainous terrain is approximated. Note that these algorithms have little to do with actually occurring processes in nature, but produce (somewhat) convincing results nonetheless.

float ridgedNoise(float2 p, float seed)
{
    return 1.0f-abs(perlinNoise(p, seed));
}

IQ Noise

The algorithms above are originally all due to K. Perlin. They work by summing noise functions at different scales, but the noise functions are completely independent from each other. Consequently, both peaks and valleys have equal amounts of (statistically similar) smaller features.

IQ noise

IQ noise

One way to create more heterogeneous terrain is to modify the amplitude of finer detail octaves based on (intermediate) output from coarser octaves while summing over them as before.

Originally described by I. Quillez, the following algorithm applies that principle quite nicely:

float iqTurbulence(float2 p, float seed, int octaves,
                   float lacunarity = 2.0, float gain = 0.5)
{
    float sum = 0.5;
    float freq = 1.0, amp = 1.0;
    float2 dsum = float2(0,0);
    for (int i=0; i < octaves; i++)
    {
        float3 n = perlinNoisePseudoDeriv(p*freq, seed + i / 256.0);
        dsum += n.yz;
        sum += amp * n.x / (1 + dot(dsum, dsum));
        freq *= lacunarity;
        amp *= gain;
    }
    return sum;
}

n is the output of a noise function called perlinNoisePseudoDeriv that returns 3 components. n.x is simply the current octave’s Perlin noise output, equal to what float n was in previous algorithms. n.y and n.z are kind of like the local derivatives of the Perlin noise function at the sampled coordinates. I say ‘kind of like’ because, in fact, they output something different because the derivative of the noise function was derived incorrectly by the original author. Nevertheless (or perhaps thanks to), they output the right kind of information for this algorithm to create much more interesting terrain.

While iterating over the octaves, the pseudo derivatives are collected and summed into dsum, which is used to (kind of) suppress finer details on slopes detected in coarser octaves. Note that when dsum would be zero, the algorithm is equivalent to the turbulence function. The perlinNoisePseudoDeriv function is defined as follows:

float3 perlinNoisePseudoDeriv(float2 p, float seed)
{
    // Calculate 2D integer coordinates i and fraction p.
    float2 i = floor(p);
    float2 f = p - i;

    // Get weights from the coordinate fraction
    float2 w = f * f * f * (f * (f * 6 - 15) + 10);
    float4 w4 = float4(1, w.x, w.y, w.x * w.y);

    // Get pseudo derivative weights
    float2 dw = f * f * (f * (30 * f - 60) + 30);

    // Get the four randomly permutated indices from the noise lattice nearest to
    // p and offset these numbers with the seed number.
    float4 perm = tex2D(samplerPerlinPerm2D, i / 256) + seed;

    // Permutate the four offseted indices again and get the 2D gradient for each
    // of the four permutated coordinates-seed pairs.
    float4 g1 = tex2D(samplerPerlinGrad2D, perm.xy) * 2 - 1;
    float4 g2 = tex2D(samplerPerlinGrad2D, perm.zw) * 2 - 1;

    // Evaluate the four lattice gradients at p
    float a = dot(g1.xy, f);
    float b = dot(g2.xy, f + float2(-1,  0));
    float c = dot(g1.zw, f + float2( 0, -1));
    float d = dot(g2.zw, f + float2(-1, -1));

    // Bi-linearly blend between the gradients, using w4 as blend factors.
    float4 grads = float4(a, b - a, c - a, a - b - c + d);
    float n = dot(grads, w4);

    // Calculate pseudo derivates
    float dx = dw.x * (grads.y + grads.w*w.y);
    float dy = dw.y * (grads.z + grads.w*w.x);

    // Return the noise value, roughly normalized in the range [-1, 1]
    // Also return the pseudo dn/dx and dn/dy, scaled by the same factor
    return float3(n, dx, dy) * 1.5;
}

FX Composer

If you want to use or experiment with any of the code mentioned in this article, you can simply download the NVIDIA FX Composer project from the Downloads section below and start from there. It contains all discussed algorithms as different DX9/DX10 FX and (OpenGL) CgFX techniques, outputting the result as a greyscale procedural image on a plane.

Procedural pixel shader in FX Composer

That's it for now. In the next article in this series I will cover more complex extensions and variations on the techniques described above.

Downloads

  • FX Composer 2.5 project, containing all discussed algorithms (ZIP).
  • C++ permutation and gradient image generation snippet (TXT)

Licensed under the Simplified BSD license.

Comments (8)

Crysta
January 3, 2012

Wonderful information, I am checking back again frequently looking for upgrades.

interval training
April 6, 2012

I like your fantastic web site, I was searching for this all over.
best regards
Ron

Arturo Pliego
June 3, 2012

Some truly prime posts on this website, saved to my bookmarks.

Jérémie St-Amand
June 10, 2012

How would you calculate terrain normals effectively?

I tried finding the tangent and bitangent vectors in the vertex program and then normalizing their cross product. It works, but requires a LOT of shader instructions because I'm computing the noise functions 3 times (so I must reduce the octaves count significantly).

What would be a more efficient solution?

Thanks for your great articles!

Giliam
June 10, 2012

Hi Jérémie.

As always, the answer depends on the context. ;-) . If you're using perlin noise summing without any form of input distortion to compute the 'standard' turbulence function, then the perlinNoiseDeriv function defined in the next article is basically all you need. Start with float height = 0; and float3 normal = float3(0,0,0); Then, for each octave, execute float3 n = perlinNoiseDeriv(freq*pos); height += amp * n.x; normal += amp * freq * float3(-n.y,1,-n.z); and after summing all octaves, simply normalize normal once, and your done!

If that approach is somehow unsuitable for your purposes, then you'll need to compute the result of at least 3 noise function per position, as you already said. But you might be able to optimize it at least a bit further. For example, if your noise function implementation is currently pretty serial and uses mainly scalars instead of vectors, then you might be able to calculate a lot of the steps for the three different noise evalutions in parallel at the same time 'for free' by using xyz instead of only x.

Another approach would be to use pixel shaders to render out the heights to an intermediate buffer in a separate pass, which can then be sampled at 4 locations in the vertex shader to calculate the 2 deltas (or, if you prefer you could have a second pixel shader pass to do this, and then sample its result written to a second intermediate buffer from within the vertex shader). But depending on the type of terrain renderer, this approach might or might not be feasible (and could produce LOD-dependent normals, for example).

But in any case, the cross product could at least be optimized away, as float3 normal = normalize(cross(float3(0, (h2-h0)/dist, 1), float3(1, (h1-h0)/dist, 0))) = normalize(float3(-(h1-h0)/dist, 1, -(h2-h0)/dist)) = normalize(float3(h0-h1, dist, h0-h2)).

Good luck!
Giliam

Jérémie St-Amand
June 12, 2012

Thank you Giliam, your first suggestion works great! However, when I try to do this with Billowed and Ridged noise, the normals don't come up right. I guess I'm not supposed to abs() the whole float3 vector of perlinNoiseDeriv..?

Thanks again!
Jérémie

Giliam
June 13, 2012

That's good to hear! And you're right: for the analytic approach to work, any change to how n.x is applied should also be incorporated into how the gradient n.yz is used. For stuff like multifractals and input distortion, this can become quite complex. But for billowed and ridged turbulence, this is still relatively easy. And you're already quite close. But if the derivative of any function f(x) with respect to x is called g(x), then the derivative of abs(f(x)) with respect to x is sign(f(x))*g(x), and not abs(g(x)). So, instead of doing n = abs(n) for billowy noise, use n = float3(abs(n.x), (n.x>0?1:-1)*n.yz). Likewise, use n = float3(1-abs(n.x), (n.x>0?-1:1)*n.yz) for ridged turbulence (notice the change in order of 1 and -1 due to the MINUS abs(x)).
As a side note, i wrote x>0?1:-1 above instead of sign(x) as that can result is slightly better performance (in Cg, that is). And what I've mentioned in my previous comment about summing 'normal' could also be optimized just a bit further: use 'dsum += .... * n.yz' in the loop instead of 'normal += .... * float3(-n.y,1,-n.z)', where dsum (i.e. the 2D derivatives or gradient sum) is initailized before the loop with float2(0,0), and the 3D normal is only calculated once AFTER the loop (so: float3 normal = normalize(float3(-dsum.x,1,-dsum.y));).
Btw, I never compared the performance of the analytic normal approach to the performance of a sample-three-times differencing approach myself. So, I'm quite interested to hear if and how much difference in performance you noticed in practice.
Regards, Giliam

Florian
January 13, 2013

The ridged noise is nice, except all valleys get round. I'd propose a double ridged operator that works like this (for a statistical range of the noise between -0.5 to 0.5)

abs(0.5-abs(v)*2.0)*2.0-0.5

What this does it gives both the valleys and tops ridges, leading to a more fractured/mountainous landsacpe. It's a distinct difference from just abs(v).

Leave a comment