This is the second article in a series on the Scape terrain editor. The first article introduced the project and focused on the rendering side of the terrain editor. This second article covers the first (and simpler) half of the procedural algorithms behind Scape’s brushes. The fifth article on Scape provides a summary of the project, together with all links to the project’s research, articles, source code and binaries.
Scape allows the user to edit the terrain through different type of customizable brush types, including:
- Push/pull smoothly (raise and lower terrain)
- Push/pull procedurally (paint with procedural algorithms)
- Smooth/contrast (suppress and boost smaller details)
- Level (bulldoze all brushed terrain to the height of the stroke’s begin position).
To apply these brushes, the user can use a mouse or stylus to ‘stroke’ or ‘paint’ over the areas that need to be edited. The strength of the brush can be controlled as a brush parameter, but also by using a pressure-sensitive input device (e.g. a Wacom tablet). The width of the affected area is controlled the brush’s an inner and outer radius. The whole inner area is transformed with full strength, while the brush’s influence gradually fades to zero between the inner and outer radius.
Many of these brush types found in Scape can be found in other terrain editors as well. But the combination of exposing procedural techniques through fast user-controlled brushes is less common. This article will focus on its procedural brushes only, starting off with the simpler ones. And in the next article, two even more advanced algorithms are described. How all these procedural terrain-generating algorithms are used as fast brushes will be the topic of a future article in this series.
A number of different procedural algorithms have been implemented in Scape, but all are based on combining multiple samples from a 2D Perlin noise function. The reason for that is that Perlin noise is well-behaved (e.g. produces nicely band-limited noise), is fast, and can be evaluated locally and in parallel. This makes Perlin noise preferable over other techniques like iterative simulations, Poisson faulting, midpoint displacement, Voronoi summing and Fourier synthesis.
In case you don’t know what Perlin noise looks like: Sampling it as a 2D image greyscale generates (something like) the image on the right. But for the particular case of heightfield terrain, a Perlin noise function would typically be used to calculate the vertical y value for a given set of two-dimensional horizontal (x, z) coordinates. When the image is interpreted like that, white would indicate a high altitude, and black a low altitude.
Of course, this noise on its own wouldn’t create a very interesting heightfield. But by mixing differently scaled versions of the Perlin noise functions, more complex and realistic terrain can be generated. But I’ll cover the implementation of the main building block first.
Perlin noise
Perlin noise functions can be written for any number of input and output dimensions, but for the particular case of heightfields, having one 2D horizontal position input (called p below), one 1D seed input and one 1D output (the calculated y component) is exactly enough. And for that particular case, the following implementation has been specifically written and optimized.
Note that all code in this article is written in the Cg language, allowing direct evaluation of the functions as (pixel) shader effects.
uniform sampler2D samplerPerlinPerm2D; uniform sampler2D samplerPerlinGrad2D; float perlinNoise(float2 p, float seed) { // Calculate 2D integer coordinates i and fraction p. float2 i = floor(p); float2 f = p - i; // Get weights from the coordinate fraction float2 w = f * f * f * (f * (f * 6 - 15) + 10); float4 w4 = float4(1, w.x, w.y, w.x * w.y); // Get the four randomly permutated indices from the noise lattice nearest to // p and offset these numbers with the seed number. float4 perm = tex2D(samplerPerlinPerm2D, i / 256) + seed; // Permutate the four offseted indices again and get the 2D gradient for each // of the four permutated coordinates-seed pairs. float4 g1 = tex2D(samplerPerlinGrad2D, perm.xy) * 2 - 1; float4 g2 = tex2D(samplerPerlinGrad2D, perm.zw) * 2 - 1; // Evaluate the four lattice gradients at p float a = dot(g1.xy, f); float b = dot(g2.xy, f + float2(-1, 0)); float c = dot(g1.zw, f + float2( 0, -1)); float d = dot(g2.zw, f + float2(-1, -1)); // Bi-linearly blend between the gradients, using w4 as blend factors. float4 grads = float4(a, b - a, c - a, a - b - c + d); float n = dot(grads, w4); // Return the noise value, roughly normalized in the range [-1, 1] return n * 1.5; }
This implementation is loosely based on the GPU Gems 2 article ‘Implementing Improved Perlin Noise’ by Simon Green. See my MSc. thesis or his article for more background information on Perlin noise. A noticeable implementational difference is that Green’s version defines a full 3D Perlin noise function, while the function above calculates only a full 2D Perlin noise result, randomized by a seed number. Obviously, you could interpret the seed as the third input dimension, but the conceptual difference here is that no smooth transition from ‘depth’ seed to seed+1 is required for the purpose of heightfield generation. Hence, no interpolation is needed between results from neighboring seeds, making it possible to considerably optimize Green’s implementation.
samplerPerlinPerm2D is used to sample a carefully constructed 256 x 256 permutation RGBA texture. This texture is effectively a lookup table that, given ‘lattice point’ i, returns the hash of (i.x, i.y), of (i.x, i.y + 1), of (i.x + 1, i.y) and of (i.x + 1, i.y + 1) in its R, G, B and A channel, respectively. In other words, perm will contain not only contain a randomly permutated version of i, but also that of its neighbours to the south, east and south-east. The R8G8B8A8 repeated texture is to be sampled at the nearest point (so, without bilinear blending) with mipmapping turned off.
samplerPerlinGrad2D uses a similar trick. Its 2D input (the hash of two north-south neighbors on the lattice, offsetted by the seed number) is used to do two lookups in one sample. Sampling its 256 x 256 R8G8B8A8 texture permutates its input coordinates one last time (properly hashing the seed together with the coordinates) and returns the two 2D gradients for the two neighbors in its RGBA components. Consequently, only two samplerPerlinGrad2D samples are needed to calculate the 2D gradients for all four cell corners (i.e. lattice points) surrounding p. Next, the four planes (defined by the four corners’ positions and gradients) are used to calculate the four heights at p. Lastly, these heights are blended using a quintic interpolator, and normalized. Like samplerPerlinPerm2D, the texture sampler mode should be set to repeating, nearest point sampling and no mipmapping.
So, in effect, only three texture lookups in total are necessary to shuffle the 2D coordinates plus seed three times through an 8-bit permutation table and look up the four gradients at the four nearest cell corners. 8-bit shuffling proved to be more than enough because, in practice, Perlin noise is always sampled at multiple different locations and seed numbers before being combined and outputted as terrain, hiding its (eventually) repetitive behaviour even better.
Near the end this article, you’ll find a link to a code snippet that generates the permutation and gradient textures for use in DirectX. Please note that DirectX and OpenGL define differently what ‘the next line’ is in a texture. To get the code to work correctly under OpenGL, you could add fragment program code to flip the v coordinate in the texture reads. But that would only work when additional scale-and-bias code is added as well, countering artefacts due to differences in rounding behaviour. Consequently, it’s better to simply use a permutation and gradient image that have been flipped vertically in advance and keep the Cg code unaltered when using OpenGL, as demonstrated in the FX Composer project.
fBM
fBM (fractional Brownian motion) is the simplest of the composite Perlin noise algorithms. Because more complex turbulence algorithms are extensions of this basic idea, this algorithm is simply called ‘turbulence’ in Scape.
It calculates the weighted sum a number of scaled Perlin noise ‘octaves’, adding details over a broad range of scales. Each octave is responsible for a different size of features, or, inversely, a dominant ‘frequency’ in the generated 2D signal.
float turbulence(float2 p, float seed, int octaves, float lacunarity = 2.0, float gain = 0.5) { float sum = 0; float freq = 1.0, amp = 1.0; for (int i=0; i < octaves; i++) { float n = perlinNoise(p*freq, seed + i / 256.0); sum += n*amp; freq *= lacunarity; amp *= gain; } return sum; }
The seed should be any number between 0.0 and 1.0. The lacunarity (the frequency multiplier between octaves) is typically set to a number just under 2.0 (e.g. 1.92) to prevent regular overlap in Perlin noise lattices.
To cover all possible feature scales, the number of octaves is typically a bit less than log(terrain_width) / log(lacunarity). So, for a 1024 x 1024 heightfield, about 10 octaves are needed. The gain is controlled by the user and directly influences the terrain roughness.
Billowy turbulence
When the following billowedNoise function is called instead of perlinNoise in the turbulence function, a more ‘billowy’ and eroded terrain with sharp creases can be created. This type of turbulence noise is often called simply ‘the’ turbulence function in other literature. As you can see below, it simply is defined as the absolute value of the standard perlinNoise function.
float billowedNoise(float2 p, float seed) { return abs(perlinNoise(p, seed)); }
Ridged turbulence
Using the complement of the billowedNoise function results in terrain with sharp ridges instead of creases. That is, by calling ridgedNoise instead of perlinNoise in the turbulence function, a different, less eroded kind of mountainous terrain is approximated. Note that these algorithms have little to do with actually occurring processes in nature, but produce (somewhat) convincing results nonetheless.
float ridgedNoise(float2 p, float seed) { return 1.0f-abs(perlinNoise(p, seed)); }
IQ Noise
The algorithms above are originally all due to K. Perlin. They work by summing noise functions at different scales, but the noise functions are completely independent from each other. Consequently, both peaks and valleys have equal amounts of (statistically similar) smaller features.
One way to create more heterogeneous terrain is to modify the amplitude of finer detail octaves based on (intermediate) output from coarser octaves while summing over them as before.
Originally described by I. Quillez, the following algorithm applies that principle quite nicely:
float iqTurbulence(float2 p, float seed, int octaves, float lacunarity = 2.0, float gain = 0.5) { float sum = 0.5; float freq = 1.0, amp = 1.0; float2 dsum = float2(0,0); for (int i=0; i < octaves; i++) { float3 n = perlinNoisePseudoDeriv(p*freq, seed + i / 256.0); dsum += n.yz; sum += amp * n.x / (1 + dot(dsum, dsum)); freq *= lacunarity; amp *= gain; } return sum; }
n is the output of a noise function called perlinNoisePseudoDeriv that returns 3 components. n.x is simply the current octave’s Perlin noise output, equal to what float n was in previous algorithms. n.y and n.z are kind of like the local derivatives of the Perlin noise function at the sampled coordinates. I say ‘kind of like’ because, in fact, they output something different because the derivative of the noise function was derived incorrectly by the original author. Nevertheless (or perhaps thanks to), they output the right kind of information for this algorithm to create much more interesting terrain.
While iterating over the octaves, the pseudo derivatives are collected and summed into dsum, which is used to (kind of) suppress finer details on slopes detected in coarser octaves. Note that when dsum would be zero, the algorithm is equivalent to the turbulence function. The perlinNoisePseudoDeriv function is defined as follows:
float3 perlinNoisePseudoDeriv(float2 p, float seed) { // Calculate 2D integer coordinates i and fraction p. float2 i = floor(p); float2 f = p - i; // Get weights from the coordinate fraction float2 w = f * f * f * (f * (f * 6 - 15) + 10); float4 w4 = float4(1, w.x, w.y, w.x * w.y); // Get pseudo derivative weights float2 dw = f * f * (f * (30 * f - 60) + 30); // Get the four randomly permutated indices from the noise lattice nearest to // p and offset these numbers with the seed number. float4 perm = tex2D(samplerPerlinPerm2D, i / 256) + seed; // Permutate the four offseted indices again and get the 2D gradient for each // of the four permutated coordinates-seed pairs. float4 g1 = tex2D(samplerPerlinGrad2D, perm.xy) * 2 - 1; float4 g2 = tex2D(samplerPerlinGrad2D, perm.zw) * 2 - 1; // Evaluate the four lattice gradients at p float a = dot(g1.xy, f); float b = dot(g2.xy, f + float2(-1, 0)); float c = dot(g1.zw, f + float2( 0, -1)); float d = dot(g2.zw, f + float2(-1, -1)); // Bi-linearly blend between the gradients, using w4 as blend factors. float4 grads = float4(a, b - a, c - a, a - b - c + d); float n = dot(grads, w4); // Calculate pseudo derivates float dx = dw.x * (grads.y + grads.w*w.y); float dy = dw.y * (grads.z + grads.w*w.x); // Return the noise value, roughly normalized in the range [-1, 1] // Also return the pseudo dn/dx and dn/dy, scaled by the same factor return float3(n, dx, dy) * 1.5; }
FX Composer
If you want to use or experiment with any of the code mentioned in this article, you can simply download the NVIDIA FX Composer project from the Downloads section below and start from there. It contains all discussed algorithms as different DX9/DX10 FX and (OpenGL) CgFX techniques, outputting the result as a greyscale procedural image on a plane.
That’s it for now. In the next article in this series I will cover more complex extensions and variations on the techniques described above.
Downloads
- FX Composer 2.5 project, containing all discussed algorithms (ZIP).
- C++ permutation and gradient image generation snippet (TXT)
Licensed under the Simplified BSD license.
Comments (9)
Wonderful information, I am checking back again frequently looking for upgrades.
April 6, 2012
I like your fantastic web site, I was searching for this all over.
best regards
Ron
June 3, 2012
Some truly prime posts on this website, saved to my bookmarks.
June 10, 2012
How would you calculate terrain normals effectively?
I tried finding the tangent and bitangent vectors in the vertex program and then normalizing their cross product. It works, but requires a LOT of shader instructions because I’m computing the noise functions 3 times (so I must reduce the octaves count significantly).
What would be a more efficient solution?
Thanks for your great articles!
June 10, 2012
Hi Jérémie.
As always, the answer depends on the context. ;-). If you’re using perlin noise summing without any form of input distortion to compute the ‘standard’ turbulence function, then the perlinNoiseDeriv function defined in the next article is basically all you need. Start with float height = 0; and float3 normal = float3(0,0,0); Then, for each octave, execute float3 n = perlinNoiseDeriv(freq*pos); height += amp * n.x; normal += amp * freq * float3(-n.y,1,-n.z); and after summing all octaves, simply normalize normal once, and your done!
If that approach is somehow unsuitable for your purposes, then you’ll need to compute the result of at least 3 noise function per position, as you already said. But you might be able to optimize it at least a bit further. For example, if your noise function implementation is currently pretty serial and uses mainly scalars instead of vectors, then you might be able to calculate a lot of the steps for the three different noise evalutions in parallel at the same time ‘for free’ by using xyz instead of only x.
Another approach would be to use pixel shaders to render out the heights to an intermediate buffer in a separate pass, which can then be sampled at 4 locations in the vertex shader to calculate the 2 deltas (or, if you prefer you could have a second pixel shader pass to do this, and then sample its result written to a second intermediate buffer from within the vertex shader). But depending on the type of terrain renderer, this approach might or might not be feasible (and could produce LOD-dependent normals, for example).
But in any case, the cross product could at least be optimized away, as float3 normal = normalize(cross(float3(0, (h2-h0)/dist, 1), float3(1, (h1-h0)/dist, 0))) = normalize(float3(-(h1-h0)/dist, 1, -(h2-h0)/dist)) = normalize(float3(h0-h1, dist, h0-h2)).
Good luck!
Giliam
June 12, 2012
Thank you Giliam, your first suggestion works great! However, when I try to do this with Billowed and Ridged noise, the normals don’t come up right. I guess I’m not supposed to abs() the whole float3 vector of perlinNoiseDeriv..?
Thanks again!
Jérémie
June 13, 2012
That’s good to hear! And you’re right: for the analytic approach to work, any change to how n.x is applied should also be incorporated into how the gradient n.yz is used. For stuff like multifractals and input distortion, this can become quite complex. But for billowed and ridged turbulence, this is still relatively easy. And you’re already quite close. But if the derivative of any function f(x) with respect to x is called g(x), then the derivative of abs(f(x)) with respect to x is sign(f(x))*g(x), and not abs(g(x)). So, instead of doing n = abs(n) for billowy noise, use n = float3(abs(n.x), (n.x>0?1:-1)*n.yz). Likewise, use n = float3(1-abs(n.x), (n.x>0?-1:1)*n.yz) for ridged turbulence (notice the change in order of 1 and -1 due to the MINUS abs(x)).
As a side note, i wrote x>0?1:-1 above instead of sign(x) as that can result is slightly better performance (in Cg, that is). And what I’ve mentioned in my previous comment about summing ‘normal’ could also be optimized just a bit further: use ‘dsum += …. * n.yz’ in the loop instead of ‘normal += …. * float3(-n.y,1,-n.z)’, where dsum (i.e. the 2D derivatives or gradient sum) is initailized before the loop with float2(0,0), and the 3D normal is only calculated once AFTER the loop (so: float3 normal = normalize(float3(-dsum.x,1,-dsum.y));).
Btw, I never compared the performance of the analytic normal approach to the performance of a sample-three-times differencing approach myself. So, I’m quite interested to hear if and how much difference in performance you noticed in practice.
Regards, Giliam
January 13, 2013
The ridged noise is nice, except all valleys get round. I’d propose a double ridged operator that works like this (for a statistical range of the noise between -0.5 to 0.5)
abs(0.5-abs(v)*2.0)*2.0-0.5
What this does it gives both the valleys and tops ridges, leading to a more fractured/mountainous landsacpe. It’s a distinct difference from just abs(v).
November 2, 2014
awesome post! huge thanks for sharing Giliam.
For normals calculation I use the built-in derivative function in GLSL (dFdx/dFdy or ddx/ddy in CG), it works well and the performance is improved. This method does some visual artifacts though – flat shaded normals when passed from vertex shader, and pixely normals in fragment shader…
in Frag shader:
h = perlinNoise(p, seed); //can also come from Vertex shader
vec3 pos = vec3(p.xy,h);
vec3 dx = dFdx(pos);
vec3 dy = dFdy(pos);
vec3 nVec = normalize(cross(dx ,dy));
January 3, 2012