Category Archives: All

Boundaries and nesting of land and water areas

One of the problems that arises in the process of drawing lands and lakes boundaries is the problem of determining the relative nesting of land and water areas. It’s a common thing, when there are lakes inside the continent, and, in turn, there can be islands in those lakes. In principle, infinite nesting is possible: there can be pools of water on the islands, in which there may be small islets, and so on.

When the boundaries between land and water are determined the role of these boundaries are not always clear. A boundary line may be the mainland coastline or the shore of a lake. The task here is to strictly define the boundary of which object is the given boundary line.

a lake island

An example of in-lake island

You can accomplish this task using PostGIS extension, just as we used it to create rivers with variable widths.

As an example, we consider four simple geographical objects as in the picture: red squares are the continents, blue square is the lake, and yellow square is the island in this lake.

It is natural to consider that the lakes’ areas belong to the mainland, and the lake islands’ areas belong to the lakes. That is, all our objects are polygons without holes. In this case, the objects will be located either side by side (if they do not intersect) or one above the other, as if in layers.

Partial overlapping of objects are not allowed (in the project this is provided by geography objects creating procedures).

Let’s create a GIS database of those objects.


    fid           Bigint NOT NULL,
    pfid          Bigint,
    zindex        Smallint,
    area          Double Precision,
    geom          Geometry(POLYGON, 3785)

polygons table has following fields:

  • fid is feature id of the polygon.
  • pfid is the id of the polygon container in which the given polygon is directly contained.
  • zindex is the layer number in which a polygon is located.
  • area is the area of the polygon. In this example, we calculate the 2D Cartesian area of the polygon; if you need an exact area on the sphere, then you need to go to PostGIS Geography Type.
  • geom is the field with geographic data in the Mercator projection (EPSG:3785).

For efficiency we should create some indices.

CREATE INDEX area_idx ON polygons (area);
CREATE INDEX pfid_idx ON polygons (pfid);
CREATE INDEX polygons_geom_idx ON polygons USING GIST (geom);

area_idx and pfid_idx are the standard indices on area and pfid fields. polygons_geom_idx is the GIST index on geom field, which speeds up the process of finding the nested polygons. You may prefer to create indices after filling out the table.

The procedure arrangePolygons() scans the polygons table and finds the container polygon id. If there is no container for a polygon then NULL value remains.

    RETURNS Void AS $proc$
    curs_feats CURSOR FOR
        SELECT fid, area, geom FROM polygons
        ORDER BY area FOR UPDATE;
    curs_parents CURSOR (
          feat_area double precision, feat_geom Geometry) FOR
        SELECT fid, geom FROM polygons
        WHERE area>feat_area AND geom && feat_geom
        ORDER BY area;
    pfidV Bigint;
    SET enable_seqscan TO off;
    FOR feat IN curs_feats LOOP
        pfidV = NULL;
        IF feat.geom IS NOT NULL THEN
            FOR pfeat IN curs_parents (
                  feat_area:=feat.area, feat_geom:=feat.geom) LOOP
                IF _ST_Contains(pfeat.geom, feat.geom)=true THEN
                    pfidV = pfeat.fid;
                    EXIT parents_loop;
                END IF;
            END LOOP parents_loop;
            UPDATE polygons SET
                    pfid = pfidV
                WHERE CURRENT OF curs_feats;
        END IF;
    END LOOP feats_loop;
$proc$ LANGUAGE plpgsql;

The procedure is quite simple and consists of two nested loops. In curs_parents cursor we use ordering on area to find the object’s container: from those polygons that contain the given one the container will, of course, have the smallest area.

Note that we use a variant of the _ST_Contains function with an underscore, since ST_Contains could preliminary access the polygons_geom_idx index, which we have already used with && operator on the bounding boxes.

The expression `SET enable_seqscan TO off;‘ is here for forcing usage of defined indices.

The second procedure setZIndex(Bigint, Smallint) sets zindex of polygons and it is recursive. We pass the container id (NULL if there is no container) and the current zindex as arguments to it.

CREATE OR REPLACE FUNCTION setZIndex(pfid Bigint, zindex Smallint)
    RETURNS Void AS $proc$
    feat record;
    curs refcursor;
    IF pfid IS NULL THEN
        OPEN curs NO SCROLL FOR
            SELECT fid FROM polygons AS f
            WHERE f.pfid IS NULL;
        OPEN curs NO SCROLL FOR
            SELECT fid FROM polygons AS f
            WHERE f.pfid=setZIndex.pfid;
    END IF;
        FETCH NEXT FROM curs INTO feat;
        EXIT curs_loop WHEN NOT FOUND;
        PERFORM setZIndex(feat.fid, zindex+1::Smallint);
        UPDATE polygons SET
                zindex = setZIndex.zindex
            WHERE CURRENT OF curs;
    END LOOP curs_loop;
$proc$ LANGUAGE plpgsql;

At the very beginning, one of the two applicable cursors is selected depending on the container id value. Inside the loop, the procedure is called again for those polygons whose container is current polygon. zindex value is increased by one on every recursive call.

Let’s execute this code.

INSERT INTO polygons (fid, geom) VALUES (1, ST_GeomFromText(
  'POLYGON((0 0,100 0,100 100,0 100,0 0))',3785));
INSERT INTO polygons (fid, geom) VALUES (2, ST_GeomFromText(
  'POLYGON((150 0,200 0,200 50,150 50,150 0))',3785));
INSERT INTO polygons (fid, geom) VALUES (3, ST_GeomFromText(
  'POLYGON((20 20,80 20,80 80,20 80,20 20))',3785));
INSERT INTO polygons (fid, geom) VALUES (4, ST_GeomFromText(
  'POLYGON((40 40,60 40,60 60,40 60,40 40))',3785));
UPDATE polygons SET area=ST_Area(geom);
SELECT arrangePolygons();
SELECT setZIndex(NULL::Bigint,0::Smallint);
SELECT fid, pfid, area, zindex FROM polygons;

We should get following results.

fid pfid area zindex
2 2500 0
4 3 400 2
3 1 3600 1
1 10000 0

The pfid column gives us the container id, the value in the column zindex shows to us the type of the polygon object: if the value is even then it is the land coastline, otherwise it is the lake shore.

Zero value corresponds to the isolated polygons: continents or oceanic islands.

Change log: 0.7.1

  • Correct lakes and lands borders. Excluding only small features which size comparable to the map resolution.
  • Fixed invalid river channel heights in riversz shapefile.
  • Water level of the lakes is stored in the new field ‘hlevel’ of lakes shapefile. For lowlakes water level is always 0 (water level of the ocean), so there is no need to store it.
  • Some minor bug fixes.

The Nishke and Shkay planets data are replaced by new ones generated under the same configuration parameters. The planets named Nishke II and Shkay II respectively.

This is an important correction, which gives way to some interesting applications.


Change Log: 0.7

  • The mechanism for combining two or more height meshes to obtain a great variety in planet relief forms. In particular, it is now possible to make continents with alternate mountainous and lowland areas.
  • Rivers are now more likely to flow into each other than flow in the same direction to the ocean.
  • River width variations fixed. Earlier, the value of the channel shift depended on the latitude.
  • The way of generating relief inside rhombs is again changed: sharp uplifts near the rivers are removed and there are no ubiquitous bulges.

The two new example planets added to the site.

Making river polygons with PostGIS

In the practice of the procedural generation of maps, I found it convenient to use the facilities introduced by PostGIS extender for PostgreSQL. One of the problems that arises, the creation of river polygons, has a fairly simple solution by means of PostGIS.

Here, I will give almost all code that is needed for making river polygons. Suppose, that we already have somehow obtained the rivers line segments of different streamflow (discharge) value. All river lines consist of such segments.

Also, we can assume that the segments with the same streamflow can be one or more for any river. More than one segments are needed to create variations in the width of the river channel (see the previous post).

If the database is new, then do not forget to add the PostGIS extension.


There are two tables in the database. One contains river segments and other contains resulting river polygons.

CREATE TABLE riversegs (
    rid           bigint NOT NULL,
    streamflow    integer NOT NULL,
    geom_wgs84    geometry(LINESTRING, 4326) NOT NULL

CREATE INDEX riversegs_idx ON riversegs (rid);

    rid bigint PRIMARY KEY,
    geom geometry(MULTIPOLYGON, 3785)

CREATE INDEX rivers_geom_idx ON rivers USING GIST (geom);

We specify different projections for fields of geometry type. EPSG:4326 or WGS 84 (World Geodetic System) projection is convenient for the initial creation of geographic objects during procedural generation, because its unit of measurement is usual degrees of longitude and latitude. EPSG:3785 (Web Mercator) projection is more convenient when it is necessary to make various transformations of geometries, and its unit of measurement is meter. Since the projections are different, at the certain stage we will have to make a transformation from one projection to another.

This code also includes the creation of two indices, the second of which is the spatial index. Such indices are a good tool for speed up many operations with geospatial data; see details in the PostGIS documentation. In this example, rivers_geom_idx index is not particularly useful, but if you intend to use the resulting rivers data further, then including it more than desirable. On the other hand, we do not create any spatial index with the geom_wgs84 field, since we do not need bounding box property of its values in our procedures.

The river line segments are of LINESTRING spatial type, and resulting rivers are of MULTIPOLYGON spatial type. We do not specify the POLYGON spatial type in favor of the more general type, since the rivers can consist of separate pieces, for example, if a lake divides the river into parts.

For contrived, but simple, example we write down the data of the river consisting of three segments.

INSERT INTO riversegs VALUES (1,1,ST_GeomFromText('LINESTRING(1 0,2 0,2 1,3 1)',4326));
INSERT INTO riversegs VALUES (1,2,ST_GeomFromText('LINESTRING(3 1,3 2,3 2,3 4)',4326));
INSERT INTO riversegs VALUES (1,3,ST_GeomFromText('LINESTRING(3 4,4 4,5 4,6 4)',4326));

Below is the main procedure in which we collect all possible rivers and write them into the rivers table.

        halfw double precision
    ) RETURNS void AS
    curs_rivers CURSOR FOR SELECT rid FROM riversegs GROUP BY rid;
    target_geom Geometry;
    FOR river IN curs_rivers LOOP
    SELECT INTO target_geom (
        SELECT ST_Multi(ST_Union(segs.geom))
            FROM (SELECT 
                makeBuffer(geom_wgs84, halfw, sqrt(streamflow)) as geom
                FROM riversegs
                WHERE rid=river.rid
            ) AS segs);
        target_geom = validateMultiPolygon(target_geom);
        INSERT INTO rivers VALUES (river.rid, target_geom);
    END LOOP river_loop;
$BODY$ LANGUAGE plpgsql;

The procedure takes one argument, the half value of the minimum river width in meters. Further, in the cycle for all possible rivers, we select all segments of each river and make the combined geometry with ST_Union function. We use the variant of ST_Union function which accepts a set of spatial objects as a single argument. With ST_Multi function we ensure that the geometry will be returned as a MULTI* geometry.

makeBuffer function is defined as following.

        geom Geometry,
        halfw double precision,
        sqrt_sf double precision
    ) RETURNS Geometry AS
    buffed_geom Geometry;
    w double precision;
    w = sqrt_sf * halfw;
    buffed_geom = ST_Buffer(
        ST_Transform(geom, 3785), w, 'endcap=round join=round');
    RETURN buffed_geom;
$BODY$ LANGUAGE plpgsql;

In this procedure we turn the river line segment into the ‘thick’ segment, which is a polygon. The function ST_Buffer helps us in this. 'endcap=round join=round' argument forces the function to use this rounding method for line joinings and endings. As the buffer width we take the product of half of the minimum width and the square root of the river current streamflow.

In accordance with the project’s agreement of semi-circular profile of the river bed, we take that the river width is proportional to the square root of the current streamflow. For the sake of justice, it should be noted that there are other ways of calculating the width of rivers depending on the streamflow.

Also, note how the function ST_Transform transforms our original geometry projection into another projection, in which the unit of measurement is meter.

The validateMultiPolygon procedure does a validation check and, if necessary, changes the geometry type to MULTIPOLYGON. If resulting geometry is invalid the procedure returns NULL. We try to get a valid object two times, firstly by ST_MakeValid procedure, and then by ST_Buffer procedure, which also is used to obtain valid geometries.

CREATE OR REPLACE FUNCTION validateMultiPolygon(geom Geometry)
    RETURNS Geometry AS
    new_geom Geometry;
    IF ST_IsValid(geom)=false THEN
        new_geom = findSubGeom(ST_MakeValid(geom));
        IF (new_geom IS NULL) OR ST_IsEmpty(new_geom)=true THEN
                new_geom = findSubGeom(ST_Buffer(geom, 1));
                WHEN SQLSTATE 'XX000' THEN new_geom = NULL;
        END IF;
        IF ST_GeometryType(geom)<>'ST_MultiPolygon' THEN
            new_geom = findSubGeom(geom);
            new_geom = geom;
        END IF;
    END IF;
    RETURN new_geom;
$BODY$ LANGUAGE plpgsql;

The last procedure tries to make MULTIPOLYGON geometry from input geometry; returns NULL if this can not be done.

CREATE OR REPLACE FUNCTION findSubGeom(org_geom Geometry)
    RETURNS Geometry AS
    new_geom Geometry;
    new_geom = NULL;
    SELECT ST_Multi(ST_Collect(a.gdump)) as b INTO new_geom
        FROM (SELECT (ST_Dump(org_geom)).geom AS gdump) AS a
        WHERE ST_GeometryType(a.gdump)='ST_Polygon';
    RETURN new_geom;
$BODY$ LANGUAGE plpgsql;

That’s all. As you can see, with four small procedures we were able to perform a difficult job. To start the process with minimum river width 100, just run the command

SELECT makerivers(50);

To view resulting data

SELECT ST_AsText(geom) FROM rivers;

More realistic rivers

Looking at the rivers from the previous post it seems that everything is quite pretty. But perhaps, like me, you feel that something is wrong. Oh ye, there are no such pretty smooth rivers in nature!

Let’s open google or other Earth maps and look at the shape of any river banks. Usually, the width of a river does not remain constant and gradually changes, even when the amount of water in it is invariably. Furthermore, the banks do not vary simultaneously. This, as well as the fact that such behavior of the banks does not depends on the width of the river, suggests that the shape of the river is a fractal (do not confuse this fractal with the fractal of the river system). All this is true for majority of small and medium-width rivers, but the widest lowland rivers often have more complex banks and have large well visible islands.

Here is a pair of random Earth rivers of medium width from two continents.

It is known that in nature there are almost no classical geometric forms, but fractal forms are very common.

So far we consider only the most frequent behavior of the river channel; ducts, deltas, islands, and old channels that have become oblong lakes we are not considering now. Although, the method proposed further, with some changes, can be applied to create models of these nature forms; at least I find it as promising.

Surely, there are already scientific works devoted to the formation of rivers and it would be possible to model river forms based on the strict physical theory. However, the shape of the mountains could be obtained by modeling based on the strict theory of lithospheric plates movement, weathering, and erosion; but, in order to obtain a plausible terrain for virtual worlds, no one does it (correct me if I’m wrong). For this purpose we use one of the algorithms of the Brownian surface generation.

Similarly, with rivers, we should try to get a fractal shape close to the real river forms, and then we could use it to represent procedural generated rivers on maps.

Again, carefully studying the real earth’s rivers, we can notice some special behavior of the river banks. Two banks most often do not diverge or taper simultaneously (it also happens, but less often). Everything looks so that one bank shore departs from the imaginary central line of the river or approaches to it. This phenomenon makes one think to put one more channel (river bad), different from the first but close to it, atop each other. Then we should get the right behavior of the banks.

So it turned out to be valid.

The second channel is constructed by the random walk method. Each point of the second channel is obtained by shifting the corresponding point of the first channel by a certain vector, which is the same in absolute value (for the current river width), but its direction makes a random walk. When passing to each next point, we add a random value to the last angle value, and the random values is distributed according to the normal (Gaussian) distribution. The points thus obtained are also smoothed.

Obtained channels must be merged in the resulting river polygon. How exactly I merge channels I will recount in the next post. For now a few examples from the planet Admete.

Click the images for bigger image size.