Boundaries and nesting of land and water areas

One of the problems that arises in the process of drawing lands and lakes boundaries is the problem of determining the relative nesting of land and water areas. It’s a common thing, when there are lakes inside the continent, and, in turn, there can be islands in those lakes. In principle, infinite nesting is possible: there can be pools of water on the islands, in which there may be small islets, and so on.

When the boundaries between land and water are determined the role of these boundaries are not always clear. A boundary line may be the mainland coastline or the shore of a lake. The task here is to strictly define the boundary of which object is the given boundary line. An example of in-lake island

You can accomplish this task using PostGIS extension, just as we used it to create rivers with variable widths.

As an example, we consider four simple geographical objects as in the picture: red squares are the continents, blue square is the lake, and yellow square is the island in this lake. It is natural to consider that the lakes’ areas belong to the mainland, and the lake islands’ areas belong to the lakes. That is, all our objects are polygons without holes. In this case, the objects will be located either side by side (if they do not intersect) or one above the other, as if in layers.

Partial overlapping of objects are not allowed (in the project this is provided by geography objects creating procedures).

Let’s create a GIS database of those objects.

CREATE EXTENSION postgis;

CREATE TABLE IF NOT EXISTS polygons
(
fid           Bigint NOT NULL,
pfid          Bigint,
zindex        Smallint,
area          Double Precision,
geom          Geometry(POLYGON, 3785)
);

polygons table has following fields:

• fid is feature id of the polygon.
• pfid is the id of the polygon container in which the given polygon is directly contained.
• zindex is the layer number in which a polygon is located.
• area is the area of the polygon. In this example, we calculate the 2D Cartesian area of the polygon; if you need an exact area on the sphere, then you need to go to PostGIS Geography Type.
• geom is the field with geographic data in the Mercator projection (EPSG:3785).

For efficiency we should create some indices.

CREATE INDEX area_idx ON polygons (area);
CREATE INDEX pfid_idx ON polygons (pfid);
CREATE INDEX polygons_geom_idx ON polygons USING GIST (geom);

area_idx and pfid_idx are the standard indices on area and pfid fields. polygons_geom_idx is the GIST index on geom field, which speeds up the process of finding the nested polygons. You may prefer to create indices after filling out the table.

The procedure arrangePolygons() scans the polygons table and finds the container polygon id. If there is no container for a polygon then NULL value remains.

CREATE OR REPLACE FUNCTION arrangePolygons()
RETURNS Void AS \$proc\$
DECLARE
curs_feats CURSOR FOR
SELECT fid, area, geom FROM polygons
ORDER BY area FOR UPDATE;
curs_parents CURSOR (
feat_area double precision, feat_geom Geometry) FOR
SELECT fid, geom FROM polygons
WHERE area>feat_area AND geom && feat_geom
ORDER BY area;
pfidV Bigint;
BEGIN
SET enable_seqscan TO off;
<<feats_loop>>
FOR feat IN curs_feats LOOP
pfidV = NULL;
IF feat.geom IS NOT NULL THEN
<<parents_loop>>
FOR pfeat IN curs_parents (
feat_area:=feat.area, feat_geom:=feat.geom) LOOP
IF _ST_Contains(pfeat.geom, feat.geom)=true THEN
pfidV = pfeat.fid;
EXIT parents_loop;
END IF;
END LOOP parents_loop;
UPDATE polygons SET
pfid = pfidV
WHERE CURRENT OF curs_feats;
END IF;
END LOOP feats_loop;
END;
\$proc\$ LANGUAGE plpgsql;

The procedure is quite simple and consists of two nested loops. In curs_parents cursor we use ordering on area to find the object’s container: from those polygons that contain the given one the container will, of course, have the smallest area.

Note that we use a variant of the _ST_Contains function with an underscore, since ST_Contains could preliminary access the polygons_geom_idx index, which we have already used with && operator on the bounding boxes.

The expression `SET enable_seqscan TO off;‘ is here for forcing usage of defined indices.

The second procedure setZIndex(Bigint, Smallint) sets zindex of polygons and it is recursive. We pass the container id (NULL if there is no container) and the current zindex as arguments to it.

CREATE OR REPLACE FUNCTION setZIndex(pfid Bigint, zindex Smallint)
RETURNS Void AS \$proc\$
DECLARE
feat record;
curs refcursor;
BEGIN
IF pfid IS NULL THEN
OPEN curs NO SCROLL FOR
SELECT fid FROM polygons AS f
WHERE f.pfid IS NULL;
ELSE
OPEN curs NO SCROLL FOR
SELECT fid FROM polygons AS f
WHERE f.pfid=setZIndex.pfid;
END IF;
<<curs_loop>>
WHILE TRUE LOOP
FETCH NEXT FROM curs INTO feat;
PERFORM setZIndex(feat.fid, zindex+1::Smallint);
UPDATE polygons SET
zindex = setZIndex.zindex
WHERE CURRENT OF curs;
END LOOP curs_loop;
END;
\$proc\$ LANGUAGE plpgsql;

At the very beginning, one of the two applicable cursors is selected depending on the container id value. Inside the loop, the procedure is called again for those polygons whose container is current polygon. zindex value is increased by one on every recursive call.

Let’s execute this code.

INSERT INTO polygons (fid, geom) VALUES (1, ST_GeomFromText(
'POLYGON((0 0,100 0,100 100,0 100,0 0))',3785));
INSERT INTO polygons (fid, geom) VALUES (2, ST_GeomFromText(
'POLYGON((150 0,200 0,200 50,150 50,150 0))',3785));
INSERT INTO polygons (fid, geom) VALUES (3, ST_GeomFromText(
'POLYGON((20 20,80 20,80 80,20 80,20 20))',3785));
INSERT INTO polygons (fid, geom) VALUES (4, ST_GeomFromText(
'POLYGON((40 40,60 40,60 60,40 60,40 40))',3785));
UPDATE polygons SET area=ST_Area(geom);
SELECT arrangePolygons();
SELECT setZIndex(NULL::Bigint,0::Smallint);
SELECT fid, pfid, area, zindex FROM polygons;

We should get following results.

fid pfid area zindex
2 2500 0
4 3 400 2
3 1 3600 1
1 10000 0

The pfid column gives us the container id, the value in the column zindex shows to us the type of the polygon object: if the value is even then it is the land coastline, otherwise it is the lake shore.

Zero value corresponds to the isolated polygons: continents or oceanic islands.