4 Variables

There are several different sets of variables implemented that the system recognises. Any undeclared variable you use that does not occur in this list will be treated as a zero or as undefined (depending on where it is used), which means it will probably not behave as you expect. An example declaring variables is given below.

As a general rule, uppercase letters denote absolute values, lower case letters denote signed values (positive or negative). Positive values are north, east, above, or to the right. Negative values are south, west, below or to the left.

$D is the absolute euclidean distance from the processing group to a candidate neighbour group across all dimensions.

$D[0], $D[1] are the absolute euclidean distances in dimension 0 and 1. In most cases $D[0] will be the X (longitude) dimension, $D[1] will be the Y (latitude) dimension. The library functions can actually handle more dimensions than this (eg $D[2] for altitude or depth), but the GUI is not set up to display them (it will plot the data using the first two axes, so only the first of any overlapping groups will be visible).

$d[0], $d[1] and so forth are the signed euclidean distance in dimension 0, 1 etc. This allows us to extract all groups within some distance in some direction. As with standard Cartesian plots, negative values are to the left or below (west or south), positive values to the right or above (east or north). As with $D[0], $d[0] will normally be the X dimension, $d[1] will be the Y dimension.

Note that using abs($d[1]) is the same as using $D[1].

$C, $C[0], $C[1], $c[0], $c[1] are the same as the euclidean distance variables ($D etc) but operate directly in group (cell) units. If your groups were imported using a cellsize of 100,000, then $D[1] < 100000 is the same as $C[1] < 1. Note, however, that if you used a different resolution in each dimension, then the map and cell distances are not directly comparable. For example, if cell sizes of 100 and 200 were used for axes 0 and 1 then $C<1 is the same as sqrt($C[0]**2 + $C[1]**2) < 1 which is sqrt(($D[0]/100)**2 + ($D[1]/200)**2) < 1, and not $D<100.

$coord_id1 is the name of the processing coord, $coord_id2 is the name of the neighbour coord.

$coord[0], $coord[1] are the coordinate values of the processing group in the first and second dimensions. As per the above, think of these as X and Y, except that $coord[5] will also work if your groups have six or more dimensions. Note that the $coord[] variables do not necessarily work properly with the spatial index, so you might need to turn the index off when using them.

$nbrcoord[0] etc are analogous to $coord[0] etc, except that they are the coordinates for the current neighbour group.

Note that the array index starts from zero, so $coord[1] is the second coordinate axis and not the first. This differs from systems like Fortran, R and AWK, but is consistent with many other programming languages like C, Rust and Python.

4.1 Examples using variables

Set the neighbours to be those groups where the absolute distance from the processing group is less than 100,000.

$D <= 100000

Select all groups to the west of the processing group.

$d[0] < 0

Select all groups to the north-east of the processing group.

$d[0] > 0 && $d[1] > 0

The absolute distance in the first (eg x) dimension is less than 100,000 AND the signed distance is greater than 100,000. This will result in a neighbourhood that is a column of groups 200,000 map units east-west, and including all groups 100,000 map units north of the processing group. Not that you would normally want a neighbourhood like this...

$D[0] <= 100000 && $d[1] >= 100000

Select everything north of 6000000 (e.g. if using UTM coordinates as axes 0 and 1). This is an example that could be used as a definition query, and will not work well as a neighbourhood (use $nbr_y instead of $y for that).

$y > 6000000

Select everything within a rectangle. This is another useful definition query.

$y > 6000000 && $y <= 6100000 && $x > 580000 && $x <= 600000

Select a specific processing coord (495:595), useful as a definition query to use only one group. Note the use of the eq operator - this matches text.

$coord_id1 eq '495:595'

4.2 Declaring variables and using more complex functions

Variable declaration is done as per Perl syntax. For example:

my $some_var = 10;
return ($D / $some_var) <= 100;

This trivial example evaluates to true if the absolute distance divided by 10 (the value in variable $some_var) is less than 100. The semicolon denotes a separation of statements to be processed in sequence, such that this example could be written on one line. The result of the last statement is what is returned to the analysis to determine if the group is part of the neighbourhood or not. It is evaluated as true or false. The word return is not actually needed in this case, but does make things clearer when there are multiple lines of code.

A more complex function might involve an ellipse (although you could just use sp_ellipse (major_radius => 300000, minor_radius => 100000, rotate_angle => 1.5714))

my $major_radius = 300000; # longest axis
my $minor_radius = 100000; # shortest axis

# set the offset in radians, anticlockwise (1.5714 = PI/2 = north)
my $rotate_angle = 1.5714;

# now calc the bearing to rotate the coords by
my $bearing = atan2 ($d[0], $d[1]) + $rotate_angle;

#  and rotate them
my $r_x = cos ($bearing) * $D; # rotated x coord
my $r_y = sin ($bearing) * $D; # rotated y coord

#  get the scaled distances in each direction
my $a_dist = ($r_y ** 2) / ($major_radius ** 2);
my $b_dist = ($r_x ** 2) / ($minor_radius ** 2);

# this last line evaluates to 1 (true) if the candidate
#   neighbour is within or on the edge of the ellipse, 
#   and 0 (false) otherwise
return ($a_dist + $b_dist) <= 1;

Note the use of the word my. This is required to declare your own variables in the correct scope. If it is not used then the variables will not work properly. Do not declare any of the pre-calculated Biodiverse variables with this ($D etc) - they already exist and redeclaring will overprint them, causing unpredictable results.

If you wish to know if your function works with the spatial index then run a moving window analysis twice, once with and once without the spatial index, using the list indices generated by the Element Counts calculations to get the lists of neighbours. Export the results to CSV and use a difference tool to compare the results.

Functions available by default are those in the Math::Trig library, plus POSIX::fmod().

To access environment variables you have set, just use $ENV{variable_name}, eg $ENV{my_default_radius}.