Computer Science Technical Report 95-10
LIBRA: A Lazy Interpreter of Binary Relational Algebra
Barry Dwyer
Department of Computer Science
University of Adelaide
G.P.O. Box 498, Adelaide
South Australia 5005
October 30, 1995.
LIBRA is a general-purpose programming language based on the algebra of binary
relations. It is an attempt to unify functional and logic programming,
retaining the advantages of both, and avoiding some of the problems. It has
all the features needed of a programming language, including a straightforward
semantic interpretation. Since program specifications are easily expressed as
relations, it offers a simple path from a specification to a program and from
the program to its proof of correctness. The algebra of binary relations has
several operators whose effects are like those of familiar procedural language
constructs, for example, relational composition is analogous to sequential
execution.
The syntax and semantics of the Libra language are given in detail. The Prolog
implementation of a lazy evaluator for Libra programs is described. Libra is
illustrated by its application to some simple programming exercises.
PROGRAMMING Languages, Binary Relations, Specification Languages, Prolog.
IN MATHEMATICS, the expression `X r Y' is true if X and Y satisfy the
relation `r'. For example `X<Y' is true if X and Y satisfy the
relation `<'. There are three ways the `<' relation can be considered:
as the (infinite) set of all (X,Y) pairs for which X<Y; a predicate that can
be applied to (X,Y) pairs; or as a `relator' [11, 12], that given X, will yield
all Y values greater than X.
In Libra, a (finite) relation to describe changes in temperature could be
defined as follows:
let transition->{'Cold','Warm';'Warm','Hot';'Hot','Warm';'Warm','Cold'}.
The braces enclose a set of 4 elements, separated by semicolons. Each element
is an ordered pair of terms, separated by a comma. The relation is given the
name `transition'. The two means used in this example--set formation and pair
formation--are the only ways of building data structures in Libra. The members
of a set are unordered, but pairs are ordered.
There are three ways this relation could be used:
- It could generate the four pairs of values.
- It could be used to test whether a pair of values satisfy the relation.
- Given the first term of a pair or pairs, it could give the corresponding
second terms as a result. For example, given 'Warm', it could yield both
'Cold' and 'Hot'.
The first two ways of looking at relations are shared by
sets in general. We may enumerate their members, or test elements for
membership of them. The third view puts the relation in a more active role,
and is described as `applying' the relation to an argument to yield one or more
values as a result. This is analogous to the view we take of applying
functions in a functional programming language.
Libra distinguishes these three roles syntactically. To generate the members
of the set, we would write:
find @transition.
('Cold','Warm')
('Warm','Hot')
('Warm','Cold')
('Hot','Warm')
(The `find' command enumerates the values of an expression.)
Here, the `@' operator generates each pair in the set, although not necessarily
in the order shown. It may be read as `all' or `for all', depending on the
context. In the existing Prolog-based interpreter, these pairs are enumerated
by back-tracking, but in principle, Libra allows them to be generated in
parallel. In general, we may imagine that the computation splits into several
independent threads--in this case, four of them.
To test set membership, we might write:
find ('Warm','Hot')?transition.
'True'
whereas:
find ('Cold','Hot')?transition.
'False'
The `?' operator may be read as `is a member of' and corresponds to epsilon, the usual mathematical symbol for set membership.
To apply a relation to an argument, we write:
find 'Warm'!transition.
'Hot'
'Cold'
where `!' is read as `apply'. (It has no connection with Prolog's `cut' operator.)
There is an important distinction to make here. Applying a relation to an
argument `enumerates' a set of result values, splitting this computation into
two threads. It does not yield the values as the single set, {'Hot';
'Cold'}--which can actually be generated by Libra's `image' operator:
find {'Warm'} image transition.
{'Cold'; 'Hot'}
This means that applying relations that yield several values is computation
intensive rather than storage intensive, although, as the `image' operator
illustrates, the programmer has the means to trade space for time if desired.
The distinction between the members of a set and the set itself is a subtle
one, and the reader should take careful note of it.
Finally, it is worth mentioning that Libra does not support a fourth mode of
using relations: given the second term of a pair, to determine its
corresponding first terms--at least not directly. Accordingly, the first term
of a pair is called its argument, and the second term is called its value. It
is possible to go from argument to value, but, in general, not from value to
argument--any more that existing programming languages allow inputs to be
derived from outputs. This is stressed by the alternative notation for pairs:
using `->':
let transition ->
{'Cold'->'Warm';'Warm'->;'Hot';'Hot'->'Warm';'Warm'->'Cold'}.
from which it will be seen that the name `transition' and the relation it defines
are also an ordered pair.
WHY SHOULD a programming language be based on the algebra of binary relations?
One view is that it is an attempt to combine the advantages of both functional
programming and logic programming. An often stated advantage of functional
programming is `referential transparency': the idea that a program is an
algebraic expression, which when simplified, yields the value of the program,
ie., its output. An aspect of this idea is that functions can be given names,
and that any occurrence of the name can be replaced by the corresponding
function. A difficulty that besets functional programming languages is that a
function always has exactly one value. This makes it hard to deal with
exceptions: dividing a number by zero yields no result, yet finding the square
root of a number yields two results. This difficulty is avoided in a logic
language such as Prolog, which can produce no result by `failing', or produce
two results by back-tracking. On the other hand, Prolog has its own
difficulties. Although a subset of Prolog can be understood in terms of
predicate calculus, most useful Prolog programs need to use extra-logical
predicates, which can only be understood with reference to a specific model of
program execution. A language based on the algebra of binary relations can
combine the best aspects of both approaches, while avoiding some of their
problems.
A second view is that relations are to general-purpose programming languages
what matrices are to scientific languages. They are large scale data
structures, which can be manipulated as wholes. In the scientific field,
matrices have led to the evolution of specialised parallel processing computer
architectures such as vector processors, but there has been no similar
well-established concept that has formed the basis of parallel architectures in
more diverse fields--unless one counts the n-ary relations used in databases.
Perhaps binary relations will provide such a concept. (One might consider that
the original Connection Machine [4] was based on similar ideas, but failed to
exploit the full power of binary relational algebra.) It is probable that the
ideas in this paper will need additional refinement before this goal can be
achieved, perhaps by amalgamating them with the related notion of `conceptual
graphs' [12].
A third view justifying using binary relational algebra is that it contains a
rich supply of useful operators that closely match the things that programmers
do. (This sets it apart from the relational algebra of tuples used in database
query languages, which has but a few general-purpose operators: `select',
`project', and `join'.) For example, an exercise often given to student
programmers is, given a list of words, to display for each word a list of all
its positions in the list. In the algebra binary relations, if `W' is the list
of words, the result is essentially given by the expression
`W-1'.
A fourth justification is that relations are very general mathematical objects,
which include functions as special cases. They are like multi-valued
functions, and can do anything that functions can. They are also known to be a
universal model for data representation, being the building blocks of
relational databases. In Libra, there is no distinction between a relation as
data or a relation as an operator, except how it is used. The syntax of Libra
is such that, the limitations of the ASCII character set aside, any legal
program is a valid expression in the algebra of binary relations, which in turn
specifies the program's behaviour. This gives Libra a flavour similar to the
specification language, `Z'. Although Libra is not an attempt to imitate `Z',
it does mean that it is easy to prove the correctness of a Libra program
derived from a `Z' specification, using only the tools of discrete
mathematics.
Finally, lest the reader assume that Libra is only suitable for solving
abstruse problems in discrete mathematics, before reading on, it may help first
to skim Section 8, which explains some perfectly ordinary example programs.
The proof of any pudding is in the eating, so Libra is best tested by personal
experience. The implementation of Libra described here is portable, in that it
uses core Prolog almost exclusively, and should prove easy to adapt to
different Prolog environments. (It was originally developed using Open Prolog
[1].) The interpreter allows readers the chance to experiment. It is
acknowledged that the interpreter is slow, but this is a property of the
implementation, not an inherent property of relational languages.
The Libra language owes its inspiration to Sanderson's Relator Calculus [10,
11], which first demonstrated to the author the power of relational languages.
However, there is a fundamental shift of viewpoint from that work. Sanderson
regarded programming constructs such as `if...else' and `while...do' as having
proved their worth (Wirth?), and based the relator calculus on similar
constructs. The disadvantage of this is that their mathematical properties are
not as clean as those of the roughly similar concepts of relational union and
closure, except when the program is deterministic (ie., functional). (Even
so, there are situations where something closer to `if...else' or `while...do'
is needed, which is why Libra offers the `else' and limit (^^) operators.) The
Relator Calculus [10, 11] is an algebra rather than a programming language,
although an experimental interpreter for it has been implemented.
Another source of inspiration is the work of MacLennan [5, 6], who demonstrated
several examples of relational programs, using the language RPL. RPL was
really a functional language that could operate on relational data structures.
Its control structure did not exploit backtracking or parallelism. It also had
the blemish that different operators had to be written to denote the same
operation depending on whether it was applied to relations expressed as data
(extensional relations) or as program structures (intensional relations). For
example, the union of two sets and the union of two programs were formed by
different operators. It was also difficult to mix operations on intensional
and extensional structures. This can be defended on the grounds that a
programmer should be aware of the distinction between data and executable
instructions, but it makes an RPL program less abstract, and harder to read.
The defects of RPL were addressed by Drusilla [2], which allowed the same
operator to be used uniformly on intensional or extensional relations, or a
mixture of both. This is really a radical point. It would be one thing to
devise a language that supported operations on matrices; it would be quite
another to devise a language where the programs themselves were matrices that
could be combined using matrix operations. Likewise, programs that are
structured using relational operations are not like programs in other
languages.
Drusilla also distinguishes the three modes of using relations identified
earlier: as sets of pairs, as predicates that can be applied to pairs, and as
means of constructing results from an argument. The Drusilla language system
includes a compiler that uses an extension of Milner type checking [7] to
deduce the correct ways of combining relations from their properties. This
process also performs conventional type checking, detecting potential
programming errors. In this respect, Drusilla is superior to Libra, which
leaves what little type checking it does until execution time. On the other
hand, Libra allows operators to be over-loaded, so that, for example, the
programmer can extend the built-in `+' operator, which applies only to
integers, to apply to vectors and matrices. This kind of overloading is not
possible with Milner type checking, because the types of operands and results
are deduced from the types of operators, which therefore have to be fixed.
The `Z' specification language [8] has been another influence on Libra. Libra
includes many constructs found in `Z', although it is certainly not an attempt
to turn `Z' into a programming language. Rather, it is a language into which
it should be easy to convert a `Z' specification. Alternatively, Libra is
itself sufficiently close to mathematical notation that it can serve as its own
specification language. A specification can be written in Libra. Although it
might not be an executable program, it should be possible to turn it into one
by using the ordinary theorems of discrete mathematics.
The immediate precursors to Libra are a language proposal [3] written by the
author--while in ignorance of Drusilla's existence--and Hydra [9], a partial
implementation of the proposal. Like Drusilla, the language proposal addressed
the defects of MacLennan's RPL, but adopted different solutions from Drusilla
in almost every case. This is probably because Drusilla has a functional
programming background, whereas Libra is based on logic programming. Drusilla
is implemented using Miranda; Libra is implemented using Prolog. Both Drusilla
and Libra attempt to meld the functional and logic styles, but are towards
opposite ends of a spectrum. Hydra was influenced by Drusilla, incorporating a
similar type checking system, and revealed several problems inherent in the
original proposal. However, only a small subset of the proposed language was
implemented, and Hydra is not currently useable.
There are several differences between Libra and Drusilla. Some result from
the means of implementation. For example, Drusilla explores the alternative
results of relations by lazy evaluation of lists, whereas Libra uses a mixture
of lazy evaluation and backtracking. Other differences are experiments--if
Drusilla does it one way, Libra usually does it another. For example, as
already mentioned, Libra uses dynamic type checking.
An important difference between Drusilla and Libra is in the treatment of
iteration. Drusilla simulates iteration by recursion, which is to be expected
of a language based on functional programming; whereas Libra uses relational
closure. (The fact that closure is implemented internally by recursion is
irrelevant; what matters is the effect on the source program.) A typical Libra
program is free of recursive definitions, making its structure much more
transparent. (Recursion can simulate `go to' statements and, badly used, can
make a program just as hard to understand as they do.) Some consideration was
given to banning recursion in Libra altogether. It remains available--in case
of emergency.
Another difference is that Libra uses sets and pairs as its basic structuring
operations, whereas Drusilla uses lists and tuples. The author considers that
Libra's approach is much more appropriate to a relational language. Compared
with lists, sets have advantages and disadvantages. The elements of sets have
no particular order, which means that they can be dealt with in parallel, at
least in principle. (The current implementation uses back-tracking to simulate
parallelism.) A consequence is that operations on different set elements
cannot possibly interact with one another, simplifying understanding of the
program. A disadvantage of sets is that they are not as easily implemented as
lists are--although they have several efficient implementations, such as trees
and hash tables, that allow set membership to be tested with lower
computational complexity than is possible for testing membership of lists.
(Unfortunately, the current implementation of Libra is badly deficient in this
respect, because it represents sets as lists. Correcting this is a top
priority in a later version of Libra.) Lists exist in Libra, where they are
called sequences, but they are implemented as functions from the integers 1 to
N onto their terms. Given a tree or hash table implementation of sets,
efficient access would be possible to any term of a sequence.
A Libra feature worthy of note is reduction, which enables the elements of a
set or sequence to be combined under a suitable operation. For example,
reduction under `+' forms the total of a set of numbers. A special case of
reduction is to choose an arbitrary member of a set. This is useful in
problems that admit of many solutions, but where only one solution is needed.
(It also subsumes one of the functions of Prolog's `cut' operator.)
Libra is intended to encourage the development of relational programming
languages by allowing further experiment. To this end, the current interpreter
is designed for simplicity and ease of extension and modification. Even so, it
incorporates much useful knowledge, being essentially an expert system for
evaluating relational expressions.
THE SYNTAX of Libra is based on that of Prolog. Given suitable operator
precedences and associativities, a Libra program is simply a Prolog term. Its
lowest level building blocks are variables, integers, atoms, delimiters and
operators. Variables and integers are defined exactly as in Prolog. Variables
must begin with a capital letter or underscore character, and may comprise
letters, underscores or digits.
There are two kinds of atoms. Those that begin with capital letters, such as
"'Warm'" are called `literals', and simply stand for themselves. (They must be
enclosed in quotes to distinguish them from variables.) They are used to give
meaningful names to values. The only pre-defined literals are the boolean
values 'True' and 'False', but the programmer can invent new ones, such as
'Cold', 'Warm' and 'Hot'.
All other atoms are called `names'. There are three classes of names: strings
beginning with a small letter and comprising letters, underscores or digits,
strings of the symbols +-*/\^<>=:.?@#$&,
and any string enclosed in quotes that is not a literal. A name always stands
for an expression defined by a `let' statement. For example, `transition' was
defined earlier as a name that stands for a set of four pairs. The basic rule
is that where a name appears, it can always be replaced by the thing it stands
for.
Because Libra is based on relations rather than functions, it seems reasonable
to let a name map to more than one value if desired. This lets the meanings of
names be overloaded; for example, the built-in `+' relation applies only to
integers, but the programmer may easily define it to apply to vectors and
matrices as well.
Operators are a notational convenience. Almost any name can be defined as an
operator. The arrangements for doing this are similar to those of Prolog. The
same name may be both a unary and a binary operator. The expression `1+2' is
indistinguishable from the expression `+(1,2)'. Both are interpreted as
relational application, ie., as `(1,2)!(+)'. It may seem strange to treat `+'
as a relation rather than a function, but a function is merely a special case
of a relation.
Libra is activated within a Prolog environment by consulting the file `libra'.
Libra declares many operators that are not found in Prolog itself, so that
although the programmer is really using Prolog, the Libra programming
environment seems quite different.
The `find' command evaluates an expression:
find 1+2.
3
If
there are several results, `find' displays them all:
find @{0;2}+ @{0;1}.
0
1
2
3
If
the expression has no value because it contains an inapplicable operator,
nothing is displayed:
find "abc"+2.
The
`?' command is an alternative to `find':
?1+2.
3
Usually,
an expression will use definitions that have been created by an earlier `let'
statement:
let x->1.
let x->"abc".
let y->2.
let y->4.
?x+y.
3
5
The word `let' is optional:
x->1.
x->"abc".
y->2.
y->4.
?x+y.
3
5
A
`let' statement maps a name onto an expression. Often the expression is a
relation, but this is not a requirement. The same name may map onto several
expressions. If so, all possible substitutions are made.
It is possible to overload the name of any Libra relation, command or operator,
with one exception, `drop'. (This ensures that if overloading a name causes
confusion, the definition can be deleted.)
However, there is a penalty. Libra makes checks on the operands of built-in
relations, so that for example, if the operands of `+' are not integers, a
warning message results. However, if the programmer overloads the `+'
operator, Libra must turn off the warning, otherwise it would complain about
the calls on the programmer's own definition. (One might argue that it should
first test to see whether the new definition proved to be inapplicable--but in
some cases the new definition might be intended to be inapplicable!) As a
result, overloading Libra's built-in names is better left until after a program
has been debugged. (The converse of this is that the programmer can suppress
Libra's warning messages if desired.)
The `show' command displays the current definitions associated with a name:
show x.
x -> 1 .
x -> [97,98,99] .
If no name is specified, `show' displays all the current definitions:
show.
x -> 1 .
x -> [97,98,99] .
y -> 2 .
y -> 4 .
The
format displayed by `show' may not agree exactly with what was typed. Among
other things, variables are replaced by `A', `B', `C', and so on, and
parentheses are introduced to make operator precedences explicit:
add1->{X->X+1}.
show add1.
add1 -> {(A,(A + 1))} .
Definitions
remain in force until they are dropped:
drop x.
find x+y.
produces no output. The form:
drop.
drops all definitions and wipes the slate clean.
The definitions associated with a name can be dropped selectively using the
`edit' command:
x->1.
x->"abc".
edit x.
x -> 1 .
Keep it (+) or drop it (-)? +
x -> [97,98,99] .
Keep it (+) or drop it (-)? -
show x.
x -> 1 .
The
programmer may add new operators to the language. There are seven commands for
this purpose, depending on the properties of the operator required:
- fx non-associative unary prefix
- fy associative unary prefix
- xf non-associative unary postfix
- yf associative unary postfix
- xfx non-associative binary infix
- xfy right-associative binary infix
- yfx left-associative binary infix
(These conventions are the same as in
Prolog.) The same name may be declared as an operator twice: once as a unary
operator, and once as a binary operator. Attempting to declare a name as a
unary or a binary operator for the second time drops the existing definition.
(This restriction is imposed to make the parsing of Libra programs unambiguous,
and is inherited from Prolog. It does not prevent the same operator having
several overloaded meanings.
An operator's precedence can be specified by any expression:
<+ yf (binary_prec(*)+unary_prec(+))/2.
which
declares `<+' as an associative postfix operator with precedence midway
between that of the binary `*' and unary `+' operators. (The built-in
functions `unary_prec' and `binary_prec' are very special because their
operands are not interpreted. In any other context, an attempt is always made
to interpret names by replacing them with their definitions.) For portability
reasons, the programmer should use expressions that give operators relative
precedences, rather than absolute ones.
The `show' command displays operator definitions too:
<+ yf binary_prec(*)+unary_prec(+))/2.
show <+ .
<+ yf 450.
Programs
may be typed interactively, but it is usually easier to edit them as separate
files (e.g., using the Prolog editor), then load them with the `use' command:
use farmer.
Reading farmer ...
Done.
A
file used in this way is treated exactly as if the same commands had been
entered from the keyboard, and it may contain any sequence of commands
whatsoever.
Because Libra is a relational language, it allows names to have multiple
definitions. If a `use' file is corrected and `used' a second time, its old
(erroneous?) definitions will remain active until they are dropped. The
`reuse' command is equivalent to `drop' followed by `use'.
reuse farmer.
Reading farmer ...
Done.
(This
drops all definitions, not only those in the file `farmer'. In general, it
would be unwise to use the `reuse' command within a program file.)
The `dump' command is the converse of `use'; it writes all existing definitions
to a file. The text of the file is the same as would be displayed by `show'.
Since the variable names in the saved file are meaningless, this command is of
limited use.
Finally, `commands.' lists all Libra's commands, and `help X.' describes the
purpose of the command `X' in more detail.
LIBRA'S basic structuring operations are set formation and pair formation.
Sets are written as lists of elements between braces `{}', separated by
semicolons. The order in which the members are written is unimportant, and
duplicate elements are ignored, e.g, `{1;2;2;3}' and `{2;3;1}' both denote the
set containing the integers 1, 2, and 3. The members of sets may be structured,
e.g., `{(1,2);{1,2}}'. A set may contain members of different kinds, e.g.,
`{1;{};'True'}'.
A special shorthand is provided for ranges of integers: {M..N} denotes the set
containing the range of integers M to N inclusive:
find{3..5}.
{3;4;5}
In
contrast to `extensional' sets whose members are listed explicitly, it is also
possible to define an `intensional' set by a means of a pattern and a
predicate. For example:
{(X,Y):X<Y}
denotes
the set of all (X,Y) pairs such that X is less than Y, ie., it is the same as
the relation `<' itself. (In this context, `:' should be read as `such
that'.)
There is an important restriction on a set defined in this way. It is possible
to test an element for membership of it, but it is not possible to enumerate
its members. Apart from the task of generating all (X,Y) pairs such that
X<Y being never-ending, Libra does not have the knowledge to deduce what set
is implied by a pattern and a predicate.
Such sets are called `filters' to distinguish them from those whose members are
listed explicitly, which are called `generators'. Enumerating the members of a
set (using the `@' operator) is described as `using the set as a generator'.
Testing an element for membership of a set is described as `using the set as a
filter'. A generator can be used as a filter, but a filter can't be used as a
generator.
(These two modes of use sometimes have subtle consequences. In Libra, `AB',
the intersection of sets `A' and `B', denoted by `A meet B', is found by
generating the members of `A', then testing to see which are also members of
`B'. Thus `A' must be a generator, but `B' may be either a generator or a
filter. Although `AB' and `BA' denote the same mathematical object, `A meet B'
and `B meet A' denote different Libra programs. Consequently, if a
specification called for the value of `BA', it might be necessary to first
transform it to `AB' to make the program workable.
Filters are often used in a `generate and test' strategy where it is desired to
test given values for set membership. Thus, by defining the set of odd numbers:
odd -> {X:X mod 2=1}.
the command:
? {1..100} meet odd.
{1;3;5;...;99}
yields the set of all odd numbers between 1 and 100.
Ordered pairs are defined by an argument and a value. The pair:
Arg->Value
or
Arg,Value
maps the input `Arg' onto the output `Value'.
Pairs are formed by the infix operators `,' and `->',
which differ only in that `->'
binds its operands more loosely than `,'. Having the choice of two operators
often saves using parentheses. Within Libra however, `->' and `,' are
treated identically.
It is sometimes possible to treat pairs as tuples, written as a list of terms
within parentheses separated by commas. Because `,' is a right-associative
binary operator, the tuple `(1,2,3)' is indistinguishable from `(1,(2,3))', `(1->2,3)' or `(1->2->3)'.
However, all four are distinct from `(1,2->3)', ie., `((1,2),3)'.
There is no such thing as the empty tuple; `()' is a syntax error. A 1-tuple
is indistinguishable from its only element, ie., `(1)' is the same as `1'.
However, when operators are used as operands, it is sometimes necessary to
enclose them in parentheses, e.g. `(+)', to ensure correct parsing.
Binary relations are merely sets of pairs. For example:
swap->{X,Y->Y,X}.
Defines
a relation that swaps a pair of values. It is possible to use this relation as
a filter to test if one pair is the reverse of another, for example:
? ((1,2),(2,1))?swap.
'True'
It
is not possible to use `swap' as a generator. However, it allows a third mode
of use: as a relator [10, 11] or `constructor'. `Swap' can be applied to the
pair (1,2) to construct the pair (2,1) as follows:
? (1,2)!swap.
(2,1)
It is possible to write what appears to be a ternary relation, e.g:
ternary->{X,Y,Z:X<Y<Z}.
But the associativity of the comma is such that this is identical to:
ternary->{X,(Y,Z):X<Y<Z}.
which is really a binary relation. This means that ternary, quaternary relations and
so on do not exist in Libra, although it is usually harmless to pretend that
they do. Since the second term in `ternary' introduces new variables, it is a
filter.
A relation may be non-deterministic, in that an argument in its domain may map
onto more than one value in its codomain, e.g.,
transition ->
{'Cold','Warm'; 'Warm','Hot'; 'Hot','Warm'; 'Warm','Cold'}.
? 'Warm'!transition.
'Cold'
'Hot'
The pairs comprising a relation may have different types, e.g.:
staff->{
1001->{'Name',"John Citizen";'Age',25};
1002->{'Name',"Jane Doe";'Age',40}
}.
is a set of two records.
? 1001!staff.
{'Name',"John Citizen";'Age',25}
? 'Name'!(1001!staff).
"John Citizen"
However,
it is debatable whether records have a useful place in a relational language.
An alternative is to define two simpler relations, as follows:
name->{1001->"John Citizen"; 1002->"Jane Doe"}.
age ->{1001->25; 1002->40}.
There is a hierarchy of relations in Libra: `generators', `constructors', and
`filters'. As we have just seen, a constructor may be used as a constructor or
used as a filter. A generator may be used as a generator, a constructor or a
filter. A filter may be used only as a filter. The distinction is made as
follows. A generator contains no variables, but constructors and filters do.
In a constructor, only the first term of a pair may introduce new variable
names, which may be reused in the second term. A filter contains a pair whose
second term introduces new variables. If a relation contains several pairs,
its lowest ranking pair determines the rank of the relation.
(This classification is a little simplistic. The terms of relations may
themselves be sets or relations defined in terms of variables. For the
purposes of classification, variables defined within these sets or relations
may be ignored, and the sets or relations treated as constants. Thus:
let switch ->{1->{X:X>0}; 2->0}.
find 1!switch.
{(A : (A > 0))}
find 2!switch.
0
defines
a relation that pairs 1 with a filter but pairs 2 with 0. It is a generator.
However, a relation may not yield an unbound variable except as part of a set
definition. For example:
switch ->{1->X; 2->0}.
is
dangerous. The expression `switch(1)' would yield `X', which is considered an
error.
A frequent way to define relations is by a series of cases. For example:
max->{X,Y->X:X>Y; X,Y->Y:Y>X}.
defines
a relation that finds the larger of a pair of values, if there is one. In such
a context, `:' is best read as `if and only if'.
There are no parameters associated with the name of a relation; each pair in
the definition carries its own argument pattern. Thus the previous definition
of `max' could also be written:
max -> {X,Y->X:X>Y; A,B->B:B>A}
In
addition to the means of definition given so far, which are simply those for
sets, constructors also allow the second term of a pair to be defined by an
expression. For example:
succ->{X->X+1}.
defines
`succ' as a relation that adds one to its argument. It may be used as a
constructor or a filter. For example:
? 3!succ.
4
? (5,6)?succ.
'True'
Variables
in constructors and filters are unified with input arguments (in the Prolog
sense). To match a pattern of the form `(X,Y)', the argument must also be a
pair. However, the triple `(1,2,3)' will unify with (X,Y) successfully, giving
X=1, Y=(2,3), because `(1,2,3)' is really `(1,(2,3))'.
A constructor may not introduce new variables on its right hand side. For
example:
confused->{X,Y->X+Y,Z}.
defines
`confused' as a relation that, given two pairs, applies if the first term of
the second pair equals the sum of the terms in the first pair. This is not a
constructor, because there is no way to derive Z from X and Y. It is a filter,
and must be written as one:
confused->{(X,Y),(Sum,Z):Sum=X+Y}.
(This
restriction results from the way Libra does pattern matching, but seems to be
harmless.)
Although `->' and `,' have the same mathematical meaning, there is an
important difference between them. Each term of a Libra generator must be
written in the form:
pattern -> expression
or
the form:
pattern -> expression : predicate
where
the `->' cannot, in general, be replaced by a comma. The pattern may
consist only of variables, perhaps structured as pairs or tuples. An
expression to the right of `->'
or a predicate may be general expressions--but should not introduce new
variables. Thus, although:
confused->{X,Y->Sum,Z:Sum=X+Y}.
has
the same meaning as the preceding definition of `confused', its definition will
cause two warnings, because `Sum' and `Z' first appear to the right of `->'.
Conversely, the definition:
confused->{(X,Y),(X+Y,Z)}.
will
cause a warning, because `X+Y' is not to the right of `->'.
`Sequences' are special relations whose arguments occupy the range 1..n,
for some n>=0. Sequences are useful enough to deserve a special
shorthand. They may be written as lists of terms separated by commas, and
enclosed in square brackets; for example `[97,98,99]' denotes the relation
{1,97; 2,98; 3,99}. `Strings' are sequences whose values are characters, ie.,
small integers. A special notation is provided for strings of characters;
`[97, 98, 99]' may be written more conveniently as `"abc"'.
The empty sequence is denoted by `[]'. It is the same object as the empty set,
denoted by `{}'.
There is a special sequence notation similar to the range notation for sets.
The expression `[M..N]' denotes the sequence whose first term is the integer M
and whose last term is the integer N. So that:
[3..5]
denotes
the relation:
{1->3;2->4;3->5}
Such
a sequence is called a `shifter', as it is often used to shift or select the
terms of another sequence. For example:
? [3..5] o "abcdefg".
[99,100,101]
Sequences,
sets, and tuples may be nested as required, e.g., `{(1,2);{3;4};[5,(6,7)]}'
denotes the set containing the pair `(1,2)', the set `{3;4}', and the sequence
`[5,(6,7)]' of length 2.
It is important to realise that relations are not recurrence formulae. Asking
for the values of the relation:
? @{0->1; X->X+1}.
does
not enumerate the natural numbers, but yields two values; `(0,1)' and a
structure representing `X->X+1'.
(Such a structure does not mean anything outside set braces, and causes a
warning.)
A FUNDAMENTAL difference between an algebra and a programming language is the
idea of `application'. Applying a relation to an input enumerates outputs:
? 3!{X->X-1;X->X+1}.
2
4
If
a relation is applied to an argument that fails to unify with any of its
argument patterns, it generates no output, and is said to be `inapplicable'.
Even if unification occurs, a relation may be inapplicable because of a
predicate introduced by `:'. For example, the relation `max' given earlier is
inapplicable to (1,1), or any other pair of equal values, and therefore yields
no output. A more useful definition of `max' is:
max -> {X,Y->X:X>=Y; X,Y->Y:X=<Y}.
which,
if presented with two equal values would yield the same output twice. In
practice, this would almost certainly be harmless, although it would usually
waste time.
A better definition of `max' is:
max -> {X,Y->X:X>=Y; X,Y->Y:X<Y}.
which
is how the built-in `max' operator is defined.
There are three ways of writing a relational application. We may use the apply
operator (`!'):
? (1,3)!max.
3
or
it may be written as a call:
? max(1,3).
3
or,
if the relation is declared as an infix operator (as `max' actually is), we may
write:
? 1 max 3.
3
These
three ways of writing relational application have exactly the same effect, and
are merely syntactic conveniences. Depending on the context, one may seem more
appropriate than the others.
There are several other operators closely related to relational application.
The first is called `functional application' (`~'), and is written thus:
? max~(1,3).
3
Functional
application differs from relational application only when relational
application yields several values; functional application yields only one of
them, chosen arbitrarily. Among other things, it is useful when a problem
admits several solutions but any solution will do.
A set may be considered as a degenerate relation whose argument-value pairs
have values, but no arguments. In this way, the `for all' operator, `@', may
be considered as a special form of relational application, applied to an empty
argument. Thus, generating the elements of a set is analogous to applying a
relation. (The reader may find this analogy either helpful or confusing. It
is not important.)
The `@' operator has a `functional' form, called `i', (which can be read as
`one of'). So that:
? i{1;2;3}.
1
will
choose 1, 2 or 3, although it is not possible to say which in advance. It is
useful when it is desired to force elements of a set to be considered
sequentially rather than independently. This can be done by choosing an
element from a set, removing it, choosing an element from what remains, and so
on, until the empty set is reached. (However, this style of programming may
indicate that the programmer is thinking too procedurally.)
An alternative way to look at a set is as a relation whose argument-value pairs
have arguments, but empty values. Thus, given an argument, an empty output
results. Libra does not deal in empty results. A similar effect is provided
by the membership operator, `?', which returns a boolean result.
Membership testing never yields more than one result, even if there is more
than one way an argument can be counted as a member of the set. For example:
? 10?{X:X mod 2=0;X:X mod 5=0}.
'True'
(The
membership operator is infix; the `find' operator is prefix.)
A membership test yields 'False' only if there is no way to count the element
as a member of the set. (Prolog typically handles membership testing by
distinguishing success from failure. A membership test in Libra results in one
of three outcomes: 'True' or 'False', or it may be inapplicable, e.g., when the
second operand does not define a set.)
There is an asymmetry between the evaluation of the two operands of the
application operators. The argument is always fully evaluated, but the
relation is evaluated lazily--only those values matching the argument are
evaluated. For example, given the definition:
factorial -> {0->1; N->N*factorial(N-1):N>0}.
and the query:
? factorial(2*3-6).
1
the expression `2*3-6' is fully evaluated to yield `0', which then unifies with `0'
in the `factorial' relation. The argument also unifies with `N' in the second
term, but fails the predicate `N>0'. Thus the expression `N*factorial(N-1)'
is not evaluated. This illustrates an important point. A program typically
defines a complex relation--usually infinite, in that it can be applied to an
unbounded set of different arguments. If the whole program had to be evaluated
(ie., its output were pre-computed for every possible input) before it could
be applied to an argument, Libra would not be a practical programming language.
Finally, what happens if a relation cannot be applied to its argument? It
simply yields no result. (In Prolog terms, it fails.) This is considered a
normal event; there is no run-time diagnostic. On the other hand, the relation
from names to the expressions they define is special; reference to an undefined
name causes an error.
EXPRESSIONS may be used to specify the results of constructors, and in
predicates following the `:' (such that) operator. Both uses of expressions
may use only the variables in the pattern that matches the argument.
A name may be defined to yield any form of expression, not just a relation. An
occurrence of a name is exactly equivalent to an occurrence of any expression
it defines. If a name is overloaded, each expression is substituted in turn,
and all the applicable expressions are used. For example, the two definitions:
neighbour->{X->X+1}.
and
neighbour->{X->X-1}.
are
exactly equivalent to the single definition:
neighbour->{X->X+1; X->X-1}.
Complex
expressions are made more readable by using operators. Appendix A lists all
Libra's built-in operators.
Apart from syntax, operators are simply relations. Unary operators take a
single argument; binary operators have pairs as arguments, e.g.,
(+)->{X,Y->(X,Y)\\(+):X?relations & Y?relations}.
overloads
the definition of binary `+' so that it also signifies vector and matrix
addition.
It is possible for the programmer to define new operators, using the `fx',
`fy', `xf', `yf', `xfx', `xfy', and `yfx' commands. (See Section 2.2.)
The semantics of operator use are the same as relational application, not
functional application. That is, the expression `X+Y' is equivalent to
`+(X,Y)' or `(X,Y)!(+)'. It is not equivalent to (+)~(X,Y). This means that:
find @{0;2}+ @{0;1}.
0
1
2
3
yields
all four results (in some order), and not just one of them as:
find (+)~(@{0;2},@{0;1}).
0
would
do.
The precedence of operators in Libra follows the following general plan:
- Relations and sets
- Arithmetic
- Comparison
- Booleans
- Structures
- Commands
where the most tightly binding operators are listed first,
and the most loosely binding operators are listed last. (Confusingly, as in
Prolog, tightly binding operators have low precedence values, and loosely
binding ones have high precedence values.)
The reason why relational operators were chosen to bind so tightly is that it
is often the case that a relational expression will yield a numerical value
(e.g., `X!r+X!s'). On the other hand, for a numerical expression to yield a
relational value (e.g. `{1,2;3,4}', it would always have to appear in brackets
or braces, making precedences irrelevant.
THE PREFIX `#' operator returns the number of elements of a set, and therefore
the number of pairs in a relation, and therefore, in turn, the length of a
sequence, e.g:
? #"abcd".
4
The
operator `?' tests for set membership, as in:
X?{1..8}
which
yields 'True' for all values of X in the range 1-8 and 'False' otherwise. The
complementary operator `\?' tests for non-membership:
X\?{1..8}
yields
'False' for all values of X in the range 1-8 and 'True' otherwise..
Testing membership of a sequence must be done with care, because sequences are
sets of pairs. For example,
? "a"?"abc".
'False'
? (1,97)?"abc".
'True'
? "a" subset "abc".
'True'
(In
the first example, the expression "a" is the relation `{(1,97)}', not the pair
`(1,97)'.)
Care is also needed with:
? @[1..3].
(1,1)
(2,2)
(3,3)
which
does not generate the outputs 1-3 in sequence, but the outputs (1,1), (2,2) and
(3,3) in an arbitrary order.
The operator `join' forms the union of two sets; `A join B' denotes
the set whose members are either members of `A' or members of `B' (AB). If an
element is a member of both A and B, it appears only once in their union. For
example:
? {1;2}join{2;3}.
{1;2;3}
The
intersection of two sets is formed by `meet'; `A meet B' denotes the set of the
elements of `A' that are also elements of `B'. For example:
? {1;2}meet{2;3}.
{2}
The
operator `omit' forms the asymmetric difference of two sets; `A omit B' denotes
all the elements of `A' except those that are also elements of `B' (A\B). For
example:
? {1;2}omit{2;3}.
{1}
The
cartesian product of two sets is formed by the operator (the letter) `x'; `A x
B' denotes the relation comprising all pairs whose first term is chosen from
`A' and whose second term is chosen from `B' (AB). For example:
? {1;2}x{2;3}.
{1,2; 1,3; 2,2; 2,3}
Finally,
`sets_of A' denotes the powerset of `A', the set whose elements are all the
possible subsets of the elements of `A', including the empty set and `A' itself
(2A). For example:
? sets_of {1;2;3}.
{{}; {1}; {2}; {3}; {1;2}; {1;3}; {2;3}; {1;2;3}}
A
set of N members has a powerset containing 2N elements, so that explicitly
forming a powerset can be costly. On the other hand:
? {2;3}?sets_of {1;2;3}.
'True'
is
efficient, because Libra simply tests whether {2;3} is a subset of {1;2;3}.
SINCE RELATIONS are sets, they may be combined using any of the set operators.
For example:
? 1!({1->2}join{1->3}).
2
3
whereas:
? 1!({1->2}meet{1->3}).
yields
nothing.
(The effect is as if the relation were evaluated, then applied, although the
reality is that the relation is applied lazily. For example, to evaluate `1!({1->2}
meet {1->3}',
Libra applies the first relation to `1' to construct `2', then rejects it using
the second relation as a filter.)
In addition to the set operators, there are several operators that apply only
to relations.
The composition `AB' of relations `A' and `B' is denoted by `A o B' (the letter
`o'). It consists of all pairs (X,Z) such that there is some Y such that (X,Y)
is in A, and (Y,Z) is in B. For example:
? {1,3; 2,3; 3,4} o {1,2; 3,4; 3,5}.
{1,4; 1,5; 2,4; 2,5}
(Although
the first relation can yield both 3 and 4, the second relation is not
applicable to 4.) Relational composition is an analogue of functional
composition. In Libra, `X!(g o f)' has the same effect as `f(g(X))' or
`(X!g)!f'.
(Functional composition is evaluated from right to left, but relational
composition is evaluated from left to right. Relational composition is also
similar to sequential execution in a procedural language. It is guaranteed
that the second relation is applied to the result of the first, ie., in
left-to-right order. This is important when input or output has to be
considered.)
Two operators, `else' and `but', provide the less familiar operations of
`relational extension' (similar to functional extension) and `over-ride' (AB).
If `A' is applicable to an input argument, `A else B' maps it to the outputs of
`A', and `B' is ignored. However, if `A' is inapplicable to the argument, `A
else B' maps it to the outputs of `B'. For example:
max->{X,Y->X:X>Y}else{X,Y->Y}.
maps
(X,Y) to Y only when it cannot map it to X.
The expression `A but B' is equivalent to `B else A'. It is often used when it
is desired to `update' part of a relation, ie., to derive a relation that is
like an existing one, except in some particular, e.g:
{X->X but[2!X,1!X]}
is
a relation that `updates' sequence `X' by swapping its first two elements. `X'
is a sequence, and therefore a relation. The expression `1!X' applies X to 1,
yielding the first element of `X'. Similarly, `2!X' applies X to 2, yielding
the second element of `X'. Since the sequence `[2!X,1!X]' is applicable to 1
or 2, the first two terms of X are over-ridden. Since `[2!X,1!X]' is not
applicable to 3, 4, and so on, the remaining terms of the sequence are the same
as those of X.
The inverse `A-1'
of relation `A' (denoted in Libra by `A^-1') is such that
(X,Y) is a pair in `A-1' if and only if (Y,X) is a pair in `A'. For example:
? "abc"^-1.
{(97,1); (98,2); (99,3)}
(The
`^-' operator reverses the pairs of a relation, not a string; the result isn't
"cba".)
Asking for the inverse of a relation is to ask what inputs can produce a given
output. For example, the relation:
square->{X->X*X}
has
as its inverse the relation finds the positive and negative square roots of the
perfect squares. Since Libra cannot solve equations, only generator relations
can be reversed.
The domain of relation R (dom R) is the set of all X such that (X,Y) is a
member of R. The codomain of relation R (codom R) is the set of all Y such
that (X,Y) is a member of R. For example:
? dom[5,6,7,8].
{1; 2; 3; 4}
? codom[5,6,7,8].
{5; 6; 7; 8}
There
are four `restriction' operators. Each modifies a relation by restricting its
arguments or values according to the members of a set.
The left restriction operator `<?' is such that `s<?r' denotes `r'
restricted to arguments in `s'. It comprises those pairs in `r' whose first
terms are members of `s'. For example:
? {1;3;5}<?[5,6,7,8].
{(1,5); (3,7)}
The
left anti-restriction operator `<\?' is such that `s<\?r' denotes `r'
restricted to arguments not in `s'. For example:
? {1;3;5}<\?[5,6,7,8].
{(2,6); (4,8)}
Left
restriction is often used to apply different relations to different classes of
input. For example, given the following definitions:
negative ->{X:X<0}.
positive ->{X:X>0}.
increment->{X->X+1}.
decrement->{X->X-1}.
then
the relation defined by:
seek_zero -> negative<?increment join positive<?decrement.
has
exactly the same effect as:
seek_zero ->{X->X+1:X<0; X->X-1:X>0}.
(The
restriction operators are more useful when the relations and sets involved are
more complex than in this example.)
A non-obvious but important use of left restriction occurs when it desired to
convert a constructor into a generator. Consider the relation:
add1->{X->X+1}.
It
is not possible to express this relation as a generator in Libra because it is
infinite--or too big to store, at any rate. But if it is known that it will be
used over a finite range of arguments, it can be evaluated, e.g:
? {1..5}<?add1.
{(1,2); (2,3); (3,4); (4,5); (5,6)}
The
right restriction operator `?>' is such that `r?>s' denotes `r'
restricted to values in `s'. It comprises those pairs in `r' whose second
terms are members of `s'. For example:
? [5,6,7,8]?>{1;3;5}.
{(1,5)}
The
right anti-restriction operator `\?>' is such that `r\?>s' denotes `r'
restricted to values not in `s'. For example:
? [5,6,7,8]\?>{1;3;5}.
{(2,6); (3,7); (4,8)}
The
right restriction operators are typically used in `generate and test'
situations. For example:
? {1..100}?>{X:X mod 2=1}.
{1; 3; 5;... ;99}
Actually,
the left and right restriction operators can be simulated by using the built-in
`id' relation, which yields an identity relation on a set. `S!id' or `id(S)'
yields a relation that, when applied to a member of `S', yields the same member
of `S' as its argument, but yields nothing when applied to a non-member of `S'.
For example:
? 5!{1;2;3}
? 1!id{1;2;3}
1
The
left restriction `S<?R' could therefore be written as `id(S) o R', and the
right restriction `R?>S' could be written as `R o id(S)'.
SEQUENCES are relations, so most set and relational operators can be applied to
them. However, they have a few operators of their own.
Sequences may be concatenated, using the operator `&&', so that
"ab"&&"cd" is the string "abcd".
It is simple to find the n-th element of a sequence, e.g., 2!"abcd" and
"abcd"~2 both yield 98. If a string has a name, it is also possible to use
functional notation:
str -> "abcd".
? str(2).
98
The
expression `[2..3]o"abcd"' simplifies to `"bc"'--illustrating how a substring
can be selected from a string. It is also straightforward to convert a single
character to and from its ASCII equivalent; 1!"a"=97, and [97]="a".
Given the argument `[5,6,7,8]', the built-in function `head' returns the term
`5', the function `tail' returns the sequence `[6,7,8]', the function `last'
returns the term `8', and the function `front' returns the sequence
`[5,6,7]'.
The built-in `sort' function takes a set as argument and yields a sequence
whose terms are the elements of the set in Prolog's standard order. For
example:
? {2;3;1}!sort.
[1,2,3]
The
`sort' function always sorts into ascending order. However, the `<-'
operation reverses a sequence, so that:
? ({2;3;1}!sort)<-.
[3,2,1]
is
a combination that sorts a set into descending order.
There is sometimes a need to invent identifiers for objects. The function
`unique' appends a unique serial number to a string, so that for example,
successive calls of `unique("Vertex_")' will return the strings "Vertex_1",
"Vertex_2", and so on.
RELATIONS that have the same argument and value types are called homogeneous.
They can be applied to their own outputs, and allow the concept of closure.
Closure is a very important concept in Libra, analogous to iteration or
search.
The infix operator `^+' is used to apply a relation to its own output a fixed
number of times; `R^+2' is equivalent to `R o R', `R^+3' is equivalent to
`R o R o R', and so on (R2,
R3, etc. in mathematical
notation). `R^+1' (R1)
is simply `R' itself, and `R^+0' (R0) yields its
argument as a result, provided `R' can be applied to it.
The postfix operator `^+' forms the transitive closure of a homogeneous
relation (R+ in mathematical notation).
It is defined by the infinite union:
R+ = R U (R o R) U (R o R o R) ...
For example:
?{1->2;2->3;3->4}^+ .
{(1,2);(1,3);(1,4);(2,3);(2,4);(3,4)}
Transitive
closure is best understood by thinking of the graph of the relation:

where each edge represents one of its pairs. The members of the transitive
closure are the paths in the graph. For example, since there is a path in the
graph from 1 to 4, the pair (1->4) is a member of the transitive closure.
There is an important restriction on the relations whose closure Libra can
compute: they must have acyclic graphs. Consider the cyclic graph:

Finding the transitive closure of the corresponding relation:
?{1->2;2->3;3->4;4->1}^+ .
will
result in an infinite computation because the graph contains an infinite number
of paths of unbounded length. Libra's implementation of transitive closure is
not sophisticated enough to recognise that in exploring longer and longer
paths, no new terms are added to the closure.
It is entirely possible to devise an algorithm that finds the transitive
closure of a cyclic relation. One way is, when exploring a path, to test
whether the next vertex to be added to the path is already on it. Another way
is to test whether a newly generated pair is already part of the result. Why
aren't these methods built into Libra?
There are three reasons. The first is that transitive closure is often used in
a simple way, where cycles obviously cannot occur:
?(0,1)!({M,N->N,M+N}^+).
(1,1)
(1,2)
(2,3)
(3,5)
(5,8)
...
(the
Fibonnaci series). Testing whether a cycle has occurred would be an
unnecessary overhead in such a situation.
The second reason is that the transitive closure of a relation can be extremely
large. Complex search problems can be modelled by closures. A typical search
has the form:
?start!(improve^+ ?>solution).
which
finds all solutions by repeatedly using `improve' to transform the initial
state `start' until it satisfies `solution'. Keeping track of the states
generated by `improve^+' might use up all available storage. An important part
of solving a search problem is often to find a clever way of keeping track of
previous states.
The third reason also relates to search problems: it is usually necessary to
record how a solution has been reached. This typically means storing the
sequence of moves chosen by `improve'. This sequence becomes part of the input
and output of `improve', so, unfortunately, it would defeat an automatic cycle
detector. On each iteration around a cycle, the sequence of moves would grow
longer. However, a built-in cycle detector could not distinguish the move
sequence from the rest of the state, and would consider each iteration to
generate a new state.
Despite these remarks, there is nothing to stop the programmer constructing
more sophisticated tools for finding transitive closures, and in fact Libra
uses one itself when it must evaluate the transitive closure of a relation in
extensional form (rather than applying the closure of an intensional form.)
Thus:
?{1,2;2,1}.
{1,1; 1,2; 2,1; 2,2}
works
correctly, without being trapped. Since Libra must store the whole result
anyway, it uses the partial result to check if progress is being made.
The `^+' operator always applies its relation at least once. The similar `^*'
operator forms the reflexive transitive closure of a relation:
R* = id U R U (R o R) U (R o R o R) ...
where `id' is the identity function on the domain of R.
In terms of the graph of the relation, the reflexive transitive closure
consists of all pairs of vertices linked by paths, including those of zero
length. For example:
? {1->2;2->3;3->4}^* .
{(1,1); (1,2); (1,3); (1,4); (2,2); (2,3); (2,4); (3,3); (3,4); (4,4)}
(The
difference between `^+' and `^*' can be likened to that between the `repeat'
loop and the `while' loop in a procedural language such as Pascal.)
The closure of a function can act like a simple loop:
factorials->{N->N,1}o{N,F->N-1,F*N:N>0}^*o{0,F->F}.
This
defines a factorial relation that uses iteration rather that recursion. It is
the composition of three relations. The first maps its argument N onto the
pair (N,1). The second is the reflexive transitive closure of a function that
forms a product. The third relation suppresses its first argument. Some care
was needed in this definition. Omitting the predicate in the second component
relation:
factorials->{N->N,1}o{N,F->N-1,F*N}^*o{0,F->F}.
would
compute the correct result, but then continue to seek further solutions
indefinitely. Omitting the zero from the pattern in the third component:
factorials->{N->N,1}o{N,F->N-1,F*N:N>0}^*o{N,F->F}.
Would
deliver the correct solution, but also output a trace of the steps taken to
find it.
Because a closure operator often needs the termination condition to be written
twice, a limit (^^) operator is provided for greater convenience.
Mathematically, the limit R^ of relation R
yields the pairs that are in the
reflexive transitive closure of R, but to whose second term R is applicable.
In terms of graphs, the limit operator finds all the paths that cannot be
extended further. For example:
? {1->2;2->3;3->4}^^ .
{(1,4); (2,4); (3,4); (4,4)}
The
factorial function may also be defined using the postfix limit operator `^^':
factorials->{N->N,1}o{N,F->N-1,F*N:N>0}^^o{N,F->F}.
It
is no longer necessary to restrict the output in the third term because the
second term is inapplicable to a pair of the form `(0,F)'. The limit operator
applied to a function is the closest thing in Libra to a `while' loop in a
procedural language.
As mentioned earlier, the inverse `A-1'
of relation `A' (denoted in Libra by
`A^-1') is such that (X,Y) is a pair in `A-1'
if and only if (Y,X) is a pair in
`A'. The `^-' operator can take any non-negative exponent. The expression
`A^-N' is directly equivalent to `(A^-1)^+N'; ie., it yields all paths of
length N in the reversed graph of the relation. For example:
? {1->2;2->3;3->4}^-2.
{(3,1); (4,2)}
(There
is no `^-' postfix closure operator analogous to `^+'.)
A HIGHER-ORDER relation is one whose values are other relations. The following
example defines a class of functions that add a number to their arguments:
add->{X->{Y->X+Y}}.
When
`add' is given the argument `1':
?add(1).
{(A,(1 + A)}
it
yields a function that adds 1 to its argument.
A similar kind of higher-order relation may be used to yield a set:
gtr->{X->{Y:Y>X}}.
Which
when given an argument:
?gtr(0).
{(A:(A > 0))}
yields
a filter that contains the positive integers:
Among other things, such relations can be combined using restriction operators,
e.g:
?2!gtr(0)<?add(1).
3
Higher-order
relations of this form allow `currying', commonly used in functional
programming.
`Set reduce' (>>->) is an important operator that applies its second
operand to combine all the values of its first operand. For example:
? @{1;2;3;4}>>->(+).
10
adds
together 1, 2, 3 and 4.
Each different second operand of `>>->' yields a different relation.
For example:
? @{1;2;3;4}>>->(*).
24
multiplies
together 1, 2, 3 and 4.
`Set reduce' offers the neatest way to define a factorial function:
factorial->{N->@{1..N}>>-> *}.
The
function maps its argument onto the set of integers from 1 to N, whose members
are then generated and multiplied together.
If the set being reduced contains one element, the `>>->' operator
returns the element itself:
?{1;2;3;4}>>->(+).
{1;2;3;4}
(Its
first operand has only one value--a set.)
If an attempt is made to reduce the empty set, the `>>->' operator is
inapplicable. The definition of `factorial' just given does not deal correctly
with an argument of zero, although this is easily corrected:
factorial ->{0->1; N->@{1..N}>>-> *}.
Reduction
operators have an important property: they first spawn, then gather together,
several threads into a single result. Reduction provides the only way in which
parallel threads can interact. In the preceding example, `@(1..N)' generates N
threads, which are reduced to one by `*'. The symbol for set reduction
emphasises this many-to-one aspect. The values `>>->'
gathers together are precisely those of its first operand.
The set reduction operator can be applied to a relation rather than a set. For
example:
unit_image ->{X,R->X!R o{Y->{Y}}>>->join}.
defines
the `unit image' function, which given an argument and a relation, finds the
set of all values yielded by applying the relation to the argument. For
example:
? unit_image(1,{1->2;1->3}).
{2; 3}
In
this example, the set of values to be reduced is generated by the `!' (apply)
operator. Each result obtained by applying R to X is made into a singleton set
{Y}, and the resulting sets are joined to give a single result.
In practice, reduction using the join of singleton sets is so common that there
is a unary form of the `>>->' for precisely this purpose. Using it,
the definition of `unit_image' can be given more concisely:
unit_image -> {X,R->X!R>>->}.
Libra
has a built-in `image' operator, such that `S image R' yields the set
of all the values that can be obtained by applying R to the members of S.
`Image' could be defined as follows:
image -> {(R,S)-> @S!R>>->}.
? {1;2}image{1,3;1,4;2,5;3,6}.
{3;4;5}
The
2nd operand of `reduce' should be a homogeneous, binary, commutative operator.
That is, it should be a relation of the form {(X,Y)->Z},
where X, Y and Z have the same type, and also:
op(X,Y) = op(Y,X),
and
op(op(X,Y),Z) = op(X,op(Y,Z)).
For
example, the expression:
@{1;2;3}>>->(-)
would
give unpredictable results, depending on the order in which pairs of terms were
subtracted.
The sequence reduction operator, `>>=>', operates on sequences in a
similar way. However:
? @[1,2,3,4]>>=>(-).
-2
is
guaranteed to yield `1-(2-(3-4))'. In the case of sequence reduction, the
operator does not need to be commutative.
One use of `>>=>' is to flatten a sequence:
?["abc","def","xyz"]>>=>(&&).
[97,98,99,100,101,102,120,121,122]
ie.,
"abcdefxyz".
Another potential use of a higher-order relation is as a `package', as follows:
stack -> {
'Push'-> {X,S->[X]&&S};
'Pop' -> {S->head(S),tail(S) };
'Top' -> {S->head(S)}
}.
where
'Push', 'Pop' and 'Top' name relations within the `stack' package. The
relations can be referred to as `stack('Push')', `stack('Pop')' and
`stack('Top').
Given two relations with the same domain and codomain, any binary operator can
be `amplified' using the `zip' operator, `\\'.
? ([1,2,3],[4,5,6])\\(+).
[5,7,9]
The
effect of this is to add the terms pairwise, similar to vector addition.
Here is a second example:
("abc","xyz")\\(->)
which
yields:
{(1,(97,120));(2,(98,121));(3,(99,122))}
The
`zip' operator may be defined as follows:
'\\'-> {(R,S),Op->@(dom R meet dom S)!{X->X,(X!R,X!S)!Op}}.
As a result, if an argument yields multiple values, all combinations are
enumerated.
({1->2;1->3},{1->5;1->7})\\(*).
{(1,10);(1,14);(1,15);(1,21)}
The
`\\' operator may be used to define new operators, or to overload existing ones:
'+'->{R1,R2->(R1,R2)\\(+)}.
defines
an operator that is capable of vector addition--among other things. Because of
the way `\\' and the built-in `+' operator are defined, there is no ambiguity
about which `+' applies in a given case; `\\' can only apply to relations, and
the built-in `+' operator only applies to integers. Another nice feature is
that vector `+' applies to matrices represented in either of two ways. If a
matrix is represented as a function from number pairs to values, the existing
definition works straightforwardly, because `\\' makes no assumptions about the
domains of the relations involved. If a matrix is represented as a vector of
vectors, `+' works by calling itself recursively.
The precedences and associativities of all the set and relational operators are
as follows:
50 yf ^*,^+,<-,^^
50 fy sets_of,seqs_of,dom,codom
50 yfx ^+,^-
100 yfx meet,o,x,<?,?>,<\?,\?>
125 yfx &&,join,omit,else,but
150 yfx !,~,\\
150 fy @,i
175 yfx >>->,>>=>,image
175 yf >>->
175 fx #
CURRENTLY, only integer arithmetic is supported. The arithmetic operators are
`+' (addition), `-' (subtraction), `*' (multiplication), `/' (division), `^'
(exponentiation), `>>' (right shift), `<<' (left shift), `/\'
(bitwise and) `\/' (bitwise or), and `mod' (remainder after division), all with
their usual Prolog meanings. In addition, `X min Y' yields the
smaller of X and Y; and `X max Y' yields the greater of X and Y.
Arithmetic expressions are formed in the usual way. The following relation
defines the successor of an integer:
succ->{X->X+1}.
Arithmetic
expressions may include functional applications, of the form `fn~(Args)'. The
following is yet another definition of a factorial function:
factorial-> {0->1; N->N*factorial~(N-1):N>0}.
Since
all functions are relations, relational application is an alternative to
functional application:
factorial-> {0->1; N->(N-1)!factorial*N:N>0}.
or
factorial-> {0->1; N->N*factorial(N-1):N>0}.
The
precedences and associativities of the arithmetic operators are as follows:
200 xfy ^
300 yfx mod
400 yfx *,/,<<,>>
500 fx -,+
500 yfx +,-,/\,\/
600 yfx max,min
THE comparison operators `=', `<', `>', `=<', `>=', `\=' may be
used to compare arithmetic expressions in the usual way. In addition, `\<'
and `\>' are synonyms for `>=' and `=<' respectively.
The same comparison operators may also be used to compare strings in
lexicographical order, ie., if A and B are strings, `A<B' is true if A
would precede B in a dictionary. Strings are special cases of vectors, so
vectors (and vectors of vectors) may be compared similarly. When these
operators are applied to other sets or relations, their effects are
unpredictable.
Sets may be compared using the following operators:
Operator Meaning
A equal B A and B have the same members.
A unequal B A and B don't have the same members.
A subset B Every member of A is a member of B.
A inside B Every member of A is a member of B,
but some member of B is not a member of A.
A encloses B Equivalent to `B inside A'.
A includes B Equivalent to `B subset A'.
A disjoint B No member of A is a member of B.
(Since sequences are relations and therefore sets, it is important to
distinguish `=<' from `subset', for example.)
`A equal B' is true if and only if `A' and `B' represent the same set. `A
unequal B' is true if and only if `A' and `B' represent different sets. `A
subset B' is true if and only if every member of `A' is also a member of `B'.
`A inside B' is true if and only if every member of `A' is also a member of `B'
and `B' has at least one member that is not a member of `A'. `A includes B' is
a syntactic variant of `B subset A'. `A encloses B' is a syntactic variant of
`B inside A'. Finally, `A disjoint B' is true if and only if `A' and `B' share
no common members.
All the comparison operators yield the boolean values 'True' and 'False'.
Sets or relations defined using variables are considered equal only if they are
isomorphic; that is, if there is a one-one correspondence between their
variable names. For example `{X->X+2}' and `{Y->Y+2}' are considered
equal, but neither is equal to `{X->X+1+1'}. Although all three clearly
define the same function in this case, a computable test for equivalence is
impossible in general, and Libra does not attempt one.
The set membership operators `?' and `\?' discussed earlier are also classed as
comparison operators.
The precedences and the associativities of the comparison operators are as
follows:
700 xfy =,\=,<,>,>=,=<,\>,\<,
equal,subset,inside,encloses,includes,disjoint
700 xfx ?,\?,..
(The range operator `..' is not a comparison operator,
but has the same precedence.)
The boolean operators are `&' (and), `v' (or), and `\' (not), and `=>'
(implies), and `<=>' (logically equivalent).
The expression `P&Q' yields 'True' only when both `P' and `Q' are 'True',
otherwise it yields 'False'. The expression `P v Q' yields 'True' if either
`P' or `Q' are 'True'; it yields 'False' when both `P' and `Q' are 'False'.
The expression `\P' yields 'False' when `P' is 'True' and 'True' when `P' is
'False'.
The expression `P=>Q' yields 'False' when `P' is 'True' and `Q' is 'False',
and yields 'True' otherwise. The expression `P<=>Q' yields 'True' if and
only if `P' and `Q' have the same truth value.
The boolean variables 'True' and 'False' are distinct from the concepts
`applicable' and `inapplicable'. A relation may be applicable to 'False', or
yield the output 'False'. A relation may be inapplicable to 'True'. An
inapplicable relation yields no result, not 'False'.
In the construction:
{X->X+1:X<8}
the
predicate `X<8' is inapplicable if the argument `X' is not an integer. If
the predicate of a relation is inapplicable, the whole relation is
inapplicable.
(In this context, the logical complement of `X<8' is not simply `X\>8',
but actually `X\?integers v X\>8'.)
Libra allows a shorthand often used in mathematical notation: the expression
`X<Y<Z' is interpreted as `X<Y&Y<Z'. This example can be
generalised to any number of terms and any choice of comparison operators.
Boolean expressions are evaluated from left to right, until the result is
known.
The precedences and associativities of the boolean operators are as follows:
900 fy \
925 xfy &
950 xfy v
975 xfx =>,<=>
The `succeeds' relation is a `hook' to the
underlying Prolog system, which may allow access to special operating system
features. Its argument is a Prolog goal, and its value is `true' is the goal
succeeds, and `false' otherwise. It may be used in the following way:
unifies -> {X,Y:succeeds(X=Y)}. ? unifies(1)
1
? unifies(1,2)
false
The precedences and associativities of the structure operators are as follows:
1000 xfy ,
1050 xfy ->
1075 xfx :
1100 xfy ;
The commands bind very loosely. In particular, `unary_prec'
and `binary_prec' bind looser than any potential operand. Their precedences
and associativities are as follows:
1150 fx let, find, ?, edit, use, reuse, show, dump, drop
1150 xfx fx, fy, xf, yf, xfx, xfy, yfx
1175 fy help
1200 fx unary_prec, binary_prec
BECAUSE operators may be overloaded, it is sometimes necessary to test the type
of the argument of a relation. Types are simply sets, and any set expression
can be used as a type. Types may be checked using the set membership operator,
e.g., `X?integers' is true if `X' is an integer--or a character.
Some useful sets are built-in:
- integers
- naturals
- positives
- characters
- booleans
- literals
- sets
- relations
- sequences
- strings
- any
A few of these sets are defined as generators, although their
main use is as filters. For example, `X?integers' yields 'True' if and only if
`X' is an integer. On the other hand `@integers' will (start to) generate all
the integers.
`Naturals' is the set of non-negative integers, and `positives' is the set of
positive integers. `Characters' is the set of integers from 0 to 255. These
are all generators, as is `booleans', which generates the set `{'True';
'False'}'.
The remaining sets are filters. The set `literals' includes all atoms
beginning with a capital letter--including 'True' and 'False'. The set `sets'
contains sets of all kinds, including relations, sequences and strings.
`Relations' contains only those sets that consist of ordered pairs.
`Sequences' are functions whose arguments range from 1 to N. `Strings' are
sequences whose values are characters. `Any' denotes the universal set.
`X?any' is always 'True'.
Because any valid set expression may be used to define a type, some of the
built-in sets are related, for example:
sets -> sets_of any.
relations -> sets_of (any x any).
It
is also possible to define new types:
integer_pairs -> integers x integers.
Finally,
the set `grounded' comprises all expressions that are free of variables. This
is useful because some operators can only be applied to grounded expressions,
for example, it is only possible to evaluate the inverse of a relation if it is
in extensional form. The set `symbolic' is the complement of `grounded'.
IT IS IMPORTANT to understand how a Libra program is executed, for two reasons:
to know what ranks of relation (generator, constructor or filter) allow
execution to be possible, and to be able to make some estimate of program
complexity.
A Libra program is best thought of as a collection of threads of control. A
thread of control carries an argument with it. When a relation is applied to
the argument, each result generates a new thread of control. In the current
interpreter each result thread is explored in turn by back-tracking, but it is
better to consider that they are all executed in parallel; there is no way for
the program to communicate between the threads, and the order in which they
will be executed is unpredictable anyway. A second way in which several
threads can arise is by a thread invoking the `@' (for all) operator, which
generates a thread for each member of a set. Closure operators apply a
relation to its own results, and can multiply the number of threads
exponentially.
What mechanisms remove threads? The simplest is when a relation is
inapplicable to its argument. A thread reaches the relation, but no result
emerges. This is typical of a `generate and test' strategy, or pruning during
a tree search. A similar effect is achieved by the restriction operators,
which kill off threads according to whether they are members of a given set or
not.
Reduction first multiplies threads and then reduces their number to one, or
even zero. When a thread of control reaches a reduction operator, it generates
a thread for each value of its first operand. The operand can either simply
generate a set, or it can be a multi-valued relation applied to an argument.
These threads are generated within an envelope that allows them to be collected
together again and combined using the second operand of the closure operator.
(The current implementation relies on Prolog's `findall' predicate.) Thus, one
thread emerges to match the one that reached the construct--or, if the
collection proves empty, none emerges at all.
Functional application (f~X) is a special case of reduction, where the
combining operation has the form {(X,Y)->X}, ie., one result is chosen
arbitrarily. In the back-tracking interpreter this is implemented by making a
`cut' as soon as any thread succeeds. In a parallel implementation it could
perhaps be done by selecting the first thread to complete, then killing off the
slower threads.
An important aspect of control is ensuring that a program is not trapped in a
loop. This can happen in one of two ways: through the use of recursion, or
through using a closure operator. Recursion is discouraged in Libra (but not
forbidden). (The use of recursion may imply that the programmer is thinking
sequentially rather than relationally.) We have already pointed out some of
the pitfalls of the closure operator; the programmer must ensure that a closure
terminates without being trapped by a cycle. There is little that can be
assumed about the order of execution. Libra does not guarantee that
alternatives will be chosen in any particular order. For example, the
expression:
? 0!{X->X+1;X->X-1}^* .
is
capable of generating all the integers, but only in principle. Depth-first
search might generate 0,1,2,3... or 0,-1,-2,-3... .
A better strategy would be breadth-first search, generating the series
0,1,-1,2,-2,3,-3... . However, Libra does not promise to do either of
these things, but might alternately add and subtract one, cycling between the
values 0 and 1 indefinitely.
What does Libra promise about closure? Only this: a child result cannot be
generated until after its parent result has been generated. In the above
example, 3 is either the child of 2 or of 4, so 3 cannot be generated until
after 2 or 4 has been generated. (Since 4 cannot be generated until after 3 or
5 has been generated, it is easy to prove that 2 must be generated before 3.)
This sequential property derives from the composition operator. In the
expression `X!r o s', first `r' is applied to `X', then `s' is
applied to the results. This is guaranteed in any implementation of Libra.
Another programming consideration is the asymmetry between the treatment of the
operands of `apply' (!). The first operand is always evaluated as fully as
possible, but the second is always evaluated as little as possible. Normally,
the first operand will contain no variables, and is said to be `grounded'. If
it does contain variables, and cannot be reduced to a grounded form, it will
passed in `symbolic' form, ie., still containing variables, although it may
become partially simplified. This allows a constructor or filter to be passed
as an argument to a higher order relation. However, if a first operand can be
reduced to a grounded form, it always will be. This means that passing an
expression as a first operand to some other relation is one way to force its
evaluation.
The programmer should be aware that each time any relation is applied to an
argument its results are calculated afresh. If a complex relation is likely to
be applied to the same argument several times, it may be worthwhile evaluating
the relation first, and storing it as a set of argument-value pairs--provided
it is not too large. For example:
r->{0,0; 1,1; 2,0; 3,1; 4,0; 5,1}.
add1->{X->X+1}.
? @{0..5}!(r o add1).
1
2
1
2
1
2
causes
`add1' to be applied to 0 three times, and to 1 three times. This could be
avoided as follows:
? {0;1}<?add1!{X->@{0..5}!(r o X)}.
1
2
1
2
1
2
which
evaluates `{0;1}<?add1' to give {0,1;1,2}, which is then used as argument
`X' in the second relation. Although the results are the same, the expressions
`0+1' and `1+1' are only evaluated once each, then remembered. Although this
is not worthwhile in this example, it can become an efficient technique when a
complex relation is substituted for `+'.
There are several other ways of forcing evaluation. For example, finding the
inverse of a relation (R^-1), first forces the relation to be evaluated.
It is difficult to summarise the rules governing the ranks of relations needed
by different operators in different modes of use. For example, if `A meet B'
is used as a generator, `A' must be a generator, but `B' may be a filter.
However, if `A join B' is used as a generator, both `A' and `B' must be
generators. As a general rule, the first operand will need to have at least
the same rank as the mode of use, so that a generator is needed to generate a
set, a constructor is needed to construct a result, but only a filter is needed
to test membership. The second operand may need the same or a lower rank,
depending on the operator. If a relation has insufficient rank, an error will
be detected during program execution. As a rule of thumb, the programmer is
advised always to place the higher-ranking operator first, and hope for the
best. Currently, the only alternative is to consult the source program of the
interpreter.
A PROGRAM is a set of named definitions of relations. Such a set is itself a
relation--from names to expressions. In the current implementation of Libra,
all definitions are global in scope.
Variable names are local to the elements of a relation or set. For example, in
the definition:
let change->{X->X+1;X->X-1}.
the
two occurrences of X are independent. They are also independent in the
following construct:
find 1!(X->X+1}o{X->X+1}.
3
If
they were not, the argument `1' would not only bind the first relation to the
pair (1,1+1), but the second relation to (1,1+1) as well, so the second
relation would not be applicable to the intermediate result (ie., `2').
A relation or set that has another set as an element may cause a potential
ambiguity. For example, what should the following definition do?
let neighbours -> {X->{X-1;X+1}}.
The
intention seems clear. We would expect:
find neighbours(1).
{0; 2}
The
rule is adopted that a variable name has scope throughout the set element in
which it occurs, including any sets occurring within the element.
This means caution is needed in defining higher-order relations. For example,
the intention of the following example is to define a function that will
evaluate relation `R' over the domain `X', thus producing a generator:
eval->{X,R->{@X!{X->X,X!R}>>->}}.
However,
in the construction `{X->{X,X!R}}'
`X' is bound to the first argument of `eval', but the programmer's intention
was for `X' to be bound to an element of the first argument. Libra will detect
the problem and issue a warning. The problem is easily corrected:
eval->{S,R->{@S!{X->X,X!R}>>->}}.
The normal action of Libra is to `find' the values of an expression, displaying
them in Libra's input format. This is adequate for ad hoc problem solving and
program development, but it is not always user-friendly enough. One problem is
that character strings are displayed as sequences of numbers. Although control
is needed over input and output operations, they are a problem for Libra,
because they have no meaning within the binary relational algebra.
Consequently, input and output are mostly treated as side-effects of identity
functions, ie., functions that transparently copy their arguments to their
results but produce an action outside the program.
A good solution would be to treat files as relations [3]. An indexed file maps
an argument--which becomes its key--onto a tuple containing one or more
attributes. A sequential file is a sequence: e.g., a text file is a mapping
from integers to lines, ie., strings. Alternatively a file could contain a
sequence of Libra expressions in input format. The user of Libra can also be
treated as a relation, one that maps a question to a response. Unfortunately,
there is no easy way--at least that I can see--to implement these ideas
portably in Prolog. So for the present, Libra input-output is, frankly, a
kludge.
A second problem with input-output in Libra is that, in principle at least,
different threads of execution can proceed in parallel. This means that input
and output from different threads could become hopelessly inter-twined. For
example, two threads could attempt to ask the user a question, but both could
complete their output before either got the user's reply. How will the user
know which question to answer first? Even in a back-tracking implementation,
the order of execution of different threads is unpredictable.
The way this matter is currently resolved is that Libra currently allows only
one input and one output file to be in use at one time. They must be assigned
using `see' and `tell', in a similar way to Prolog--even to converse with the
user. If no output file has been assigned by `tell' no output is possible.
Likewise, if no input file has been assigned by `see' no input is possible.
Files are released using `seen' and `told'.
In a parallel implementation of Libra, an attempt to `see' an input file or
`tell' an output file when one is already is use would cause the requesting
thread to wait until the file was released. In the current back-tracking
implementation, such an attempt could only happen because another thread has
failed to release the file--and never will. This is a programming error. The
error is reported, and the attempt fails.
Input-output operations within a thread will occur in a predictable order if
they are linked by composition operators. The limit (^^) of a function (which
is equivalent to a series of compositions) will also generate an ordered
sequence of operations, similar to a while loop. Likewise, the construct `A
else B' guarantees to test the applicability of `A' before attempting to apply
`B'. Composition, limit and `else' are the only constructs that can will
safely guarantee the sequence of input-output operations.
If a thread that has acquired a file branches non-deterministically, Libra does
not forbid its branches to read input or write output. Although the order in
which branches will transfer data is unpredictable, this might not matter a
great deal, and may even be useful during debugging. However, it is not
recommended. It is usually better for each thread to acquire the file and
release it as needed, or to work within a single thread.
The `see' function takes a string as its argument, and yields an identity
function as its result. It has the side-effect of opening for input the file
whose name is given by the string. This odd-seeming specification has the
effect of making `see(X)' transparent. For example;
find 1!see("user").
1
makes
input from the user's keyboard possible. (The expression `see("user")' yields
the identity function, which is applied to `1'.)
Similarly, the `tell' function also takes a string as its argument, and yields
an identity function as its result. It has the side-effect of opening for
output the file whose name is given by the string. For example;
? 1!tell("user").
1
makes
output to the user's display possible.
The `speak' function is an identity function that has the two side-effects of
opening the user's keyboard for input and opening the user's display for
output.
The `seen' relation is an identity function that releases the current input
file. The `told' relation is an identity function that releases the current
output file. The `spoken' relation is an identity function that releases
user's keyboard and display.
The `write' function writes its argument (to the current output file) in input
format (the same format as `find'), and yields its argument as a result. For
example:
? "abc"!(speak o write o spoken).
[97,98,99][97,98,99]
(There
is no line-feed; `find' displays the result again.)
The `debug' function is similar to `write' but always sends its output to
`user', preceding it with `debug: ', and following it with a new line.
The `put' function expects a string for its argument, which it writes (to the
current output file) as a series of characters. It yields an identity function
as a result. For example:
? "abc"!(speak o put("The answer is ") o put o spoken).
The answer is abc{(A,A)}
(Since
`speak' is an identity function, and the first `put' yields an identity
function, the second `put' operates on the string "abc", yielding an identity
function that is copied by `spoken'.)
The `nl' function is an identity function that writes (to the current output
file) a new-line as a side-effect. For example:
find "abc"!(speak o put o nl o spoken).
abc
{(A,A)}
Because
the output of `find' can be a nuisance, the `null' function may be used to
suppress it. The `null' function yields a special result that cannot be
generated in any other way, which is recognised by `find', and ignored. For
example:
? "Type a number: "!(speak o put o nl o spoken o null).
Type a number:
yields
no visible result. (However, `null' is not inapplicable. This matters; e.g.,
if `null' is used within the first operand of an `else' operator.)
The relation `read' reads (from the current input file) a Libra expression
terminated by a period (`.'), possibly extending over several lines. Its
argument is ignored, and it yields the expression read. At the end of file,
`read' is inapplicable.
The relation `get' reads one line from the current input file. Its argument is
ignored, and it yields the line read as a string of characters. The last line
of the file may be terminated by either an end of line or an end of file.
There is no way for Libra to tell which. At the end of file, `get' is
inapplicable.
Input and output operations involving the user may be combined to form a
dialogue:
? "Type a number: "!(speak o put o get o spoken).
Type a number: -1
[45,49]
If
the argument of the conversion function `str_to_int' is a string representing
an integer, it returns the integer, otherwise it is inapplicable. For example:
?[45,49]!str_to_int.
-1
This
could obviously be used as follows:
? "Type a number: "!(speak o put o get o spoken o str_to_int).
Type a number: -1
-1
The
function `int_to_str' is the inverse of `str_to_int', which makes it useful for
output:
? -1!(int_to_str o speak o put o nl o spoken).
-1
[45,49]
This section explains the development of three example programs. Accordingly,
the programs are presented piecemeal. The full texts of the second and third
examples are given in Appendix B.
The first example program is just a `one-liner', which, like most one-liners,
is very tricky. Here it is:
copy_file -> {In,Out->[] ! see(In)
o tell(Out)
o (get o put o nl)^^
o told
o seen}.
The
intention of the definition is that a command of the form:
? copy_file("input","output").
{(A,A)}
will
copy the text file "input" to the file "output". (Setting the output to "user"
will display the file.)
Basically, the definition is a composition. Because it is a composition,
everything happens in sequence from left to right. First, `see(In)' assigns
the input channel to `In'. Second, `tell(Out)' assigns the output channel to
`Out', Third, a limit construct (^^) does the actual copying. Fourth, `told'
releases the output channel. Fifth and last, `seen' releases the input
channel.
What about the limit construct? This comprises three steps: `get' reads the
next line of input as a string, `put' writes it, and `nl' writes a new line.
Because of the limit operator, these three steps are repeated until they cannot
be applied any more. This happens when `get' attempts to read past the end of
file (`get' is inapplicable).
So much for the definition as a procedure. What about its behaviour as a
relation? Why does `find' display `{(A,A)}'? The expression `see(In)' yields
an identity function, as does `tell(Out)'. So the result of `[]!see(In) o
tell(Out)' is simply `[]'. The `get' relation yields a string irrespective of
its argument. Its result becomes the argument of the `put' function, which
yields an identity function as a result. Since `nl' is transparent, the `get o
put o nl' composition yields an identity function. The limit construct as a
whole yields the value of the last successful iteration, which is an identity
function. The `told' and `seen' relations are also transparent, so that is the
result of `copy_file' as a whole. An exception arises if the file is empty,
when the original argument to `see' is copied transparently.
It really doesn't matter what the argument of `see' is, as long as it has one.
To see why, consider the sub-expression `see(In) o tell(Out)'. This is
equivalent to `{X->X}o{X->X}', because each term yields an identity
function. Libra can apply this composition to any given argument. For
example, the first relation may be applied to `1', because `1' matches the
pattern `X'. The intermediate result is `1', and in a similar way, the second
relation also yields `1'. However, Libra is not smart enough to realise that
the composition of two identity functions is also an identity function--or even
that `{X->X}' is an identity function. It can only evaluate a composition
explicitly if the first relation is a generator. It therefore attempts to
treat `{X->X}' as a generator, which yields the term `(X,X)'. Because
`(X,X)' contains variables, but is not a set or relation, Libra recognises that
it is in trouble, and issues a warning message.
As the second example of program development we consider a simple planning
problem, of the kind that is easily solved in Prolog.
A farmer has with him a sack of corn, a chicken, and a rather vicious dog. He
reaches a river, which he must cross in a small boat. The boat has only space
enough for the farmer and one more thing. He must therefore ferry the corn,
chicken and dog from the left bank to the right bank of the river one at a
time. The problem is that he cannot leave the dog alone with the chicken, for
it will certainly eat it, nor can he trust the chicken alone with the corn.
How can he ferry them all across safely?
We may sketch a solution immediately. Starting with an initial state in which
the farmer, corn, chicken and dog are all on the left bank, the program must
choose a sequence of moves that result in all four being moved to the right
bank. This suggests the following:
solution -> initial_state ! safe_move^+ = final_state.
We
assume that `safe_move' is a relation that transforms a state into a new state,
such that the new state is `safe', ie., nothing gets eaten. `Safe_move' is a
relation rather than a function, because we expect that several moves are
possible from any given state. Notice how the closure operator will compose
all possible sequences of choices to implement a search--provided we are
careful to make sure it doesn't get trapped in a loop.
This solution has a fatal flaw: it will yield `True' if the problem has a
solution, or it will be inapplicable if it doesn't. To be useful, the program
should yield a plan showing how the job should be done. The plan could be
either a sequence of moves, or a sequence of states, or perhaps both. In the
solution given here, the plan will be a sequence of states, from which it is
easy to deduce the moves.
A second problem is that the closure operator allows the program to be trapped
in a cycle. For example, the farmer could ferry the chicken back and forth
across the river indefinitely. The way to avoid such cycles is to make sure
that each new state added to the plan is not already part of it. This is why
making the plan a sequence of states rather than a sequence of moves is the
better choice. A good solution should not pass through the same state twice,
but it might need to make the same move several times.
The solution should therefore have the form:
solution -> [initial_state] ! add_to_plan^+ ?>solved.
Here,
the plan initially consists of a single term: the initial state. We include
the initial state in the plan because we want the program to check that it
doesn't return to it. Since the test for completion is no longer a simple
equality, we use right restriction to make sure the finished plan is in the set
`solved'.
We may now elaborate `initial_state':
initial_state -> (everything,{}).
everything -> {'Farmer';'Dog';'Corn';'Chicken'}.
We
represent a state as a pair. The first member of the pair is the set of things
on the left bank of the river, and its second member is the set of things on
the right bank. We need to keep to track of the position of the farmer, but it
is not necessary to worry about the boat; where the farmer goes, the boat goes
too.
The problem is solved when the last term of the plan is the desired final state:
solved -> {Plan:last(Plan)=final_state}.
final_state -> ({},everything).
States
may be added to the plan using a generate and test strategy:
add_to_plan -> suggest o verify.
where
`suggest' first generates a possible new state, then `verify' ensures that it
is not already in the plan.
The argument of `suggest' is obviously the existing plan, and its output should
include the new state, but in addition it needs to copy the existing
plan--otherwise `verify' could not check the new state or add it to the plan.
suggest -> {Plan->Plan,last(Plan)!cross_river}.
This
definition extracts the existing latest state from the plan, and uses
`cross_river' (explained shortly) to generate new states.
`Verify' is straightforward, provided that we remember that the members of the
plan are not states, but (integer, state) pairs. The set of states already
visited is the codomain of the plan:
verify->{Plan,State->Plan&&[State]:State\?codom Plan}.
This
appends the sequence consisting of the new state to the existing plan. (The
`&&' operator joins two sequences, not a sequence and a term.)
We now must make a short digression to explain why we didn't write:
add_to_plan -> {Plan -> Plan && [last(Plan)!cross_river]
:last(Plan)!cross_river\?codom Plan}.
The
expression `last(Plan)!cross_river' generates a new state. We add it to the
plan, provided that it is a new one. Unfortunately, the expression appears
twice, and `cross_river' being a relation, there is no guarantee that it will
yield the same value in each place. (In fact, it typically won't.) Therefore
the state being added to the plan is not necessarily the one that proved to be
a new one. The correct two step approach given earlier ensures that the same
new state is used in both places.
The `cross_river' relation operates on states rather than on plans. Crossing
can occur from left to right or from right to left, which need different
treatment:
cross_river -> left_to_right join right_to_left.
(Since
the two cases are mutually exclusive, we could have equally well written `else'
in place of `join'.)
Crossing from left to right occurs only when the farmer is on the left bank,
which is why we check the precondition of `ferry_object' to ensure it is in the
set `farmer_on_left':
left_to_right -> farmer_on_left<?ferry_object\?>unsafe.
Similarly,
we use right anti-restriction to ensure the post-condition is not unsafe, e.g.,
leaving the dog alone with the chicken. (An alternative would have been a
right-restriction to `safe'.)
`Farmer_on_left' is a set, which may be defined as a filter, because it is
always given a known state:
farmer_on_left -> {Left,Right:'Farmer'?Left}.
(A
state is a (left bank, right bank) pair. We want to check that the farmer is a
member of the set of things on the left bank.)
Ferrying an object from left to right is a two step process consisting of first
choosing each object on the left bank in turn, then moving it from the left
bank to the right. As with `add_to_plan', a two step approach guarantees that
the object removed from the left bank is the same object added to the right
bank. Whatever object is chosen, the farmer must go with it:
ferry_object -> {Left,Right->Left,Right,@Left}
o {Left,Right,Choice-> Left omit{Choice}omit{'Farmer'},
Right join{Choice}join{'Farmer'}}.
A
significant point here is that since the farmer is on the left bank, the farmer
can be chosen to move. If this happens, the farmer is removed from the left
bank twice, and added to the right bank twice, but since we are dealing with
sets, this won't matter. When the farmer is the chosen object, this models his
crossing the river alone.
We now need to define the set of unsafe states. Assuming the right bank was
left in a safe state by the previous move and will certainly be safe once the
farmer reaches it, we only need to concern ourselves with the safety of the
left bank. There are three unsafe situations, when dog is left with the
chicken, when the chicken is left with the corn, or when all three are left
together. An interesting way to express this is as follows:
unsafe -> {Left,Right: Left includes{'Chicken';'Corn'}
v Left includes{'Dog';'Chicken'}}.
That
is to say, the unsafe states are those where the set of things on the left bank
includes the dog and the chicken or the chicken and the corn.
This completes the strategy for moving things from left to right. We now have
a choice, write an analogous `left_to_right' relation with left and right
interchanged, or get the program to exchange left and right:
swap -> {Left,Right->Right,Left}.
Armed
with which, the `right_to_left' relation is easy to write:
right_to_left -> swap o left_to_right o swap.
That
is the problem solved. However, the output generated by:
? solution.
does
not look especially elegant, and it would be far easier to read if each state
were displayed on a new line. We therefore modify `solution' as follows:
solution -> [initial_state] ! add_to_plan^+ ?>solved
o display_seq o null.
where
`null' is used to suppress the normal output of `find'.
The `display_seq' relation has to be written with care, since we have to be
sure of the order of execution. The only really safe operators are
composition, which guarantees left to right execution, and limit (^^) of a
function, which is similar to a loop in a procedural language:
display_seq -> speak
o ({X->head(X)!write o nl,tail(X)} o {H,T->T})^^
o nl o spoken.
`Display_seq'
begins by opening the "user" file. (`tell("user")' would have been just as good
as `speak' here), and finishes by displaying a new line and releasing the
"user" file. The interesting part is the limit construct. Its argument is the
sequence to be displayed. Its body is the composition of two relations. The
first relation generates a pair of values, the head and tail of the sequence.
`Write' and `nl' display the head of the sequence and a new line as a
side-effect. The second relation discards the head of the sequence, so the
overall effect of the composition is simply to find the tail of the sequence.
Because of the limit operator, the composition is applied until the sequence
becomes empty, whereupon the `head' and `tail' functions become inapplicable,
so the limit (the empty sequence) has been found.
Two warnings are in order here. It would not do at all to replace the `^^'
(limit) operator by the `^*' (closure) operator, because although the same
sequence of `write' operations would occur, the construct would yield the
resulting sequence as an output after every step. The result would be that the
"user" file would be released after the first step, and the second `write'
would fail. The second warning is that `display_seq' cannot be simplified as
follows:
display_seq -> speak o{X->head(X)!write o nl o tail(X)}^^ o nl o spoken.
This
looks plausible because it seems that the limit construct should yield
`tail(X)' as before. To see why this won't work, remember that `write' and
`nl' are transparent, so that the program (less output) is equivalent to:
display_seq -> {X->head(X)!tail(X)}^^ .
The
expression `tail(X)' yields a sequence--a relation from integers to pairs of
states. The program tries to apply it to `head(X)', which is a pair of states,
and finds that it is inapplicable.
The third example uses relations as data. The problem is, given a character
string, to display a list of the words it contains, and the number of times
each occurs in the text.
There are several subproblems here: to find the words in the text, to count
their frequencies, and to display the results prettily. It is easiest to
approach the solution by concentrating on the second sub-problem first.
Suppose that we have discovered the list of words in the text; e.g., given the
string `"to be or not to be"', we have already derived the sequence
`["to","be","or","not","to","be"]'. (It is important to form a sequence, not a
set--a set would contain each word only once.)
The inverse of the sequence looks like this:
{"be",2; "be",6; "not",4; "or",3; "to",1; "to",5}
The
frequency of a word, e.g., `be', is the size of its image, e.g., `{2;6}'. It
would therefore be useful to have a function that could find the size of the
image of an argument in a relation. This is easily constructed from Libra's
built-in `#' (size) and `image' functions:
image_size->{X,R-> #({X}image R)}.
To
create a frequency table, it is merely necessary to apply `image_size' to the
arguments in the domain of the inverted word list, and collect the results into
a set:
frequencies -> {WordList ->
WordList^-1 !{Inv-> @dom Inv !{X->{X,image_size(X,Inv)}}>>->}
}.
The
expression `WordList^-1' finds the inverse of `WordList', to which it applies a
reduction construct. The operand of `>>->'
generates each member of the domain of `WordList^-1'--which is the set of
words--and then maps each word to a set containing a pair of values: the word
itself, and the size of its image. The postfix `>>->'
operator collects all these unit sets into a single set.
? ["to","be","or","not","to","be"]!frequencies.
{"be",2; "not",1; "or",1; "to",2}
(Actually,
the output of `find' is not so pretty as this; "be" is displayed as `[98,101]'
for example. But for the reader's benefit, this problem will be overlooked.)
Having solved the nub of the problem, let us consider how to convert a string
into a list of words. Although we could scan the string from left to right
using a simple two-state machine to detect the boundaries between words, this
is an inherently sequential method. It is better--though more challenging--to
devise a parallel method.
Recalling that the composition:
? [10..12]o"to be or not to be".
"not"
will
extract a substring, to find a list of words, it is first necessary to find a
list of their positions. We are aiming to define `letter_positions' and
`word_positions' so that they work together as follows:
?"to be or not to be"!letter_positions.
{1;2;4;5;7;8;10;11;12;14;15;17;18}
?{1;2;4;5;7;8;10;11;12;14;15;17;18}!word_positions.
[[1,2],[4,5],[7,8],[10,11,12],[14,15],[17,18]]
The
method we adopt is to find a list of the positions of first letters of word,
and of the last letters of words, then thread them together.
How do we discover the initial and final positions of the letters? An initial
letter is one that does not have a letter on its left. A final letter is one
that does not have a letter on its right:
initial_positions -> {Posns->(@Posns!{X->X:X-1\?Posns}>>->)!sort}.
final_positions -> {Posns->(@Posns!{X->X:X+1\?Posns}>>->)!sort}.
where
`Posns' is the set of positions of all letters in the original text:
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!initial_positions.
[1,4,8,10,14,17]
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!final_positions.
[2,5,8,12,15,18]
How
do we thread the two sequences together?
word_positions -> {LetterPositions ->
(LetterPositions!initial_positions,
LetterPositions!final_positions)
\\
{L,R->[L..R]}
}.
which
uses the `\\' (zip) operator to combine corresponding initial and final
positions with the `[M..N]' construct.
? {1;2;4;5;7;8;10;11;12;14;15;17;18}!word_positions.
[[1,2],[4,5],[7,8],[10,11,12],[14,15],[17,18]]
The next thing to do is to define which characters are part of words, and which are
spaces between words. The decision is somewhat arbitrary:
letters -> {1!"A"..1!"Z"}join{1!"a"..1!"z"}join{1!"0"..1!"9"}.
(Libra syntax does not allow:
letters -> {1!"A"..1!"Z";1!"a"..1!"z";!"0"..1!"9"}.
There is no generalised `M..N' construct, only `{M..N}' or `[M..N]').
Finally, we need to find the letter positions. This can be done by taking the
original text--which is a function from positions to characters, and
right-restricting it to the set of letters. The domain of this restricted
function is the set of letter positions:
letter_positions -> {Text->dom(Text?>letters)}.
?"to be or not to be"!letter_positions.
{1;2;4;5;7;8;10;11;12;14;15;17;18}
which
operates as promised earlier. Composing this with `word_positions' gives the
following:
? "to be or not to be"!letter_positions!word_positions.
[[1,2],[4,5],[7,8],[10,11,12],[14,15],[17,18]]
The
final step is to map the sequences of positions onto sequences of letters. It
is tempting to believe that it is merely necessary to apply the original text
to the result--after all, the text is a mapping from positions to letters.
However, this is not quite right, because the codomain of the sequence above is
the set of sequences of positions, but the domain of the text is individual
positions. What we need to do is to compose the elements of its codomain with
the text. This is an example of a general problem, which we can call
`compose_codom':
compose_codom->{R1,R2->(@R1)!{X,Y->X,Y o R2}>>->}.
This
higher-order relation takes two other relations as arguments. It generates all
(X,Y) pairs of the relation `R1', mapping each pair onto a singleton set
containing the same value of X, but with Y composed with the relation `R2'.
These singleton sets are then joined into a new relation. In terms of this
particular problem, `R1' will be the sequence of position sequences above, and
`R2' will be the text. Each position sequence will therefore be mapped onto a
letter sequence.
Putting it all together, we get:
word_list-> {Text->
compose_codom(Text!letter_positions!word_positions, Text)}.
?"to be or not to be"!word_list.
["to","be","or","not","to","be"]
a
surprisingly painful process. (The author is by no means convinced that this
is the best way to solve this problem.)
In practice, such a program would read the text string from a file. A simple
adaptation of the file copying program in Section 8.1 will do the job:
file_to_str -> {F->[]!see(F)o ({Str->Str&&" "&&get(0)})^^ o seen}.
? file_to_str("hamlet").
" to be or not to be"
Each
iteration of the limit construct appends a space and a line of text to `Str'.
The value of `Str' before the first iteration is the empty sequence, which is
passed transparently through `see(F)'. The argument of `get' doesn't matter;
the program could just as well say `get(Str)', `get([])', or whatever.
It is nice to be able supply the file name interactively. The `ask' relation
asks the user a question, and yields the answer:
ask -> speak o put o get o spoken.
This
can then be tailored to ask for a file name:
ask_file_name -> ask("Please type the file name: ").
? file_to_str(ask_file_name)!word_list.
Please type the file name: hamlet
["to","be","or","not","to","be"]
We
have already defined the `frequencies' relation, which will analyse the word
list. However, for better presentation, we want to display the words in
decreasing order of frequency. To do this, we need to sort the frequency table
by frequency, then reverse the resulting sequence. However, since the
frequency table contains (word, frequency) pairs, `sort' would put into
alphabetical order. Before sorting it, each pair must be reversed, which is
easily done by finding the inverse of the table.
We may therefore sketch the analysis program as follows:
analysis -> ((file_to_str(ask_file_name)!word_list!frequencies<)^-1!sort)<- .
? analysis.
Please type the file name: hamlet
[2,"be"; 2,"to";1,"or";1,"not"]
Bearing
in mind that `find' actually displays the words as lists of integers, the
output should be cleaned up, and the output from `find' should be suppressed:
analysis -> ((file_to_str(ask_file_name)!word_list!frequencies)^-1!sort)<-
! display_list o null.
The
`display_list' relation, which displays the word list, is similar in structure
to `display_seq' in Section 8.2.
display_list-> {WordList->
WordList ! speak
o ({L->display_line(head(L)),tail(L)}o{H,T->T})^^
o spoken}.
which
reduces the problem to how to display each line. Given that the terms of the
inverted frequency table are (frequency, word) pairs, the following relation
will do the job:
display_line-> {Freq,Word->
[] ! put(justify(int_to_str(Freq),6))
o put(" ") o put(Word) o nl}.
where
`justify' will right justify a string:
justify-> {String,Width->String!{Str->" "&&Str: #Str<Width}^^}.
This
relation is a good example of using a global variable. The `justify' relation
finds the limit of an inner relation that operates on `Str' by adding leading
spaces until it has the width given by `Width'. `Width' is a constant global
variable as far as the inner relation is concerned. If `Width' had to be
passed as a parameter, the inner relation would have to operate on
(string,integer) pairs, and its final output would need be simplified by
another relation.
It is important to distinguish `Str' from `String'. Suppose the relation had
read:
justify-> {Str,Width->Str!{Str->" "&&Str: #Str<Width}^^}.
Then
a call of the form:
? justify("12",6).
would
bind `justify' as follows:
justify->{"12",6->"12"!{"12"->" "&&"12": #"12"<6}^^}.
making the inner relation quite useless. Such a definition would trigger Libra to
issue a warning, because `Str' appears on the left of `->'. On the other
hand, `Width' is merely used in a predicate, so it would not cause a warning.
LIBRA is implemented as a
Prolog program.
Because it was developed using
Open Prolog
[1], and the Open Prolog editor does not support files exceeding 32K
bytes--and because the Libra Source program has a lot of comments--it consists
of four large files, and a small one to link them together. The four large
files are called `parser', `commands', `built_in' and `apply', and the small
one is called `libra'. Consulting `libra' causes the other four files to be
consulted, and activates Libra's command loop. When the command loop is
active, the user may issue any of the commands explained in Section 2.2.
Actually, it doesn't matter much whether the command loop is active or not,
because Libra commands are also valid Prolog goals. They are read in by
`read', which means that they are automatically parsed, whether the command
loop is active or not.
When the command loop is inactive the user may type any Prolog goal, which
proved useful when debugging Libra. The only drawback is that the word `let'
has to be typed before a definition, otherwise Prolog will try to interpret `->'
as `if ... then'--without much success as a rule. If it is necessary to abort
execution of a Libra command, the command loop will become inactive, and must
be restarted by typing `libra.'
Definitions created by `let' and by operator declarations (`fx', etc.) are
stored as Prolog facts. However, before storage, they are first preprocessed.
Preprocessing serves three main functions, it converts sets to internal format,
it enforces scope rules--including warning the programmer about suspected
abuses, and it gets rid of syntactic variants; for example `X<Y<Z'
becomes `X<Y & Y<Z'. The same preprocessing is used by `find' to
convert its goal to internal form. Currently, the weakest part of this is that
the internal form of a set is an ordered list, which is totally inefficient but
was the easiest route to a prototype. In particular, testing for membership or
applying a relation to an argument implies scanning the lists from left to
right. There is no fast access to individual elements. As a result, the
complexity of one relational composition is N2, the complexity of two
compositions is N3, and so on, so Libra programs soon become computationally
complex.
The `parser' file contains the operator declarations that define Libra's
syntax, and the preprocessor. The rest of the command interpreter is in the
`commands' file, including a pretty printer used by `show' that converts
internal form back to external form. It cannot restore the original text
because Prolog's `read' predicate loses the original names of variables, and
ignores comments totally. However, it does display definitions in canonical
form, making it clear exactly how they were parsed. Theoretically, saving a
program with `dump', which uses the same format, then reading it in again with
`use', should completely restore it. The commands file also contains the
`simplify' predicate, which is responsible for simplifying expressions, ie.,
executing programs. It is pretty simple however, because all the real work is
done by `built_in' or `apply'.
The `built_in' file implements all Libra's built-in definitions. To make
execution as lazy as possible, each operator is responsible for deciding when
to evaluate its operands. However, `built_in' only implements one mode of
using relational operators: full evaluation. When a relational expression is
applied to an argument, it is handled by one of three predicates in the `apply'
file: `generate', `construct', or `filter'. In contrast to `built_in', these
predicates expect their arguments to have been fully evaluated. Which one is
called depends on how the argument is used. Generating a set or relation
(using `@', etc.) calls `generate', applying a relation to an argument (using
`!', etc.) calls `construct', and testing set membership (using `?', etc.)
calls `filter'. For any given operator, these three predicates are closely
related, so their rules are grouped in threes, by operator. There is a lot of
mutual recursion between them, to ensure relations are evaluated as lazily as
possible. The Prolog rules for these operators are mostly straightforward, so
it is easy to check the ranks of relations involved. For example, the rules
for composition are as follows:
generate(R1 o R2,(X,Z)) :- !,generate(R1,(X,Y)),construct(Y,R2,Z).
filter((X,Z),(R1 o R2)) :- construct(X,R1,Y),filter((Y,Z),R2),!.
filter((_,_),(_ o _)) :- !,fail.
construct(X,(R1 o R2),Z) :- !,construct(X,R1,Y),construct(Y,R2,Z).
The
rule for `generate' requires the first relation to be a generator, and the
second to be at least a constructor. The rules for `filter' require the first
relation to be at least a constructor, but the second needs only to be a
filter. The rules for `construct' require both relations to be at least
constructors.
When `built_in' has to fully evaluate a relational expression, it usually calls
`generate' to do the work, collecting its results using Prolog's `findall', and
finally sorting them to comply with the rule that sets are represented by
ordered lists. The exceptions occur where there is a simple method that will
do the job faster. The most important of these exceptions is finding the
transitive closure of an explicit relation, which is guaranteed to terminate
even if the relation is cyclic.
WHAT, if anything, has been learnt from implementing an interpreter for Libra?
The first lesson--which came as a surprise to the author--is that although the
relational programs developed as examples are quite short, they were
surprisingly tricky to get right. Some of the difficulties arose because the
interpreter was under development, some were due to the author's unfamiliarity
with relational programming, and some due to the design of Libra itself. The
main problem in the design is the notion of inapplicability. If a relation is
presented with an argument to which it does not apply, it simply produces no
result. It is fundamental to relational programming that this should occur,
but it means that if a programming error leads to the wrong type of argument
being passed, then the program as a whole typically produces no result. It is
not easy to deduce exactly where the problem lies from an empty output.
There would seem to be several solutions to this problem. They all involve
some kind of type-checking. Some checking is already done by Libra's built-in
relations. If the wrong kind of argument is passed to one of them, a warning
results. However, if the relation is overloaded by the programmer's own
definition, Libra turns off its warning, because it cannot tell which
definition is meant to be effective in any given case. Even if the
programmer's definition proves inapplicable, Libra can't know whether it
amounts to an error.
Although the definition of a relation can test the type of its arguments, this
only serves to restrict it further. Arguments that do not pass the type test
are inapplicable, so the situation is unchanged. It would be possible for a
programmer to define a relation so that if its argument had the wrong type, it
wrote an error message. However, this is not a modular solution, because
overloading the relation to accept a different type of argument (perhaps by a
separate package) would require the error condition in the first definition to
be modified.
One solution, that adopted by Drusilla, is to infer data types statically from
the types of operators, and in turn, to deduce the types of programmer defined
relations. In its basic form, this means that the built-in operators cannot be
overloaded. However, it is possible to imagine an extension of this idea to
allow for overloading. Indeed, Drusilla does precisely this in treating
generators, constructors and filters as different types. It seems to the
author that allowing overloading is the only logical decision in a language
that allows relations to have multiple results.
The author's personal view is that static type-checking is unduly restrictive.
Although it can detect many programming errors, it is just one of many checks
that could be made. Programs can fail because of division by zero or other
arithmetic errors, or by entering an infinite loop or infinite recursion.
However, nobody suggests that we should insist that all such errors should be
detectable statically. Indeed, if we devise a language in which all programs
are guaranteed to terminate, we know it is not Turing-complete, and less than
universal. I believe the same kind of objection applies, in a less obvious
way, to languages that are statically type-checkable. What the exact problem
is, I don't know, but I believe it is the reason why many artificial
intelligence languages avoid static type-checking, and why systems programming
languages need to have type loop-holes.
One way to solve these problems would be to make a distinction between
`inapplicability' and `no result'. If no definition is applicable to an
argument, this could be treated as a programming error; or, the definition
could remain applicable, but deliver no result. An easy way to implement this
would be to extend the use of Libra's `null' relation to deliver no result as a
general facility, not merely in connection with `find'.
This leads naturally to a related problem: the question of what defines the
domain of a relation. A relation is a mapping from its domain to its codomain.
In mathematics, this presents no particular difficulty because a relation is a
static object. In a programming language, there are three possibilities for
the domain or codomain: their types, their potential values, or their actual
values. The distinctions are that the type of a domain or codomain might be
the integers, its potential values might be the prime numbers, and its actual
values for some given state of execution might be 2, 3 and 5. The `Z'
specification language [8] distinguishes two of these cases: the values
of a domain are chosen from a `source' type--and the values of a
codomain from a `target' type. It might be argued that the set of potential
values and the type of an object are the same thing. This is true if the
language has a flexible enough way of specifying types. Libra treats types as
ordinary sets, so--depending on one's point of view--it either offers a very
flexible means of type definition, or none at all. However, in most statically
typed languages the means for defining types isn't flexible enough to allow
`prime numbers' to be declared as a type.
We have already discussed one situation where these distinctions cause a
problem: Libra can't tell when a relation is meant to apply to a given
argument. If Libra could determine that an argument was in the domain of the
relation but that the relation did not apply, it could consider this a normal
result; but if it could determine that the argument wasn't in the domain of the
relation--or in the case of overloading, in the domain of any definition of a
name--it could consider it an error.
A second aspect of the domain problem arises in finding the reflexive
transitive closure (^*) of a relation. Here, it is best to visualise the graph
of the relation. The graph contains a number of edges, and the reflexive
transitive closure consists of all paths of length zero or more. In addition
to the paths of length one or more found by transitive closure (^+), Libra also
adds to the closure a loop linking each vertex to itself. The set of vertices
is found from the edges of the graph: each edge must leave and enter a vertex
of the graph. However, it is possible for a graph to contain isolated
vertices. Libra has no way of knowing what these are. The programmer may
overcome this shortcoming by defining a graph as a triple: a domain and
codomain representing its vertices, and a relation representing its edges.
(The domain and codomain would be the same set for a homogeneous graph.) The
programmer would be able to construct the loops on its vertices using Libra's
built-in `id' function, thus finding its correct reflexive transitive closure.
A related problem occurs in using the limit operator (^^). The limit operator
repeatedly applies a relation to an argument until it is no longer applicable.
The question is: if the relation is inapplicable to the original argument, is
the original argument the limit, or has an error occurred? The answer depends
on whether the argument is in the domain of the relation, but Libra cannot
tell.
A solution is to force each relation to specify its domain and codomain.
Perhaps this might be written as follows:
(+)->{(X,Y) -> (X,Y)\\(+)}::relations x relations -> relations.
The
point however is that any set of values could be specified in place of
`relations', so that the domain and codomain could be arbitrary sets. One
aspect of this is that the existing classification of relations as generators,
constructors and filters would change, for example, a constructor whose domain
was a generator would become a generator, because all the pairs in the relation
could be generated by applying the constructor to the values generated by the
domain.
Naturally, any solution to the domain problem would also solve the
applicability problem of Section 10.1.
When Libra evaluates a relational expression or applies a complex relation to
an argument, it is possible for it to generate the same result several times.
When these results are collected or considered as a whole, they form a `bag'
rather than a set. They can only be made into a set by removing duplicates.
The time-complexities of Libra algorithms are often determined by the sizes of
their results--therefore understanding results as bags is needed in order to
estimate complexity. Why does Libra support set operations but not bag
operations?
This matter received much thought in the design of Hydra [9]. One argument
against using bags as a basic data structure is that storing duplicate results
isn't usually a good idea. Forming a set of results using reduction is a
useful way of saving space and reducing complexity. On the other hand, bags
can already be represented within Libra as functions from elements to natural
numbers. Such a representation would be no less efficient than any used for
sets.
A corollary to using bags is to replace relations by multi-relations, in which
the same pair can appear as often as desired. The problem with this is that
multi-relations are equivalent to matrices of natural numbers, so why not
matrices of integers or reals? In other words, the resulting language would be
based on matrices. There is no real problem with this, but developing a
language based on bags is really a separate issue, and hasn't been seriously
considered here.
The argument that bags are needed to understand aspects of Libra is specious
anyway. Many computer languages are interpreted using a run-time stack. That
doesn't mean that stacks have to be a basic data structure within the languages.
In the implementation of Libra, a decision was made to evaluate the arguments
of relations--using `simplify'--before applying them, except for the built-in
relations, which try to achieve the same effect more lazily. Originally, this
was motivated by an analogy with procedural languages, which typically require
their input parameters to be fully evaluated, but which evaluate conditional
constructs lazily. On reflection, it is seen to be an arbitrary decision. For
example:
and->{A,B->A&B}
defines
an `and' function exactly like the built-in and relation, except that it always
evaluates both operands. It would clearly be an advantage if no operand was
ever evaluated until its value was needed. It would seem to follow that
arguments should be passed by reference rather than by value, or, in terms of
the implementation, `simplify' should not be used until necessary.
Such a change would have an important consequence. Consider the definition:
double->{X->X+X}.
which
is intended to double the value of its argument. Under the existing
implementation of Libra, this operates as follows:
? @{1;2}!double.
2
4
However,
if `simplify' wasn't called until X was evaluated, the effect would be to
evaluate:
? @{1;2}+@{1;2}.
2
3
3
4
This
is certainly referential transparency, but is it sensible? In Section 8.2, we
explained exactly why two instances of the same variable should be given the
same value in connection with the relation `add_to_plan'. In that example, we
wanted the program to make sure that the move it added to its plan was the same
move that it had just tested for safety. But the converse is that the same
rule destroys the notion of referential transparency, at least as it is usually
understood.
It is possible to argue--as some functional programming languages do--that
variables are an unnecessary concept, and should be eliminated from the
language. But to the author, this would be simply sweeping the problem under
the carpet. To achieve the effect of `double', it would be necessary to have
some way of duplicating an argument. Suppose that there were a built-in
relation called `replicate' as follows:
replicate->{X->X,X}.
Then
we could define `double' as:
double->replicate o (+).
without
using variables. But how are we to know that the two values generated by
`replicate' are the same? Surely,
? @{1;2}!replicate.
should
be equivalent to:
? (@{1;2},@{1;2}).
which generates:
(1,1)
(1,2)
(2,1)
(2,2)
Looking
on the positive side, the language of conceptual graphs [12], one of whose aims
is to present a variable-free version of the predicate calculus, introduces
variables into its linear notation for precisely the same purpose, to indicate
when two occurrences of an expression refer to the same object. (In its
graphical notation, this is done simply by pointing at the same vertex
twice.)
Although variable names are potentially an evil no better than the infamous
`goto', the author believes that in the restricted context of a relational
definition they are harmless, even beneficial. After all, the alternative is
to use built-in relations such as `left' and `right', effectively defined as:
left->{L,R->L}.
right->{L,R->R}.
to
dissect structured arguments. Libra variables are merely local names for
compositions of such relations.
The current treatment of files within Libra is very unsatisfactory, and should
be improved as soon as possible. Part of a remedy would be to treat files as
relations. A possible set of file operators was suggested in the language
proposal on which Libra is based [3].
In a way, a relational language should have less of a problem integrating with
files than a functional language has. One of the properties of files is that
they can be modified. This means that the same argument may retrieve a
different result depending on when it is applied. Considered as a relation, an
argument can be expected to have multiple results. Reading a file could be
likened to using the `~' operator, which chooses an arbitrary result. But this
is only part of the story. Updating a file has a side-effect on how it will be
read, which is not expressed by the Libra programming language. (It isn't
expressed by any other programming language either, but that is beside the
point.) The core problem is that Libra programs--like functional programs--are
expressions, not procedures taking place in time. It is not easy to see how to
integrate Libra with the idea of a time-dependent state.
The problems of file input-output are still more manifest when dealing with the
user of a Libra program. We have previously discussed how the dialogue
generated by different threads can become confusingly intertwined: question and
answer have to be treated atomically. But what happens if several threads
decide to ask the user the same question? Should each question be answered
separately, or should Libra ask the question once and remember the answer? In
a relational language, only the first choice is logical, but it may not always
prove convenient.
In the language proposal on which Libra was based [3], it was proposed that
names might given limited scope. The reason for this is that in defining a
module or library of relations for general use, it might be useful to define
subsidiary relations, which would not be of general interest. A programmer
using the module should not have to worry about these subsidiary relations. In
particular, there is a danger that the programmer will define a new relation
with the same name as one in a module, and on invoking it, find that spurious
results are produced.
One possible way of avoiding this problem would be to introduce a `where'
operator, so that:
{add2->add1 o add1} where {add1->{X->X+1}}.
would
have the effect of making `add2' global, but `add1' would be private to the
declaration. In general, a set of global definitions would be able to share
access to a set of private ones.
Simple as this idea is, it is not currently implemented. For one thing, it is
hard to see how to integrate it with the one-definition-at-a-time approach of
the current command interpreter--whose serial view of the program text is
already inappropriate to a relational language.
Another aspect to consider is the object-oriented approach to programming. In
the context of Libra, where objects are unimportant, many of the problems that
object-oriented languages solve aren't present. On the other hand, it would be
useful to have some concept similar to inheritance. This means that if a
general type of object has been defined, and a more specialised type of object
has been based on it, the specialised object should be able to inherit the
relations from the more general object, or over-ride them with its own. For
example, if the `number_of_legs' relation applied to members of the set
`mammal' yields `4', we would want to over-ride this for the set `humans' to
yield `2'. Libra does not provide any means of over-riding one relational
definition by another; if two definitions apply, both take effect. Nor does
Libra have any way at present of using the knowledge that `humans' are
`mammals'.
A partial solution to this problem would be to use `else' rather than `join' to
link definitions of relations having the same name. If a name defines several
relations, they could be tried in turn. As soon as a definition is found that
applies to the argument, no further definitions would be tried. Some means of
specifying the search order is needed. Letting the search order be defined by
the sequence of the program text seems a poor approach, but Libra does not
currently offer any other language feature that could be exploited. A better
suggestion, which fits well with the nature of relations, is to allow the
program text to contain packages, which are relations from names to
definitions, and link them together by an arbitrary relational expression.
Treating data and program text in a uniform way causes a major problem in the
implementation of Libra. Consider an operation such a composition. The
expression `R o S' should compile to a Prolog rule something like:
'R o S'(X,Z) :- R(X,Y),S(Y,Z).
if
R and S are static program objects, but to:
'o'(R,S,'R o S').
if
they are data objects (where `o' is a built-in predicate). The present
implementation represents everything as data, which deals with static objects
very poorly. The converse, of expressing data objects as facts means that an
implementation would spend much of its time asserting and retracting facts. A
better solution would be to have different approaches for static (facts and
rules) and dynamic (data structure) sets and relations.
One approach is to follow Drusilla, and choose suitable representations hin a
compiler. This preserves a uniform syntactic treatment for static and dynamic
objects--although there are problems when they are mixed, and in compiling
higher-order relations. However, having programmed a few examples has
convinced the author that the programmer needs to keep the distinction between
dynamic and static objects clear in order to make sensible space-time
trade-offs. Perhaps they could be distinguished syntactically, without
sacrificing uniformity of notation. For example:
compose->{R,S->{X->X!(R o S)}}.
could
define a relation that could be applied to an argument (X), leading to the
first translation above; whereas:
compose->{R,S->R o S}.
might
apply to data objects. However, this approach raises the new problem of
converting between static and dynamic representations.
1. M. Brady, Open Prolog Version 1.0.3d22, Dept. of Computer Science,
Trinity College Dublin, URL:
ftp://ftp.cs.tcd.ie/pub/languages/open-prolog, 1995.
2. D.M. Cattrall, The Design and Implementation of a Relational Programming
System, PhD Thesis, Dept. of Computer Science, University of York, 1992.
3. B. Dwyer, Programming Using Binary Relations: a proposed programming
language, Technical Report 94-04, Dept. of Computer Science, University of
Adelaide, 1994.
4 W.D. Hillis, The Connection Machine, MIT Press, 1985.
5. B. J. MacLennan, Relational Programming, Technical Report NPS52-83-012,
Naval Postgraduate School, Monterey, CA, 1983.
6. B. J. MacLennan, Four Relational Programs, Technical Report NPS52-86-023,
Naval Postgraduate School, Monterey, CA, 1983.
7 R. Milner, "A theory of type polymorphism in programming", Journal of
Computer and System Sciences, 17(3) pp348-375, 1978.
8 B. Potter, J. Sinclair, and D. Till, An Introduction to Formal
Specification and Z, Prentice-Hall 1991.
9. J.D. Prosser, A Programming Language Based on the Algebra of Binary
Relations, Honours Report, Dept. of Computer Science, University of
Adelaide, 1994.
10. J.G. Sanderson, "A Relational Theory of Computing", Lecture Notes in
Computer Science 82, Springer-Verlag Berlin 1980.
11. J.G. Sanderson, Relator Calculus, Technical Report 84-02, Dept. of
Computer Science, University of Adelaide, 1984.
12. W.M. Tepfenhart, J.P. Dick, J.F. Sowa (Eds.), "Conceptual Structures:
Current Practices", Lecture Notes in Computer Science 835,
Springer-Verlag Berlin 1994.
Operator Type Precedence
^* yf 50
^+ yf,yfx 50
^- yfx 50
^^ yf 50
<- yf 50
sets_of fy 50
seqs_of fy 50
dom fy 50
codom fy 50
?> yfx 100
\?> yfx 100
<? xfy 100
<\? xfy 100
meet yfx 100
o yfx 100
x yfx 100
&& yfx 125
join yfx 125
omit yfx 125
else yfx 125
but yfx 125
! yfx 150
~ yfx 150
\\ yfx 150
@ fx 150
i fx 150
# fy 175
>>-> yf,yfx 175
>>=> yfx 175
image yfx 175
Operator Type Precedence
^ xfy 200
mod yfx 300
* yfx 400
/ yfx 400
<< yfx 400
>> yfx 400
- fx,yfx 500
+ fx,yfx 500
/\ yfx 500
\/ yfx 500
max yfx 600
min yfx 600
Operator Type Precedence
= xfy 700
\= xfy 700
< xfy 700
> xfy 700
=< xfy 700
>= xfy 700
\> xfy 700
\< xfy 700
equal xfy 700
unequal xfy 700
subset xfy 700
includes xfy 700
inside xfy 700
encloses xfy 700
disjoint xfy 700
.. xfx 700
? xfx 700
\? xfx 700
Operator Type Precedence
\ fy 900
& yfx 925
v yfx 950
=> xfx 975
<=> xfx 975
Operator Type Precedence
, xfy 1000
-> xfy 1050
: xfx 1070
; xfx 1100
Operator Type Precedence
let fx 1080
edit fx 1080
find fx 1080
? fx 1080
use fx 1080
reuse fx 1080
show fx 1080
dump fx 1080
drop fx 1080
fx xfx 1080
fy xfx 1080
xf xfx 1080
yf xfx 1080
xfx xfx 1080
xfy xfx 1080
yfx xfx 1080
help fx 1090
unary_prec fx 1200
binary_prec fx 1200
solution -> [initial_state]!add_to_plan^+ ?>solved o display_seq o null.
initial_state -> (everything,{}).
final_state -> ({},everything).
everything -> {'Farmer';'Dog';'Corn';'Chicken'}.
solved -> {Plan:last(Plan)=final_state}.
add_to_plan -> suggest o verify.
suggest -> {Plan->Plan,last(Plan)!cross_river}.
verify -> {Plan,State->Plan&&[State]:State\?codom Plan}.
cross_river -> left_to_right join right_to_left.
left_to_right -> farmer_on_left <? ferry_object \?> unsafe.
farmer_on_left -> {Left,Right:'Farmer'?Left}.
ferry_object -> {Left,Right->Left,Right,@Left}
o {Left,Right,Choice -> Left omit{Choice}omit{'Farmer'},
Right join{Choice}join{'Farmer'}}.
unsafe -> {Left,Right:Left includes{'Chicken';'Corn'}
v Left includes{'Dog';'Chicken'}}.
right_to_left -> farmer_on_left <\? swap o left_to_right o swap.
swap -> {Left,Right->Right,Left}.
display_seq -> speak
o ({X->head(X)!write o nl,tail(X)}o{H,T->T})^^o nl
o spoken.
Fig. B.1: Chicken, Corn, Dog.
analysis -> ((file_to_str(ask_file_name)!word_list!frequencies)^-1!sort)<-
! display_list o null.
frequencies -> {WordList->
WordList^-1!{Inv-> @dom Inv!{X->X,image_size(X,Inv)}>>->}}.
word_list -> {Text ->
compose_codom(Text!letter_positions!word_positions,Text)}.
compose_codom->{R1,R2->(@R1)!{X,Y->X,Y o R2}>>->}.
letter_positions -> {Str->dom(Str?>letters)}.
letters -> {1!"A"..1!"Z"}join{1!"a"..1!"z"}join{1!"0"..1!"9"}.
word_positions -> {Letters->
(Letters!initial_positions,
Letters!final_positions) \\ {L,R->[L..R]}}.
initial_positions -> {Letters-> (@Letters!{X->X:X-1\?Letters}>>->)!sort}.
final_positions -> {Letters-> (@Letters!{X->X:X+1\?Letters}>>->)!sort}.
image_size->{X,R-> #({X}image R)}.
ask -> speak o put o get o spoken.
ask_file_name -> ask("Please type the name of the input file: ").
file_to_str->{F->[]!see(F)
o ({X->X&&" "&&get(0)})^^
o seen}.
display_list -> {WordList ->
WordList ! speak
o ({Seq->display_line(head(Seq)),tail(Seq)}o{H,T->T})^^
o spoken}.
display_line -> {Freq,Word -> []!put(justify(int_to_str(Freq),6))
o put(" ") o put(Word) o nl}.
justify->{String,Width->String!{Str->" "&&Str: #Str<Width}^^}.
Fig. B.2: Frequency Analysis
This page has been accessed
times since 12th June 1997.
Up to Barry Dwyer's Home Page
|