POV-Ray: Newsgroups: povray.off-topic: Logic programming: Logic programming

POV-Ray : Newsgroups : povray.off-topic : Logic programming : Logic programming		Server Time 1 Jul 2025 13:21:21 EDT (-0400)
From: Invisible
Date: 31 May 2012 06:00:59
Message: <4fc7415b$1@news.povray.org>
I haven't attempted to actually compile and run any Java code for about 
10 years now. So for the past month, I've been writing Java code. I will 
probably write about the actual coding experience later; right now I 
want to talk about what I've produced.

It's a small program called Logic Box. It lets you do logic programming, 
after a fashion.

More exactly, Logic Box solves predicates. For the purposes of this 
discussion, a "predicate" is simply a logical statement which may be 
true or false. For example, 2 = 2 is obviously true, while 5 = 7 is 
clearly false.

Given such a statement, Logic Box can tell whether it is true or false. 
But that's pretty easy, really. The interesting part is that a predicate 
may contain /variables/. Such a predicate is now no longer simple "true" 
or "false"; it depends on what the variables contain. So we can write 
predicates which are /sometimes/ true and /sometimes/ false.

Given such a predicate, Logic Box will attempt to /make/ the predicate 
true. That is, it tries to figure out what you need to set the variables 
to in order to "solve" the predicate (i.e., make it true).

The simplest kind of predicate is the equal predicate. It compares two 
expressions (not predicates) and is satisfied if both expressions are 
/identical/, once any bound variables have been replaced with the values 
they are bound to. Not all variables have to be bound to something; for 
example, x = x is unconditionally true, regardless of what value x has.

There are currently three sorts of expression: an integer, a variable 
name, or a "record". A record has a name and zero or more fields, which 
are arbitrary expressions. (In other words, records can be nested to any 
depth.) You can use the name of the record as the actual data. For 
example, Foo[] and Bar[] are two distinct values. Or you can use the 
name to indicate the "sort" of data, and the fields to represent the 
actual data. For example, Date[28, 03, 1980] or Time[09, 27, 56]. (Both 
of these are three integers, but they are not comparable because the 
names are different.)

(Note carefully: "Foo" is a variable, "Foo[]" is a record.)

Solving a predicate such as 7 = x isn't hard. The solution is simply x 
:= 7. (Logic Box uses the notation that "=" denotes a /test for/ 
equality between two arbitrary expressions, while ":=" denotes the 
/assignment/ of a value to a variable, otherwise known as a "binding".) 
But consider the predicate

   Triple[a, x, y] = Triple[1, Pair[2, a], Pair[x, 7]]

Can this be solved? Actually yes, it can:

   a := 1; x := Pair[2, 1]; y := Pair[Pair[2, 1], 7]

If you take the original predicate and replace each variable with the 
corresponding binding, you will find that

   Triple[1, Pair[2, 1], Pair[Pair[2, 1], 7] = Triple[1, Pair[2, 1], 
Pair[Pair[2, 1], 7]

Clearly both sides are identical, so the predicate is solved.

Notice how variables are not limited to containing integers; they may 
contain arbitrary expressions - even expressions involving other 
variables. However, a variable can never be bound to an expression 
containing itself. That would result in an infinite loop. So something 
like x := Pair[1, x] is not allowed. (Imagine trying to replace every x 
with Pair[1, x]. The process never halts.)

As well as the equal predicate, you can use the usual AND ("&") and OR 
("|") operators. By default, the operator precedence is =, &, |, in that 
order. You can use brackets to override this as usual. (There is a 
slight glitch: brackets are not allowed inside expressions, only inside 
predicates. So "Pair[(x), (y)] = z" is not allowed, neither is "(Pair[x, 
y]) = z", but "(Pair[x, y] = z)" is fine.)

The meaning of AND is pretty obvious; the predicate is solved only if 
the solution solves /both/ child predicates simultaneously. The meaning 
of OR is equally obvious. What is perhaps /not/ obvious is that 
predicates involving OR can have /multiple solutions/. Consider, for 
example, x = 1 | x = 2. This has two solutions, x := 1 and x := 2. In 
general, a given predicate can have an unlimited number of solutions, 
and Logic Box will attempt to find all of them for you.

Perhaps most interestingly of all, you can define "named predicates", 
which behave something like subroutines. In particular, a named 
predicate can invoke itself recursively, creating a way to implement 
loops. (This would otherwise be impossible.)

Logic Box comes with a small bunch of named predicates, mostly to do 
with list processing. All the list predicates begin with "L". Some 
predicates take a list, but treat it as a poor man's set. These begin 
with "S".

As an example, the SMember{} predicate tests whether a given item is a 
member of a set (actually a list). So, for example,

   SMember{2, List[1, 2, 3]}

is true, while

   SMember{5, List[1, 2, 3]}

is false.

Perhaps the most fascinating thing about logic programming is that you 
can write a predicate like SMember{}, and once written, it works 
forwards /and backwards/! For example,

   SMember{5, List[1, x, 3]}

has one solution, namely x := 5. Logic Box doesn't tell you /if/ the 
predicate is true; it tells you how to /make/ it true. In this case, by 
inserting the 5 into the set. We haven't told Logic Box to do this; the 
code for SMember{} doesn't say how to accomplish such a thing. It's just 
an automatic consequence of the language.

We could similarly have written

   SMember{5, List[x, y, z]}

This generates 3 solutions:

   x := 5;
   y := 5;
   z := 5;

Clearly each of these solves the predicate. Alternatively,

   SMember{x, List[1, 2, 3]}

has solutions

   x := 1;
   y := 2;
   z := 3;

Here what we have done, in effect, is to make Logic Box loop over all 
the elements of the set. So SMember{} can mean "/is/ this a member of 
the set?", but it can also mean "/get/ a member of the set". You could 
them write a second predicate involving x, join them with an AND, and 
Logic Box will find any solutions there might be. For example,

   SMember{x, List[1, 2, 3]} & SMember{x, List[2, 4, 6, 8]}

will find any elements in the intersection of the two sets. (In this 
case, x := 2 is the only solution.) Replace the AND with an OR and you 
get the union of the sets instead.

Another, more list-oriented predicate is LJoin{}. This works in a 
slightly strange way. Given three lists, the predicate is satisfied if 
joining the first two lists together yields the last list. For example,

   LJoin{List[1, 2, 3], List[4, 5, 6], List[1, 2, 3, 4, 5, 6]}

is true, because List[1, 2, 3] + List[4, 5, 6] = List[1, 2, 3, 4, 5, 6].

In this form, the predicate is testing /if/ joining two lists yields a 
third. But if we replace the third list with a variable name, Logic Box 
will /make/ the predicate true, by computing what the join of the two 
lists is:

   LJoin{List[1, 2, 3], List[4, 5, 6], out}

   out := List[1, 2, 3, 4, 5, 6]

So we can test /if/ this is the join of two lists, or we can /get/ the 
join of two lists. But we can also run this backwards:

   LJoin{List[1, 2, 3], y, List[1, 2, 3, 4, 5, 6}

   y := List[4, 5, 6]

Logic Box has stripped the prefix 1, 2, 3 off of our long list. We 
didn't tell it how to do that in the code for LJoin{}; again, it's 
automatic. We can /even/ to this:

   LJoin{x, y, List[1, 2, 3, 4, 5, 6]}

This has 7 solutions:

   x := List[]; y := List[1, 2, 3, 4, 5, 6];
   x := List[1]; y := List[2, 3, 4, 5, 6];
   x := List[1, 2]; y := List[3, 4, 5, 6];
   x := List[1, 2, 3]; y := List[4, 5, 6];
   x := List[1, 2, 3, 4]; y := List[5, 6];
   x := List[1, 2, 3, 4, 5]; y := List[6];
   x := List[1, 2, 3, 4, 5, 6]; y := List[];

Again, Logic Box has performed a search and discovered every possible 
way of satisfying the predicate - in other words, every possible way to 
split a list in half. So the /joining/ predicate can be used for 
/splitting/.

We can of course go still further, doing loopy things like

   LJoin{List[x, y, z], w, List[1, 2, 3, 4, 5, 6]}

to fetch the first three elements of the list plus the remainder. You 
could of course implement all of this functionality in a normal 
programming language; logic programming just lets you do it all with one 
small piece of code.

Beware that it's very easy to ask a question which has infinity answers. 
For example,

   LJoin{x, y, z}

has infinity solutions, as Logic Box attempts to construct all possible 
lists of every combination of sizes. Also, sometimes a predicate 
/doesn't/ work in a particular direction; for example LReverse{} works 
forwards, but goes into an infinite search if run backwards. It depends 
on the exact order in which Logic Box searches the solution space, which 
depends on the exact way in which the predicate is phrased.

I should point out that the examples above don't actually work as 
written. I describe expressions such as List[1, 2, 3] as "packed lists". 
The language provides no way to iterate over such structures. Instead, 
you have to use "unpacked lists", which look like

   Node[1, Node[2, Node[3, End[]]]]

Clearly that's much more typing (although it allows easy iteration). 
There's an LUnpack{} predicate which can unpack a list for you. (It uses 
an internal hard-wired language primitive to achieve this feat.) There 
ought to be one LPack{} predicate, but unfortunately we need a separate 
LPack{} and LUnpack{}, due to issues to search space ordering. (The two 
predicates have identical implementations, except for the ordering of 
their terms.)

[In future, I might make LPack{} itself a hard-wired primitive, allowing 
it to work in any direction. It's mildly annoying that this is necessary 
though...]

As an example of use,

   LUnpack{List[1, 2, 3], x} & LReverse{x, y} & LPack{out, y}

In both LPack{} and LUnpack{}, the first argument is a packed list, and 
the second is an unpacked list. The difference is which one is the 
"input" and which the "output". (Alternatively, you can just write out 
the unpacked lists by hand. It's not /that/ hard to do...)

Now that we know about unpacked lists, let's look at the source code for 
LJoin{}:

   LJoin{xs, ys, zs}:
     (xs = End[] & ys = zs) |
     (~head ~tailX ~tailZ
       xs = Node[head, tailX] &
       zs = Node[head, tailZ] &
       LJoin{tailX, ys, tailZ}
     )
   ;

That's /it/. That's the whole thing. That's how we can join, strip and 
split lists all at once. That's all there is to it.

So how does it work? Well, the first line says that the predicate can be 
satisfied if the first list is empty, and the second two lists are 
identical. (I.e., prefixing nothing does nothing.)

The notation "~x ..." means "there exists a value x such that ..." 
Essentially what it does is to create a local variable. As you can see, 
LJoin{} calls itself recursively, so we don't want variables from 
different invocations clashing because they have the same name. (The 
arguments to a predicate are automatically local, by a different mechanism.)

So, the equations are saying:

1. Both xs and zs are Node[] (not End[]).
2. There exists a value (head) which is the first element of xs and zs.
3. Recursively, join the tails of the lists.

Eventually we will reach the end of xs, terminating the loop. We see 
that either xs = End[] or xs = Node[]. These two possibilities are 
mutually exclusive; they cannot both occur. So - provided xs is defined 
- eventually the loop must end.

What might /not/ be clear is how this generates multiple solutions in 
cases such as LJoin{x, y, List[1, 2, 3]}. Where is the "searching" 
happening? Well, that would be the OR operator. It tries both 
alternatives. Look again at the solutions list: the /first/ solution is 
xs = End[] and ys = zs, because that's the first line in LJoin{}.

Next, it tries xs with one element. It knows what element, because zs = 
Node[head, tailZ], and zs is defined to begin with a 1, so xs must begin 
with a 1. In this fashion, it can iterate over all the possible splits. 
Eventually we reach the point where zs = End[] rather than Node[], so 
the second case fails, and only the first case is applicable, ending the 
loop.

You may find it entertaining to try to write some more predicates of 
your own - e.g., to add a new element to a set-list if not already 
present, to interleave to lists, etc.



So that's what Logic Box /does/. But how does it /work/?

Well, implementing Logic Box is about 50 lines of Haskell. It's a bit 
more in Java, but it's not /that/ big. I spent about 90% of my 
programming effort on making Logic Box /interactive/. It actually 
/shows/ you what it's doing, as it does it.

(Unfortunately, for mildly complicated predicates, the result just ends 
up being a giant wall of text which is still disappointingly hard to 
follow. But I tried...)

So, let's start with the equal predicate. This compares two expressions. 
If there are no variables involved, you simply compare the two 
expressions, and either they are already identical, or they will never 
be identical. So that's easy. But what if there are variables?

The general algorithm is apparently called "unification". It's 
surprisingly simple, although it is quite fiddly to get right.

The general idea is to compare the two expressions. If they are 
identical, you're done. If one of the expressions is a variable, you 
bind it to the other expression. (I.e., you make the other expression 
the variable's value.) But what if the variable /already/ has a binding? 
Well, you /could/ try unifying the new value against the existing one, 
but that gets kinda messy. There's a much simpler way.

What you do is you "cook" the expressions before unifying them. That is, 
you replace any variables which already have bindings with the 
expressions they're bound to. In this way, any variables which already 
have bindings vanish from the input, and by unifying the inputs you're 
automatically unifying the contents of those variables. It's much easier 
to program that way.

There is one slight bit of trickiness. Consider the predicate

   x = y & y = 5

Logic Box answers that the solution is

   x := 5; y := 5;

which is true. However, when unifying the first equality, we get the 
partial solution x := y. When we unify the second one, we add y := 5. So 
at this point, we have

   x := y; y := 5;

To get from this to the final result, we perform a "simplification". 
This basically involves looking up every variable and cooking its 
expression using the current solution. If the expression changes in this 
process, set the cooked expression as the variable's new expression and 
restart the loop. The loop terminates when no amount of cooking changes 
any expression.

So, the complete algorithm is this:

1. Cook both expressions.

2. If both expressions are identical, success.

3. If one of the expressions is a variable:
   A. If the variable appears in the other side as well, fail.
   B. Bind the variable to the other expression.
   C. Run the simplification loop.
   D. Success.

4. If both expressions are records:
   A. If the record names do not match, fail.
   B. If the field counts do not match, fail
   C. Unify each pair of fields in turn.

So unification works with a "current solution", which is gradually 
extended as more data is processed. In particular, the final step above 
performs unification recursively. The outer unification fails if any of 
the inner ones fail. Otherwise, the updated output from each sub-step 
feeds into the next sub-step, until no more steps remain.

The AND predicate is very simple on paper, but was the hardest one to 
program. In theory, you run each predicate until it produces a result, 
and then run the next predicate with the result from the previous step 
as its initial solution.

Oh, wait, predicates can potentially produce /multiple/ solutions, remember?

This means that you have to try all possible combinations. Again, on 
paper, that's simple:

   out1 := run predicate 1 [initial solution]
   for each solution1 in out1
     out2 := run predicate 2 [solution1]
     for each solution2 in out2
       out3 := run predicate 3 [solution2]
       for each solution3 in out3
         ...

Doing it interactively is damned fiddly. You have to be able to stop and 
restart multiple nested loops. It took me a while to figure that out. 
The algorithm I ultimately came up with is this:

1. Set the current predicate to predicate 1.

2. Run the current predicate.

3. If a solution is received:
   A. If this is the last predicate, return the solution.
   B. Pass the solution to the next predicate, and make that the current 
one.

4. If EOF is received:
   A. If this is the first predicate, return EOF.
   B. Make the previous predicate the current one.

As far as I can tell, this algorithm works. But it took a heck of a lot 
of trial and error to get to this point.

By contrast, the OR predicate is very easy. Just run each predicate in 
turn until EOF is received. Note that unlike AND, with OR every 
predicate gets an identical initial solution. (The initial solution for 
OR itself.) So the rules are simple; run current predicate, pass on any 
solutions received, move to the next predicate on EOF, end when you run 
out of predicates.

The way the "exists" predicate is implemented is slightly odd. This 
predicate consists of a variable name followed by a predicate. Each time 
it's run, it generates a unique ID number, and then replaces all 
occurrences of the local variable with this ID number. The resulting 
"cooked" predicate is then run as usual.

These unique IDs are called "temporary variables" (because usually they 
don't form part of the final solution, only the temporary internal 
state). They print out as ?1?, ?2?, ?3?, etc. (You can't type them in; 
they can only be created by exists.) You may from time to time see these 
pop up in a solution. Well, now you know what they are.

Named predicate calls are also very simple. You take the predicate name, 
look it up in a dictionary, check it's got the right number of 
arguments, replace each argument variable with the supplied argument 
value, and run the resulting predicate. Done.

As I mentioned, there's also a small number of named predicates who's 
implementation is actually hard-wired into the program. (I.e., it runs 
Java code rather than Logic Box code.) In particular, there's 
PRIM_True{} and PRIM_False{}, which unconditionally succeed or fail.
Post a reply to this message