Matthias Gondan
Department of Psychology
University of Innsbruck
Innrain 9
A-6020 Innsbruck
Matthias.Gondan-Rochon@uibk.ac.at
Prolog is a classical logic programming language with many applications in expert systems, computer linguistics and traditional, that is, symbolic artificial intelligence. The main strength of Prolog is its concise representation of facts and rules for the representation of knowledge and grammar, as well as its very efficient built in search engine for closed world domains. R is a statistical programming language for data analysis and statistical modeling which is widely used in academia and industry. Besides the core library, a lot of packages have been developed for all kinds of statistical problems, including new-style artificial intelligence tools such as neural networks for machine learning and deep learning. Whereas Prolog is weak in statistical computation, but strong in symbolic manipulation, the converse may be said for the R language. SWI-Prolog is a widely used Prolog system that offers a wide range of extensions for real world applications, and there already exist two so-called Prolog packs to invoke R (rserve-client, Real) from SWI-Prolog. However, given the large user community of R, there may also be a need for a connection in the reverse direction that allows invoking Prolog queries in R computations. The R package Rolog embeds the SWI-Prolog system into an R package, thus enabling deterministic and non-deterministic queries to the Prolog interpreter. Usage of the Rolog library is illustrated by a few examples.
Statistics; Logic Programming; Artificial Intelligence
The R (R Core Team 2021) programming language and environment is a widely used open source software for statistical data analysis. The basic R is a functional language with lots of support for storage and manipulation of different data types, and a strong emphasis on operations involving vectors and arrays. Moreover, a huge number of R packages (e.g., CRAN, https://cran.r-project.org/) have been contributed that cover problems from diverse areas such as bioinformatics, machine learning, specialized statistical methods, web programming and connections to other programming languages. To my knowledge, an interface to Prolog is lacking so far.
The logic programming language Prolog was invented in the 1970ies by Colmerauer (Colmerauer and Roussel 1996), mostly for the purpose of natural language processing. Since then, logic programming has become an important driving force in research on artificial intelligence, natural language processing, program analysis, knowledge representation and theorem proving (Shoham 1994; Lally and Fodor 2011; Carro 2004; Hsiang and Srivas 1987). SWI-Prolog (Jan Wielemaker et al. 2012) is an open-source implementation of Prolog that mainly targets developers of applications, with many users in academia, research and industry. SWI-Prolog includes a large number of libraries for “the real world”, for example, a web server, encryption, interfaces to C/C++ and other programming languages, as well as a development environment and debugger. In addition, pluggable extensions (so-called packs) are available for specific tasks to enhance its capabilities.
Unlike R, Prolog is a declarative programming language consisting of
facts and rules that define relations, for example, in a problem space.
Prolog’s major strength is its builtin query-driven search engine that
efficiently deals with complex structured data, but this data is not
necessarily numerical. In fact, Prolog only provides a basic collection
of arithmetic calculations via a purely functional interface
is/2
. More complex calculations such as matrix algebra,
statistical models or machine learning need help from other systems, for
example, from R.
Angelopoulos et al. (Angelopoulos et al. 2013) summarize work at the intersection of symbolic knowledge representation and statistical interence, especially in the area of model fits (EM algoritms, MCMC, Sato and Kameya 2013; Angelopoulos and Cussens 2008) and stochastic logic programs (Cussens 2000; ProbLog 2011). One of the major strengths of logic programming is handling constraints; and a number of systems for constraint satisfaction tools have been developed (constraint logic programming on booleans, finite domains, reals, and intervals) for that purpose (e.g., Frühwirth 1998; Triska 2018). Some constraint handlers exist in R (see the CRAN task view for optimization problems), but more of them would be available via a bridge between R and Prolog.
Earlier approaches to connect Prolog and R have been published as SWI-Prolog packs (r..eal, Rserve_client, Angelopoulos et al. 2013; Jan Wielemaker 2021) and as a YAP module (YapR, Azevedo 2011). Whereas r..eal establishes a direct link to an embedded instance of R, Rserve_client communicates with a local or remote R service (Urbanek 2021). The former approach emphasises speed, the latter might be preferred from a security perspective, especially in systems such as SWISH (Jan Wielemaker, Lager, and Riguzzi 2015) that accept only a set of sandboxed commands for Prolog, but do not impose restrictions on R. A common feature of the two packages is that they provide an interface for R calls from Prolog, but not the other way round, that is, querying Prolog from R is not possible so far.
Rolog is an attempt to fill this gap, and to offer the possibilty to raise Prolog queries in R scripts, for example, to perform efficient symbolic computations, searches in complex graphs, parsing natural language and definite clause grammars. In addition, two Prolog predicates are provided that enable Prolog to ring back to the R system for bidirectional communication. Similar to r..real, tight communication between the two systems is established by linking to a shared library that embeds the current version of SWI-Prolog. The exchange of data is facilitated by the C++ interfaces of the two languages (Eddelbuettel and Balamuta 2018; J. Wielemaker 2021). A less tight connection might be established using the recently developed machine query interface (Zinda 2021) that allows socket-based communication between foreign languages and SWI-Prolog (and, in fact, the MQI documentation includes an R call).
A bidirectional bridge between R and Prolog might overcome the limitations of both languages, thereby combining the extensive numerical and statistical power of the R system with Prolog’s skills in the representation of knowledge and reasoning. In addition to the useful little tools shown in the examples below, Rolog can therefore contribute to progress at the intersection of traditional artificial intelligence and contemporary statistical programming.
The next section presents the interface of Rolog in detail. Section 3 presents possible extensions of the package at both ends, in R and Prolog. Section 4 is a list of illustrative examples that offer useful extensions to the R system. Conclusions and further perspectives are summarized in Section 5.
Rolog has a rather minimalistic syntax, providing only some basic ingredients to establish communication with an embedded SWI-Prolog. Some ways to extend the interface are described in Section 3.
After installation (for now, with
remotes::install_github("mgondan/rolog")
, later, with
install.packages("rolog")
), the package is loaded in the
standard way using R’s library
-command.
library(rolog)
#> Found R package rswipl: /home/matthias/R/x86_64-pc-linux-gnu-library/4.2/rswipl/swipl/lib/swipl
#> Welcome to SWI-Prolog (threaded, 64 bits, version 8.5.20)
#> SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
#> Please run ?- license. for legal details.
#>
#> For online help and background, visit https://www.swi-prolog.org
#> For built-in help, use ?- help(Topic). or ?- apropos(Word).
We can see SWI-Prolog’s typical welcome message.
Most of the work can done using the three R functions
query
, submit
, and clear
. The
functions consult
, once
, and
findall
are provided for convenience.
consult. In most applications, a number of Prolog facts and
rules will be loaded into the system. To facilitate this recurrent task,
the prolog directive consult/1
has been mirrored into R,
consult(filename)
, with filename given as a string
(or a list of strings if multiple files are to be consulted). Note that
the full filename should be given, including the extension (e.g.,
“.pl”). The function returns TRUE
on success, otherwise
FALSE
and an error message is shown.
query. The function query(call, options)
is
used to create a Prolog query (without invoking it yet). The first
argument call is a regular R call that is created using R’s
function call(name, ...)
. This call represents the Prolog
predicate which will be queried in the later course. The creation of
such predicates and Prolog terms is described below and can become quite
cumbersome (see the examples in Section 4). The second argument,
options, may be used for ad hoc modifications of the
translation between R and Prolog, see the section below. The function
returns TRUE
on success. Note that the function does not
check if the corresponding Prolog predicate exists (but see
submit()
below).
Only a single query can be opened at a given time. If a new query is created while another query is still open, a warning is shown and the other query is closed.
submit. Once a query has been created, it can be submitted
using submit()
. If the query fails, the return value is
FALSE
. If the query succeeds, a list of constraints is
returned, with bindings for the variables that satisfy the query.
Repeated calls to submit are possible, returning the different solutions
of a query (until it eventually fails). Programmatically distinguishing
between the different types of return values for success and failure
(list vs. FALSE
) is facilitated by the R function
isFALSE(x)
.
clear. Closes the query. The name of the function is chosen
to avoid name clashes with R’s own built-in function close. The function
returns an invisible TRUE
, even if there is no open
query.
The following short R program illustrates a query to Prolog’s
member/2
using Rolog’s syntax rules.
# member(1, [1, 2.0, a, "b", X])
query(call("member", 1L, list(1L, 2.0, quote(a), "b", expression(X), TRUE)))
#> [1] TRUE
#> attr(,"query")
#> [1] "member(1, [1, 2.0, a, b, X, true])"
# returns an empty list, stating that the query satisfied
submit()
#> list()
# returns a list, stating that the query is satisfied if X = 1
submit()
#> $X
#> [1] 1
# close the query
clear()
member/2
predicate.
once and findall. The function
once(call, options)
is a convenience function that acts as
a shortcut for query(call, options)
, submit()
,
and clear()
. Similarly, findall(call, options)
abbreviates the commands query(call, options)
, repetition
of submit()
until failure, and clear()
,
returning a list collecting the the return value of the individual calls
to submit.
Table 1 summarizes the rules for the translation from R to Prolog. Most rules work in both directions, but a few exceptions exist. For example, there is an empty atom in Prolog, but no empty symbol in R, so the empty atom is translated to a character string in R.
R | Prolog | Note/Alternatives |
---|---|---|
expression(X) |
Variable X | not necessarily uppercase |
as.symbol(abc) |
Atom abc | as.name , quote |
TRUE , FALSE ,
NULL |
Atoms true, false, null | |
"abc" |
String "abc" | |
3L |
Integer 3 | |
3 |
Float 3.0 | |
call("term", 1L, 2L) |
term(1, 2) | |
list(1L, 2L, 3L) |
List [1, 2, 3] | |
list(a=1, b=2, c=3) |
List [a-1, b-2, c-3] | |
c(1, 2, 3) |
#(1.0, 2.0, 3.0) | vectors of length > 1 |
c(1L, 2L, 3L) or 1:3 |
'%'(1, 2, 3) | |
c("a", "b", "c") |
$$("a", "b", "c") | |
c(TRUE, FALSE, NA) |
!(true, false, na) |
Moreover, R is mostly vectorized, lacking support for scalar entities, that is, scalar entities are treated as vectors of length 1. Conversely, Prolog does not natively support vectors or matrices. The problem is solved in the following way:
#/N
, %/N
, $$/N
, and
!/N
for floating points, integers, strings and logicals,
respectively.In the reverse direction, Prolog terms like #/N
are
translated back to R vectors of length N. This including the
terms #/0
and #/1
that normally don’t occur.
To summarize, the rules for translation are not fully symmetrical. A
quick check for symmetry of the representation is obtained by a query to
r_eval/2
(see also below, subsection Prolog interface):
once(call("r_eval", c(1, 2, NA, NaN, Inf), expression(X)))
#> $X
#> [1] 1 2 NA NaN Inf
A few package-specific options have been defined to allow some finetuning of the rules for translation between R and Prolog.
TRUE
(default), R vectors of
length 1 are translated to scalars in Prolog. If FALSE
, R
vectors are always translated to #/N
etc., depending on the
type.TRUE (default)
, the result of
query
, once
and findall
includes
an attribute with a representation of the query in Prolog.The command rolog_options()
returns a list with all the
options. The options can be globally modified like e.g. this:
options(rolog.intvec="%%")
In a given query, the options can be set in the optional argument, e.g.
query(call("member", expression(X), list(1:3, 4:6)), list(intvec="%%"))
Rolog offers some basic support to call R from Prolog, that is,
connecting the two systems in the reverse direction. Two predicates can
be used for this purpose, r_eval(Call)
and
r_eval(Function, Result)
. The former just invokes R with
the command Call (ignoring the result); the latter evaluates
Function and unifies the result with Result. Note that
proper quoting of R functions is needed at the Prolog end, especially
with R functions that start with uppercase letters or contain a dot in
their name.
Two use cases for r_eval/2
are shown in Section 4
below.
The package is intentionally kept minimalistic, but can easily be
extended by convenience functions on both ends, Prolog and R, to
facilitate recurrent tasks and/or avoid cumbersome syntax. R is a
functional language, whereas Prolog is declarative. Obviously, there
cannot be a perfect one-to-one correspondence between the syntactic
components of two programming languages that follow completely different
paradigms. Whereas symbols, functions, numbers and character strings are
easily mapped between R and Prolog, there are loose ends at both sides.
In particular, Prolog variables are translated from and to R
expressions (not to be confused with R symbols), and R vectors
of length N > 1 are translated to the Prolog terms
#/N
, %/N
, !/N
, and
$$/N
, as mentioned above.
These rules are, in principle, arbitrary and can be intercepted at several stages of the process:
as.rolog()
)The process is illustrated in Figure 1.
We have already seen above that raising even simple everyday Prolog
queries such as member(X, [1, 2, 3, a, b])
require
complicated R expressions such as
query(call("member", expression(X), list(1, 2, 3, quote("a"), quote(b)))
The R function as.rolog(call)
is meant to simplify this
a bit by translating symbols starting with a dot to Prolog variables,
and calls like ""[1, 2, 3, a, b]
to lists. The argument
call is typically a quoted R call or symbol:
<- quote(member(.X, ""[1, 2, 3, a, b]))
q query(as.rolog(q))
Note that the name of the variable will still be X in the
later course, not “dot-X”. A bit flexibility is lost because
quote()
treats the arguments a, b as
symbols; to evaluate them (i.e., “unquote”), they can be put in
parentheses:
<- 4
a <- 5
b <- quote(member(.X, ""[1, 2, 3, a, (b)]))
q query(as.rolog(q))
resulting in the query member(X, [1, 2, 3, a, 5])
. More
sophisticated work with quasi-quotations (i.e., “unquoting” expressions)
is described in the “Advanced R” book (Wickham
2019).
Section 4 includes an example for mathematical rendering of R expressions. In that example, a preprocessing function is used to bring function calls with named arguments to a canonical form which is then handled in Prolog.
This may again be a function that reverts some of the manipulations
during preprocessing. For once()
and submit()
,
such a function would operate on the bindings. For example, many Prolog
programmers are used to operate with atoms instead of character strings,
which is the preferred representation of symbolic information in R. The
following simple example illustrates conversion of the results for a
query like member(X, [a, b, c])
to strings.
<- function(x)
stringify
{# replace Prolog variable by the value of an R variable with the same name
if(is.name(x))
return(as.character(x))
# Recurse into lists and calls
if(is.call(x))
-1] <- lapply(x[-1], FUN=stringify)
x[
if(is.list(x))
<- lapply(x, FUN=stringify)
x
# Leave the rest unchanged
return(x)
}
# This may be the return value of a Prolog query
<- quote(member(.X, ""[a, b, c]))
q <- findall(as.rolog(q))
r stringify(r)
#> [[1]]
#> [[1]]$X
#> [1] "a"
#>
#>
#> [[2]]
#> [[2]]$X
#> [1] "b"
#>
#>
#> [[3]]
#> [[3]]$X
#> [1] "c"
Recent versions of SWI-Prolog support so-called dictionaries of the
form Tag{Key1:Value1, Key2:Value2, ...}
. The tag is
typically an atom (but can be a variable, as well), the keys are unique
atom or integers; the values can be anything. Suppose we have a Prolog
predicate that does something with dicts, and we would like to query it
from R. The simplest solution is a wrapper in Prolog that translates
key-value pairs [Key1-Value1, Key2-Value2, ...]
back and
forth to dicts:
Pairs0, Pairs1) :-
do_something_with_pairs(Dict0, my_dict, Pairs0),
dict_pairs(Dict0, Dict1),
do_something_with_dicts(Dict1, my_dict, Pairs1). dict_pairs(
do_something_with_pairs/2
can then be queried from R
using, e.g., lists with named elements.
once(call("do_something_with_pairs", list(a=1, b=2), expression(X)))
In the code above, dict_pairs/2
takes the role of both
preproc/2
and postproc/2
in Figure 1. It
illustrates that complicated syntax on the R side when can be much
simpler to do the conversion at the Prolog end.
A way to extend Prolog by add-ons (“packs”) are shown in the next section.
In this section I present a few usage examples for the Rolog package in increasing complexity. Although the code snippets are mostly self-explanatory, some familiarity with the Prolog language is helpful.
Prolog’s typical hello world example is a search through a directed acyclic graph (DAG), for example, a family tree like the one given in Listing 1.
, bob).
parent(pam, bob).
parent(tom, ann).
parent(bob, pat).
parent(bob, jim).
parent(pat
X, Z) :-
ancestor(X, Z).
parent(
X, Z) :-
ancestor(X, Y),
parent(Y, Z). ancestor(
Listing 1 is included in the package and can be accessed from R using
the function system.file(...)
. Within Prolog, the normal
workflow is to consult the code with [family]
and then to
raise queries such as ancestor(X, jim)
, which returns, one
by one, four solutions for the variable X. In R, we obtain the
following results:
library(rolog)
# [family].
consult(system.file(file.path("pl", "family.pl"), package="rolog"))
# ancestor(X, jim).
query(call("ancestor", expression(X), quote(jim)))
#> [1] TRUE
#> attr(,"query")
#> [1] "ancestor(X, jim)"
# solutions for X, one by one
submit()
#> $X
#> pat
submit()
#> $X
#> pam
submit()
#> $X
#> tom
submit()
#> $X
#> bob
submit() # no more results (closing the query)
#> [1] FALSE
submit() # warning that no query is open
#> Warning in .submit(): submit: no open query.
#> [1] FALSE
# clear() # normally used to close a query
As stated above, consult()
loads the facts and rules of
Listing 1 into the database. query(expr)
initializes the
query expr
, and the subsequent calls to submit
return the conditions under which the query succeeds. In this example,
the query succeeds if X is either pat
,
pam
, tom
or bob
. A query is
closed with clear()
, or automatically if the query fails.
Note that it is generally not possible to open two queries
simultaneously, so opening a second query while another one is still
open will raise a warning. If we are interested in just the first
solution, we can use once(expr)
as a shortcut to
query(expr)
-submit()
-clear()
. If
we want to collect all solutions of a query with a finite set of
solutions, we can use findall(expr)
.
A rather cumbersome aspect of Rolog is the construction of Prolog
terms and queries. expression(X)
encapsulates the variable
X
, R symbols from quote(jim)
or
as.symbol("jim")
are translated to Prolog atoms, and Prolog
compounds such as ancestor/2
are generated using
call("ancestor", ...)
. A simplified syntax is provided by
as.rolog(...)
that accepts quoted expressions with dots
indicating Prolog variables:
= quote(ancestor(.X, jim))
q findall(as.rolog(q))
Note that as.rolog(...)
removes the dot from the
variable name.
A more useful application of DAGs is confounder adjustment in causal
analysis (Greenland, Pearl, and Robins 1999;
Barrett 2021). The Prolog file backdoor.pl
is an
implementation of Greenland et al.’s criteria for the backdoor test for
d-separation in DAGs, with a predicate minimal/3
that searches for minimally sufficient sets of variables for confounder
adjustment on the causal path between exposure and outcome.
# [backdoor].
consult(system.file(file.path("pl", "backdoor.pl"), package="rolog"))
# Figure 12 in Greenland et al.
= function(N)
add_node invisible(once(call("assert", call("node", N))))
= function(X, Y)
add_arrow invisible(once(call("assert", call("arrow", X, Y))))
add_node("a")
add_node("b")
add_node("c")
add_node("d") # outcome
add_node("e") # exposure
add_node("f")
add_node("u")
add_arrow("a", "d")
add_arrow("a", "f")
add_arrow("b", "d")
add_arrow("b", "f")
add_arrow("c", "d")
add_arrow("c", "f")
add_arrow("e", "d")
add_arrow("f", "e")
add_arrow("u", "a")
add_arrow("u", "b")
add_arrow("u", "c")
findall(call("minimal", "e", "d", expression(S)))
#> [[1]]
#> [[1]]$S
#> [[1]]$S[[1]]
#> [1] "a"
#>
#> [[1]]$S[[2]]
#> [1] "b"
#>
#> [[1]]$S[[3]]
#> [1] "c"
#>
#>
#>
#> [[2]]
#> [[2]]$S
#> [[2]]$S[[1]]
#> [1] "f"
The query to minimal/3
returns two minimally sufficient
sets of covariates for confounder adjustment (namely, {a, b, c} and
{f}).
One of the main driving forces of Prolog development was natural language processing (Dahl 1981). Therefore, the next example is an illustration of sentence parsing using so-called definite clause grammars. As Listing 3 shows, Rolog can access modules from SWI’s standard library (here, it is “dcg/basics.pl”).
:- use_module(library(dcg/basics)).
% Translate R string to code points and invoke phrase/2
Tree, Sentence) :-
sentence(Sentence, Codes),
string_codes(phrase(s(Tree), Codes).
% Simple grammar with sentences, noun, verb and participle phrases
NP, VP)) --> np(NP, C), blank, vp(VP, C).
s(s(NP, C) --> pn(NP, C).
np(Det, N), C) --> det(Det, C), blank, n(N, C).
np(np(Det, N, PP), C) --> det(Det, C), blank, n(N, C), blank, pp(PP).
np(np(V, NP), C) --> v(V, C), blank, np(NP, _).
vp(vp(V, NP, PP), C) --> v(V, C), blank, np(NP, _), blank, pp(PP).
vp(vp(P, NP)) --> p(P), blank, np(NP, _).
pp(pp(
% Determiners, personal nouns, nouns, verbs and participles
, sg) --> `a`.
det(det(a), _) --> `the`.
det(det(the), sg) --> `john`.
pn(pn(john), sg) --> `man`.
n(n(man), pl) --> `men`.
n(n(men), sg) --> `telescope`.
n(n(telescope), sg) --> `sees`.
v(v(sees), pl) --> `see`.
v(v(see), _) --> `saw`.
v(v(saw)--> `with`. p(p(with))
sentence/2
preprocesses the R
call.
As in the first example, we first consult a little Prolog program with a minimalistic grammar and lexicon (Listing 2), and then raise a query asking for the syntactic structure of “john saw a man with a telescope”. Closer inspection of the two results reveals the two possible meanings, “john saw a man who carries a telescope” versus “john saw a man through a telescope”. More Prolog examples of natural language processing are found in Blackburn and Bos (Blackburn and Bos 2005), including the resolution of anaphoric references and the extraction of semantic meaning.
# [telescope].
consult(system.file(file.path("pl", "telescope.pl"), package="rolog"))
# findall(sentence(Tree, "john saw a man with a telescope")).
findall(call("sentence", expression(Tree), "john saw a man with a telescope"))
#> [[1]]
#> [[1]]$Tree
#> s(pn(john), vp(v(saw), np(det(a), n(man), pp(p(with), np(det(a),
#> n(telescope))))))
#>
#>
#> [[2]]
#> [[2]]$Tree
#> s(pn(john), vp(v(saw), np(det(a), n(man)), pp(p(with), np(det(a),
#> n(telescope)))))
In description of the previous example, I noted in passing that Rolog
can use the built-in libraries of SWI-Prolog (e.g., by calls to
use_module/1
). It is also possible to extend the
installation by add-ons, including add-ons that require compilation, if
the build tools (essentially, RTools) are properly configured. This is
illustrated below by the demo add-on “environ” (J. Wielemaker 2012) that collects the current
environment variables.
# pack_install(environ, [interactive(false)]).
once(call("pack_install", as.name("environ"), list(call("interactive", FALSE))))
# use_module(library(environ))
once(call("use_module", call("library", quote(environ))))
# environ(X)
once(call("environ", expression(X)))
The query then unifies X with a list with Key=Value terms.
The purpose if this example is obviously not to mimick the built-in
function Sys.getenv()
from R, but to illustrate the
installation and usage of Prolog extensions from within R.
Prolog is homoiconic, that is, code is data. In this example, we make use of Prolog’s ability to match expressions against given patterns and modify these expressions according to a few predefined “buggy rules” (Brown and Burton 1978), inspired by recurrent mistakes in the statistics exams of my students. Consider the \(t\)-statistic for comparing an observed group average to a population mean:
\[ T = \frac{\overline{X} - \mu}{s / \sqrt{N}} \]
Some mistakes may occur in this calculation, for example, omission of the implicit parentheses around the numerator and the denominator when typing the numbers into a calculator, resulting to \(\overline{X} - \frac{\mu}{s} \div \sqrt{N}\), or forgetting the square root around \(N\), or both. Prolog code for the two buggy rules is given in Listing 3.
% Correct step from task to solution
X, Mu, S, N), frac(X - Mu, S / sqrt(N))).
expert(tratio(
% Mistakes
X - Mu, S / SQRTN), X - frac(Mu, S) / SQRTN).
buggy(frac(N), N).
buggy(sqrt(
% Apply expert and buggy rules
X, Y) :-
step(X, Y).
expert(
X, Y) :-
step(X, Y).
buggy(
% Enter expressions
X, Y) :-
step(compound(X),
, X, Y),
mapargs(searchX, Y).
dif(
% Search through problem space
X, X).
search(
X, Z) :-
search(X, Y),
step(Y, Z). search(
The little e-learning system shown in Listing 2
(buggy.pl
) is invoked using the R script below.
library(rolog)
consult(system.file(file.path("pl", "buggy.pl"), package="rolog"))
<- quote(search(tratio(x, mu, s, n), .S))
q findall(as.rolog(q))
#> [[1]]
#> [[1]]$S
#> tratio(x, mu, s, n)
#>
#>
#> [[2]]
#> [[2]]$S
#> frac(x - mu, s/sqrt(n))
#>
#>
#> [[3]]
#> [[3]]$S
#> x - frac(mu, s)/sqrt(n)
#>
#>
#> [[4]]
#> [[4]]$S
#> x - frac(mu, s)/n
#>
#>
#> [[5]]
#> [[5]]$S
#> frac(x - mu, s/n)
#>
#>
#> [[6]]
#> [[6]]$S
#> x - frac(mu, s)/n
The fourth and the sixth result are combinations of the two buggy rules (parenthesis, then square root, and the other way round). Some additional filters would be needed to eliminate trivial and redundant solutions (see, e.g., the chapter on generate-and-test in Sterling and Shapiro 1994).
It should be mentioned that R is homoiconic, too, and the Prolog code
above can, in principle, be rewritten in R using non-standard evaluation
techniques (Wickham 2019). Prolog’s
inbuilt pattern matching algorithm simplifies things a lot, though. An
important feature of such a term manipulation is that the evaluation of
the term can be postponed; for example, there is no need to instantiate
the variables X
, Mu
, s
and
N
with given values before raising a query. This is
especially helpful for variables that may represent larger sets of data
in the later course.
The R extension of the markdown language (Xie,
Dervieux, and Riederer 2020) enables reproducible statistical
reports with nice typesetting in HTML, Microsoft Word, and Latex.
However, so far, R expressions such as pbinom(k, N, p)
are
typeset as-is; prettier mathematical expressions such as \(P_\mathrm{Bi}(X \le k; N, p)\) require
Latex commands like
P_\mathrm{Bi}\left(X \le k; N, p\right)
, which are
cumbersome to type in and hardly readable even for simple expressions.
Below we make use of Prolog’s grammar rules for the automatic
translation of R expressions to MathML. The result can then be used for
calculations or it can be rendered on a web page. A limited set of rules
for translation from R to MathML is found in pl/mathml.pl
of the Rolog package. The relevant code snippets are shown in the
listings below, along with their output.
library(rolog)
consult(system.file(file.path("pl", "mathml.pl"), package="rolog"))
# R interface to Prolog predicate r2mathml/2
= function(term)
mathml
{= once(call("r2mathml", term, expression(X)))
t cat(paste(t$X, collapse=""))
}
The first example is easy. At the Prolog end, there is a handler for
pbinom/3
that translates the term into a pretty MathML
syntax like P_bi(X <= k; N, pi).
= quote(pbinom(k, N, p))
term
# Pretty print
mathml(term)
# Do some calculations with the same term
= 10
k = 22
N = 0.4
p eval(term)
[1] 0.77195
The next example is interesting because Prolog needs find out the
name of the integration variable for sin
. For that purpose,
Rolog provides a predicate r_eval/2
that calls R from
Prolog (i.e., the reverse direction, see also next example). Here, the
predicate is used for the R function formalArgs(args(sin))
,
which returns the name of the function argument of sin
,
that is, x
.
= quote(integrate(sin, 0L, 2L*pi))
term mathml(term)
eval(term)
2.221501e-16 with absolute error < 4.4e-14
Note that the Prolog end, the handler for integrate/3
is
rather rigid; it accepts only these three arguments in that particular
order, and without names, that is,
integrate(sin, lower=0L, upper=2L * pi)
would not print the
desired result.
Therefore, pl/mathml.pl
includes two handlers that
accept terms with named arguments,
integrate(f=Fn, lower=Lower, upper=Upper)
, as well as terms
of the form $(integrate(Fn, Lower, Upper), value)
that are
needed for the evaluation below.
# Apply match.call to all components of a term
<- function(term)
canonical
{if(is.call(term))
{<- match.fun(term[[1]])
f if(!is.primitive(f))
<- match.call(f, term)
term
# Recurse into arguments
-1] <- lapply(term[-1], canonical)
term[
}
return(term)
}
# A custom function
<- function(u)
g
{sin(u)
}
# Mixture of (partially) named and positional arguments in unusual order
<- quote(2L * integrate(low=-Inf, up=Inf, g)$value)
term mathml(canonical(term))
# It is a bit of a mystery that R knows the result of this integral.
eval(term)
[1] 0
The extra R function canonical()
applies
match.call()
to non-primitive R calls, basically cleaning
up the arguments and bringing them into the correct order.
The basic workflow of the bridge from R to Prolog is to (A) translate
an R expression into a Prolog term (i.e., a predicate), (B) query the
predicate, and then, (C) translate the result (i.e., the bindings of the
variables) back to R. The reverse direction is straightforward, we start
by translating a Prolog term to an R expression (i.e. Step C), evaluate
the R expression, and then translate the result back to a Prolog term
(Step A). Rolog provides two Prolog predicates for this purpose,
r_eval(Expr)
and r_eval(Expr, Res)
. The former
is used to invoke an R expression Expr for its side effects
(e.g., initializing a random number generator); it does not return a
result. The latter is used to evaluate the R expression Expr
and return the result Res. The code snippet in Listing 5
(r_eval.pl
) illustrates this behavior.
Seed) :-
r_seed('set.seed'(Seed)).
r_eval(
N, L) :-
r_norm(N), L). r_eval(rnorm(
# [r_eval].
consult(system.file(file.path("pl", "r_eval.pl"), package="rolog"))
# rnorm(3)
once(call("r_norm", 3L, expression(X)))
#> $X
#> [1] 0.91560562 -0.18800519 -0.03210339
r_eval/1
and
r_eval/2
. The R call set.seed
is quoted
because the dot is an operator in Prolog.
The example in Listing 5 is a bit trivial, basically illustrating the
syntax and the workflow. More serious applications of
r_eval/1,2
are illustrated in the example on mathematical
expressions where r_eval/2
is used to obtain a names of a
function argument, as well as in the next example on interval
arithmetic, where r_eval/2
is used with monototonically
behaving R functions.
Let \(\langle\ell, u\rangle\) denote
a number between \(\ell\) and \(u\), \(\ell\le
u\). It is easily verified that the result of the difference
\(\langle\ell_1, u_1\rangle - \langle\ell_2,
u_2\rangle\) is somewhere in the interval \(\langle \ell_1 - u_2, u_1 -
\ell_2\rangle\), and a number of rules exist for basic arithmetic
operations and (piecewise) monotonically behaving functions (Hickey, Ju, and Emden 2001). For ratios,
denominators with mixed sign yield two possible intervals, for example,
\(\langle 1, 2\rangle / \langle -3, 3\rangle =
\langle -\infty, 1/3\rangle \cup \langle 1/3, \infty\rangle\), as
shown in Figure 4 in Hickey et al.’s article. The number of possible
candidates increases if more complicated functions are involved, as
unions of intervals themselves appear as arguments (e.g., if \(I_1 \cup I_2\) is added to \(I_3 \cup I_4\), the result is \(I_1 + I_3 \cup I_1 + I_4 \cup I_2 + I_3 \cup I_2 +
I_4\)). As a consequence, calculations in interval arithmetic are
non-deterministic in nature, and the number of possible results is not
foreseeable and cannot, in general, be vectorized as is often done in R.
Use cases for interval arithmetic are the limitations of floating-point
representations in computer hardware, but intervals can also be used to
represent the result of measurements with limited precision, or
truncated intermediate results of students doing hand calculations. A
few rules for basic interval arithmetic are found in
pl/interval.pl
; a few examples are shown below. Again,
Prolog rings back to R via r_eval/2
to determine the result
of dbinom(X, Size, Prob, Log)
.
# [interval].
consult(system.file(file.path("pl", "interval.pl"), package="rolog"))
# findall(1 ... 2 / -3 ... 3, Res).
<- quote(int(`...`(1, 2) / `...`(-3, 3), .Res))
q findall(as.rolog(q))
#> [[1]]
#> [[1]]$Res
#> ...(-Inf, -0.333333333333333)
#>
#>
#> [[2]]
#> [[2]]$Res
#> ...(0.333333333333333, Inf)
# t-ratio
= quote(`...`(5.7, 5.8))
D = 4
mu = quote(`...`(3.8, 3.9))
s = 24L
N = call("/", call("-", D, mu), call("/", s, call("sqrt", N)))
tratio findall(call("int", tratio, expression(Res)))
#> [[1]]
#> [[1]]$Res
#> ...(2.13545259627251, 2.32056923000512)
# Binomial density
= quote(`...`(0.2, 0.3))
prob once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res)))
#> $Res
#> ...(0.088080384, 0.200120949)
= quote(`...`(0.5, 0.6))
prob once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res)))
#> $Res
#> ...(0.111476736, 0.205078125)
= quote(`...`(0.2, 0.6))
prob once(call("int", call("dbinom", 4L, 10L, prob, FALSE), expression(Res)))
#> $Res
#> ...(0.088080384, 0.250822656)
The slightly cumbersome syntax for entering an interval \(\langle \ell, u\rangle\) is due to the fact that the ellipsis is a reserved symbol in R and cannot be used as an infix operator. A way more powerful and comprehensive system for constraint logic programming over intervals is available as a Prolog pack (Workman 2021) and can easily be connected to R using, for example, the present package.
R has become the primary language for statistical programming and
data science, but is currently lacking support for traditional, symbolic
artificial intelligence. There are already two addons for SWI-Prolog
that allow to run R calculations from Prolog (Angelopoulos et al. 2013; Jan Wielemaker 2021),
but a connection in the other direction was missing, so far. Rolog
bridges this gap by providing an interface to a SWI-Prolog distribution
embedded into an R package. The communication between the two systems is
mainly in the form of queries from R to Prolog, but two Prolog
predicates allow Prolog to ring back and evaluate terms in R. The design
of the package is minimalistic, providing three main functions
query(...)
, submit()
, and
clear()
, and a very limited set of convenience tools
(consult(...)
, once(...)
, and
findall(...)
) to facilitate everyday actions. As both
systems are homoiconic in nature, it was easy to establish a one-to-one
correspondence between the elements of the two languages. Most
exceptions (e.g., lack of R support for empty symbols) can be avoided
and/or circumvented by wrapper functions at both ends.
Simple ways to extend the package have been described in Section 3;
such extensions could, for example, include R objects and structures
like those returned by lm
, or S4 classes. In many use
cases, this may be realized by transforming the R object to a list with
named elements, and “reconstruct” the object on the Prolog end if
needed. After a query, the process is reversed. If speed is an issue,
more of these steps can, in principle, be moved into the package and
implemented in Rcpp.
Rolog, thus, opens up a wide of applications in logic programming for statisticians and researchers at the intersection of symbolic and connectionist artificial intelligence, where concise knowledge representation is combined with statistical power. Moreover, Rolog provides starting points for useful small-scale solutions for everyday issues in data science (term transformations, pretty mathematical output, interval arithmetic).
At its present stage, a major limitation of Rolog is its rather slow
speed. For example, translation of R lists or vectors to the respective
elements of the Prolog language (lists, #/N
) is done
element-wise, in both directions. I tried to optimize this process using
Rcpp (Eddelbuettel and Balamuta 2018), but
Prolog does not support vectors or matrices, so there remains an upper
bound in the efficiency. Of course, Prolog’s primary purpose is not
vector or matrix calculation, so this limitation may not show up in real
applications. Another issue, maybe a bit annoying, is the rather
cumbersome syntax of the interface, with the need for quoted calls and R
expressions being a bit misused for representing Prolog variables. Rolog
was deliberately chosen to be minimalistic and, so far, only depends on
base R. A more concise form of representation might be obtained by tools
from the “Tidyverse” ecosystem, as described in Chapter 19 of Advanced R
(Wickham 2019). Finally, at this stage,
Rolog is unable to deal with cyclic Prolog terms (e.g.,
once(call("=", expression(A), call("f", expression(A))))
raises an error message).
The source code of the package is found at https://github.com/mgondan/rolog/, including installation instructions for Unix, Windows and macOS. At the moment, Rolog is a source package that requires the RTools 4.2 toolchain under Windows. Binaries will be made available soon on https://cran.r-project.org/.
I am grateful to Jan Wielemaker, the main developer of SWI-Prolog; this package would never have been finalized without his help. Moreover, development of the package profited substantially from the Prolog packs rserve_client (Jan Wielemaker 2021) and real (Angelopoulos et al. 2013). The project was financially supported by the European Commission (Erasmus+ project QHelp, 2019-1-EE01-KA203-051708).