Perl in Lisp 0.1

by Stuart Sierra


This document describes the source code of a Common Lisp interface to the Perl 5 API. It consists two layers: 1. CFFI definitions for the C API of Perl and 2. a Lisp library on top of them that offers convenient entry points to evaluate strings of Perl code, call Perl functions, and convert Perl data types to and from their Common Lisp equivalents.

This is a beta release. Some parts are incomplete, but the overall package is usable.

This documentation was generated with Noweb and LaTeX.


This document is a ``literate program,'' containing both its source code and full documentation of that source code. The Makefile in Section [->] produces two output files. The first, perlapi.lisp, defines the perl-api package, which contains CFFI definitions for the Perl C API. The second, perl-in.lisp, defines the perl-in-lisp package, which exports Lisp functions that provide convenient ways to use Perl from Common Lisp.

Unit tests for both packages are defined with the Lisp-Unit testing framework.

The Perl API

On Unix/Linux, the Perl library is called simply ``libperl'' and this is sufficient for CFFI to find it. On Windows, I do not know where the Perl DLL file will be located or what it will be called. This code should work fine on Windows, but you will need to alter this chunk to tell CFFI where the Perl DLL file is located.

<Libperl foreign library definition>= (U->)
(define-foreign-library libperl
  (t (:default "libperl")))

(use-foreign-library libperl)

Most of the public Perl API is implemented as C preprocessor macros. Obviously, those macros cannot be called through a foreign function interface. There are two possible ways to proceed here. One could write a small library of C code to wrap the API macros in functions, and that's exactly what I did in early versions of this library. This proved tricky to compile and awkward to use. So I decided to dig into the Perl source and find the underlying functions those macros call. Then I can reimplement the macros in Lisp.

Perl API Primitive Types

The Perl API defines abbreviations for common C types. They are copied here to make the FFI definitions match the C source. I32, U32, IV, and UV are, usually, all 32-bit integers. I32 is actually 64 bits on Crays. If this code ever gets run on a Cray, I will eat my keyboard.

<Perl API Types>= (U->) [D->]
(defctype :i32 :int32)
(defctype :u32 :uint32)

A more difficult problem is the width of IV (signed integer) and UV (unsigned). They are usually 32 bits, but could be 64 bits on some architectures. I do not know how to determine this without crawling through the preprocessed Perl source, so I cheat and assume 32 bits. This is a bad thing and should be fixed.

<Perl API Types>+= (U->) [<-D->]
(defctype :iv  :int32)
(defctype :uv  :uint32)

NV is always a double. PV is always a char*, although Perl PV strings may contain NULL characters and may not be NULL-terminated like proper C strings, so we cannot treat them as CFFI :string types.

<Perl API Types>+= (U->) [<-D->]
(defctype :nv  :double)
(defctype :pv  :pointer) ; char*

STRLEN is a typedef, like the traditional size_t, for an unsigned integer type that can hold the length of the largest string Perl can handle. Again, this can vary by platform, so I cheat and assume 32 bits. Bad me.

<Perl API Types>+= (U->) [<-D->]
(defctype :strlen :uint32)

The Perl Interpreter

We treat the interpreter as an opaque void pointer; there is no need to access its memory directly.

<Perl API Types>+= (U->) [<-D->]
(defctype :interpreter :pointer :translate-p nil)

Initializing the Interpreter

There are four Perl API functions necessary to set up the Perl interpreter, perl_alloc, perl_construct, perl_parse, and perl_run. Despite what the perlembed man page says, my tests indicate that the PERL_SYS_INIT3 macro is not actually necessary for running an embedded interpreter.

<CFFI Definitions>= (U->) [D->]
(defcfun "perl_alloc" :interpreter)

(defcfun "perl_construct" :void
  (interpreter :interpreter))

(defcfun "perl_parse" :void
  (interpreter :interpreter)
  (xsinit :pointer)
  (argc :int)
  (argv :pointer)
  (env :pointer))

(defcfun "perl_run" :int
  (interpreter :interpreter))
Defines perl-alloc, perl-construct, perl-parse, perl-run (links are to index).

<perl-api Exports>= (U->) [D->]
#:perl-alloc #:perl-construct #:perl-parse #:perl-run

We can wrap up the complete process necessary to initialize the interpreter in a single function. It returns the pointer to the interpreter instance. This pointer will be needed later to destroy the interpreter and free the memory.

<Wrapper Library Internal Functions>= (U->) [D->]
(defun make-interpreter ()
  (let ((interpreter (perl-alloc))
        (arguments (foreign-alloc :pointer :count 3)))
    (perl-construct interpreter)
    <Create Command-Line Argument Array>
    (perl-parse interpreter (null-pointer)
                3 arguments (null-pointer))
    <Start Interpreter Running>
Defines make-interpreter (links are to index).

perl-parse receives an array of strings, which in a normal Perl executable would be the command-line arguments. To run an embedded interpreter, we need to pass three arguments: an empty string, -e, and 0. These are similar to the arguments that would be used when calling snippets of Perl code from a shell. Initializing this as a C char** array looks like this (copied from the perlembed man page):

<Embedding Command Line Arguments In C>=
char *embedding[] = { "", "-e", "0" };

But in CFFI-speak it's a little more complicated. The let in the function above allocates a foreign array, arguments, of pointers. Then we create the three strings:

<Create Command-Line Argument Array>= (<-U)
(setf (mem-aref arguments :pointer 0) (foreign-string-alloc ""))
(setf (mem-aref arguments :pointer 1) (foreign-string-alloc "-e"))
(setf (mem-aref arguments :pointer 2) (foreign-string-alloc "0"))

To start the interpreter, we call perl_run, which returns zero on success. Any other return value signals a critical error.

<Start Interpreter Running>= (<-U)
(let ((run (perl-run interpreter)))
  (unless (zerop run)
    (error "perl_run failed (return value: ~A)" run)))

Destroying the Perl Interpreter

There are separate Perl API functions to shut down the interpreter, perl_destruct, and free its memory, perl_free.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "perl_destruct" :void
  (interpreter :interpreter))

(defcfun "perl_free" :void
  (interpreter :interpreter))
Defines perl-destruct, perl-free (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-destruct #:perl-free

To ensure that these functions really clean out all the memory used by Perl, we have to set the global variable PL_perl_destruct_level to one.

<CFFI Definitions>+= (U->) [<-D->]
(defcvar "PL_perl_destruct_level" :i32)
Defines *pl-perl-destruct-level* (links are to index).

<perl-api Exports>+= (U->) [<-D->]

We wrap the whole process in a single function, which takes as its argument the pointer returned by make-interpreter.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun destroy-interpreter (interpreter)
  (setf *pl-perl-destruct-level* 1)
  (perl-destruct interpreter)
  (perl-free interpreter))
Defines destroy-interpreter (links are to index).

Maintaining the Interpreter

We store the pointer to the Perl interpreter instance as a private (non-exported) global variable. Two functions will be exported to start and stop the interpreter. They are safe to call at any time; start-perl will do nothing if the interpreter is already running and stop-perl will do nothing if the interpreter is not running. Both functions explicitly return nothing with (values) so that no potentially confusing return values will be printed at the REPL.

<Wrapper Library Globals>= (U->) [D->]
(defvar *perl-interpreter* nil)
Defines *perl-interpreter* (links are to index).

<Wrapper Library Public Functions>= (U->) [D->]
(defun start-perl ()
  (unless *perl-interpreter*
    (setq *perl-interpreter* (make-interpreter)))

(defun stop-perl ()
  (when *perl-interpreter*
    (destroy-interpreter *perl-interpreter*)
    (setq *perl-interpreter* nil))
Defines start-perl, stop-perl (links are to index).

<Wrapper Library Exports>= (U->) [D->]
#:start-perl #:stop-perl

To make this code idiot-proof, we will ensure that a Perl interpreter is running before calling any of the API functions. We can define a function, need-perl, to be called a the top of every function that needs the interpreter. Since this function will be called very often, we declare it inline.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(declaim (inline need-perl))

(defun need-perl ()
  (unless *perl-interpreter* (start-perl)))
Defines need-perl (links are to index).

Perl Scalars

A Perl scalar value (abbreviated SV) can be a number, a string, or a reference. At the API level, i.e. not in Perl source code, it may also contain a pointer to other values, such as arrays and hashes.

<Perl API Types>+= (U->) [<-D->]
(defctype :sv :pointer)
Defines :sv (links are to index).

We will usually interact with scalars as opaque pointers, but it may be occasionally useful to have access to parts of their structure, particularly the reference count.

<Perl API Types>+= (U->) [<-D->]
(defcstruct sv
  (any :pointer)
  (refcnt :uint32)
  (flags :uint32))

<perl-api Exports>+= (U->) [<-D->]
#:sv #:any #:refcnt #:flags

Creating Scalars

Perl_newSV creates a generic, empty scalar with the supplied number of bytes of storage space allocated. It sets the scalar's reference count to one, as do all of the ``shortcut'' functions below.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newSV" :sv
  (size :strlen))
Defines perl-newsv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

There are ``shortcut'' functions for creating new scalars with numeric values:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newSViv" :sv
  (int :iv))

(defcfun "Perl_newSVuv" :sv
  (uint :uv))

(defcfun "Perl_newSVnv" :sv
  (double :nv))
Defines perl-newsviv, perl-newsvnv, perl-newsvuv (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-newsviv #:perl-newsvuv #:perl-newsvnv

There are two functions for creating scalars from strings. Both take the length of the string as an argument, but Perl_newSVpv will automatically calculate the length if it is given as zero. Perl_newSVpvn, which does not perform this check, is recommended as more efficient.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newSVpv" :sv
  (string :string)
  (length :strlen)) ; automatically computed if zero

(defcfun "Perl_newSVpvn" :sv
  (string :string)
  (length :strlen)) ; NOT automatically computed
Defines perl-newsvpv, perl-newsvpvn (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-newsvpv #:perl-newsvpvn

To copy existing scalars:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newSVsv" :sv
  (scalar :sv))
Defines perl-newsvsv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

Scalar Reference Counting

Perl's garbage collection works by reference counting. In Perl code, this is invisible, but when using the C interface we must explicitly increment and decrement the reference counts of the variables we use.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_sv_newref" :sv
  (scalar :sv))

(defcfun "Perl_sv_free" :void
  (scalar :sv))
Defines perl-sv-free, perl-sv-newref (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-sv-newref #:perl-sv-free

perl-sv-newref will increment the reference count of the scalar; perl-sv-free will decrement the reference count and, if it drops to zero, clear the scalar and deallocate all its memory.

We can also get a scalar's reference count directly from its structure:

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun refcnt (scalar)
  (foreign-slot-value scalar 'sv 'refcnt))
Defines refcnt (links are to index).

Testing the reference count functions.

<Wrapper Library Tests>= (U->) [D->]
(define-test refcnts
  (let ((s (perl-newsv 0)))
    (assert-equal 1 (refcnt s))
    (assert-equal 2 (refcnt (perl-sv-newref s)))
    (perl-sv-free s)
    (assert-equal 1 (refcnt s))
    (perl-sv-free s)
    (assert-equal 0 (refcnt s))))

Technically the scalar gets deallocated from memory at the next-to-last line of that test, but the structure survives long enough for the final =0 test to pass.

Determining the Type of Scalars

Since scalars can contain multiple types of values, we need tests to determine what they actually are.

<Perl API Types>+= (U->) [<-D->]
(defcenum svtype
  :null ; undef
  :iv   ; Scalar (integer)
  :nv   ; Scalar (double float)
  :rv   ; Scalar (reference)
  :pv   ; Scalar (string)
  :pviv ; a pointer to an IV (used in hashes)
  :pvnv ; a pointer to an NV (used in hashes)
  :pvmg ; blessed or magical scalar
  :pvbm ; ??
  :pvlv ; ??
  :pvav ; Array
  :pvhv ; Hash
  :pvcv ; Code reference
  :pvgv ; typeglob (possibly a file handle)
  :pvfm ; ??
  :pvio ; an I/O handle?
Defines svtype (links are to index).

<perl-api Exports>+= (U->) [<-D->]

Type checking of scalars is implemented in the Perl API with macros that directly access bits of the SV structure. Copied from sv.h in the Perl source, they are:

<SvTYPE macros in C>=
#define SVTYPEMASK 0xff
#define SvTYPE(sv) ((sv)->sv_flags & SVTYPEMASK)

In Lisp, these become:

<Wrapper Library Globals>+= (U->) [<-D]
(defvar *sv-type-mask* #Xff)
Defines *sv-type-mask* (links are to index).

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun svtype (scalar)
   (logand (foreign-slot-value scalar 'sv 'flags)
Defines svtype (links are to index).

Here are tests for the most common types:

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test sv-type
  (assert-eq :null (svtype (perl-newsv 0)))
  (assert-eq :iv   (svtype (perl-newsviv 100)))
  (assert-eq :nv   (svtype (perl-newsvnv 3.14d0)))
  (assert-eq :rv   (svtype (perl-newrv (perl-newsv 0))))
  (assert-eq :pv   (svtype (perl-newsvpv "hello" 0)))
  (assert-eq :pvav (svtype (perl-newav)))
  (assert-eq :pvhv (svtype (perl-newhv))))

Converting Scalars to C Types

Perl_sv_true returns the boolean value (automatically converted from an integer to t or nil by CFFI's :boolean type) of the scalar by the Perl definition of a boolean. In Perl, the value undef, the number 0, the string ``0'', and the empty string are all false; anything else is true.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_sv_true" :boolean
  (scalar :sv))
Defines perl-sv-true (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>= (U->) [D->]
(define-test scalars-true-false
  (assert-true (pointerp (perl-newsv 0)))
  (assert-equal nil (perl-sv-true (perl-newsv 0)))
  (assert-equal t (perl-sv-true (perl-newsviv 5))))

Three functions convert scalars to numeric types. These functions will attempt to coerce the scalar to an IV (signed integer), UV (unsigned integer), or NV (double float), respectively.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_sv_2iv" :iv
  (scalar :sv))

(defcfun "Perl_sv_2uv" :uv
  (scalar :sv))

(defcfun "Perl_sv_2nv" :nv
  (scalar :sv))
Defines perl-sv-2iv, perl-sv-2nv, perl-sv-2uv (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-sv-2iv #:perl-sv-2uv #:perl-sv-2nv

<API Tests>+= (U->) [<-D->]
(define-test new-scalar-integers
  (assert-equal t (perl-sv-true (perl-newsviv -256)))
  (assert-equal t (perl-sv-true (perl-newsvuv 17)))
  (assert-equal nil (perl-sv-true (perl-newsviv 0)))
  (assert-equal nil (perl-sv-true (perl-newsvuv 0)))
  (assert-equal -256 (perl-sv-2iv (perl-newsviv -256)))
  (assert-equal 17 (perl-sv-2uv (perl-newsvuv 17))))

(define-test new-scalar-floats
  (assert-equal nil (perl-sv-true (perl-newsvnv 0d0)))
  (assert-equal t (perl-sv-true (perl-newsvnv 3.1459d0)))
  (assert-equal 3.1459d0 (perl-sv-2nv (perl-newsvnv 3.1459d0))))

In the Perl API documentation, scalars are normally converted to strings with the SvPV macro, which first checks if the scalar is actually storing a string and, if it is, returns a pointer directly to that string. If it is not, it uses Perl_sv_2pv_flags to coerce the scalar into a string.

However, Perl_sv_2pv_flags returns a bogus pointer when called on a scalar which already contains string, so it will not work as a functional substitute for SvPV. Instead, we must use Perl_sv_pvn_force_flags, which works for both string and non-string scalars.

The flags in the name refers to a bitfield argument. I do not know what all of the flags are for; they are simply included here for completeness. SV_GMAGIC is the standard recommended flag for use when converting scalars to strings.

<CFFI Definitions>+= (U->) [<-D->]
(defbitfield sv-flags
  (:immediate-unref 1)
  (:gmagic 2)
  (:cow-drop-pv 4) ; Unused in Perl 5.8.x
  (:utf8-no-encoding 8)
  (:nosteal 16))

(defcfun "Perl_sv_pvn_force_flags" :pointer
  (scalar :sv)
  (length :pointer) ; STRLEN*
  (flags :i32))
Defines perl-sv-pvn-force-flags, sv-flags (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:sv-flags #:perl-sv-pvn-force-flags

Converting scalars to strings is a complicated by the fact that Perl strings can contain NULL characters, so foreign-string-to-lisp must be called with null-terminated-p set to nil. The length of the string comes from the length pointer passed to Perl_sv_pvn_force_flags.

Does this leak memory when new strings are created? I'm sure I don't know.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun string-from-sv (sv)
  (with-foreign-object (length :strlen)
        (perl-sv-pvn-force-flags sv length
                                  'sv-flags '(:gmagic)))
        (mem-ref length :strlen)
        nil))) ; null-teminated-p
Defines string-from-sv (links are to index).

Note that this does not handle UTF8 (Perl's preferred flavor of Unicode) strings. Unicode and Lisp is whole can of worms that I don't want to deal with yet.

Some tests for strings:

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test new-scalar-strings-of-numbers
  (assert-equal nil (perl-sv-true (perl-newsvpv "0" 0)))
  (assert-equal "0" (string-from-sv (perl-newsviv 0)))
  (assert-equal "-256" (string-from-sv (perl-newsviv -256)))
  (assert-equal "3.14" (string-from-sv (perl-newsvnv 3.14d0))))

(define-test new-scalar-strings-to-booleans
  (assert-equal t (perl-sv-true (perl-newsvpv "foo" 0)))
  (assert-equal t (perl-sv-true (perl-newsvpvn "foo" 3)))
  (assert-equal t (perl-sv-true (perl-newsvpv "1" 0)))
  (assert-equal nil (perl-sv-true (perl-newsvpv "" 0)))
  (assert-equal nil (perl-sv-true (perl-newsvpv "0" 0))))

(define-test new-scalar-strings
  (assert-true (pointerp (perl-newsvpv "hello" 0)))
  (assert-equal "hello" (string-from-sv (perl-newsvpv "hello" 0)))
  (assert-equal "good" (string-from-sv (perl-newsvpvn "goodbye" 4))))

We can also test that we can use strings containing NULL characters, which are allowed in both Perl and Lisp but not in C. The Perl character-escape syntax \00 will insert a NULL character in an interpolated string. (We have to escape the backslash to insert it in a Lisp string, giving us \\00.) We can create the equivalent string in Lisp by treating a string like an array and modifying one character.

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test string-containing-null
  (assert-equal (let ((string (copy-seq "abcde")))
                  (setf (aref string 2) (code-char 0))
                  string) ; "ab" + NULL + "de"
                (string-from-sv (perl-eval-pv "qq{ab\\00de}" 0))))

Lastly, we can access a named scalar with Perl_get_sv. If the named scalar does not exist and create is true, a new scalar will be created.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_get_sv" :sv
  (name :string)
  (create :boolean))
Defines perl-get-sv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test create-named-scalars
  (let ((x (perl-get-sv "foo" t)))
    (perl-sv-setsv-flags x (perl-newsviv 1)
                          'sv-flags '(:gmagic)))
    (assert-equal t (perl-sv-true
                     (perl-get-sv "foo" nil)))))

Setting the Value of Scalars

The standard API function for copying the value of one scalar to another scalar is Perl_sv_setsv_flags. The flags argument is sv-flags, defined above. The recommended standard flag is SV_GMAGIC.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_sv_setsv_flags" :void
  (destination :sv)
  (source :sv)
  (flags :i32))
Defines perl-sv-setsv-flags (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test sv-setsv
  (let ((x (perl-newsv 0))
        (y (perl-newsviv 55)))
    (perl-sv-setsv-flags x y (foreign-bitfield-value
                              'sv-flags '(:gmagic)))
    (assert-equal 55 (perl-sv-2iv x))))

We can abstract away the foreign bitfield:

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun set-sv (destination source &rest flags)
  (perl-sv-setsv-flags destination source
                        'sv-flags flags)))
Defines set-sv (links are to index).

There are also shortcut functions for C types. The _mg suffix means that these functions correctly handle `set' magic (i.e. tied variables).

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_sv_setiv_mg" :void
  (destination :sv)
  (source :iv))
(defcfun "Perl_sv_setuv_mg" :void
  (destination :sv)
  (source :uv))
(defcfun "Perl_sv_setnv_mg" :void
  (destination :sv)
  (source :nv))
(defcfun "Perl_sv_setpv_mg" :void
  (destination :sv)
  (source :string)
  (length :strlen)) ; automatically calculated if 0
(defcfun "Perl_sv_setpvn_mg" :void
  (destination :sv)
  (source :string)
  (length :strlen)) ; NOT automatically calculated
Defines perl-sv-setiv-mg, perl-sv-setnv-mg, perl-sv-setpv-mg, perl-sv-setpvn-mg, perl-sv-setuv-mg (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-sv-setiv-mg #:perl-sv-setuv-mg #:perl-sv-setnv-mg
#:perl-sv-setpv-mg #:perl-sv-setpvn-mg

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test sv-set-iv-uv-nv-pv
  (let ((x (perl-newsv 0)))
    (perl-sv-setiv-mg x -256)
    (assert-equal -256 (perl-sv-2iv x))
    (perl-sv-setuv-mg x 88)
    (assert-equal 88 (perl-sv-2uv x))
    (perl-sv-setnv-mg x 3.1459d0)
    (assert-equal 3.1459d0 (perl-sv-2nv x))
    (perl-sv-setpv-mg x "hello" 0)
    (assert-equal "hello" (string-from-sv x))
    (perl-sv-setpvn-mg x "goodbye" 4)
    (assert-equal "good" (string-from-sv x))))

Accessing Scalars By Name

We can also access and/or create a scalar variable by its name. If the named variable does not exist, it will be automatically created. I am not allowing the passing of any value other than t to the create argument of perl-get-sv because it is not obvious what we should do when the given name does not exist: signal an error or return a null pointer? To keep it simple I use a form that will always succeed.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun get-scalar-by-name (name)
  (perl-get-sv name t))
Defines get-scalar-by-name (links are to index).

For reasons I cannot divine, a variable must be created with perl-get-sv before being used in an eval context in order to be later accessed without causing a memory fault. I think it has something to do with the Perl garbage collector.

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test get-scalar-by-name
  (let ((x (get-scalar-by-name "nines")))
    (declare (ignore x))
    (perl-eval-pv "$nines = 999;" nil)
      (get-scalar-by-name "nines")))))

Perl Arrays

A Perl array (type AV*) is actually an ``upgraded'' scalar that points to a C array of other scalars.

<Perl API Types>+= (U->) [<-D->]
(defctype :av :pointer)

Creating a new array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newAV" :av)
Defines perl-newav (links are to index).

<perl-api Exports>+= (U->) [<-D->]

To get the SV at a given index key in an array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_fetch" :pointer
  (array :av)
  (key :i32)
  (create :boolean))
Defines perl-av-fetch (links are to index).

<perl-api Exports>+= (U->) [<-D->]

If create is true, then the function will grow the array to include the given index. Perl_av_fetch has a return type of SV**. We must derefernce the pointer to get at the regular SV*, but first we have to check that it is not NULL. A wrapper function takes care of this, and returns nil if the pointer was NULL.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun av-fetch-sv (array key create)
  (let ((ptr (perl-av-fetch array key create)))
    (if (null-pointer-p ptr) nil
      (mem-ref ptr :pointer))))
Defines av-fetch-sv (links are to index).

To store a value in an array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_store" :pointer
  (array :av)
  (key :i32)
  (scalar :sv))
Defines perl-av-store (links are to index).

<perl-api Exports>+= (U->) [<-D->]

From av.c: ``The return value will be NULL if the operation failed or if the value did not need to be actually stored within the array (as in the case of tied arrays). Otherwise it can be dereferenced to get the original SV*. Note that the caller is responsible for suitably incrementing the reference count of val before the call, and decrementing it if the function returned NULL.''

I'm not going to write a wrapper function here, because whether or not we actually want to decrement the reference count depends on the reason we're creating the array in the first place---if we're initializing a new array with newly-created scalars, there's no reason to increment the reference count on the scalars before storing them in the array.

<API Tests>+= (U->) [<-D->]
(define-test array-store-fetch
  (let ((a (perl-newav)))
    (perl-av-store a 3 (perl-newsviv -17))
    (assert-equal -17 (perl-sv-2iv (perl-in-lisp::av-fetch-sv a 3 nil)))))

To empty an array (does not free the memory):

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_clear" :void
  (array :av))
Defines perl-av-clear (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-clear
  (let ((a (perl-newav)))
    (perl-av-store a 0 (perl-newsviv 34))
    (perl-av-clear a)
    (assert-equal nil (perl-in-lisp::av-fetch-sv a 0 nil))))

To undefine an array and free its memory;

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_undef" :void
  (array :av))
Defines perl-av-undef (links are to index).

<perl-api Exports>+= (U->) [<-D->]

To push a scalar onto the end of the array, automatically enlarging it if necessary:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_push" :void
  (array :av)
  (scalar :sv))
Defines perl-av-push (links are to index).

<perl-api Exports>+= (U->) [<-D->]

And pop a scalar off the end:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_pop" :sv
  (array :av))
Defines perl-av-pop (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-push-pop
  (let ((a (perl-newav)))
    (perl-av-push a (perl-newsvpv "a" 0))
    (perl-av-push a (perl-newsvpv "b" 0))
    (assert-equal "b" (perl-in-lisp::string-from-sv (perl-av-pop a)))
    (assert-equal "a" (perl-in-lisp::string-from-sv (perl-av-pop a)))))

To ``unshift'' an array, i.e. to add undef values to the beginning of the array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_unshift" :void
  (array :av)
  (count :i32))
Defines perl-av-unshift (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-unshift
  (let ((a (perl-newav)))
    (perl-av-unshift a 10)
    (assert-equal 9 (perl-av-len a))))

To shift an SV off the beginning of the array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_shift" :sv
  (array :av))
Defines perl-av-shift (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-shift
  (let ((a (perl-newav)))
    (perl-av-push a (perl-newsvpv "a" 0))
    (perl-av-push a (perl-newsvpv "b" 0))
    (assert-equal "a" (perl-in-lisp::string-from-sv (perl-av-shift a)))
    (assert-equal "b" (perl-in-lisp::string-from-sv (perl-av-shift a )))))

To get the highest index of the array, or -1 if the array is empty:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_len" :i32
  (array :av))
Defines perl-av-len (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-len
  (let ((a (perl-newav)))
    (perl-av-push a (perl-newsv 0))
    (perl-av-push a (perl-newsv 0))
    (perl-av-push a (perl-newsv 0))
    (assert-equal 2 (perl-av-len a))))

To ensure that an array contains elements indexed at least up to fill:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_fill" :void
  (array :av)
  (fill :i32))
Defines perl-av-fill (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test av-fill
  (let ((a (perl-newav)))
    (perl-av-fill a 3)
    (assert-equal 3 (perl-av-len a))))

To delete the element at index key from an array:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_delete" :sv
  (array :av)
  (key :i32)
  (flags :i32))
Defines perl-av-delete (links are to index).

<perl-api Exports>+= (U->) [<-D->]

flags may be :discard from perl-call-flags, in which case the SV element is freed and NULL is returned.

To test if an element at index key has been initialized:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_av_exists" :boolean
  (array :av)
  (key :i32))
Defines perl-av-exists (links are to index).

<perl-api Exports>+= (U->) [<-D->]

We can make perl-aref behave like Lisp's aref. Given an array and an index into that array, return the scalar at that index. This will not correctly handle references or arrays of arrays. Perl's array access function can potentially return a NULL pointer, which gets translated to nil by av-fetch-sv.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun perl-aref (array index)
  (lisp-from-perl (av-fetch-sv array index t)))
Defines perl-aref, perl-array (links are to index).

The scalar returned is the same scalar as the one stored in the array, not a copy. So the normal scalar setting functions will work on the return value of perl-aref, and we do not need another function to store a value in an array.

Finally, to convert a complete Perl array into the equivalent Lisp list, we have:

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun list-from-av (av)
  (loop for i from 0 upto (perl-av-len av)
        collecting (perl-aref av i)))
Defines list-from-av (links are to index).

Perl Hash Tables

A Perl hash table, noted in code as %name, always uses strings as keys.

Creating a new hash:

<Perl API Types>+= (U->) [<-D->]
(defctype :hv :pointer)

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newHV" :hv)
Defines perl-newhv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

Storing a value in it:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_store" :pointer ; SV**
  (hash :hv)
  (key :string)
  (key-length :u32)
  (value :sv)
  (precomputed-hash-value :u32))
Defines perl-hv-store (links are to index).

<perl-api Exports>+= (U->) [<-D->]

key-length must be given as it is not automatically calculated. value's reference count is not automatically incremented.

Retrieving a value:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_fetch" :pointer ; SV**
  (hash :hv)
  (key :string)
  (key-length :u32)
  (lvalue :i32))
Defines perl-hv-fetch (links are to index).

<perl-api Exports>+= (U->) [<-D->]

To check if a hash table entry exists:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_exists" :boolean
  (hash :hv)
  (key :string)
  (key-length :u32))
Defines perl-hv-exists (links are to index).

<perl-api Exports>+= (U->) [<-D->]

To delete the entry:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_delete" :pointer ; SV**
  (hash :hv)
  (key :string)
  (key-length :u32)
  (flags :i32))
Defines perl-hv-delete (links are to index).

<perl-api Exports>+= (U->) [<-D->]

If flags does not include :discard from perl-call-flags then hv_delete will create and return a mortal copy of the deleted value.

To delete all the entries in a hash without deleting the hash itself:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_clear" :void
  (hash :hv))
Defines perl-hv-clear (links are to index).

<perl-api Exports>+= (U->) [<-D->]

And to delete both the entries and the hash itself:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_hv_undef" :void
  (hash :hv))
Defines perl-hv-undef (links are to index).

<perl-api Exports>+= (U->) [<-D->]

Other functions return complete key/value hash structures, or allow SVs to be used as keys, but I believe these are unnecessary for this implementation.

To copy hash tables into Lisp data structures, we will need to be able to iterate over them. The Perl API provides the following routines for this purpose.

<Perl API Types>+= (U->) [<-D->]
(defctype :he :pointer
  :documentation "An entry in a Perl hash table")

<CFFI Definitions>+= (U->) [<-D->]
;; initialize an iterator for the hash
(defcfun "Perl_hv_iterinit" :i32 ; returns # of hash entries
  (hash :hv))

;; advance to the next hash entry
(defcfun "Perl_hv_iternext" :he
  (hash :hv))

;; get the key of the hash entry
(defcfun "Perl_hv_iterkey" :pointer ; char*, may contain NULL
  (hash-entry :he)
  (key-length :pointer)) ; I32*, length of the char*

;; same as above but creates new mortal SV to hold the key
(defcfun "Perl_hv_iterkeysv" :sv
  (hash-entry :he))

;; get the value of the hash entry
(defcfun "Perl_hv_iterval" :sv
  (hash :hv)
  (hash-entry :he))
Defines perl-hv-iterinit, perl-hv-iterkey, perl-hv-iterkeysv, perl-hv-iternext, perl-hv-iterval (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-hv-iterinit #:perl-hv-iternext #:perl-hv-iterkey
#:perl-hv-iterkeysv #:perl-hv-iterval

With these four functions, we can make a function to convert a Perl hash table to a Lisp hash table.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun hash-from-hv (perl-hash)
  (perl-scope ;; because SVs may be mortal copies
   (let ((lisp-hash (make-hash-table :test #'equal))
         (size (perl-hv-iterinit perl-hash)))
     (loop repeat size
           do (let ((entry (perl-hv-iternext perl-hash)))
                (setf (gethash (string-from-sv ; does not work w/ lisp-from-perl, why?
                                (perl-hv-iterkeysv entry))
                       (perl-hv-iterval perl-hash entry)))))
Defines hash-from-hv (links are to index).

And convert a Lisp hash table to a Perl hash table.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun hv-from-hash (lisp-hash)
  (let ((perl-hash (perl-newhv)))
    (maphash #'(lambda (key value)
                 (let ((string-key (string key)))
                    (cstring string-key)
                    (perl-hv-store perl-hash
                                   (length string-key)
                                   (perl-from-lisp value)
Defines hv-from-hash (links are to index).

Perl References

A Perl reference is a scalar that points to something---anything---else. References are created with newRV, which increments the reference count of the source object, and newRV_noinc, which does not.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_newRV" :sv
  (thing :sv))

(defcfun "Perl_newRV_noinc" :sv
  (thing :sv))
Defines perl-newrv, perl-newrv-noinc (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-newrv #:perl-newrv-noinc

The Perl API dereferences with a macro, so we have to do it by digging into the RV/SV structure.

<Perl API Types>+= (U->) [<-D]
(defcstruct xrv
  (rv :sv))

<perl-api Exports>+= (U->) [<-D->]
#:xrv #:rv

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun deref-rv (ref)
  (foreign-slot-value (foreign-slot-value ref 'sv 'any)
                      'xrv 'rv))
Defines deref-rv (links are to index).

<API Tests>+= (U->) [<-D->]
(define-test references
  (let ((s (perl-newsv 0)))
    (assert-equality #'pointer-eq s
                     (deref-rv (perl-newrv s)))))

<API Tests>+= (U->) [<-D->]
(define-test reference-to-array
  (let ((a (perl-newav)))
    (assert-equality #'pointer-eq a
                     (deref-rv (perl-newrv a)))))

Manipulating the Perl Stack

It's a dirty job, but we have to do it. The only way to pass arguments to and return values from functions is to manipulate the Perl stack directly. Ugh. Here goes.

A Digression on Pointers

In order to manipulate the Perl stack, we need to modify the values of several global pointers---not the values the they point to, but the addresses stored in the pointers themselves. However, CFFI's :pointer type is immutable; once a foreign variable is defcvared as a :pointer, one cannot modify the address it contains.

However, foreign variables declared as integers can be modified under CFFI, and in C, pointers are just integers with some additional type information. The only question is which integer type to use, since pointers can be different sizes on different platforms. I cannot cheat here; this has to be correct. Fortunately, CFFI already knows the answer.

<Determine Pointer Size>= (U->)
(defvar *pointer-size*
  (foreign-type-size :pointer)
  "The size of a pointer on the current platform, in bytes.")
Defines *pointer-size* (links are to index).

On this basis, we can create a new ``pointer-as-integer'' type. I will name this new type :address. (This may need to be wrapped in an eval-when.)

<:address Type>= (U->)
(ecase *pointer-size*
  (1 (defctype :address :uint8))   ; unlikely
  (2 (defctype :address :uint16))  ; possible
  (4 (defctype :address :uint32))  ; most common
  (8 (defctype :address :uint64))) ; possible
Defines :address (links are to index).

Since :address is just another name for an integer, CFFI's definitions for setf and its ilk will work correctly. Arithmetic between addresses will also work correctly.

C's ++ and -- operators increment and decrement pointers by the correct number of bytes, so stack operations can be succintly written as *++stack=object or similar. But Lisp's incf will always increment by one unless given a different value. To avoid mistakes, I will define two macros to increment and decrement an address by the size of a pointer.

<Macros for Using :address>= (U->) [D->]
(defmacro address-incf (address &optional (n 1))
  `(incf ,address (* ,n *pointer-size*)))

(defmacro address-decf (address &optional (n 1))
  `(decf ,address (* ,n *pointer-size*)))
Defines address-decf, address-incf (links are to index).

Lastly, we will need to access the value the address points to. We can create an abbreviation that fills the same role as C's * operator.

<Macros for Using :address>+= (U->) [<-D]
(defmacro address-ref (address type)
  `(mem-ref (make-pointer ,address) ,type))
Defines address-ref (links are to index).

Since this is a macro, CFFI's (setf (mem-ref ...)) magic still works. This allows stack operations such as:

(setf (address-ref (address-incf x) ...))

which is equivalent to *++x = ... in C.

I will wrap this functionality in its own package, since it is not directly related to Perl.

;;; address.lisp -- CFFI extension to allow mutable pointers

<License Header>

(in-package :common-lisp-user)

(defpackage :cffi-address
  (:use :common-lisp :cffi)
  (:export #:address-incf #:address-decf #:address-ref))

(in-package :cffi-address)

<Determine Pointer Size>
<:address Type>
<Macros for Using :address>

The Stack Pointer

The Perl API uses a series of macros to create and manipulate a local copy of the stack pointer stored in the global variable PL_stack_sp, of type SV**. The local pointer is declared and given its initial value with the dSP macro.

The only reason I can see for the local copy is so that it can be optimized with the C register keyword. For our purposes, it will be simpler to manipulate the global variable directly.

<CFFI Definitions>+= (U->) [<-D->]
(defcvar "PL_stack_sp" :address)
Defines *pl-stack-sp* (links are to index).

<perl-api Exports>+= (U->) [<-D->]

Then we can translate the PUSHMARK and PUTBACK macros, used to keep track of the number of parameters being pushed onto the stack in a function call. I don't claim to understand exactly what these two macros do; I'm just transcribing their definitions into Lisp. Thep PL_* variables have different definitions in different parts of the Perl API source. They are macros, but they appear as symbols in the object code, so I am hoping it is possible to use them directly.

Here is the C definition of PUSHMARK, from pp.h in the Perl source:

<PUSHMARK macro in C>=
#define PUSHMARK(p) if (++PL_markstack_ptr == PL_markstack_max) \
                        markstack_grow();                       \
                    *PL_markstack_ptr = (p) - PL_stack_base

Translated to Lisp, using global variables, this becomes:

<CFFI Definitions>+= (U->) [<-D->]
(defcvar "PL_markstack_ptr" :address)  ; *pl-markstack-ptr*
(defcvar "PL_markstack_max" :address)  ; *pl-markstack-max*
(defcvar "PL_stack_base" :address)     ; *pl-stack-base*
(defcfun "Perl_markstack_grow" :void)  ; (perl-markstack-grow)

<perl-api Exports>+= (U->) [<-D->]
#:*pl-markstack-ptr* #:*pl-markstack-max* #:*pl-stack-base* #:perl-markstack-grow

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun pushmark ()
  (when (= (address-incf *pl-markstack-ptr*) *pl-markstack-max*)
  (setf (address-ref *pl-markstack-ptr* :address)
        (- *pl-stack-sp* *pl-stack-base*)))
Defines pushmark (links are to index).

If you try to call a function with arguments without first calling pushmark, Perl dies violently with an ``Out of memory!'' error and takes Lisp down with it.

The PUTBACK macro's purpose purpose in C is to reset the global stack pointer to the value of the local copy. Since we are working directly with the global pointer, we can omit PUTBACK.

Pushing Arguments Onto the Stack

The XPUSHs macro in the Perl API is used to push new scalar values onto the Perl stack. The first thing it does is extend the stack if necessary, using the EXTEND macro, which looks like this:

<EXTEND macro in C>=
#define EXTEND(p,n) STMT_START { if (PL_stack_max - p < (int)(n)) { \
                                 sp = stack_grow(sp,p, (int) (n));  \
                               } } STMT_END

The STMT_START and STMT_END do nothing; they are macros used by the Perl source to prevent certain C compiler warnings.

We replace this with a Lisp function that serves the same purpose. Since we are treating PL_stack_sp as an :address, we must declare the Perl_stack_grow function to return an :address (instead of a :pointer) as well.

<CFFI Definitions>+= (U->) [<-D->]
(defcvar "PL_stack_max" :address)
(defcfun "Perl_stack_grow" :address
  (sp :address) (p :address) (n :uint))

<perl-api Exports>+= (U->) [<-D->]
#:*pl-stack-max* #:perl-stack-grow

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun ensure-room-on-stack (n)
  (when (< (- *pl-stack-max* *pl-stack-sp*) n)
    (setf *pl-stack-sp*
          (perl-stack-grow *pl-stack-sp* *pl-stack-sp* n))))
Defines ensure-room-on-stack (links are to index).

The XPUSHs macro looks like this:

<XPUSHs macro in C>=
#define XPUSHs(s) STMT_START { EXTEND(sp,1); (*++sp = (s)); } STMT_END

Now we can define an equivalent to XPUSHs.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun pushs (scalar)
  "Push a scalar value (a pointer) onto the Perl stack."
  (ensure-room-on-stack 1) ; EXTEND
  (setf (address-ref (address-incf *pl-stack-sp*) :address)
        (pointer-address scalar)))
Defines pushs (links are to index).

Popping Values Off the Stack

Popping a scalar value off the Perl stack is, thankfully, much simpler.

The C macro is:

<POPs macro in C>=
#define POPs (*sp--)

Which becomes, in Lisp:

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun pops ()
  "Pop a scalar pointer off the Perl stack."
      (address-ref *pl-stack-sp* :pointer)
    (address-decf *pl-stack-sp*)))
Defines pops (links are to index).

Phew. Let's test that, shall we? We will check that pushing a value on to the stack and popping it off gives back the same value, and check that the stack pointer is in the same place after the operation as it was before.

<API Tests>+= (U->) [<-D->]
(define-test push-and-pop-one-scalar
  (let ((x (perl-newsv 0))
        (old-stack-address *pl-stack-sp*))
    (pushs x)
    (assert-equality #'pointer-eq x
                     (address-ref *pl-stack-sp* :pointer))
    (assert-equality #'pointer-eq x (pops))
    (assert-equal old-stack-address *pl-stack-sp*)))

Scope and Temporary Variables

When creating temporary variables to place on the stack as function arguments, we must define a scope for those variables to live in, and free their memory when we are finished with them. The Perl API provides a set of four macros for this purpose, described in perlcall: ENTER and SAVETMPS begin a new scope for local variables, and FREETMPS and LEAVE end the scope. Within that scope, local variables must be declared ``mortal'' with the sv_2mortal() function. We can imitate all of this in Lisp.

Normally, the SAVETMPS and FREETMPS macros fiddle with a ``temporary value stack'' to avoid calling free_tmps if not necessary. To keep it simple, we will always call free_tmps. This does no harm and should not be a major performance drain. As a result of this simplification, we can completely omit SAVETMPS.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_push_scope" :void) ; ENTER
(defcfun "Perl_free_tmps" :void)  ; FREETMPS
(defcfun "Perl_pop_scope" :void)  ; LEAVE
(defcfun "Perl_sv_2mortal" :sv
  (scalar :sv))
Defines perl-free-tmps, perl-pop-scope, perl-push-scope, perl-sv-2mortal (links are to index).

<perl-api Exports>+= (U->) [<-D->]
#:perl-push-scope #:perl-free-tmps #:perl-pop-scope #:perl-sv-2mortal

Nowe we can create a Lisp macro that neatly packages up the process of creating a Perl scoping block.

<Wrapper Library Macros>= (U->)
(defmacro perl-scope (&body body)
  (let ((return-symbol (gensym)))
       (perl-push-scope)  ; ENTER
       ;; SAVETMPS omitted
       (let ((,return-symbol (progn ,@body))) ; execute body
         (perl-free-tmps) ; FREETMPS
         (perl-pop-scope) ; LEAVE
Defines perl-scope (links are to index).

The let in the middle allows us to return values from the body, which we will want to do when we start calling Perl functions.

We can test the scope by making sure that a scalar declared ``mortal'' inside the scope has its reference count set to zero outside of the scope.

<API Tests>+= (U->) [<-D->]
(define-test perl-scope
  (let ((x))
     (setf x (perl-newsviv 999))
     (perl-sv-2mortal x)
     (assert-equal 999 (perl-sv-2iv x))
     ;; still within the scope block:
     (assert-equal 1 (foreign-slot-value x 'sv 'refcnt)))
    ;; outside the scope block here:
    (assert-equal 0 (foreign-slot-value x 'sv 'refcnt))))

Using the Perl Stack

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun push-mortals-on-stack (args)
  (loop for arg in args
        do (pushs (perl-sv-2mortal (perl-from-lisp arg)))))
Defines push-mortals-on-stack (links are to index).

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun get-from-stack (n)
  (nreverse (loop repeat n
                  collecting (lisp-from-perl (pops)))))

(defun get-stack-values (n)
  (values-list (get-from-stack n)))
Defines get-from-stack, get-stack-values (links are to index).

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test stack
  (let ((things (list 1 2 "hello" -27)))
     (push-mortals-on-stack things)
     (assert-equal things (get-from-stack (length things))))))

Calling Perl Functions


Calling Perl functions (or subroutines, as Perl calls them) always boils down to a single function, Perl_call_sv, which takes a scalar argument which can be a the name of function (a string) or an anonymous function reference. All parameter passing to and from the Perl function is done on the Perl stack.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_call_sv" :i32
  (name :sv)
  (flags :i32))
Defines perl-call-sv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

The second argument is a bitfield specifying the type of function to call, the context (array, scalar, or void) in which to call it, and how to handle errors. Here are the values, copied from Perl's cop.h along with their documenting comments.

<CFFI Definitions>+= (U->) [<-D->]
(defbitfield perl-call-flags
  (:scalar   0)  ; call in scalar context
  (:array    1)  ; call in array context
  (:void   128)  ; call in void context (no return values)
  (:discard  2)  ; Call FREETMPS.
  (:eval     4)  ; Assume `eval {}' around subroutine call.
  (:noargs   8)  ; Don't construct a @_ array.
  (:keeperr 16)  ; Append errors to $@, don't overwrite it.
  (:nodebug 32)  ; Disable debugging at toplevel.
  (:method  64)) ; Calling method. 
Defines perl-call-flags (links are to index).


<perl-api Exports>+= (U->) [<-D->]

A shortcut exists that takes a C string as its argument instead of a scalar:

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_call_pv" :i32
  (name :string)
  (flags :i32))
Defines perl-call-pv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test perl-call-pv
  (assert-equal 1  ; 1 value was return on the stack
"sub meaning_of_life { print \"\\nThis should be forty-two: \",
 $_[0], \"\\n\"; return $_[0]; }" t)
                  (pushs (perl-sv-2mortal (perl-newsviv 42)))
                  (perl-call-pv "meaning_of_life"
  (assert-equal 42 (perl-sv-2iv (pops))))

We can abstract out the foreign bitfield with a function. We will export this function so that libraries that use this library will not need to import any symbols from CFFI. Here, flags should be a list.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun perl-call-scalar (scalar flags)
  (perl-call-sv scalar (foreign-bitfield-value
                         'perl-call-flags flags)))

(defun perl-call-function (name flags)
  (perl-call-pv name (foreign-bitfield-value
                         'perl-call-flags flags)))
Defines perl-call-function, perl-call-scalar (links are to index).

<API Tests>+= (U->) [<-D->]
(define-test perl-call-function
  (assert-equal 1 (progn
"sub meaning_of_life { print \"\\nThis should be forty-two: \",
 $_[0], \"\\n\"; return $_[0]; }" t)
                    (pushs (perl-sv-2mortal (perl-newsviv 42)))
                    (perl-call-function "meaning_of_life" '(:scalar))))
  (assert-equal 42 (perl-sv-2iv (pops))))

Perl Calling Contexts

Users of this library should not have to worry about the special flags used when calling Perl functions from C. However, we can't entirely shield them from Perl's notion of calling contexts. Perl functions can be called in scalar context, list context, or void context. The interpreter determines the context based on how the return value of the function is used. For example:

<sample Perl code>=
my $scalar = func();  # called in scalar context
my @array = func();  # called in list context
func();  # called in void context (return value not used)

In this example, func may return entirely different values in each of the contexts in which it is called. We cannot recreate this behavior in Lisp without doing ghastly things to syntax and functional purity.

Furthermore, in Perl the number 5 is indestinguishable from the string ``5''---both are scalars. Lisp is dynamically typed, but not that dynamic; it does has some modesty.

The simplest albeit not the prettiest way out of this dilemma is to force the user to specify the type of the return value. Thus, in the eval-perl, call-perl, and call-perl-method functions, below, the first argument specifies the return type.

An argument of nil will call the function in void context and will return nothing. Most functions will behave the same way in a void context as they do in scalar context, they simply discard their return value.

The following arguments will call the function in a scalar context:

The following arguments will call the function in a list context:

Note that Perl does not have an explicit ``hash context'' for calling functions. Perl functions that return a hash table actually return a list in the form ``key1, value1, key2, value2.'' Assigning this list to a Perl hash variable causes it to be interpreted as a hash table. Again, Lisp is not quite that dynamic, so we must specify the result type.

The following function will return the correct flag, :void, :scalar, or :array, to use when calling Perl, based on the given return type.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun context-from-type (type)
   ((null type) :void)
   ((find type '(:integer :float :string t)) :scalar)
   ((find type '(:list :array :alist :hash)) :array)
   (t (error "No Perl calling context for type ~a" type))))
Defines context-from-type (links are to index).

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun calling-flags (type methodp)
  (let ((flags (list :eval (context-from-type type))))
    (when methodp (push :method flags))
Defines calling-flags (links are to index).


<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun get-stack-by-type (type count)
  (declare (ignore type)) ;; fix me
  (get-stack-values count))
Defines get-stack-by-type (links are to index).

Public Interface


We cannot use :discard in the calling flags because that would destroy the return value before we can use it (discovered by trial and error).

<Wrapper Library Public Functions>+= (U->) [<-D->]
(defun call-perl (function return-type methodp &rest args)
   (push-mortals-on-stack args)
    (funcall (if (stringp function) #'perl-call-function
               ;; either a scalar string or a code reference
             (calling-flags return-type methodp)))))
Defines call-perl (links are to index).

<Wrapper Library Exports>+= (U->) [<-D->]

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test call-perl
  (eval-perl "use CGI;")
  (assert-equal "<p>Hello, 1999</p>"
                (call-perl "CGI::p" :string nil "Hello," 1999)))

Evaluating Perl Code

We can evaluate arbitrary strings of Perl code with the Perl_eval_pv function. Its first argument is a string of Perl code, which may be an expression, a semicolon-terminated statement, or multiple semicolon-separated statements. The second argument is a boolean specifying whether or not the process should die if a Perl error occurs.

Perl_eval_pv always returns a single scalar as its result, so the given statements or expressions will be evaluated in scalar context.

<CFFI Definitions>+= (U->) [<-D->]
(defcfun "Perl_eval_pv" :sv
  (code :string)
  (die-on-error :boolean))
Defines perl-eval-pv (links are to index).

<perl-api Exports>+= (U->) [<-D->]

<API Tests>+= (U->) [<-D->]
(define-test eval-pv-expressions
  (assert-equal 7 (perl-sv-2iv (perl-eval-pv "3 + 4" nil)))
  (assert-equal "7" (perl-in-lisp::string-from-sv (perl-eval-pv "3 + 4" nil)))
  (assert-equal "olleh" (perl-in-lisp::string-from-sv
                         (perl-eval-pv "reverse 'hello'" nil))))

Anything that can go in a normal Perl script can be used in Perl_eval_pv: you can load other modules, create variables, and declare packages.

<API Tests>+= (U->) [<-D->]
(define-test eval-pv-multi-statement
  (assert-equal "<p align=\"center\">Hello, world!</p>"
                 (perl-eval-pv "
package PerlInLisp::Tests;
use CGI;
my $cgi = new CGI;
$cgi->p({align=>'center'}, 'Hello, world!');" nil))))

Note that a single call to Perl_eval_pv defines a block of Perl scope. Local variables declared with my will not retain their value between calls.

<API Tests>+= (U->) [<-D->]
(define-test eval-pv-local-scope
  (assert-equal 385 (perl-sv-2uv (perl-eval-pv "my $x = 385" nil)))
  (assert-equal 0 (perl-sv-2uv (perl-eval-pv "$x" nil))))

To keep values between calls, you must use package-global variables.

<API Tests>+= (U->) [<-D->]
(define-test eval-pv-global-scope
  (assert-equal 200 (perl-sv-2uv (perl-eval-pv "$var = 200" nil)))
  (assert-equal 200 (perl-sv-2uv (perl-eval-pv "$var" nil))))

Perl_eval_pv is actually only a shortcut for the more general Perl_eval_sv, which takes the code argument as a scalar. Its second argument is the same set of flags as those used by Perl_call_sv. Also like call_sv, its integer return value is the number of result values placed on the Perl stack.

<CFFI Definitions>+= (U->) [<-D]
(defcfun "Perl_eval_sv" :i32
  (code :sv)
  (flags :i32))
Defines perl-eval-sv (links are to index).

<perl-api Exports>+= (U->) [<-D]

<API Tests>+= (U->) [<-D->]
(define-test eval-sv
  (assert-equal 1 (perl-eval-sv (perl-newsvpv "20 + 7" 0)
                                 '(:scalar :eval))))
  (assert-equal 27 (perl-sv-2iv (pops))))

We can abstract away the bitfield here just as with the perl-call functions.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun perl-eval-scalar (scalar flags)
  (perl-eval-sv scalar (foreign-bitfield-value
                        'perl-call-flags flags)))
Defines perl-eval-scalar (links are to index).

<API Tests>+= (U->) [<-D]
(define-test perl-eval-scalar
  (assert-equal 1 (perl-eval-scalar (perl-newsvpv "33+1" 0) '(:scalar)))
  (assert-equal 34 (perl-sv-2iv (pops))))

This wrapper function will evaluate the given string of Perl code in scalar context, returning whatever that code returns, automatically converted to the most likely Lisp type.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defun eval-perl (code)
    (perl-eval-scalar (perl-sv-2mortal (perl-newsvpv code 0))
                      '(:scalar :eval)))))
Defines eval-perl (links are to index).

<Wrapper Library Tests>+= (U->) [<-D->]
(define-test eval-perl
  (assert-equal 7 (eval-perl "3+4"))
  (assert-equal "abcdef" (eval-perl "'abc' . 'def'"))
  (assert-equal '(1 2 3) (eval-perl "[1, 2, 3];")))

(define-test eval-perl-with-hash
  (let ((hash
          "{aa => 1, bb => 3.14, cc => 'hello'};")))
    (assert-true (hash-table-p hash))
    (assert-equal 1 (gethash "aa" hash))
    (assert-true (< 3.13 (gethash "bb" hash) 3.15))
    (assert-equal "hello" (gethash "cc" hash))

(define-test eval-perl-creating-hash
  (let ((hash (make-hash-table)))
    (setf (gethash 'key1 hash) "one")
    (setf (gethash 'key2 hash) "two")
    (setf (gethash 'key3 hash) "three")
    (let ((new-hash (lisp-from-perl (perl-from-lisp hash))))
      (assert-true (hash-table-p new-hash))
      (assert-equal "one" (gethash "KEY1" new-hash))
      (assert-equal "two" (gethash "KEY2" new-hash))
      (assert-equal "three" (gethash "KEY3" new-hash)))))

Loading Perl Modules

The Perl API provides the load_module function as an equivalent to the use directive in Perl code. I could never make it work correctly, and it seems to require a module version number anyway. As a simpler alternative, evaluate a standard Perl use statement in an eval-perl.

<Wrapper Library Public Functions>+= (U->) [<-D]
(defun use-perl-module (name &optional version)
  (eval-perl (format nil "use ~A ~@[~A~];" name version)))
Defines use-perl-module (links are to index).

<Wrapper Library Exports>+= (U->) [<-D]

Automatic Type Conversions

It would be really useful to have generic functions that would convert automatically between appropriate types. Perl arrays can become Lisp lists, hash tables can be hashes, and so on.

Here I follow the slightly out-of-fashion Hungarian notation of naming conversion functions ``x FROM y'' rather than ``y TO x.'' I find the former easier to read, because it puts the type name closest to the object that is of that type.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defgeneric perl-from-lisp (value))

(defmethod perl-from-lisp ((value integer))
   ((and (<= 0 value 4294967295)) ;; 32-bit unsigned integers
    (perl-newsvuv value))
   ((and (> 0 value -2147483648)) ;; 32-bit signed integers
    (perl-newsviv value))
   (t (error "Integer value out of range for Perl;
BigInts not supported"))))

(defmethod perl-from-lisp ((value float))
   ;; ensure VALUE is a double-float
   (float value 1.0d0)))

(defmethod perl-from-lisp ((value string))
  (perl-newsvpv value 0))

(defmethod perl-from-lisp ((value list))
  (let ((a (perl-newav)))
    (loop for i in value
          ;; Perl's "push" pushes to the *end* of the array
          do (perl-av-push a (perl-from-lisp i)))

(defmethod perl-from-lisp ((value hash-table))
  (hv-from-hash value))
Defines perl-from-lisp (links are to index).

While the Perl API uses &PL_sv_undef to indicate an undef value, the recommended way to add undefined values to arrays and hashes is to create a new empty scalar.

<Wrapper Library Internal Functions>+= (U->) [<-D->]
(defmethod perl-from-lisp ((value null)) ; NIL isn't a class; NULL is
  (perl-newsv 0))

<Wrapper Library Internal Functions>+= (U->) [<-D]
(defun lisp-from-perl (p)
  (ecase (svtype p)
    (:null nil)
    (:iv (perl-sv-2iv p))
    (:nv (perl-sv-2nv p))
    (:rv (lisp-from-perl (deref-rv p)))
    (:pv (string-from-sv p))
    (:pviv (cffi:mem-ref p :iv))
    (:pvnv (cffi:mem-ref p :nv))
    (:pvmg (error "Blessed or magical scalars not supported yet"))
    (:pvav (list-from-av p))
    (:pvhv (hash-from-hv p))))
Defines lisp-from-perl (links are to index).

<Wrapper Library Tests>+= (U->) [<-D]
(define-test lisp-from-perl-scalars
  (assert-equal 42 (lisp-from-perl (perl-from-lisp 42)))
  (assert-equal "Hello, world!"
                (lisp-from-perl (perl-from-lisp "Hello, world!")))
   ;; we can't get exact equality from floats
   (< 3.14589 (lisp-from-perl (perl-from-lisp 3.1459)) 3.14591))
  (assert-equal nil (lisp-from-perl (perl-from-lisp nil))))


<perl-api Package Definition>= (U->)
;;;; perlapi.lisp - CFFI definitions for the Perl C API

<License Header>

(cl:in-package :common-lisp-user)

(defpackage :perl-api
  (:use :common-lisp :cffi :cffi-address)
  (:export <perl-api Exports>))

(in-package :perl-api)
Defines perl-api (links are to index).

<perl-in-lisp Package Definition>= (U->)
;;;; Perl-in.lisp - Lisp interface to the Perl API

<License Header>

(cl:in-package :common-lisp-user)

(defpackage :perl-in-lisp
  (:use :common-lisp :cffi :cffi-address :perl-api)
  (:nicknames :perl)
  (:export <Wrapper Library Exports>))

(in-package :perl-in-lisp)

ASDF System Definition

;;;; perl-in-lisp.asd - ASDF definition for a Lisp interface to Perl

<License Header>

(defpackage :perl-in-lisp.system
  (:documentation "ASDF system package for PERL-IN-LISP.")
  (:use :common-lisp :asdf))

(in-package :perl-in-lisp.system)

(defsystem :perl-in-lisp
  :components ((:static-file "perl-in-lisp.asd")
               (:module :src
                        :serial t
                        :components ((:file "address")
                                     (:file "perlapi")
                                     (:file "perl-in"))))
  :depends-on (:cffi))

(defsystem :perl-in-lisp.test
  :components ((:module :tests
                        :serial t
                        :components ((:file "lisp-unit")
                                     (:file "tests"))))
  :depends-on (:perl-in-lisp))

(defmethod perform ((op asdf:test-op) (system (eql (find-system :perl-in-lisp))))
  (asdf:oos 'asdf:load-op :perl-in-lisp.test)
  (format t "Tests loaded.
Change to package PERL-IN-LISP and execute
(RUN-TESTS) to run all tests."))

Output Files

<perl-api Package Definition>
<Libperl foreign library definition>
<Perl API Types>
<CFFI Definitions>

<perl-in-lisp Package Definition>
<Wrapper Library Globals>
<Wrapper Library Macros>
<Wrapper Library Internal Functions>
<Wrapper Library Public Functions>

;;;; tests.lisp -- Unit tests (with Lisp-Unit) for Perl-in-Lisp

<License Header>

(in-package :common-lisp-user)

;; (defpackage :perl-in-lisp.test
;;   (:use :common-lisp :perl-in-lisp :perl-api
;;      :lisp-unit :cffi :cffi-address)
;;   (:export #:run-tests)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (use-package :lisp-unit :perl-in-lisp))

(in-package :perl-in-lisp)
<API Tests>
<Wrapper Library Tests>

Development Aids



Edit this Makefile as needed, then extract it with the following command:

notangle -t8 perl-in-lisp.nw >Makefile

To generate the source code and documentation (DVI) run make.

The other defined makefile targets are:

doc --- only generate the documentation

code --- only extract the source code

pdf --- generate PDF documentation instead of DVI (requires pdflatex)

html --- generate HTML documentation

To re-extract this Makefile, run make remake.

ENSURE_DIR=mkdir -p
FASLS=*.fasl *.fas *.lib *.x86f

all: code doc

code: perl-in-lisp.nw
        $(ENSURE_DIR) src tests
        $(TANGLE) -Rperl-in-lisp.asd  perl-in-lisp.nw >perl-in-lisp.asd
        $(TANGLE) -Raddress.lisp  perl-in-lisp.nw >src/address.lisp
        $(TANGLE) -Rperlapi.lisp  perl-in-lisp.nw >src/perlapi.lisp
        $(TANGLE) -Rperl-in.lisp perl-in-lisp.nw >src/perl-in.lisp
        $(TANGLE) -Rtests.lisp  perl-in-lisp.nw >tests/tests.lisp

doc: perl-in-lisp.nw
        $(ENSURE_DIR) doc
        $(WEAVE) -t8 -latex -delay -index perl-in-lisp.nw >doc/perl-in-lisp.tex
        # run latex twice to get references right
        cd doc; $(LATEX) perl-in-lisp.tex; $(LATEX) perl-in-lisp.tex

# pdf depends on doc to ensure latex was already run once to generate
# references and table of contents
pdf: doc
        cd doc; $(PDFLATEX) perl-in-lisp.tex

html: perl-in-lisp.nw
        $(WEAVE) -index -html -filter l2h perl-in-lisp.nw | htmltoc >doc/perl-in-lisp.html

remake: perl-in-lisp.nw
        $(TANGLE) -t8 perl-in-lisp.nw >Makefile

        rm -f *~ *.out $(FASLS)
        cd src; rm -f $(FASLS)
        cd tests; rm -f $(FASLS)
        cd doc; rm -f *.aux *.log *.tex *.toc

dist: remake code doc html clean
        cd doc; rm -f *.dvi *.pdf

List of All Code Chunks

This list is automatically generated by Noweb.

Symbol Index

This list is automatically generated by Noweb. The underlined number after each symbol specifies the page and code chunk on which that symbol is defined; other numbers specify pages and chunks where that symbol is used.

License (LLGPL)

<License Header>= (<-U <-U <-U <-U <-U)
;;; Copyright 2006 Stuart Sierra

;;; This program is free software; you can redistribute it and/or
;;; modify it under the terms of the Lisp Lesser GNU General Public
;;; License (LLGPL) published by Franz, Inc., available at

;;; This program is distributed in the hope that it will be useful,
;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;; Lesser GNU General Public License for more details.