@dircategory Scheme Programming @direntry * guile-ref: (guile-ref). The Guile Reference Manual.
Copyright (C) 1996 Free Software Foundation
Copyright (C) 1997 Free Software Foundation
Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies.
Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.
Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by Free Software Foundation.
OK, enough is enough. I can see that I'm not going to be able to fool you guys. I confess everything. You're right. It all was an evil conspiracy. There really isn't a shred of merit in Tcl, or C++, or Perl, or C; there is not a single reason on earth why anyone should use any of these languages for any programming task. Scheme truly is the perfect language that solves every problem and combines the virtues of every other language. For years we've been plotting to trick programmers into using bad languages. Yes, I mean "we". Many many people have participated in this sinister plot, including Larry Wall, Dennis Ritchie, Bill Gates, the Bureau of ATF, most of the LAPD, and Mark Fuhrman (sorry you guys, but the truth has overwhelmed me so I've been forced to expose you). I feel just terrible at how I have set the programming world back, and I promise to be a good boy from now on.
--- John Ousterhout
@unnumbered{Part I: Preliminaries}
Guile is an interpreter for the Scheme programming language, packaged for use in a wide variety of environments. Guile implements Scheme as described in the Report on the Algorithmic Language Scheme (usually known as R4RS), providing clean and general data and control structures. Guile goes beyond the rather austere language presented in R4RS, extending it with a module system, full access to POSIX system calls, networking support, multiple threads, dynamic linking, a foreign function call interface, powerful string processing, and many other features needed for programming in the real world.
Like a shell, Guile can run interactively, reading expressions from the user, evaluating them, and displaying the results, or as a script interpreter, reading and executing Scheme code from a file. However, Guile is also packaged as an object library, allowing other applications to easily incorporate a complete Scheme interpreter. An application can use Guile as an extension language, a clean and powerful configuration language, or as multi-purpose "glue", connecting primitives provided by the application. It is easy to call Scheme code from C code and vice versa, giving the application designer full control of how and when to invoke the interpreter. Applications can add new functions, data types, control structures, and even syntax to Guile, creating a domain-specific language tailored to the task at hand, but based on a robust language design.
Guile's module system allows one to break up a large program into manageable sections with well-defined interfaces between them. Modules may contain a mixture of interpreted and compiled code; Guile can use either static or dynamic linking to incorporate compiled code. Modules also encourage developers to package up useful collections of routines for general distribution; as of this writing, one can find Emacs interfaces, database access routines, compilers, GUI toolkit interfaces, and HTTP client functions, among others.
In the future, we hope to expand Guile to support other languages like Tcl and Perl by compiling them to Scheme code. This means that users can program applications which use Guile in the language of their choice, rather than having the tastes of the application's author imposed on them.
This manual assumes you know Scheme, as described in R4RS. From there, it describes:
Finally, the appendices explain how to obtain the latest version of Guile, how to install it, where to find modules to work with Guile, and how to use the Guile debugger.
In its simplest form, Guile acts as an interactive interpreter for the
Scheme programming language, reading and evaluating Scheme expressions
the user enters from the terminal. Here is a sample interaction between
Guile and a user; the user's input appears after the $
and
guile>
prompts:
$ guile guile> (+ 1 2 3) ; add some numbers 6 guile> (define (factorial n) ; define a function (if (zero? n) 1 (* n (factorial (- n 1))))) guile> (factorial 20) 2432902008176640000 guile> (getpwnam "jimb") ; find my entry in /etc/passwd #("jimb" ".0krIpK2VqNbU" 4008 10 "Jim Blandy" "/u/jimb" "/usr/local/bin/bash") guile> C-d $
Like AWK, Perl, or any shell, Guile can interpret script files. A Guile script is simply a file of Scheme code with some extra information at the beginning which tells the operating system how to invoke Guile, and then tells Guile how to handle the Scheme code.
Before we present the details, here is a trivial Guile script:
#!/usr/local/bin/guile -s !# (display "Hello, world!") (newline)
The first line of a Guile script must tell the operating system to use Guile to evaluate the script, and then tell Guile how to go about doing that. Here is the simplest case:
Guile reads the program, evaluating expressions in the order that they appear. Upon reaching the end of the file, Guile exits.
The function command-line
returns the name of the script file and
any command-line arguments passed by the user, as a list of strings.
For example, consider the following script file:
#!/usr/local/bin/guile -s !# (write (command-line)) (newline)
If you put that text in a file called `foo' in the current directory, then you could make it executable and try it out like this:
$ chmod a+x foo $ ./foo ("./foo") $ ./foo bar baz ("./foo" "bar" "baz") $
As another example, here is a simple replacement for the POSIX
echo
command:
#!/usr/local/bin/guile -s !# (for-each (lambda (s) (display s) (display " ")) (cdr (command-line))) (newline)
Here we describe Guile's command-line processing in detail. Guile processes its arguments from left to right, recognizing the switches described below. For examples, see section Scripting Examples.
-s script arg...
load
function would. After loading script, exit. Any
command-line arguments arg... following script become the
script's arguments; the command-line
function returns a list of
strings of the form (script arg...)
.
-c expr arg...
command-line
function returns a list of strings of the form
(guile arg...)
, where guile is the path of the
Guile executable.
-- arg...
--
become command-line arguments for the interactive session; the
command-line
function returns a list of strings of the form
(guile arg...)
, where guile is the path of the
Guile executable.
-l file
-e function
-s
) or evaluating the expression (with
-c
), apply function to a list containing the program name
and the command-line arguments -- the list provided by the
command-line
function.
A -e
switch can appear anywhere in the argument list, but Guile
always invokes the function as the last action it performs.
This is weird, but because of the way script invocation works under
POSIX, the -s
option must always come last in the list.
See section Scripting Examples.
-ds
-s
option as if it occurred at this point in the
command line; load the script here.
This switch is necessary because, although the POSIX script invocation
mechanism effectively requires the -s
option to appear last, the
programmer may well want to run the script before other actions
requested on the command line. For examples, see section Scripting Examples.
\
--emacs
#t
.
This switch is still experimental.
-h, --help
-v, --version
Guile's command-line switches allow the programmer to describe reasonably complicated actions in scripts. Unfortunately, the POSIX script invocation mechanism only allows one argument to appear on the `#!' line after the path to the Guile executable, and imposes arbitrary limits on that argument's length. Suppose you wrote a script starting like this:
#!/usr/local/bin/guile -e main -s !# (define (main args) (map (lambda (arg) (display arg) (display " ")) (cdr args)) (newline))
The intended meaning is clear: load the file, and then call main
on the command-line arguments. However, the system will treat
everything after the Guile path as a single argument -- the string
"-e main -s"
-- which is not what we want.
As a workaround, the meta switch \
allows the Guile programmer to
specify an arbitrary number of options without patching the kernel. If
the first argument to Guile is \
, Guile will open the script file
whose name follows the \
, parse arguments starting from the
file's second line (according to rules described below), and substitute
them for the \
switch.
Working in concert with the meta switch, Guile treats the characters `#!' as the beginning of a comment which extends through the next line containing only the characters `!#'. This sort of comment may appear anywhere in a Guile program, but it is most useful at the top of a file, meshing magically with the POSIX script invocation mechanism.
Thus, consider a script named `/u/jimb/ekko' which starts like this:
#!/usr/local/bin/guile \ -e main -s !# (define (main args) (map (lambda (arg) (display arg) (display " ")) (cdr args)) (newline))
Suppose a user invokes this script as follows:
$ /u/jimb/ekko a b c
Here's what happens:
/usr/local/bin/guile \ /u/jimb/ekko a b cThis is the usual behavior, prescribed by POSIX.
\ /u/jimb/ekko
, it opens
`/u/jimb/ekko', parses the three arguments -e
, main
,
and -s
from it, and substitutes them for the \
switch.
Thus, Guile's command line now reads:
/usr/local/bin/guile -e main -s /u/jimb/ekko a b c
(main "/u/jimb/ekko" "a" "b" "c")
.
When Guile sees the meta switch \
, it parses command-line
argument from the script file according to the following rules:
""
.
\n
and
\t
are also supported. These produce argument constituents; the
two-character combination \n
doesn't act like a terminating
newline. The escape sequence \NNN
for exactly three octal
digits reads as the character whose ASCII code is NNN. As above,
characters produced this way are argument constituents. Backslash
followed by other characters is not allowed.
To start with, here are some examples of invoking Guile directly:
guile -- a b c
(command-line)
will return
("/usr/local/bin/guile" "a" "b" "c")
.
guile -s /u/jimb/ex2 a b c
(command-line)
will return
("/u/jimb/ex2" "a" "b" "c")
.
guile -c '(write %load-path) (newline)'
%load-path
, print a newline,
and exit.
guile -e main -s /u/jimb/ex4 foo
main
, passing it the list ("/u/jimb/ex4" "foo")
.
guile -l first -ds -l last -s script
-ds
switch says when to process the -s
switch. For a more motivated example, see the scripts below.
Here is a very simple Guile script:
#!/usr/local/bin/guile -s !# (display "Hello, world!") (newline)
The first line marks the file as a Guile script. When the user invokes
it, the system runs `/usr/local/bin/guile' to interpret the script,
passing -s
, the script's filename, and any arguments given to the
script as command-line arguments. When Guile sees -s
script
, it loads script. Thus, running this program
produces the output:
Hello, world!
Here is a script which prints the factorial of its argument:
#!/usr/local/bin/guile -s !# (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (display (fact (string->number (cadr (command-line))))) (newline)
In action:
$ fact 5 120 $
However, suppose we want to use the definition of fact
in this
file from another script. We can't simply load
the script file,
and then use fact
's definition, because the script will try to
compute and display a factorial when we load it. To avoid this problem,
we might write the script this way:
#!/usr/local/bin/guile \ -e main -s !# (define (fact n) (if (zero? n) 1 (* n (fact (- n 1))))) (define (main args) (display (fact (string->number (cadr args)))) (newline))
This version packages the actions the script should perform in a
function, main
. This allows us to load the file purely for its
definitions, without any extraneous computation taking place. Then we
used the meta switch \
and the entry point switch -e
to
tell Guile to call main
after loading the script.
$ fact 50 30414093201713378043612608166064768844377641568960512000000000000
Suppose that we now want to write a script which computes the
choose
function: given a set of m distinct objects,
(choose n m)
is the number of distinct subsets
containing n objects each. It's easy to write choose
given
fact
, so we might write the script this way:
#!/usr/local/bin/guile \ -l fact -e main -s !# (define (choose n m) (/ (fact m) (* (fact (- m n)) (fact n)))) (define (main args) (let ((n (string->number (cadr args))) (m (string->number (caddr args)))) (display (choose n m)) (newline)))
The command-line arguments here tell Guile to first load the file
`fact', and then run the script, with main
as the entry
point. In other words, the choose
script can use definitions
made in the fact
script. Here are some sample runs:
$ choose 0 4 1 $ choose 1 4 4 $ choose 2 4 6 $ choose 3 4 4 $ choose 4 4 1 $ choose 50 100 100891344545564193334812497256
The Guile interpreter is available as an object library, to be linked into applications using Scheme as a configuration or extension language. This chapter covers the mechanics of linking your program with Guile on a typical POSIX system.
Parts III and IV of this manual describe the C functions Guile provides. Furthermore, any Scheme function described in this manual as a "Primitive" is also callable from C; see section Relationship between Scheme and C functions.
The header file <libguile.h>
provides declarations for all of
Guile's functions and constants. You should #include
it at the
head of any C source file that uses identifiers described in this
manual.
Once you've compiled your source files, you can link them against Guile
by passing the flag -lguile
to your linker. If you installed
Guile with multi-thread support (by passing --enable-threads
to
the configure
script), you may also need to link against the
QuickThreads library, -lqt
. Guile refers to various mathematical
functions, so you will probably need to link against the mathematical
library, -lm
, as well.
To initialize Guile, use this function:
exit (0)
;
scm_boot_guile
never returns. If you want some other exit value,
have main_func call exit itself.
scm_boot_guile
arranges for the Scheme command-line
function to return the strings given by argc and argv. If
main_func modifies argc or argv, it should call
scm_set_program_arguments
with the final list, so Scheme code
will know which arguments have been processed.
scm_boot_guile
establishes a catch-all error handler which prints
an error message and exits the process. This means that Guile exits in
a coherent way if a system error occurs and the user isn't prepared to
handle it. If the user doesn't like this behavior, they can establish
their own universal catcher in main_func to shadow this one.
Why must the caller do all the real work from main_func? Guile's
garbage collector assumes that all local variables which reference
Scheme objects will be above scm_boot_guile
's stack frame on the
stack. If you try to manipulate Scheme objects after this function
returns, it's the luck of the draw whether Guile's storage manager will
be able to find the objects you allocate. So, scm_boot_guile
function exits, rather than returning, to discourage you from making
that mistake.
One common way to use Guile is to write a set of C functions which
perform some useful task, make them callable from Scheme, and then link
the program with Guile. This yields a Scheme interpreter just like
guile
, but augmented with extra functions for some specific
application -- a special-purpose scripting language.
In this situation, the application should probably process its command-line arguments in the same manner as the stock Guile interpreter. To make that straightforward, Guile provides this function:
guile
executable. This includes loading the normal Guile initialization
files, interacting with the user or running any scripts or expressions
specified by -s
or -e
options, and then exiting.
See section Invoking Guile, for more details.
Since this function does not return, you must do all application-specific initialization before calling this function.
If you do not use this function to start Guile, you are responsible for making sure Guile's usual initialization files, `init.scm' and `ice-9/boot-9.scm', get loaded. This will change soon.
Here is `simple-guile.c', source code for a main
and an
inner_main
function that will produce a complete Guile
interpreter.
/* simple-guile.c --- how to start up the Guile interpreter from C code. */ /* Get declarations for all the scm_ functions. */ #include <libguile.h> static void inner_main (void *closure, int argc, char **argv) { /* module initializations would go here */ scm_shell (argc, argv); } int main (int argc, char **argv) { scm_boot_guile (argc, argv, inner_main, 0); return 0; /* never reached */ }
The main
function calls scm_boot_guile
to initialize
Guile, passing it inner_main
. Once scm_boot_guile
is
ready, it invokes inner_main
, which calls scm_shell
to
process the command-line arguments in the usual way.
Here is a Makefile which you can use to compile the above program.
# Use GCC, if you have it installed. CC=gcc # Tell the C compiler where to find <libguile.h> and -lguile. CFLAGS=-I/usr/local/include -L/usr/local/lib # Include -lqt and -lrx if they are present on your system. LIBS=-lguile -lqt -lrx -lm simple-guile: simple-guile.o ${CC} ${CFLAGS} simple-guile.o ${LIBS} -o simple-guile simple-guile.o: simple-guile.c ${CC} -c ${CFLAGS} simple-guile.c
If you are using the GNU Autoconf package to make your application more
portable, Autoconf will settle many of the details in the Makefile above
automatically, making it much simpler and more portable; we recommend
using Autoconf with Guile. Here is a `configure.in' file for
simple-guile
, which Autoconf can use as a template to generate a
configure
script:
AC_INIT(simple-guile.c) # Find a C compiler. AC_PROG_CC # Check for libraries. AC_CHECK_LIB(m, sin) AC_CHECK_LIB(rx, regcomp) AC_CHECK_LIB(qt, main) AC_CHECK_LIB(guile, scm_boot_guile) # Generate a Makefile, based on the results. AC_OUTPUT(Makefile)
Here is a Makefile.in
template, from which the configure
script produces a Makefile customized for the host system:
# The configure script fills in these values. CC=@CC@ CFLAGS=@CFLAGS@ LIBS=@LIBS@ simple-guile: simple-guile.o ${CC} ${CFLAGS} simple-guile.o ${LIBS} -o simple-guile simple-guile.o: simple-guile.c ${CC} -c ${CFLAGS} simple-guile.c
The developer should use Autoconf to generate the `configure' script from the `configure.in' template, and distribute `configure' with the application. Here's how a user might go about building the application:
$ ls Makefile.in configure* configure.in simple-guile.c $ ./configure creating cache ./config.cache checking for gcc... gcc checking whether the C compiler (gcc ) works... yes checking whether the C compiler (gcc ) is a cross-compiler... no checking whether we are using GNU C... yes checking whether gcc accepts -g... yes checking for sin in -lm... yes checking for regcomp in -lrx... yes checking for main in -lqt... yes checking for scm_boot_guile in -lguile... yes updating cache ./config.cache creating ./config.status creating Makefile $ make gcc -c -g -O2 simple-guile.c gcc -g -O2 simple-guile.o -lguile -lqt -lrx -lm -o simple-guile $ ./simple-guile guile> (+ 1 2 3) 6 guile> (getpwnam "jimb") #("jimb" "83Z7d75W2tyJQ" 4008 10 "Jim Blandy" "/u/jimb" "/usr/local/bin/bash") guile> (exit) $
@unnumbered{Part II: Scheme Extensions}
The current "standard" for the Scheme language is the Revised^4 Report on the Algorithmic Language Scheme (r4rs), and it is commonly referred to as R4RS. Most Scheme implementations conform to all the required features in R4RS as well as the optional ones.
But most Scheme implementations go beyond R4RS in some ways, mostly because R4RS does not give specifications (or even recommendations) regarding some issues that are quite important in practical programming.
[FIXME: hmm; what else goes in this chapter? we have chapters for just about everything. I'll put a reference to R4RS and leave it.]
The Scheme language implemented in Guile is R4RS compliant, so R4RS is a valid document describing the basic Guile language. This part of the Guile Reference Manual describes the extensions to Scheme provided in Guile.
In this chapter we describe some minor configurable differences from R4RS, mostly introduced to make eventual Emacs Lisp translation easier. Later chapters will introduce major extensions to Scheme.
Guile's behaviour can be modified by setting options. For example, is the language that Guile accepts case sensitive, or should the debugger automatically show a backtrace on error?
Guile has two levels of interface for managing options: a low-level control interface, and a user-level interface which allows the enabling or disabling of options.
Moreover, the options are classified in groups according to whether they configure reading, printing, debugging or evaluating.
We will use the expression <group>
to represent read
,
print
, debug
or evaluator
.
If scm_options is called without arguments, the current option setting is returned. If the argument is an option setting, options are altered and the old setting is returned. If the argument isn't a list, a list of sublists is returned, where each sublist contains option name, value and documentation string.
With no arguments, <group>-options
returns the values of the
options in that particular group. If arg is 'help
, a
description of each option is given. If arg is 'full
,
programmers' options are also shown.
arg can also be a list representing the state of all options. In this case, the list contains single symbols (for enabled boolean options) and symbols followed by values.
Here is the list of reader options generated by typing
(read-options 'full)
in Guile. You can also see the default
values.
keywords #f Style of keyword recognition: #f or 'prefix case-insensitive no Convert symbols to lower case. positions yes Record positions of source code expressions. copy no Copy source code expressions.
Notice that while Standard Scheme is case insensitive, to ease translation of other Lisp dialects, notably Emacs Lisp, into Guile, Guile is case-sensitive by default.
To make Guile case insensitive, you can type
(read-enable 'case-insensitive)
Here is the list of print options generated by typing
(print-options 'full)
in Guile. You can also see the default
values.
source no Print closures with source. closure-hook #f Hook for printing closures.
Here is the list of print options generated by typing
(traps 'full)
in Guile. You can also see the default
values.
exit-frame no Trap when exiting eval or apply. apply-frame no Trap when entering apply. enter-frame no Trap when eval enters new frame.
Here is the list of print options generated by typing
(debug-options 'full)
in Guile. You can also see the default
values.
stack 20000 Stack size limit (0 = no check). debug yes Use the debugging evaluator. backtrace no Show backtrace on error. depth 20 Maximal length of printed backtrace. maxdepth 1000 Maximal number of stored backtrace frames. frames 3 Maximum number of tail-recursive frames in backtrace. indent 10 Maximal indentation in backtrace. backwards no Display backtrace in anti-chronological order. procnames yes Record procedure names at definition. trace no *Trace mode. breakpoints no *Check for breakpoints. cheap yes *Flyweight representation of the stack at traps.
Here is an example of a session in which some read and debug option handling procedures are used. In this example, the user
abc
and aBc
are not the same
read-options
, and sees that case-insensitive
is set to "no".
case-insensitive
aBc
and abc
are the same
case-insensitive
and enables debugging backtrace
aBc
with backtracing enabled
[FIXME: this last example is lame because there is no depth in the
backtrace. Need to give a better example, possibly putting debugging
option examples in a separate session.]
guile> (define abc "hello") guile> abc "hello" guile> aBc ERROR: In expression aBc: ERROR: Unbound variable: aBc ABORT: (misc-error) Type "(backtrace)" to get more information. guile> (read-options 'help) keywords #f Style of keyword recognition: #f or 'prefix case-insensitive no Convert symbols to lower case. positions yes Record positions of source code expressions. copy no Copy source code expressions. guile> (debug-options 'help) stack 20000 Stack size limit (0 = no check). debug yes Use the debugging evaluator. backtrace no Show backtrace on error. depth 20 Maximal length of printed backtrace. maxdepth 1000 Maximal number of stored backtrace frames. frames 3 Maximum number of tail-recursive frames in backtrace. indent 10 Maximal indentation in backtrace. backwards no Display backtrace in anti-chronological order. procnames yes Record procedure names at definition. trace no *Trace mode. breakpoints no *Check for breakpoints. cheap yes *Flyweight representation of the stack at traps. guile> (read-enable 'case-insensitive) (keywords #f case-insensitive positions) guile> aBc "hello" guile> (read-disable 'case-insensitive) (keywords #f positions) guile> (debug-enable 'backtrace) (stack 20000 debug backtrace depth 20 maxdepth 1000 frames 3 indent 10 procnames cheap) guile> aBc Backtrace: 0* aBc ERROR: In expression aBc: ERROR: Unbound variable: aBc ABORT: (misc-error) guile>
Before the the SLIB facilities can be used, the following Scheme expression must be executed:
(use-modules (ice-9 slib))
require
can then be used as described in
section `SLIB' in The SLIB Manual. For example:
(require 'format) (format "~8,48D" 10)
Jacal is a symbolic math package written in Scheme by Aubrey Jaffer. It is usually installed as an extra package in SLIB (see section Packages not shipped with Guile).
You can use Guile's interface to SLIB to invoke Jacal:
(use-modules (ice-9 slib)) (slib:load "math") (math)
For complete documentation on Jacal, please read the Jacal manual. If it has been installed on line, you can look at section `Jacal' in The SLIB Manual. Otherwise you can find it on the web at http://www-swiss.ai.mit.edu/~jaffer/JACAL.html
This chapter describes Guile functions that are concerned with loading
and evaluating Scheme code at run time. R4RS Scheme, because of strong
differences in opinion among implementors, only provides a load
function. There are many useful programs that are difficult or
impossible to write without more powerful evaluation procedures, so we
have provided some.
[FIXME: This needs some more text on the difference between procedures, macros and memoizing macros. Also, any definitions listed here should be double-checked by someone who knows what's going on. Ask Mikael, Jim or Aubrey for help. -twp]
proc
. By
convention, if a procedure contains more than one expression and the
first expression is a string constant, that string is assumed to contain
documentation for that procedure.
#t
if obj is a regular macro, a memoizing macro or a
syntax transformer.
syntax
, macro
or macro!
,
depending on whether obj is a syntax tranformer, a regular macro,
or a memoizing macro, respectively. If obj is not a macro,
#f
is returned.
copy-tree
recurses down the
contents of both pairs and vectors (since both cons cells and vector
cells may point to arbitrary objects), and stops recursing when it hits
any other object.
(eval exp)
is
equivalent to (eval2 exp *top-level-lookup-closure*)
.
#t
if sym is defined in the top-level environment.
end-of-file
error is
signalled.
%load-hook
is defined, it should be bound to a procedure
that will be called before any code is loaded. See documentation for
%load-hook
later in this section.
#f
. Filenames may have any of the optional extensions in the
%load-extensions
list; %search-load-path
will try each
extension automatically.
primitive-load
is called. If this
procedure is defined, it will be called with the filename argument that
was passed to primitive-load
.
(define %load-hook (lambda (file) (display "Loading ") (display file) (write-line "...."))) => undefined (load-from-path "foo.scm") -| Loading /usr/local/share/guile/site/foo.scm....
%search-load-path
tries each of these extensions when looking for
a file to load. By default, %load-extensions
is bound to the
list ("" ".scm")
.
This chapter describes Guile list functions not found in standard Scheme.
append
(see section `Pairs and Lists' in The Revised^4 Report on Scheme). The cdr field of each list's final
pair is changed to point to the head of the next list, so no consing is
performed. Return a pointer to the mutated list.
reverse
(see section `Pairs and Lists' in The Revised^4 Report on Scheme). The cdr of each cell in lst is
modified to point to the previous list element. Return a pointer to the
head of the reversed list.
Caveat: because the list is modified in place, the tail of the original
list now becomes its head, and the head of the original list now becomes
the tail. Therefore, the lst symbol to which the head of the
original list was bound now points to the tail. To ensure that the head
of the modified list is not lost, it is wise to save the return value of
reverse!
list-cdr-ref
and list-tail
are identical. It may help to
think of list-cdr-ref
as accessing the kth cdr of the list,
or returning the results of cdring k times down lst.
memq
, memv
and member
:
delq
compares elements of lst against item with
eq?
, delv
uses eqv?
and delete
uses equal?
delq
, delv
and delete
: they modify the pointers in the existing lst
rather than creating a new list. Caveat evaluator: Like other
destructive list functions, these functions cannot modify the binding of
lst, and so cannot be used to delete the first element of
lst destructively.
[FIXME: is there any reason to have the `sloppy' functions available at high level at all? Maybe these docs should be relegated to a "Guile Internals" node or something. -twp]
memq
, memv
and member
(see section `Pairs and Lists' in The Revised^4 Report on Scheme), but do
not perform any type or error checking. Their use is recommended only
in writing Guile internals, not for high-level Scheme programs.
To make it easier to write powerful applications, Guile provides many data structures not found in standard Scheme.
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
A record type is a first class object representing a user-defined data type. A record is an instance of a record type.
#t
if obj is a record of any type and #f
otherwise.
Note that record?
may be true of any Scheme value; there is no
promise that records are disjoint with other Scheme types.
make-record-type
that created the type represented by rtd;
if the field-names argument is provided, it is an error if it
contains any duplicates or any symbols not in the default list.
make-record-type
that created the type represented by rtd.
make-record-type
that created the type represented by
rtd.
record-predicate
, the resulting predicate would return a true
value when passed the given record. Note that it is not necessarily the
case that the returned descriptor is the one that was passed to
record-constructor
in the call that created the constructor
procedure that created the given record.
eqv?
to the type-name argument given in
the call to make-record-type
that created the type represented by
rtd.
equal?
to the
field-names argument given in the call to make-record-type
that
created the type represented by rtd.[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
A structure type is a first class user-defined data type. A structure is an instance of a structure type. A structure type is itself a structure.
Structures are less abstract and more general than traditional records. In fact, in Guile Scheme, records are implemented using structures.
A structure object consists of a handle, structure data, and a vtable. The handle is a Scheme value which points to both the vtable and the structure's data. Structure data is a dynamically allocated region of memory, private to the structure, divided up into typed fields. A vtable is another structure used to hold type-specific data. Multiple structures can share a common vtable.
Three concepts are key to understanding structures.
When a structure is created, a region of memory is allocated to hold its state. The layout of the structure's type determines how that memory is divided into fields.
Each field has a specified type. There are only three types allowed, each corresponding to a one letter code. The allowed types are:
Each field also has an associated access protection. There are only three kinds of protection, each corresponding to a one letter code. The allowed protections are:
A layout specification is described by stringing together pairs of letters: one to specify a field type and one to specify a field protection. For example, a traditional cons pair type object could be described as:
; cons pairs have two writable fields of Scheme data "pwpw"
A pair object in which the first field is held constant could be:
"prpw"
Binary fields, (fields of type "u"), hold one word each. The
size of a word is a machine dependent value defined to be equal to the
value of the C expression: sizeof (long)
.
The last field of a structure layout may specify a tail array. A tail array is indicated by capitalizing the field's protection code ('W', 'R' or 'O'). A tail-array field is replaced by a read-only binary data field containing an array size. The array size is determined at the time the structure is created. It is followed by a corresponding number of fields of the type specified for the tail array. For example, a conventional Scheme vector can be described as:
; A vector is an arbitrary number of writable fields holding Scheme ; values: "pW"
In the above example, field 0 contains the size of the vector and fields beginning at 1 contain the vector elements.
A kind of tagged vector (a constant tag followed by conventioal vector elements) might be:
"prpW"
Structure layouts are represented by specially interned symbols whose name is a string of type and protection codes. To create a new structure layout, use this procedure:
fields must be a read-only string made up of pairs of characters strung together. The first character of each pair describes a field type, the second a field protection. Allowed types are 'p' for GC-protected Scheme data, 'u' for unprotected binary data, and 's' for fields that should point to the structure itself. Allowed protections are 'w' for mutable fields, 'r' for read-only fields, and 'o' for opaque fields. The last field protection specification may be capitalized to indicate that the field is a tail-array.
This section describes the basic procedures for creating and accessing structures.
type must be a vtable structure (See section Vtables).
tail-elts must be a non-negative integer. If the layout specification indicated by type includes a tail-array, this is the number of elements allocated to that array.
The inits are optional arguments describing how successive fields of the structure should be initialized. Only fields with protection 'r' or 'w' can be initialized -- fields of protection 's' are automatically initialized to point to the new structure itself; fields of protection 'o' can not be initialized by Scheme programs.
If the field is of type 'p', then it can be set to an arbitrary value.
If the field is of type 'u', then it can only be set to a non-negative integer value small enough to fit in one machine word.
Vtables are structures that are used to represent structure types. Each
vtable contains a layout specification in field
vtable-index-layout
-- instances of the type are laid out
according to that specification. Vtables contain additional fields
which are used only internally to libguile. The variable
vtable-offset-user
is bound to a field number. Vtable fields
at that position or greater are user definable.
If you have a vtable structure, V
, you can create an instance of
the type it describes by using (make-struct V ...)
. But where
does V
itself come from? One possibility is that V
is an
instance of a user-defined vtable type, V'
, so that V
is
created by using (make-struct V' ...)
. Another possibility is
that V
is an instance of the type it itself describes. Vtable
structures of the second sort are created by this procedure:
new-fields is a layout specification describing fields
of the resulting structure beginning at the position bound to
vtable-offset-user
.
tail-size specifies the size of the tail-array (if any) of this vtable.
inits initializes the fields of the vtable. Minimally, one initializer must be provided: the layout specification for instances of the type this vtable will describe. If a second initializer is provided, it will be interpreted as a print call-back function.
;;; loading ,a... (define x (make-vtable-vtable (make-struct-layout (quote pw)) 0 'foo)) (struct? x) => #t (struct-vtable? x) => #t (eq? x (struct-vtable x)) => #t (struct-ref x vtable-offset-user) => foo (struct-ref x 0) => pruosrpwpw (define y (make-struct x 0 (make-struct-layout (quote pwpwpw)) 'bar)) (struct? y) => #t (struct-vtable? y) => #t (eq? x y) => () (eq? x (struct-vtable y)) => #t (struct-ref y 0) => pwpwpw (struct-ref y vtable-offset-user) => bar (define z (make-struct y 0 'a 'b 'c)) (struct? z) => #t (struct-vtable? z) => () (eq? y (struct-vtable z)) => #t (map (lambda (n) (struct-ref z n)) '(0 1 2)) => (a b c)
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
Arrays read and write as a #
followed by the rank
(number of dimensions) followed by what appear as lists (of lists) of
elements. The lists must be nested to the depth of the rank. For each
depth, all lists must be the same length.
(make-array 'ho 3 3) => #2((ho ho ho) (ho ho ho) (ho ho ho))
Unshared conventional (not uniform) 0-based arrays of rank 1 (dimension) are equivalent to (and can't be distinguished from) vectors.
(make-array 'ho 3) => (ho ho ho)
When constructing an array, bound is either an inclusive range of indices expressed as a two element list, or an upper bound expressed as a single integer. So
(make-array 'foo 3 3) == (make-array 'foo '(0 2) '(0 2))
#t
if the obj is an array, and #f
if not.
(index1, index2)
element in array.
#t
if its arguments would be acceptable to array-ref.
(index1, index2)
element in array to
new-value. The value returned by array-set! is unspecified.
make-shared-array
can be used to create shared subarrays of other
arrays. The mapper is a function that translates coordinates in
the new array into coordinates in the old array. A mapper must be
linear, and its range must stay within the bounds of the old array, but
it can be otherwise arbitrary. A simple example:
(define fred (make-array #f 8 8)) (define freds-diagonal (make-shared-array fred (lambda (i) (list i i)) 8)) (array-set! freds-diagonal 'foo 3) (array-ref fred 3 3) => foo (define freds-center (make-shared-array fred (lambda (i j) (list (+ 3 i) (+ 3 j))) 2 2)) (array-ref freds-center 0 0) => foo
The values of dim0, dim1, ... correspond to dimensions in the array to be returned, their positions in the argument list to dimensions of array. Several dims may have the same value, in which case the returned array will have smaller rank than array.
examples:
(transpose-array '#2((a b) (c d)) 1 0) => #2((a c) (b d)) (transpose-array '#2((a b) (c d)) 0 0) => #1(a d) (transpose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 1 0) => #2((a 4) (b 5) (c 6))
An enclosed array is not a general Scheme array. Its elements may not
be set using array-set!
. Two references to the same element of
an enclosed array will be equal?
but will not in general be
eq?
. The value returned by array-prototype when given an
enclosed array is unspecified.
examples:
(enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1) => #<enclosed-array (#1(a d) #1(b e) #1(c f)) (#1(1 4) #1(2 5) #1(3 6))> (enclose-array '#3(((a b c) (d e f)) ((1 2 3) (4 5 6))) 1 0) => #<enclosed-array #2((a 1) (d 4)) #2((b 2) (e 5)) #2((c 3) (f 6))>
(array-shape (make-array 'foo '(-1 3) 5)) => ((-1 3) (0 4))
Array-dimensions
is similar to array-shape
but replaces
elements with a 0
minimum with one greater than the maximum. So:
(array-dimensions (make-array 'foo '(-1 3) 5)) => ((-1 3) 5)
0
is returned.
array-copy!
but guaranteed to copy in row-major order.
#t
iff all arguments are arrays with the same shape, the
same type, and have corresponding elements which are either
equal?
or array-equal?
. This function differs from
equal?
in that a one dimensional shared array may be
array-equal? but not equal? to a vector or uniform vector.
array-contents
returns that shared array, otherwise it returns
#f
. All arrays made by make-array and
make-uniform-array may be unrolled, some arrays made by
make-shared-array may not be.
If the optional argument strict is provided, a shared array will be returned only if its elements are stored internally contiguous in memory.
One can implement array-indexes as
(define (array-indexes array) (let ((ra (apply make-array #f (array-shape array)))) (array-index-map! ra (lambda x x)) ra))
Another example:
(define (apl:index-generator n) (let ((v (make-uniform-vector n 1))) (array-index-map! v (lambda (i) i)) v))
Uniform Array and vectors are arrays whose elements are all of the same type. Uniform vectors occupy less storage than conventional vectors. Uniform Array procedures also work on vectors, uniform-vectors, bit-vectors, and strings.
prototype arguments in the following procedures are interpreted according to the table:
prototype type printing character #t boolean (bit-vector) b #\a char (string) a integer >0 unsigned integer u integer <0 signed integer e 1.0 float (single precision) s 1/3 double (double precision float) i +i complex (double precision) c () conventional vector
Unshared uniform character 0-based arrays of rank 1 (dimension) are equivalent to (and can't be distinguished from) strings.
(make-uniform-array #\a 3) => "$q2"
Unshared uniform boolean 0-based arrays of rank 1 (dimension) are equivalent to (and can't be distinguished from) section Bit Vectors.
(make-uniform-array #t 3) => #*000 == #b(#f #f #f) => #*000 == #1b(#f #f #f) => #*000
Other uniform vectors are written in a form similar to that of vectors,
except that a single character from the above table is put between
#
and (
. For example, '#e(3 5 9)
returns a uniform
vector of signed integers.
#t
if the obj is an array of type corresponding to
prototype, and #f
if not.
make-uniform-array
.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be read, leaving the remainder of the vector unchanged.
uniform-array-read!
returns the number of objects read.
port-or-fdes may be omitted, in which case it defaults to the value
returned by (current-input-port)
.
The optional arguments start and end allow a specified region of a vector (or linearized array) to be written.
The
number of objects actually written is returned. port-or-fdes may be
omitted, in which case it defaults to the value returned by
(current-output-port)
.
Bit vectors can be written and read as a sequence of 0
s and
1
s prefixed by #*
.
#b(#f #f #f #t #f #t #f) => #*0001010
Some of these operations will eventually be generalized to other uniform-arrays.
#f
is returned.
#t
, uve is OR'ed into bv; If bool is #f
, the
inversion of uve is AND'ed into bv.
If uve is a unsigned integer vector all the elements of uve must be
between 0 and the LENGTH
of bv. The bits of bv
corresponding to the indexes in uve are set to bool.
The return value is unspecified.
(bit-count (bit-set*! (if bool bv (bit-invert! bv)) uve #t) #t).
bv is not modified.
This chapter discusses dictionary objects: data structures that are useful for organizing and indexing large bodies of information.
A dictionary object is a data structure used to index
information in a user-defined way. In standard Scheme, the main
aggregate data types are lists and vectors. Lists are not really
indexed at all, and vectors are indexed only by number
(e.g. (vector-ref foo 5)
). Often you will find it useful
to index your data on some other type; for example, in a library
catalog you might want to look up a book by the name of its
author. Dictionaries are used to help you organize information in
such a way.
An association list (or alist for short) is a list of
key-value pairs. Each pair represents a single quantity or
object; the car
of the pair is a key which is used to
identify the object, and the cdr
is the object's value.
A hash table also permits you to index objects with arbitrary keys, but in a way that makes looking up any one object extremely fast. A well-designed hash system makes hash table lookups almost as fast as conventional array or vector references.
Alists are popular among Lisp programmers because they use only the language's primitive operations (lists, car, cdr and the equality primitives). No changes to the language core are necessary. Therefore, with Scheme's built-in list manipulation facilities, it is very convenient to handle data stored in an association list. Also, alists are highly portable and can be easily implemented on even the most minimal Lisp systems.
However, alists are inefficient, especially for storing large quantities of data. Because we want Guile to be useful for large software systems as well as small ones, Guile provides a rich set of tools for using either association lists or hash tables.
assq
compares keys with eq?
, assv
uses eqv?
and assoc
uses equal?
. If key
cannot be found in alist (according to whichever equality
predicate is in use), then #f
is returned. These functions
return the entire alist entry found (i.e. both the key and the value).
assq
, assv
and assoc
, except that only the
value associated with key in alist is returned. These
functions are equivalent to
(let ((ent (associator key alist))) (and ent (cdr ent)))
where associator is one of assq
, assv
or assoc
.
These functions do not attempt to verify the structure of alist, and so may cause unusual results if passed an object that is not an association list.
Caution: it is important to remember that the set! and
remove! functions do not always operate as intended. In some
circumstances, the functions will try to modify the first element in the
list; for example, when adding a new entry to an alist,
assoc-set!
conses the new key-value pair on to the beginning of
the alist. However, when this happens, the symbol to which the alist is
bound has not been modified--it still points to the old "beginning"
of the list, which still does not contain the new entry. In order to be
sure that these functions always succeed, even when modifying the
beginning of the alist, you will have to rebind the alist symbol
explicitly to point to the value returned by assoc-set!
, like so:
(set! my-alist (assq-set! my-alist 'sun4 "sparc-sun-solaris"))
Because of this restriction, you may find it more convenient to use hash tables to store dictionary data. If your application will not be modifying the contents of an alist very often, this may not make much difference to you.
Here is a longer example of how alists may be used in practice.
(define capitals '(("New York" . "Albany") ("Oregon" . "Salem") ("Florida" . "Miami"))) ;; What's the capital of Oregon? (assoc "Oregon" capitals) => ("Oregon" . "Salem") (assoc-ref capitals "Oregon") => "Salem" ;; We left out South Dakota. (set! capitals (assoc-set! capitals "South Dakota" "Bismarck")) capitals => (("South Dakota" . "Bismarck") ("New York" . "Albany") ("Oregon" . "Salem") ("Florida" . "Miami")) ;; And we got Florida wrong. (set! capitals (assoc-set! capitals "Florida" "Tallahassee")) capitals => (("South Dakota" . "Bismarck") ("New York" . "Albany") ("Oregon" . "Salem") ("Florida" . "Tallahassee")) ;; After Oregon secedes, we can remove it. (set! capitals (assoc-remove! capitals "Oregon")) capitals => (("South Dakota" . "Bismarck") ("New York" . "Albany") ("Florida" . "Tallahassee"))
Like the association list functions, the hash table functions come
in several varieties: hashq
, hashv
, and hash
.
The hashq
functions use eq?
to determine whether two
keys match. The hashv
functions use eqv?
, and the
hash
functions use equal?
.
In each of the functions that follow, the table argument must be a vector. The key and value arguments may be any Scheme object.
#f
if no default argument is
supplied).
The standard hash table functions may be too limited for some applications. For example, you may want a hash table to store strings in a case-insensitive manner, so that references to keys named "foobar", "FOOBAR" and "FooBaR" will all yield the same item. Guile provides you with extended hash tables that permit you to specify a hash function and associator function of your choosing. The functions described in the rest of this section can be used to implement such custom hash table structures.
If you are unfamiliar with the inner workings of hash tables, then this facility will probably be a little too abstract for you to use comfortably. If you are interested in learning more, see an introductory textbook on data structures or algorithms for an explanation of how hash tables are implemented.
ref
and
set!
functions described above, but use hasher as a
hash function and assoc to compare keys. hasher
must
be a function that takes two arguments, a key to be hashed and a
table size. assoc
must be an associator function, like
assoc
, assq
or assv
.
By way of illustration, hashq-ref table key
is equivalent
to hashx-ref hashq assq table key
.
-ref
cousins, but return a
handle from the hash table rather than the value associated with
key. By convention, a handle in a hash table is the pair which
associates a key with a value. Where hashq-ref table key
returns
only a value
, hashq-get-handle table key
returns the pair
(key . value)
.
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
For the sake of efficiency, two special kinds of strings are available in Guile: shared substrings and the misleadingly named "read-only" strings. It is not necessary to know about these to program in Guile, but you are likely to run into one or both of these special string types eventually, and it will be helpful to know how they work.
index
or strchr
functions from the C library.
string-index
, but search from the right of the string rather
than from the left. This procedure essentially implements the
rindex
or strrchr
functions from the C library.
substring-move-right!
begins copying from the rightmost character
and moves left, and substring-move-left!
copies from the leftmost
character moving right.
It is useful to have two functions that copy in different directions so
that substrings can be copied back and forth within a single string. If
you wish to copy text from the left-hand side of a string to the
right-hand side of the same string, and the source and destination
overlap, you must be careful to copy the rightmost characters of the
text first, to avoid clobbering your data. Hence, when str1 and
str2 are the same string, you should use
substring-move-right!
when moving text from left to right, and
substring-move-left!
otherwise. If str1
and `str2'
are different strings, it does not matter which function you use.
substring-move-right!
and
substring-move-left!
#t
if str's length is nonzero, and #f
otherwise.
str
, respectively.
Whenever you extract a substring using substring
, the Scheme
interpreter allocates a new string and copies data from the old string.
This is expensive, but substring
is so convenient for
manipulating text that programmers use it often.
Guile Scheme provides the concept of the shared substring to improve performance of many substring-related operations. A shared substring is an object that mostly behaves just like an ordinary substring, except that it actually shares storage space with its parent string.
substring
function: the shared substring returned
includes all of the text from str between indexes start
(inclusive) and end (exclusive). If end is omitted, it
defaults to the end of str. The shared substring returned by
make-shared-substring
occupies the same storage space as
str.
Example:
(define foo "the quick brown fox") (define bar (make-shared-substring some-string 4 9)) foo => "t h e q u i c k b r o w n f o x" bar =========> |---------|
The shared substring bar is not given its own storage space.
Instead, the Guile interpreter notes internally that bar points to
a portion of the memory allocated to foo. However, bar
behaves like an ordinary string in most respects: it may be used with
string primitives like string-length
, string-ref
,
string=?
. Guile makes the necessary translation between indices
of bar and indices of foo automatically.
(string-length? bar) => 5 ; bar only extends from indices 4 to 9 (string-ref bar 3) => #\c ; same as (string-ref foo 7) (make-shared-substring bar 2) => "ick" ; can even make a shared substring!
Because creating a shared substring does not require allocating new storage from the heap, it is a very fast operation. However, because it shares memory with its parent string, a change to the contents of the parent string will implicitly change the contents of its shared substrings.
(string-set! foo 7 #\r) bar => "quirk"
Guile considers shared substrings to be immutable. This is because programmers might not always be aware that a given string is really a shared substring, and might innocently try to mutate it without realizing that the change would affect its parent string. (We are currently considering a "copy-on-write" strategy that would permit modifying shared substrings without affecting the parent string.)
In general, shared substrings are useful in circumstances where it is important to divide a string into smaller portions, but you do not expect to change the contents of any of the strings involved.
Type-checking in Guile primitives distinguishes between mutable strings
and read only strings. Mutable strings answer #t
to
string?
while read only strings may or may not. All kinds of
strings, whether or not they are mutable return #t to this:
This illustrates the difference between string?
and
read-only-string?
:
(string? "a string") => #t (string? 'a-symbol") => #f (read-only-string? "a string") => #t (read-only-string? 'a-symbol") => #t
"Read-only" refers to how the string will be used, not how the string is permitted to be used. In particular, all strings are "read-only strings" even if they are mutable, because a function that only reads from a string can certainly operate on even a mutable string.
Symbols are an example of read-only strings. Many string functions,
such as string-append
are happy to operate on symbols. Many
functions that expect a string argument, such as open-file
, will
accept a symbol as well.
Shared substrings, discussed in the previous chapter, also happen to be read-only strings.
Most of the characters in the ASCII character set may be referred to by
name: for example, #\tab
, #\esc
, #\stx
, and so on.
The following table describes the ASCII names for each character.
0 = #\nul |
@tab 1 =
4 = #\eot |
@tab 5 =
8 = #\bs |
@tab 9 =
12 = #\np |
@tab 13 =
16 = #\dle |
@tab 17 =
20 = #\dc4 |
@tab 21 =
24 = #\can |
@tab 25 =
28 = #\fs |
@tab 29 =
32 = #\sp |
The delete
character (octal 177) may be referred to with the name
#\del
.
Several characters have more than one name:
Every object in the system can have a property list that may be used for information about that object. For example, a function may have a property list that includes information about the source file in which it is defined.
Property lists are implemented as assq lists (see section Association Lists).
Currently, property lists are implemented differently for procedures and
closures than for other kinds of objects. Therefore, when manipulating
a property list associated with a procedure object, use the
procedure
functions; otherwise, use the object
functions.
[Interface bug: there should be a second level of interface in which the user provides a "property table" that is possibly private.]
Input and output devices in Scheme are represented by ports. All input and output in Scheme programs is accomplished by operating on a port: characters are read from an input port and written to an output port. This chapter explains the operations that Guile provides for working with ports.
The formal definition of a port is very generic: an input port is simply "an object which can deliver characters on command," and an output port is "an object which can accept characters." Because this definition is so loose, it is easy to write functions that simulate ports in software. Soft ports and string ports are two interesting and powerful examples of this technique.
The following procedures are used to open file ports.
See also section Ports and File Descriptors, for an interface
to the Unix open
system call.
See
the stdio
documentation for your system for more
I/O mode options.
If a file cannot be opened, open-file
throws an exception.
(open-file filename "r")
(open-file filename "w")
port-filename
and reported in diagnostic output.
open-file
in section File Ports.
A soft-port is a port based on a vector of procedures capable of accepting or delivering characters. It allows emulation of I/O ports.
For an output-only port only elements 0, 1, 2, and 4 need be
procedures. For an input-only port only elements 3 and 4 need be
procedures. Thunks 2 and 4 can instead be #f
if there is no useful
operation for them to perform.
If thunk 3 returns #f
or an eof-object
(see section `Input' in The Revised^4 Report on Scheme) it indicates that
the port has reached end-of-file. For example:
(define stdout (current-output-port)) (define p (make-soft-port (vector (lambda (c) (write c stdout)) (lambda (s) (display s stdout)) (lambda () (display "." stdout)) (lambda () (char-upcase (read-char))) (lambda () (display "@" stdout))) "rw")) (write p p) => #<input-output-soft#\space45d10#\>
The following procedures return #t
if they successfully
close a port or #f
if it was already
closed. They can also raise exceptions if an error occurs: some
errors arising from writing output may be delayed until close.
See also section Ports and File Descriptors, for a procedure
which can close file descriptors.
These procedures obtain and modify information about ports, but are not specific to one kind of port.
current-input-port
,
current-output-port
and current-error-port
, respectively,
so that they use the supplied port for input or output.
Extended I/O procedures are available which read or write lines of text, read text delimited by a specified set of characters, or report or set the current position of a port.
Interfaces to read
/fread
and write
/fwrite
are
also available, as uniform-array-read!
and uniform-array-write!
,
section Uniform Array.
(current-input-port)
. Under Unix, a line of text
is terminated by the first end-of-line character or by end-of-file.
If handle-delim is specified, it should be one of the following symbols:
trim
concat
peek
split
#f
is returned.
Read from port if
specified, otherwise from the value returned by (current-input-port)
.
(current-input-port)
.
handle-delim takes the same values as described for read-line
.
NOTE: if the scsh module is loaded then delims must be an scsh char-set, not a string.
read-line
. If buf is filled,
#f
is returned for both the number of characters read and the
delimiter. Also terminates if one of the characters in the string
delims is found
or end-of-file is reached. Read from port if supplied, otherwise
from the value returned by (current-input-port)
.
NOTE: if the scsh module is loaded then delims must be an scsh char-set, not a string.
(current-output-port)
is used. This function
is equivalent to:
(display obj [port]) (newline [port])
lseek
.
One of the following variables should be supplied for whence:
If fd/port is a file descriptor the underlying system call is
lseek
.
The return value is unspecified.
Some of the abovementioned I/O functions rely on the following C primitives. These will mainly be of interest to people hacking Guile internals.
(current-input-port)
. If start or end are specified,
store data only into the substring of buf bounded by start
and end (which default to the beginning and end of the buffer,
respectively).
Return a pair consisting of the delimiter that terminated the string and the number of characters read. If reading stopped at the end of file, the delimiter returned is the eof-object; if the buffer was filled without encountering a delimiter, this value is #f.
%read-line
is called at the end of file, it returns the pair
(#<eof> . #<eof>)
.
Example:
(number->string (logand #b1100 #b1010) 2) => "1000"
Example:
(number->string (logior #b1100 #b1010) 2) => "1110"
Example:
(number->string (logxor #b1100 #b1010) 2) => "110"
Example:
(number->string (lognot #b10000000) 2) => "-10000001" (number->string (lognot #b0) 2) => "-1"
(logtest j k) == (not (zero? (logand j k))) (logtest #b0100 #b1011) => #f (logtest #b0100 #b0111) => #t
(logbit? index j) == (logtest (integer-expt 2 index) j) (logbit? 0 #b1101) => #t (logbit? 1 #b1101) => #f (logbit? 2 #b1101) => #t (logbit? 3 #b1101) => #t (logbit? 4 #b1101) => #f
(inexact->exact (floor (* int (expt 2 count))))
.
Example:
(number->string (ash #b1 3) 2) => "1000" (number->string (ash #b1010 -1) 2) => "101"
Example:
(logcount #b10101010) => 4 (logcount 0) => 0 (logcount -2) => 1
Example:
(integer-length #b10101010) => 8 (integer-length 0) => 0 (integer-length #b1111) => 4
Example:
(integer-expt 2 5) => 32 (integer-expt -3 3) => -27
Example:
(number->string (bit-extract #b1101101010 0 4) 2) => "1010" (number->string (bit-extract #b1101101010 4 9) 2) => "10110"
A regular expression (or regexp) is a pattern that describes a whole class of strings. A full description of regular expressions and their syntax is beyond the scope of this manual; an introduction can be found in the Emacs manual (see section `Syntax of Regular Expressions' in The GNU Emacs Manual, or in many general Unix reference books.
If your system does not include a POSIX regular expression library, and
you have not linked Guile with a third-party regexp library such as Rx,
these functions will not be available. You can tell whether your Guile
installation includes regular expression support by checking whether the
*features*
list includes the regex
symbol.
[FIXME: it may be useful to include an Examples section. Parts of this interface are bewildering on first glance.]
By default, Guile supports POSIX extended regular expressions. That means that the characters `(', `)', `+' and `?' are special, and must be escaped if you wish to match the literal characters.
This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions.
string-match
returns a match structure which
describes what, if anything, was matched by the regular
expression. See section Match Structures. If str does not match
pattern at all, string-match
returns #f
.
Each time string-match
is called, it must compile its
pattern argument into a regular expression structure. This
operation is expensive, which makes string-match
inefficient if
the same regular expression is used several times (for example, in a
loop). For better performance, you can compile a regular expression in
advance and then match strings against the compiled regexp.
make-regexp
throws a
regular-expression-syntax
error.
The flag arguments change the behavior of the compiled regexp. The following flags may be supplied:
regexp/icase
regexp/newline
regexp/basic
regexp/extended
make-regexp
includes both regexp/basic
and
regexp/extended
flags, the one which comes last will override
the earlier one.
str
.
If the optional integer start argument is provided, begin matching
from that position in the string. Return a match structure describing
the results of the match, or #f
if no match could be found.
#t
if obj is a compiled regular expression, or
#f
otherwise.
Regular expressions are commonly used to find patterns in one string and replace them with the contents of another string.
port may be #f
, in which case nothing is written; instead,
regexp-substitute
constructs a string from the specified
items and returns that.
regexp-substitute
, but can be used to perform global
substitutions on str. Instead of taking a match structure as an
argument, regexp-substitute/global
takes two string arguments: a
regexp string describing a regular expression, and a target
string which should be matched against this regular expression.
Each item behaves as in regexp-substitute, with the following exceptions:
regexp-substitute/global
to recurse
on the unmatched portion of str. This must be supplied in
order to perform global search-and-replace on str; if it is not
present among the items, then regexp-substitute/global
will
return after processing a single match.
A match structure is the object returned by string-match
and
regexp-exec
. It describes which portion of a string, if any,
matched the given regular expression. Match structures include: a
reference to the string that was checked for matches; the starting and
ending positions of the regexp match; and, if the regexp included any
parenthesized subexpressions, the starting and ending positions of each
submatch.
In each of the regexp match functions described below, the match
argument must be a match structure returned by a previous call to
string-match
or regexp-exec
. Most of these functions
return some information about the original target string that was
matched against a regular expression; we will call that string
target for easy reference.
#t
if obj is a match structure returned by a
previous call to regexp-exec
, or #f
otherwise.
#f
.
Sometimes you will want a regexp to match characters like `*' or `$' exactly. For example, to check whether a particular string represents a menu entry from an Info node, it would be useful to match it against a regexp like `^* [^:]*::'. However, this won't work; because the asterisk is a metacharacter, it won't match the `*' at the beginning of the string. In this case, we want to make the first asterisk un-magic.
You can do this by preceding the metacharacter with a backslash character `\'. (This is also called quoting the metacharacter, and is known as a backslash escape.) When Guile sees a backslash in a regular expression, it considers the following glyph to be an ordinary character, no matter what special meaning it would ordinarily have. Therefore, we can make the above example work by changing the regexp to `^\* [^:]*::'. The `\*' sequence tells the regular expression engine to match only a single asterisk in the target string.
Since the backslash is itself a metacharacter, you may force a regexp to match a backslash in the target string by preceding the backslash with itself. For example, to find variable references in a TeX program, you might want to find occurrences of the string `\let\' followed by any number of alphabetic characters. The regular expression `\\let\\[A-Za-z]*' would do this: the double backslashes in the regexp each match a single backslash in the target string.
Very important: Using backslash escapes in Guile source code (as in Emacs Lisp or C) can be tricky, because the backslash character has special meaning for the Guile reader. For example, if Guile encounters the character sequence `\n' in the middle of a string while processing Scheme code, it replaces those characters with a newline character. Similarly, the character sequence `\t' is replaced by a horizontal tab. Several of these escape sequences are processed by the Guile reader before your code is executed. Unrecognized escape sequences are ignored: if the characters `\*' appear in a string, they will be translated to the single character `*'.
This translation is obviously undesirable for regular expressions, since we want to be able to include backslashes in a string in order to escape regexp metacharacters. Therefore, to make sure that a backslash is preserved in a string in your Guile program, you must use two consecutive backslashes:
(define Info-menu-entry-pattern (make-regexp "^\\* [^:]*"))
The string in this example is preprocessed by the Guile reader before
any code is executed. The resulting argument to make-regexp
is
the string `^\* [^:]*', which is what we really want.
This also means that in order to write a regular expression that matches a single backslash character, the regular expression string in the source code must include four backslashes. Each consecutive pair of backslashes gets translated by the Guile reader to a single backslash, and the resulting double-backslash is interpreted by the regexp engine as matching a single backslash character. Hence:
(define tex-variable-pattern (make-regexp "\\\\let\\\\=[A-Za-z]*"))
The reason for the unwieldiness of this syntax is historical. Both regular expression pattern matchers and Unix string processing systems have traditionally used backslashes with the special meanings described above. The POSIX regular expression specification and ANSI C standard both require these semantics. Attempting to abandon either convention would cause other kinds of compatibility problems, possibly more severe ones. Therefore, without extending the Scheme reader to support strings with different quoting conventions (an ungainly and confusing extension when implemented in other languages), we must adhere to this cumbersome escape syntax.
[FIXME: this is taken from Gary and Mark's quick summaries and should be reviewed and expanded. Rx is pretty stable, so could already be done!]
Guile includes an interface to Tom Lord's Rx library (currently only to POSIX regular expressions). Use of the library requires a two step process: compile a regular expression into an efficient structure, then use the structure in any number of string comparisons.
For example, given the regular expression `abc.' (which matches any string containing `abc' followed by any single character):
guile> (define r (regcomp "abc.")) guile> r #<rgx abc.> guile> (regexec r "abc") #f guile> (regexec r "abcd") #((0 . 4)) guile>
The definitions of regcomp
and regexec
are as follows:
.
or [^...]
from matching newlines.
The logior
procedure can be used to combine multiple flags.
The default is to use
POSIX basic syntax, which makes +
and ?
literals and \+
and \?
operators. Backslashes in pattern must be escaped if specified in a
literal string e.g., "\\(a\\)\\?"
.
Match string against the compiled POSIX regular expression regex. match-pick and flags are optional. Possible flags (which can be combined using the logior procedure) are:
If no match is possible, regexec returns #f. Otherwise match-pick determines the return value:
#t
or unspecified: a newly-allocated vector is returned,
containing pairs with the indices of the matched part of string and any
substrings.
""
: a list is returned: the first element contains a nested list
with the matched part of string surrounded by the the unmatched parts.
Remaining elements are matched substrings (if any). All returned
substrings share memory with string.
#f
: regexec returns #t if a match is made, otherwise #f.
vector: the supplied vector is returned, with the first element replaced by a pair containing the indices of the matched portion of string and further elements replaced by pairs containing the indices of matched substrings (if any).
list: a list will be returned, with each member of the list specified by a code in the corresponding position of the supplied list:
a number: the numbered matching substring (0 for the entire match).
#\<
: the beginning of string to the beginning of the part matched
by regex.
#\>
: the end of the matched part of string to the end of
string.
#\c
: the "final tag", which seems to be associated with the "cut
operator", which doesn't seem to be available through the posix
interface.
e.g., (list #\< 0 1 #\>)
. The returned substrings share memory with
string.
Here are some other procedures that might be used when using regular expressions:
[FIXME: need some initial schmooze about keywords; this is all taken from the NEWS file, and is accurate but not very useful to someone who has not used keywords before.]
Guile supports a new R4RS--compliant syntax for keywords. A token of the form #:NAME, where NAME has the same syntax as a Scheme symbol, is the external representation of the keyword named NAME. Keyword objects print using this syntax as well, so values containing keyword objects can be read back into Guile. When used in an expression, keywords are self-quoting objects.
Guile suports this read syntax, and uses this print syntax, regardless of the current setting of the `keyword' read option. The `keyword' read option only controls whether Guile recognizes the `:NAME' syntax, which is incompatible with R4RS. (R4RS says such token represent symbols.)
The default behaviour is the R4RS--compliant one (#:xxx
instead
of :xxx
).
To change between the two keyword syntaxes you use the
read-options
procedure documented in section General option interface and section Reader options.
To use the :xxx
keyword syntax, use
(read-set! keywords 'prefix)
To make keyword syntax R4RS compliant, with the #:xxx
syntax,
use:
(read-set! keywords #f)
[FIXME: in state of flux right now; here are the Scheme primitives defined in `kw.c':]
keyword?
returns #t
if the argument kw is a keyword;
it returns #f
otherwise.
keyword-dash-symbol
[FIXME: have no idea what this does; it is
not commented.]
(handler key args ...)
key is a symbol or #t.
thunk takes no arguments. If thunk returns normally, that
is the return value of catch
.
Handler is invoked outside the scope of its own catch
. If
handler again throws to the same key, a new handler from further
up the call chain is invoked.
If the key is #t
, then a throw to any symbol will match
this call to catch
.
key is a symbol. It will match catches of the same symbol or of #t.
If there is no handler at all, an error is signaled.
misc-error
and a message constructed by
displaying msg and writing args.
#f
. message
is the error message string, possibly containing %S
and %s
escapes. When an error is reported, these are replaced by formating the
corresponding members of args: %s
formats using display
and %S
formats using write
. data is a
list or #f
depending on key: if key is
system-error
then it should be a list
containing the Unix errno
value; If key is signal
then
it should be a list containing the Unix signal number; otherwise it
will usually be #f
.
#f
is returned instead.
It is traditional in Scheme to implement exception systems using
call-with-current-continuation
, but his has not been done, for
performance reasons. The implementation of
call-with-current-continuation
is a stack copying implementation.
This allows it to interact well with ordinary C code. Unfortunately, a
stack-copying implementation can be slow -- creating a new continuation
involves a block copy of the stack.
Instead of using call-with-current-continuation
, the exception
primitives are implemented as built-ins that take advantage of the
upward only nature of exceptions.
[FIXME: somewhat babbling; should be reviewed by someone who understands modules, once the new module system is in place]
When programs become large, naming conflicts can occur when a function or global variable defined in one file has the same name as a function or global variable in another file. Even just a similarity between function names can cause hard-to-find bugs, since a programmer might type the wrong function name.
The approach used to tackle this problem is called information encapsulation, which consists of packaging functional units into a given name space that is clearly separated from other name spaces.
The language features that allow this are usually called the module system because programs are broken up into modules that are compiled separately (or loaded separately in an interpreter).
Older languages, like C, have limited support for name space
manipulation and protection. In C a variable or function is public by
default, and can be made local to a module with the static
keyword. But you cannot reference public variables and functions from
another module with different names.
More advanced module systems have become a common feature in recently designed languages: ML, Python, Perl, and Modula 3 all allow the renaming of objects from a foreign module, so they will not clutter the global name space.
Scheme, as defined in R4RS, does not have a module system at all.
Aubrey Jaffer, mostly to support his portable Scheme library SLIB, implemented a provide/require mechanism for many Scheme implementations. Library files in SLIB provide a feature, and when user programs require that feature, the library file is loaded in.
For example, the file `random.scm' in the SLIB package contains the line
(provide 'random)
so to use its procedures, a user would type
(require 'random)
and they would magically become available, but still have the same names! So this method is nice, but not as good as a full-featured module system.
In 1996 Tom Lord implemented a full-featured module system for Guile which allows loading Scheme source files into a private name space.
This module system is regarded as being rather idiosyncratic, and will probably change to something more like the ML module system, so for now I will simply descrive how it works for a couple of simple cases.
First of all, the Guile module system sets up a hierarchical name space,
and that name space can be represented like Unix pathnames preceded by a
# character. The root name space for all Guile-supplied modules
is called ice-9
.
So for example, the SLIB interface, contained in `$srcdir/ice-9/slib.scm', starts out with
(define-module (ice-9 slib))
and a user program can use
(use-modules (ice-9 slib))
to have access to all procedures and variables defined within the slib
module with (define-public ...)
.
So here are the functions involved:
(hierarchy file)
. One
example of this is
(use-modules (ice-9 slib))
define-module makes this module available to Guile programs under the given module-specification.
(hierarchy file)
. One
example of this is
(use-modules (ice-9 slib))
use-modules allows the current Guile program to use all publicly defined procedures and variables in the module denoted by module-specification.
[FIXME: must say more, and explain, and also demonstrate a private name space use, and demonstrate how one would do Python's "from Tkinter import *" versus "import Tkinter". Must also add something about paths and standards for contributed modules.]
Some modules are included in the Guile distribution; here are references to the entries in this manual which describe them in more detail:
Often you will want to extend Guile by linking it with some existing
system library. For example, linking Guile with a curses
or
termcap
library would be useful if you want to implement a
full-screen user interface for a Guile application. However, if you
were to link Guile with these libraries at compile time, it would bloat
the interpreter considerably, affecting everyone on the system even if
the new libraries are useful only to you. Also, every time a new
library is installed, you would have to reconfigure, recompile and
relink Guile merely in order to provide a new interface.
Many Unix systems permit you to get around this problem by using dynamic loading. When a new library is linked, it can be made a dynamic library by passing certain switches to the linker. A dynamic library does not need to be linked with an executable image at link time; instead, the executable may choose to load it dynamically at run time. This is a powerful concept that permits an executable to link itself with almost any library without reconfiguration, if it has been written properly.
Guile's dynamic linking functions make it relatively easy to write a module that incorporates code from third-party object code libraries.
#t
if obj is a dynamic library handle, or #f
otherwise.
dynamic-func
. Otherwise, it should
be a function handle returned by a previous call to dynamic-func
.
The return value is unspecified.
dynamic-call
,
proc should be either a function handle or a string, in which case
it is first fetched from lib with dynamic-func
.
proc is assumed to return an integer, which is used as the return
value from dynamic-args-call
.
use-modules
, for instance), and whose cdr is the function handle
for that module's initializer function.
[FIXME: provide a brief example here of writing the C hooks for an object code module, and using dynamic-link and dynamic-call to load the module.]
Most modern Unices have something called shared libraries. This ordinarily means that they have the capability to share the executable image of a library between several running programs to save memory and disk space. But generally, shared libraries give a lot of additional flexibility compared to the traditional static libraries. In fact, calling them `dynamic' libraries is as correct as calling them `shared'.
Shared libraries really give you a lot of flexibility in addition to the memory and disk space savings. When you link a program against a shared library, that library is not closely incorporated into the final executable. Instead, the executable of your program only contains enough information to find the needed shared libraries when the program is actually run. Only then, when the program is starting, is the final step of the linking process performed. This means that you need not recompile all programs when you install a new, only slightly modified version of a shared library. The programs will pick up the changes automatically the next time they are run.
Now, when all the necessary machinery is there to perform part of the linking at run-time, why not take the next step and allow the programmer to explicitly take advantage of it from within his program? Of course, many operating systems that support shared libraries do just that, and chances are that Guile will allow you to access this feature from within your Scheme programs. As you might have guessed already, this feature is called dynamic linking(1)
As with many aspects of Guile, there is a low-level way to access the dynamic linking apparatus, and a more high-level interface that integrates dynamically linked libraries into the module system.
When using the low level procedures to do your dynamic linking, you have complete control over which library is loaded when and what get's done with it.
Normally, library is just the name of some shared library file that will be searched for in the places where shared libraries usually reside, such as in `/usr/lib' and `/usr/local/lib'.
dynamic-link
. When dynamic-unlink
has been called on
dynobj, it is no longer usable as an argument to the functions
below and you will get type mismatch errors when you try to.
dynamic-call
to actually call this function. Right now,
these Scheme objects are formed by casting the address of the function
to long
and converting this number to its Scheme representation.
Regardless whether your C compiler prepends an underscore `_' to the global names in a program, you should not include this underscore in function. Guile knows whether the underscore is needed or not and will add it when necessary.
dynamic-func
, call that
function and ignore dynobj. When function is a string (or
symbol, etc.), look it up in dynobj; this is equivalent to
(dynamic-call (dynamic-func function dynobj #f))
Interrupts are deferred while the C function is executing (with
SCM_DEFER_INTS
/SCM_ALLOW_INTS
).
dynamic-call
, but pass it some arguments and return its
return value. The C function is expected to take two arguments and
return an int
, just like main
:
int c_func (int argc, char **argv);
The parameter args must be a list of strings and is converted into
an array of char *
. The array is passed in argv and its
size in argc. The return value is converted to a Scheme number
and returned from the call to dynamic-args-call
.
When dynamic linking is disabled or not supported on your system, the above functions throw errors, but they are still available.
Here is a small example that works on GNU/Linux:
(define libc-obj (dynamic-link "libc.so")) libc-obj => #<dynamic-object "libc.so"> (dynamic-args-call 'rand libc-obj '()) => 269167349 (dynamic-unlink libc-obj) libc-obj => #<dynamic-object "libc.so" (unlinked)>
As you can see, after calling dynamic-unlink
on a dynamically
linked library, it is marked as `(unlinked)' and you are no longer
able to use it with dynamic-call
, etc. Whether the library is
really removed from you program is system-dependent and will generally
not happen when some other parts of your program still use it. In the
example above, libc
is almost certainly not removed from your
program because it is badly needed by almost everything.
The functions to call a function from a dynamically linked library,
dynamic-call
and dynamic-args-call
, are not very powerful.
They are mostly intended to be used for calling specially written
initialization functions that will then add new primitives to Guile.
For example, we do not expect that you will dynamically link
`libX11' with dynamic-link
and then construct a beautiful
graphical user interface just by using dynamic-call
and
dynamic-args-call
. Instead, the usual way would be to write a
special Guile<->X11 glue library that has intimate knowledge about both
Guile and X11 and does whatever is necessary to make them inter-operate
smoothly. This glue library could then be dynamically linked into a
vanilla Guile interpreter and activated by calling its initialization
function. That function would add all the new types and primitives to
the Guile interpreter that it has to offer.
>From this setup the next logical step is to integrate these glue libraries into the module system of Guile so that you can load new primitives into a running system just as you can load new Scheme code.
There is, however, another possibility to get a more thorough access to
the functions contained in a dynamically linked library. Anthony Green
has written `libffi', a library that implements a foreign
function interface for a number of different platforms. With it, you
can extend the Spartan functionality of dynamic-call
and
dynamic-args-call
considerably. There is glue code available in
the Guile contrib archive to make `libffi' accessible from Guile.
The new primitives that you add to Guile with gh_new_procedure
or
with any of the other mechanisms are normally placed into the same
module as all the other builtin procedures (like display
).
However, it is also possible to put new primitives into their own
module.
The mechanism for doing so is not very well thought out and is likely to change when the module system of Guile itself is revised, but it is simple and useful enough to document it as it stands.
What gh_new_procedure
and the functions used by the snarfer
really do is to add the new primitives to whatever module is the
current module when they are called. This is analogous to the
way Scheme code is put into modules: the define-module
expression
at the top of a Scheme source file creates a new module and makes it the
current module while the rest of the file is evaluated. The
define
expressions in that file then add their new definitions to
this current module.
Therefore, all we need to do is to make sure that the right module is
current when calling gh_new_procedure
for our new primitives.
Unfortunately, there is not yet an easy way to access the module system
from C, so we are better off with a more indirect approach. Instead of
adding our primitives at initialization time we merely register with
Guile that we are ready to provide the contents of a certain module,
should it ever be needed.
The function initfunc should perform the usual initialization
actions for your new primitives, like calling gh_new_procedure
or
including the file produced by the snarfer. When initfunc is
called, the current module is a newly created module with a name as
indicated by name. Each definition that is added to it will be
automatically exported.
The string name indicates the hierachical name of the new module.
It should consist of the individual components of the module name
separated by single spaces. That is, the Scheme module name (foo
bar)
, which is a list, should be written as "foo bar"
for the
name parameter.
You can call scm_register_module_xxx
at any time, even before
Guile has been initialized. This might be useful when you want to put
the call to it in some initialization code that is magically called
before main, like constructors for global C++ objects.
An example for scm_register_module_xxx
appears in the next section.
Now, instead of calling the initialization function at program startup,
you should simply call scm_register_module_xxx
and pass it the
initialization function. When the named module is later requested by
Scheme code with use-modules
for example, Guile will notice that
it knows how to create this module and will call the initialization
function at the right time in the right context.
The most interesting application of dynamically linked libraries is probably to use them for providing compiled code modules to Scheme programs. As much fun as programming in Scheme is, every now and then comes the need to write some low-level C stuff to make Scheme even more fun.
Not only can you put these new primitives into their own module (see the previous section), you can even put them into a shared library that is only then linked to your running Guile image when it is actually needed.
An example will hopefully make everything clear. Suppose we want to
make the Bessel functions of the C library available to Scheme in the
module `(math bessel)'. First we need to write the appropriate
glue code to convert the arguments and return values of the functions
from Scheme to C and back. Additionally, we need a function that will
add them to the set of Guile primitives. Because this is just an
example, we will only implement this for the j0
function, tho.
#include <math.h> #include <guile/gh.h> SCM j0_wrapper (SCM x) { return gh_double2scm (j0 (gh_scm2double (x))); } void init_math_bessel () { gh_new_procedure1_0 ("j0", j0_wrapper); }
We can already try to bring this into action by manually calling the low
level functions for performing dynamic linking. The C source file needs
to be compiled into a shared library. Here is how to do it on
GNU/Linux, please refer to the libtool
documentation for how to
create dynamically linkable libraries portably.
gcc -shared -o libbessel.so -fPIC bessel.c
Now fire up Guile:
(define bessel-lib (dynamic-link "./libbessel.so")) (dynamic-call "init_math_bessel" bessel-lib) (j0 2) => 0.223890779141236
The filename `./libbessel.so' should be pointing to the shared
library produced with the gcc
command above, of course. The
second line of the Guile interaction will call the
init_math_bessel
function which in turn will register the C
function j0_wrapper
with the Guile interpreter under the name
j0
. This function becomes immediately available and we can call
it from Scheme.
Fun, isn't it? But we are only half way there. This is what
apropos
has to say about j0
:
(apropos 'j0) -| the-root-module: j0 #<primitive-procedure j0>
As you can see, j0
is contained in the root module, where all
the other Guile primitives like display
, etc live. In general,
a primitive is put into whatever module is the current module at
the time gh_new_procedure
is called. To put j0
into its
own module named `(math bessel)', we need to make a call to
scm_register_module_xxx
. Additionally, to have Guile perform
the dynamic linking automatically, we need to put `libbessel.so'
into a place where Guile can find it. The call to
scm_register_module_xxx
should be contained in a specially
named module init function. Guile knows about this special name
and will call that function automatically after having linked in the
shared library. For our example, we add the following code to
`bessel.c':
void scm_init_math_bessel_module () { scm_register_module_xxx ("math bessel", init_math_bessel); }
The general pattern for the name of a module init function is:
`scm_init_', followed by the name of the module where the
individual hierarchical components are concatenated with underscores,
followed by `_module'. It should call
scm_register_module_xxx
with the correct module name and the
appropriate initialization function. When that initialization function
will be called, a newly created module with the right name will be the
current module so that all definitions that the initialization
functions makes will end up in the correct module.
After `libbessel.so' has been rebuild, we need to place the shared
library into the right place. When Guile tries to autoload the
`(math bessel)' module, it looks not only for a file called
`math/bessel.scm' in its %load-path
, but also for
`math/libbessel.so'. So all we need to do is to create a directory
called `math' somewhere in Guile's %load-path
and place
`libbessel.so' there. Normally, the current directory `.' is
in the %load-path
, so we just use that for this example.
% mkdir maths % cd maths % ln -s ../libbessel.so . % cd .. % guile guile> (use-modules (math bessel)) guile> (j0 2) 0.223890779141236 guile> (apropos 'j0) -| bessel: j0 #<primitive-procedure j0>
That's it!
Note that we used a symlink to make `libbessel.so' appear in the right spot. This is probably not a bad idea in general. The directories that the `%load-path' normally contains are supposed to contain only architecture independent files. They are not really the right place for a shared library. You might want to install the libraries somewhere below `exec_prefix' and then symlink to them from the architecture independent directory. This will at least work on heterogenous systems where the architecture dependent stuff resides in the same place on all machines (which seems like a good idea to me anyway).
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
in-guard is called, then thunk, then out-guard.
If, any time during the execution of thunk, the continuation
of the dynamic-wind
expression is escaped non-locally, out-guard
is called. If the continuation of the dynamic-wind is re-entered,
in-guard is called. Thus in-guard and out-guard may
be called any number of times.
(define x 'normal-binding) => x (define a-cont (call-with-current-continuation (lambda (escape) (let ((old-x x)) (dynamic-wind ;; in-guard: ;; (lambda () (set! x 'special-binding)) ;; thunk ;; (lambda () (display x) (newline) (call-with-current-continuation escape) (display x) (newline) x) ;; out-guard: ;; (lambda () (set! x old-x))))))) ;; Prints: special-binding ;; Evaluates to: => a-cont x => normal-binding (a-cont #f) ;; Prints: special-binding ;; Evaluates to: => a-cont ;; the value of the (define a-cont...) x => normal-binding a-cont => special-binding
[FIXME: This is pasted in from Tom Lord's original guile.texi chapter plus the Cygnus programmer's manual; it should be *very* carefully reviewed and largely reorganized.]
A dynamic root is a root frame of Scheme evaluation. The top-level repl, for example, is an instance of a dynamic root.
Each dynamic root has its own chain of dynamic-wind information. Each has its own set of continuations, jump-buffers, and pending CATCH statements which are inaccessible from the dynamic scope of any other dynamic root.
In a thread-based system, each thread has its own dynamic root. Therefore, continuations created by one thread may not be invoked by another.
Even in a single-threaded system, it is sometimes useful to create a new dynamic root. For example, if you want to apply a procedure, but to not allow that procedure to capture the current continuation, calling the procedure under a new dynamic root will do the job.
If an error occurs during evaluation, apply handler to the
arguments to the throw, just as throw
would. If this happens,
handler is called outside the scope of the new root -- it is
called in the same dynamic context in which
call-with-dynamic-root
was evaluated.
If thunk captures a continuation, the continuation is rooted at
the call to thunk. In particular, the call to
call-with-dynamic-root
is not captured. Therefore,
call-with-dynamic-root
always returns at most one time.
Before calling thunk, the dynamic-wind chain is un-wound back to the root and a new chain started for thunk. Therefore, this call may not do what you expect:
;; Almost certainly a bug: (with-output-to-port some-port (lambda () (call-with-dynamic-root (lambda () (display 'fnord) (newline)) (lambda (errcode) errcode))))
The problem is, on what port will `fnord\n' be displayed? You
might expect that because of the with-output-to-port
that
it will be displayed on the port bound to some-port
. But it
probably won't -- before evaluating the thunk, dynamic winds are
unwound, including those created by with-output-to-port
.
So, the standard output port will have been re-set to its default value
before display
is evaluated.
(This function was added to Guile mostly to help calls to functions in C libraries that can not tolerate non-local exits or calls that return multiple times. If such functions call back to the interpreter, it should be under a new dynamic root.)
These objects are only useful for comparison using eq?
.
They are currently represented as numbers, but your code should
in no way depend on this.
If integer exit_val is specified and if Guile is being used stand-alone and if quit is called from the initial dynamic-root, exit_val becomes the exit status of the Guile process and the process exits.
When Guile is run interactively, errors are caught from within the
read-eval-print loop. An error message will be printed and abort
called. A default set of signal handlers is installed, e.g., to allow
user interrupt of the interpreter.
It is possible to switch to a "batch mode", in which the interpreter will terminate after an error and in which all signals cause their default actions. Switching to batch mode causes any handlers installed from Scheme code to be removed. An example of where this is useful is after forking a new process intended to run non-interactively.
#f
case has not been implemented.
[NOTE: this chapter was written for Cygnus Guile and has not yet been updated for the Guile 1.x release.]
Here is a the reference for Guile's threads. In this chapter I simply quote verbatim Tom Lord's description of the low-level primitives written in C (basically an interface to the POSIX threads library) and Anthony Green's description of the higher-level thread procedures written in scheme.
When using Guile threads, keep in mind that each guile thread is executed in a new dynamic root.
If an error occurs during evaluation, call error-thunk, passing it an error code describing the condition. [Error codes are currently meaningless integers. In the future, real values will be specified.] If this happens, the error-thunk is called outside the scope of the new root -- it is called in the same dynamic context in which with-new-thread was evaluated, but not in the callers thread.
All the evaluation rules for dynamic roots apply to threads.
If an error occurs during evaluation, call error-thunk, passing it an error code describing the condition. [Error codes are currently meaningless integers. In the future, real values will be specified.] If this happens, the error-thunk is called outside the scope of the new root -- it is called in the same dynamic context in which with-new-thread was evaluated, but not in the callers thread.
All the evaluation rules for dynamic roots apply to threads.
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
[FIXME: This chapter is based on Mikael Djurfeldt's answer to a question by Michael Livshin. Any mistakes are not theirs, of course. ]
Weak references let you attach bookkeeping information to data so that the additional information automatically disappears when the original data is no longer in use and gets garbage collected. In a weak key hash, the hash entry for that key disappears as soon as the key is no longer referneced from anywhere else. For weak value hashes, the same happens as soon as the value is no longer in use. Entries in a doubly weak hash disappear when either the key or the value are not used anywhere else anymore.
Property lists offer the same kind of functionality as weak key hashes in many situations. (see section Property Lists)
Here's an example (a little bit strained perhaps, but one of the examples is actually used in Guile):
Assume that you're implementing a debugging system where you want to associate information about filename and position of source code expressions with the expressions themselves.
Hashtables can be used for that, but if you use ordinary hash tables it will be impossible for the scheme interpreter to "forget" old source when, for example, a file is reloaded.
To implement the mapping from source code expressions to positional information it is necessary to use weak-key tables since we don't want the expressions to be remembered just because they are in our table.
To implement a mapping from source file line numbers to source code expressions you would use a weak-value table.
To implement a mapping from source code expressions to the procedures they constitute a doubly-weak table has to be used.
You can modify weak hash tables in exactly the same way you would modify regular hash tables. (see section Hash Tables)
Weak vectors are mainly useful in Guile's implementation of weak hash tables.
weak-vector
uses the list of
its arguments while list->weak-vector
uses its only argument
l (a list) to construct a weak vector the same way
vector->list
would.
[FIXME: this is pasted in from Tom Lord's original guile.texi and should be reviewed]
It is often useful to have site-specific information about the current Guile installation. This chapter describes how to find out about Guile's configuration at run time.
(version) => "1.3a" (major-version) => "1" (minor-version) => "3a"
libguile
was
configured. This is used to determine whether the Guile core
interpreter and the ice-9 runtime have grown out of date with one
another.
--- The name of this chapter needs to clearly distinguish it from the appendix describing the debugger UI. The intro should have a pointer to the UI appendix.
@unnumbered{Part III: Unix Programming}
The low-level interfaces are designed to give Scheme programs access to as much functionality as possible from the underlying Unix system. They can be used to implement higher level intefaces such as the Scheme shell section The Scheme shell (scsh).
Generally there is a single procedure for each corresponding Unix
facility. There are some exceptions, such as procedures implemented for
speed and convenience in Scheme with no primitive Unix equivalent,
e.g., copy-file
.
The interfaces are intended as far as possible to be portable across different versions of Unix, so that Scheme programmers don't need to be concerned with implementation differences. In some cases procedures which can't be implemented (or reimplemented) on particular systems may become no-ops, or perform limited actions. In other cases they may throw errors. It should be possible to use the feature system to determine what functionality is available.
General naming conventions are as follows:
recv!
.
#t
or #f
) have question marks
added, e.g., access?
.
primitive-fork
.
EPERM
or R_OK
are converted
to Scheme variables of the same name (underscores are not replaced
with hyphens).
Most of the procedures can be relied on to return a well-specified value. Unexpected conditions are handled by raising exceptions.
There are a few procedures which return a special
value if they don't succeed, e.g., getenv
returns #f
if it the requested string is not found in the environment. These
cases will be noted in the documentation.
For ways to deal with exceptions, section Exceptions.
Errors which the C-library would report by returning a NULL
pointer or through some other means are reported by raising a
system-error
exception.
The value of the Unix errno
variable is available
in the data passed by the exception. Accessing the
global errno value directly would be unreliable due to
continuations, interrupts or multiple threads.
Conventions largely follow those of scsh, section The Scheme shell (scsh).
Guile ports are currently based on C stdio streams. Ports can be
buffered or unbuffered. Unbuffered ports can be specified by including
0
in a port mode string. Note that some system call interfaces
(e.g., recv!
) will accept ports as arguments, but
will actually operate directly on the file descriptor underlying the
port. Any port buffering is ignored, including the one character buffer
used to implement peek-char
and unread-char
.
File descriptors are generally not useful for Scheme programs; however they can be needed when interfacing with foreign code and the Unix environment.
A file descriptor can be extracted from a port and later converted back to a port. However a file descriptor is just an integer, and the garbage collector doesn't recognise it as a reference to the port. If all other references to the port were dropped, then it's likely that the garbage collector would free the port, with the side-effect of closing the file descriptor prematurely.
To assist the programmer in avoiding this problem, each port has an associated "revealed count" which can be used to keep track of how many times the underlying file descriptor has been stored in other places. The idea is for the programmer to ensure that the revealed count will be greater than zero if the file descriptor is needed elsewhere.
For the simple case where a file descriptor is "imported" once to become a port, it does not matter if the file descriptor is closed when the port is garbage collected. There is no need to maintain a revealed count. Likewise when "exporting" a file descriptor to the external environment, setting the revealed count is not required if the port is kept open while the file descriptor is in use.
To correspond with traditional Unix behaviour, the three file descriptors (0, 1 and 2) are automatically imported when a program starts up and assigned to the initial values of the current input, output and error ports. The revealed count for each is initially set to one, so that dropping references to one of these ports will not result in its garbage collection: it could be retrieved with fdopen or fdes->ports.
flags can be constructed by combining variables using logior
.
Basic flags are:
See the Unix documentation of the open
system call
for additional flags.
open
but returns a file descriptor instead of a
port.
duplicate-port
with an appropriate mode string.
The next group of procedures perform a dup2
system call, if newfd (an
integer) is supplied, otherwise a dup
. The file descriptor to be
duplicated can be supplied as an integer or contained in a port. The
type of value returned varies depending on which procedure is used.
All procedures also have the side effect when performing dup2
that any
ports using newfd are moved to a different file descriptor and have
their revealed counts set to zero.
Unexpected behaviour can result if both ports are subsequently used
and the original and/or duplicate ports are buffered.
The mode string can include 0
to obtain an unbuffered duplicate
port.
This procedure is equivalent to (dup->port port modes)
.
The return value is unspecified.
Unexpected behaviour can result if both ports are subsequently used and the original and/or duplicate ports are buffered.
This procedure does not have any side effects on other ports or revealed counts.
_IONBF
_IOLBF
_IOFBF
This procedure should not be used after I/O has been performed with the port.
Ports are usually block buffered by default, with a default buffer size.
Procedures e.g., section File Ports, which accept a
mode string allow 0
to be added to request an unbuffered port.
Values for command are:
F_DUPFD
F_GETFD
F_SETFD
F_GETFL
F_SETFL
F_GETOWN
SIGIO
signals.
F_SETOWN
SIGIO
signals.
FD_CLOEXEC
F_GETFL
or
F_SETFL
.
Buffered input or output data is (currently, but this may change)
ignored: select uses the underlying file descriptor of a port
(char-ready?
will check input buffers, output buffers are
problematic).
The return value is a list of subsets of the input lists or vectors for which the requested condition has been met.
It is not quite compatible with scsh's select: scsh checks port buffers, doesn't accept input lists or a microsecond timeout, returns multiple values instead of a list and has an additional select! interface.
These procedures allow querying and setting file system attributes (such as owner, permissions, sizes and types of files); deleting, copying, renaming and linking files; creating and removing directories and querying their contents; syncing the file system and creating special files.
#t
if path corresponds to an existing
file and the current process
has the type of access specified by how, otherwise
#f
.
how should be specified
using the values of the variables listed below. Multiple values can
be combined using a bitwise or, in which case #t
will only
be returned if all accesses are granted.
Permissions are checked using the real id of the current process, not the effective id, although it's the effective id which determines whether the access would actually be granted.
fstat
is used
as the underlying system call).
The object returned by stat
can be passed as a single parameter
to the following procedures, all of which return integers:
stat:dev
stat:ino
stat:mode
stat:type
and stat:perms
below.
stat:nlink
stat:uid
stat:gid
stat:rdev
stat:size
stat:atime
stat:mtime
stat:ctime
stat:blksize
stat:blocks
In addition, the following procedures return the information from stat:mode in a more convenient form:
stat:type
stat:perms
stat
, but does not follow symbolic links, i.e.,
it will return information about a symbolic link itself, not the
file it points to. path must be a string.
If obj is a symbolic link, either the
ownership of the link or the ownership of the referenced file will be
changed depending on the operating system (lchown is
unsupported at present). If owner or group is specified
as -1
, then that ID is not changed.
fchmod
is used
as the underlying system call).
mode specifies
the new permissions as a decimal number, e.g., (chmod "foo" #o755)
.
The return value is unspecified.
utime
sets the access and modification times for
the file named by path. If actime or modtime
is not supplied, then the current time is used.
actime and modtime
must be integer time values as returned by the current-time
procedure.
E.g.,
(utime "foo" (- (current-time) 3600))
will set the access time to one hour in the past and the modification time to the current time.
truncate
and ftruncate
.
The return value is unspecified.
readdir
will return the first directory entry.
E.g.,
(mknod "/dev/fd0" 'block-special #o660 (+ (* 2 256) 2))
The return value is unspecified.
tmpnam
function in the system libraries.
The facilities in this section provide an interface to the user and group database. They should be used with care since they are not reentrant.
The following functions accept an object representing user information and return a selected component:
passwd:name
passwd:passwd
passwd:uid
passwd:gid
passwd:gecos
passwd:dir
passwd:shell
getpwent
to read from the user database.
The next use of getpwent
will return the first entry. The
return value is unspecified.
setpwent
.
getpwent
. The return value is unspecified.
setpwent
and
endpwent
procedures are implemented on top of this.
The following functions accept an object representing group information and return a selected component:
group:name
group:passwd
group:gid
group:mem
getgrent
to read from the group database.
The next use of getgrent
will return the first entry.
The return value is unspecified.
setgrent
.
getgrent
.
The return value is unspecified.
setgrent
and
endgrent
procedures are implemented on top of this.
The following procedures either accept an object representing a broken down time and return a selected component, or accept an object representing a broken down time and a value and set the component to the value. The numbers in parentheses give the usual range.
tm:sec, set-tm:sec
tm:min, set-tm:min
tm:hour, set-tm:hour
tm:mday, set-tm:mday
tm:mon, set-tm:mon
tm:year, set-tm:year
tm:wday, set-tm:wday
tm:yday, set-tm:yday
tm:isdst, set-tm:isdst
tm:gmtoff, set-tm:gmtoff
tm:zone, set-tm:zone
current-time
. The time zone
for the calculation is optionally specified by zone (a string),
otherwise the TZ
environment variable or the system default is
used.
current-time
. The values
are calculated for UTC.
zone
is an optional time zone specifier (otherwise the TZ environment variable
or the system default is used).
Returns a pair: the CAR is a corresponding
integer time value like that returned
by current-time
; the CDR is a broken down time object, similar to
as bd-time but with normalized values.
localtime
or gmtime
. template is a string which can include formatting
specifications introduced by a %
character. The formatting of
month and day names is dependent on the current locale. The value returned
is the formatted string.
See section `Formatting Date and Time' in The GNU C Library Reference Manual.)
strftime
, parsing string
according to the specification supplied in template. The
interpretation of month and day names is dependent on the current
locale. The
value returned is a pair. The CAR has an object with time components
in the form returned by localtime
or gmtime
,
but the time zone components
are not usefully set.
The CDR reports the number of characters from string which
were used for the conversion.
tms:clock
tms:utime
tms:stime
tms:cutime
waitpid
).
tms:cstime
E.g., (umask #o022)
sets the mask to octal 22, decimal 18.
(feature? 'EIDs)
reports whether the system
supports effective IDs.
(feature? 'EIDs)
reports whether the system
supports effective IDs.
(feature? 'EIDs)
reports whether the
system supports effective IDs.
The return value is unspecified.
(feature? 'EIDs)
reports whether the
system supports effective IDs.
The return value is unspecified.
The value of pid determines the behaviour:
The options argument, if supplied, should be the bitwise OR of the values of zero or more of the following variables:
The return value is a pair containing:
WNOHANG
was
specified and no process was collected.
The following three
functions can be used to decode the process status code returned
by waitpid
.
exit
or _exit
, if any, otherwise #f
.
#f
.
#f
.
sh
. The value
returned is cmd's exit status as returned by waitpid
, which
can be interpreted using the functions above.
If system
is called without arguments, it returns a boolean
indicating whether the command processor is available.
argv
argument to main
.
Conventionally the first arg is the same as path.
All arguments must be strings.
If arg is missing, path is executed with a null argument list, which may have system-dependent side-effects.
This procedure is currently implemented using the execv
system
call, but we call it execl
because of its Scheme calling interface.
execl
, however if
filename does not contain a slash
then the file to execute will be located by searching the
directories listed in the PATH
environment variable.
This procedure is currently implemented using the execlv
system
call, but we call it execlp
because of its Scheme calling interface.
execl
, but the environment of the new process is
specified by env, which must be a list of strings as returned by the
environ
procedure.
This procedure is currently implemented using the execve
system
call, but we call it execle
because of its Scheme calling interface.
This procedure has been renamed from fork
to avoid a naming conflict
with the scsh fork.
#f
unless a string of the form NAME=VALUE
is
found, in which case the string VALUE
is
returned.
If string is of the form NAME=VALUE
then it will be written
directly into the environment, replacing any existing environment string
with
name matching NAME
. If string does not contain an equal
sign, then any existing string with name matching string will
be removed.
The return value is unspecified.
If value is #f
, then name is removed from the
environment. Otherwise, the string name=value is added
to the environment, replacing any existing string with name matching
name.
The return value is unspecified.
NAME=VALUE
and values of NAME
should not be duplicated.
If env is supplied then the return value is unspecified.
Procedures to raise, handle and wait for signals.
Sends a signal to the specified process or group of processes.
pid specifies the processes to which the signal is sent:
sig should be specified using a variable corresponding to the Unix symbolic name, e.g.,
Sends a specified signal sig to the current process, where sig is as described for the kill procedure.
Install or report the signal hander for a specified signal.
signum is the signal number, which can be specified using the value
of variables such as SIGINT
.
If action is omitted, sigaction
returns a pair: the
CAR is the current
signal hander, which will be either an integer with the value SIG_DFL
(default action) or SIG_IGN
(ignore), or the Scheme procedure which
handles the signal, or #f
if a non-Scheme procedure handles the
signal. The CDR contains the current sigaction
flags for the handler.
If action is provided, it is installed as the new handler for
signum.
action can be a Scheme procedure taking one argument, or the value of
SIG_DFL
(default action) or SIG_IGN
(ignore), or #f
to restore
whatever signal handler was installed before sigaction
was first used.
Flags can optionally be specified for the new handler (SA_RESTART
is
always used if the system provides it, so need not be specified.) The
return value is a pair with information about the old handler as
described above.
This interface does not provide access to the "signal blocking" facility. Maybe this is not needed, since the thread support may provide solutions to the problem of consistent access to data structures.
sigaction
was made. The return value is unspecified.
SIGALRM
signal after the specified
number of seconds (an integer). It's advisable to install a signal
handler for
SIGALRM
beforehand, since the default action is to terminate
the process.
The return value indicates the time remaining for the previous alarm, if any. The new value replaces the previous alarm. If there was no previous alarm, the return value is zero.
#t
if port is using a serial
non-file device, otherwise #f
.
If there is no foreground process group, the return value is a number greater than 1 that does not match the process group ID of any existing process group. This can happen if all of the processes in the job that was formerly the foreground job have terminated, and no other job has yet been moved into the foreground.
These procedures provide an interface to the popen
and
pclose
system routines.
OPEN_READ
or OPEN_WRITE
.
(open-pipe command OPEN_READ)
.
(open-pipe command OPEN_WRITE)
.
open-pipe
, then waits for the process
to terminate and returns its status value, See section Processes, for
information on how to interpret this value.
close-port
(see section Closing Ports) can also be used to
close a pipe, but doesn't return the status.
This section describes procedures which convert internet addresses and query various network databases. Care should be taken when using the database routines since they are not reentrant.
(inet-aton "127.0.0.1") => 2130706433
(inet-ntoa 2130706433) => "127.0.0.1"
(inet-netof 2130706433) => 127
(inet-lnaof 2130706433) => 1
(inet-makeaddr 127 1) => 2130706433
A host object is a structure that represents what is known about a
network host, and is the usual way of representing a system's network
identity inside software. The hostent
functions accept a host
object and return some information about that host's presence on the
network.
AF_INET
.
The gethost
functions are used to find a particular host's entry
in the host database.
gethost
procedure will accept either a string name or an integer
address; if given no arguments, it behaves like gethostent
(see
below).
The following three procedures may be used to step through the host database from beginning to end. [FIXME: document the `sethost' variant of `sethostent'.]
gethostent
, and may
also be called afterward to reset the host entry stream.
#f
if
there are no more hosts to be found (or an error has been encountered).
This procedure may not be used before sethostent
has been called.
gethostent
. The return value is unspecified.
The netent
functions accept an object representing a network
and return a selected component.
AF_INET
.
The getnet
procedures look up a particular network in the network
database. Each returns a network object that describes the
network's technical nature.
getnet
will accept either type of
argument, behaving like getnetent
(see below) if no arguments are
given.
[FIXME: document the `setnet' variant of `setnetent'.]
getnetent
to read from the network
database.
The next use of getnetent
will return the first entry. The
return value is unspecified.
getnetent
. The return value is unspecified.
The protoent
procedures accept an object representing a protocol
and return a selected component.
The getproto
procedures look up a particular network protocol and
return a protocol object.
getprotobyname
takes a string argument, and getprotobynumber
takes an integer
argument. getproto
will accept either type, behaving like
getprotoent
(see below) if no arguments are supplied.
[FIXME: document the `setproto' variant of `setprotoent'.]
getprotoent
to read from the protocol
database.
The next use of getprotoent
will return the first entry. The
return value is unspecified.
getprotoent
. The return value is unspecified.
The servent
procedures accept a service object and return
some information about that service.
The getserv
procedures look up a network service and return a
service object which describes the names and network ports
conventionally assigned to the service.
The getserv
procedure will take either a service name or number
as its first argument; if given no arguments, it behaves like
getservent
(see below).
[FIXME: document the `setserv' variant of `setservent'.]
getservent
to read from the services
database.
The next use of getservent
will return the first entry. The
return value is unspecified.
getservent
. The return value is unspecified.
Socket ports can be created using socket
and socketpair
.
The ports are initially unbuffered, to
makes reading and writing to the same port more reliable. Buffered
ports can be obtained with
duplicate-port
, See section Ports and File Descriptors.
These procedures convert Internet addresses and port values between "host" order and "network" order as required. The arguments and return values should be in "host" order.
AF_UNIX
and AF_INET
. Typical values for style are
the values of SOCK_STREAM
, SOCK_DGRAM
and SOCK_RAW
.
protocol can be obtained from a protocol name using
getprotobyname
. A value of
zero specifies the default protocol, which is usually right.
A single socket port cannot by used for communication until it has been connected to another socket.
AF_UNIX
family. Zero is likely to be
the only meaningful value for protocol.
SOL_SOCKET
for socket-level options.
optname is an
integer code for the option required and should be specified using one of
the symbols SO_DEBUG
, SO_REUSEADDR
etc.
The returned value is typically an integer but SO_LINGER
returns a
pair of integers.
SOL_SOCKET
for socket-level options.
optname is an
integer code for the option to set and should be specified using one of
the symbols SO_DEBUG
, SO_REUSEADDR
etc.
value is the value to which the option should be set. For
most options this must be an integer, but for SO_LINGER
it must
be a pair.
The return value is unspecified.
close-port
. The
shutdown
procedure allows reception or tranmission on a
connection to be shut down individually, according to the parameter
how:
The return value is unspecified.
For a socket of family AF_UNIX
,
only address
is specified and must be a string with the
filename where the socket is to be created.
For a socket of family AF_INET
,
address
must be an integer Internet host address and arg ...
must be a single integer port number.
The return value is unspecified.
The format of address and ARG ... depends on the family of the socket.
For a socket of family AF_UNIX
, only address
is specified and must
be a string with the filename where the socket is to be created.
For a socket of family AF_INET
, address must be an integer
Internet host address and arg ... must be a single integer
port number.
The values of the following variables can also be used for address:
The return value is unspecified.
accept
to accept a connection from the queue.
The return value is unspecified.
The return value is a pair in which the CAR is a new socket port for the connection and the CDR is an object with address information about the client which initiated the connection.
If the address is not available then the CDR will be an empty vector.
socket does not become part of the connection and will continue to accept new requests.
The following functions take a socket address object, as returned
by accept
and other procedures, and return a selected component.
sockaddr:fam
AF_UNIX
or
AF_INET
.
sockaddr:path
AF_UNIX
, returns the path of the
filename the socket is based on.
sockaddr:addr
AF_INET
, returns the Internet host
address.
sockaddr:port
AF_INET
, returns the Internet port
number.
accept
. On many systems the address of a socket
in the AF_FILE
namespace cannot be read.
accept
. On many systems the address of a socket
in the AF_FILE
namespace cannot be read.
The optional flags argument is a value or bitwise OR of MSG_OOB, MSG_PEEK, MSG_DONTROUTE etc.
The value returned is the number of bytes read from the socket.
Note that the data is read directly from the socket file descriptor: any unread buffered port data is ignored.
Note that the data is written directly to the socket file descriptor: any unflushed buffered port data is ignored.
buf
, is a string into which
the data will be written. The size of buf limits the amount of
data which can be received: in the case of packet
protocols, if a packet larger than this limit is encountered then some data
will be irrevocably lost.
The optional flags argument is a value or bitwise OR of MSG_OOB, MSG_PEEK, MSG_DONTROUTE etc.
The value returned is a pair: the CAR is the number of bytes read from
the socket and the CDR an address object in the same form as returned by
accept
.
The start and end arguments specify a substring of buf to which the data should be written.
Note that the data is read directly from the socket file descriptor: any unread buffered port data is ignored.
connect
procedure. The
value returned is the number of bytes transmitted -- it's possible for
this to be less than the length of message if the socket is
set to be non-blocking. The optional flags argument is a value or
bitwise OR of MSG_OOB, MSG_PEEK, MSG_DONTROUTE etc.
Note that the data is written directly to the socket file descriptor: any unflushed buffered port data is ignored.
The following procedures accept an object as returned by uname
and return a selected component.
utsname:sysname
utsname:nodename
utsname:release
utsname:version
utsname:machine
Note that most varieties of Unix are considered to be simply "UNIX".
That is because when a program depends on features that are not present
on every operating system, it is usually better to test for the presence
or absence of that specific feature. The return value of
software-type
should only be used for this purpose when there is
no other easy or unambiguous way of detecting such features.
LC_COLLATE
,
LC_ALL
etc.
Otherwise the specified locale category is set to the string locale and the new value is returned as a system-dependent string. If locale is an empty string, the locale will be set using envirionment variables.
The macros in this section are made available with:
(use-modules (ice-9 expect))
expect
is a macro for selecting actions based on the output from
a port. The name comes from a tool of similar functionality by Don Libes.
Actions can be taken when a particular string is matched, when a timeout
occurs, or when end-of-file is seen on the port. The expect
macro
is described below; expect-strings
is a front-end to expect
based on regexec (see the regular expression documentation).
expect-strings
will read from the current input port.
The first term in each clause consists of an expression evaluating to
a string pattern (regular expression). As characters
are read one-by-one from the port, they are accumulated in a buffer string
which is matched against each of the patterns. When a
pattern matches, the remaining expression(s) in
the clause are evaluated and the value of the last is returned. For example:
(with-input-from-file "/etc/passwd" (lambda () (expect-strings ("^nobody" (display "Got a nobody user.\n") (display "That's no problem.\n")) ("^daemon" (display "Got a daemon user.\n")))))
The regular expression is compiled with the REG_NEWLINE
flag, so
that the ^ and $ anchors will match at any newline, not just at the start
and end of the string.
There are two other ways to write a clause:
The expression(s) to evaluate can be omitted, in which case the result of the regular expression match (converted to strings, as obtained from regexec with match-pick set to "") will be returned if the pattern matches.
The symbol =>
can be used to indicate that the expression is a
procedure which will accept the result of a successful regular expression
match. E.g.,
("^daemon" => write) ("^d\\(aemon\\)" => (lambda args (for-each write args))) ("^da\\(em\\)on" => (lambda (all sub) (write all) (newline) (write sub) (newline)))
The order of the substrings corresponds to the order in which the opening brackets occur.
A number of variables can be used to control the behaviour
of expect
(and expect-strings
).
By default they are all bound at the top level to
the value #f
, which produces the default behaviour.
They can be redefined at the
top level or locally bound in a form enclosing the expect expression.
expect-port
expect-timeout
expect
will terminate after this number of
seconds, returning #f
or the value returned by expect-timeout-proc.
expect-timeout-proc
expect-eof-proc
expect-char-proc
Here's an example using all of the variables:
(let ((expect-port (open-input-file "/etc/passwd")) (expect-timeout 1) (expect-timeout-proc (lambda (s) (display "Times up!\n"))) (expect-eof-proc (lambda (s) (display "Reached the end of the file!\n"))) (expect-char-proc display)) (expect-strings ("^nobody" (display "Got a nobody user\n"))))
expect
is used in the same way as expect-strings
,
but tests are specified not as patterns, but as procedures. The
procedures are called in turn after each character is read from the
port, with the value of the accumulated string as the argument. The
test is successful if the procedure returns a non-false value.
If the =>
syntax is used, then if the test succeeds it must return
a list containing the arguments to be provided to the corresponding
expression.
In the following example, a string will only be matched at the beginning of the file:
(let ((expect-port (open-input-file "/etc/passwd"))) (expect ((lambda (s) (string=? s "fnord!")) (display "Got a nobody user!\n"))))
The control variables described for expect-strings
can also
be used with expect
.
An incomplete port of the Scheme shell (scsh) 0.5.1 is available for Guile. The idea is to allow Scheme code using scsh interfaces to be run inside the Guile interpreter.
For information about scsh on the Web see http://www-swiss.ai.mit.edu/scsh/scsh.html. The original scsh is available by ftp from ftp://swiss-ftp.ai.mit.edu:/pub/su.
The scsh code is distributed as a separate module, guile-scsh, which must be installed somewhere in Guile's load path before it can be used. This is similar to the installation of slib (you may want to install that first, since it's needed before scsh can run in Guile: see section SLIB for details).
This port of scsh does not currently use the Guile module system, but can be initialized with:
(load-from-path "scsh/init")
@unnumbered{Part IV: Using Scheme with C -- a Portable Interface}
The Guile interpreter is based on Aubrey Jaffer's SCM interpreter (see section `Overview' in SCM: a portable Scheme interpreter) with some modifications to make it suitable as an embedded interpreter, and further modifications as Guile evolves.
Part of the modification has been to provide a restricted interface to
limit access to the SCM internals; this is called the gh_
interface, or libguile interface.
If you are programming with Guile, you should only use the C
subroutines described in this manual, which all begin with
gh_
.
If instead you are extending Guile, you have the entire SCM source to play with. This manual will not help you at all, but you can consult Aubrey Jaffer's SCM manual (see section `Internals' in SCM: a portable Scheme interpreter).
If you are adding a module to Guile, I recommend that you stick
to the gh_
interface: this interface is guaranteed to not
change drastically, while the SCM internals might change as Guile is
developed.
To use gh, you must have the following toward the beginning of your C source:
#include <guile/gh.h>
When you link, you will have to add at least -lguile
to the list
of libraries. If you are using more of Guile than the basic Scheme
interpreter, you will have to add more libraries.
The following C constants and data types are defined in gh:
This can cause confusion because they are different from 0 and 1. In
testing a boolean function in libguile programming, you must always make
sure that you check the spec: gh_
and scm_
functions will
usually return SCM_BOOL_T
and SCM_BOOL_F
, but other C
functions usually can be tested against 0 and 1, so programmers' fingers
tend to just type if (boolean_function()) { ... }
gh_list()
.
In almost every case, your first gh_
call will be
gh_enter()
never exits, and the user's code should all be in the
main_prog()
function. argc
and argv
will be
passed to main_prog.
gh_enter()
after Guile has been started up.
Please note that gh_enter
does not load `ice-9/boot-9.scm', which
contains much of Guile's basic functionality, including some necessary
parts of Scheme. This is a limitation, and it is only so because the
basic Scheme language functions have not yet been separated from the
higher-level functionality provided by the `ice-9/boot-9.scm' module.
Here is a note from the Guile mailing list describing how to get around
this problem if you want to run some Guile code before you invoke
gh_repl()
. It is a temporary solution, and a better way of
handling the loading of `ice-9/boot-9.scm' will soon be introduced.
The next problem is that boot-9.scm may only be executed once, otherwise you get a stack overflow. When entering the read-eval-print-loop (repl) with gh_repl, guile loads boot-9.scm. Thus, if you did load boot-9.scm yourself and then later enter the repl, guile will abort with a stack overflow. If you look a little into the guile mailing list archives, you can find a temporary solution to the problem which I posted quite some time ago. It's a trivial fix: 1) rename boot-9.scm into boot-9-tail.scm 2) create a new boot-9.scm, which only contains the following code: (if (not (defined? 'provide)) (primitive-load-path "ice-9/boot-9-tail.scm")) With this modification, boot-9.scm can be read several times.
Also note that you can use gh_repl
inside gh_enter
if you
want the program to be controled by a Scheme read--eval--print--loop.
Invoking gh_repl
will load `ice-9/boot-9.scm'.
A convenience routine which enters the Guile interpreter with the standard Guile read--eval--print--loop (REPL) is:
Note that gh_repl
should be used inside gh_enter
,
since any Guile interpreter calls are meaningless unless they happen in
the context of the interpreter.
Also note that when you use gh_repl
, your program will be
controlled by Guile's REPL (which is written in Scheme and has many
useful features). Use straight C code inside gh_enter
if you
want to maintain execution control in your C program.
You will typically use gh_enter
and gh_repl()
when you
want a Guile interpreter enhanced by your own libraries, but otherwise
quite normal. For example, to build a Guile--derived program that
includes some random number routines GSL (GNU Scientific Library),
you would write a C program that looks like this:
#include <guile/gh.h> #include <gsl_ran.h> /* random number suite */ SCM gw_ran_seed(SCM s) { gsl_ran_seed(gh_scm2int(s)); return SCM_UNSPECIFIED; } SCM gw_ran_random() { SCM x; x = gh_ulong2scm(gsl_ran_random()); return x; } SCM gw_ran_uniform() { SCM x; x = gh_double2scm(gsl_ran_uniform()); return x; } SCM gw_ran_max() { return gh_double2scm(gsl_ran_max()); } void init_gsl() { /* random number suite */ gh_new_procedure("gsl-ran-seed", gw_ran_seed, 1, 0, 0); gh_new_procedure("gsl-ran-random", gw_ran_random, 0, 0, 0); gh_new_procedure("gsl-ran-uniform", gw_ran_uniform, 0, 0, 0); gh_new_procedure("gsl-ran-max", gw_ran_max, 0, 0, 0); } void main_prog (int argc, char *argv[]) { init_gsl(); gh_repl(argc, argv); } int main (int argc, char *argv[]) { gh_enter (argc, argv, main_prog); }
Then, supposing the C program is in `guile-gsl.c', you could compile it with gcc -o guile-gsl guile-gsl.c -lguile -lgsl.
The resulting program `guile-gsl' would have new primitive
procedures (gsl-ran-random)
, (gsl-ran-gaussian)
and so
forth.
[FIXME: need to fill this based on Jim's new mechanism]
Once you have an interpreter running, you can ask it to evaluate Scheme code. There are two calls that implement this:
Note that the line of code in scheme_code must be a well formed
Scheme expression. If you have many lines of code before you balance
parentheses, you must either concatenate them into one string, or use
gh_eval_file()
.
gh_eval_file
is completely analogous to gh_eval_str()
,
except that a whole file is evaluated instead of a string. Returns the
result of the last expression evaluated.
gh_load
is identical to gh_eval_file
(it's a macro that
calls gh_eval_file
on its argument). It is provided to start
making the gh_
interface match the R4RS Scheme procedures
closely.
The real interface between C and Scheme comes when you can write new Scheme procedures in C. This is done through the routine
gh_new_procedure
defines a new Scheme procedure. Its Scheme name
will be proc_name, it will be implemented by the C function
(*fn)(), it will take at least n_required_args arguments,
and at most n_optional_args extra arguments.
When the restp parameter is 1, the procedure takes a final argument: a list of remaining parameters.
gh_new_procedure
returns an SCM value representing the procedure.
The C function fn should have the form
Examples of C functions used as new Scheme primitives can be found in
the sample programs learn0
and learn1
.
Rationale: this is the correct way to define new Scheme procedures in C. The ugly mess of arguments is required because of how C handles procedures with variable numbers of arguments.
Note: what about documentation strings?
There are several important considerations to be made when writing the C routine (*fn)().
First of all the C routine has to return type SCM
.
Second, all arguments passed to the C funcion will be of type
SCM
.
Third: the C routine is now subject to Scheme flow control, which means that it could be interrupted at any point, and then reentered. This means that you have to be very careful with operations such as allocating memory, modifying static data ...
Fourth: to get around the latter issue, you can use
GH_DEFER_INTS
and GH_ALLOW_INTS
.
Guile provides mechanisms to convert data between C and Scheme. This
allows new builtin procedures to understand their arguments (which are
of type SCM
) and return values of type SCM
.
#f
if x is zero, #t
otherwise.
If start + len is off the end of dst, signal an out-of-range error.
*lenp
to the string's length.
This function uses malloc to obtain storage for the copy; the caller is responsible for freeing it.
Note that Scheme strings may contain arbitrary data, including null characters. This means that null termination is not a reliable way to determine the length of the returned value. However, the function always copies the complete contents of str, and sets *lenp to the true length of the string (when lenp is non-null).
If start + len is off the end of src, signal an out-of-range error.
"'symbol-name"
. If lenp is non-null, the string's length
is returned in *lenp
.
This function uses malloc to obtain storage for the returned string; the caller is responsible for freeing it.
vector can be an ordinary vector, a weak vector, or a signed or unsigned uniform vector of the same type as the result array. For chars, vector can be a string or substring. For floats and doubles, vector can contain a mix of inexact and integer values.
If vector is of unsigned type and contains values too large to fit in the signed destination array, those values will be wrapped around, that is, data will be copied as if the destination array was unsigned.
These C functions mirror Scheme's type predicate procedures with one
important difference. The C routines return C boolean values (0 and 1)
instead of SCM_BOOL_T
and SCM_BOOL_F
.
The Scheme notational convention of putting a ?
at the end of
predicate procedure names is mirrored in C by placing _p
at the
end of the procedure. For example, (pair? ...)
maps to
gh_pair_p(...)
.
These C functions mirror Scheme's equality predicate procedures with one
important difference. The C routines return C boolean values (0 and 1)
instead of SCM_BOOL_T
and SCM_BOOL_F
.
The Scheme notational convention of putting a ?
at the end of
predicate procedure names is mirrored in C by placing _p
at the
end of the procedure. For example, (equal? ...)
maps to
gh_equal_p(...)
.
eq?
predicate, 0 otherwise.
eqv?
predicate, 0 otherwise.
equal?
predicate, 0 otherwise.
Many of the Scheme primitives are available in the gh_
interface; they take and return objects of type SCM, and one could
basically use them to write C code that mimics Scheme code.
I will list these routines here without much explanation, since what
they do is the same as documented in section `Standard Procedures' in R4RS. But I will point out that when a procedure takes a
variable number of arguments (such as gh_list
), you should pass
the constant SCM_EOL from C to signify the end of the list.
(define name val)
: it binds a value to
the given name (which is a C string). Returns the new object.
(cons a b)
and (list l0 l1
...)
procedures. Note that gh_list()
is a C macro that invokes
scm_listify()
.
(set-car! ...)
and (set-cdr!
...)
procedures.
(caadar ls)
procedures etc ...
(set-car! ...)
.
(set-cdr! ...)
.
gh_append()
takes args, which is a list of lists
(list1 list2 ...)
, and returns a list containing all the elements
of the individual lists.
A typical invocation of gh_append()
to append 5 lists together
would be
gh_append(gh_list(l1, l2, l3, l4, l5, SCM_UNDEFINED));
The functions gh_append2()
, gh_append2()
,
gh_append3()
and gh_append4()
are convenience routines to
make it easier for C programs to form the list of lists that goes as an
argument to gh_append()
.
scm_reverse()
.
(memq x ls)
, (memv x ls)
and
(member x ls)
, and hence use (respectively) eq?
,
eqv?
and equal?
to do comparisons.
If x does not appear in ls, the value SCM_BOOL_F
(not
the empty list) is returned.
Note that these functions are implemented as macros which call
scm_memq()
, scm_memv()
and scm_member()
respectively.
If no pair in alist has x as its CAR, the value
SCM_BOOL_F
(not the empty list) is returned.
Note that these functions are implemented as macros which call
scm_assq()
, scm_assv()
and scm_assoc()
respectively.
(make-vector n fill)
,
(vector a b c ...)
(vector-ref v i)
(vector-set v i
value)
(vector-length v)
(list->vector ls)
procedures.
The correspondence is not perfect for gh_vector
: this routine
taks a list ls instead of the individual list elements, thus
making it identical to gh_list_to_vector
.
There is also a difference in gh_vector_length: the value returned is a
C unsigned long
instead of an SCM object.
gh_call0
), one argument (gh_call1
), and so on. You can
get the same effect by wrapping the arguments up into a list, and
calling gh_apply
; Guile provides these functions for convenience.
catch
and throw
procedures,
which in Guile are provided as primitives.
eq?
, eqv?
and equal?
predicates.
For now I just include Tim Pierce's comments from the `gh_data.c' file; it should be organized into a documentation of the two functions here.
/* Data lookups between C and Scheme Look up a symbol with a given name, and return the object to which it is bound. gh_lookup examines the Guile top level, and gh_module_lookup checks the module namespace specified by the `vec' argument. The return value is the Scheme object to which SNAME is bound, or SCM_UNDEFINED if SNAME is not bound in the given context. [FIXME: should this be SCM_UNSPECIFIED? Can a symbol ever legitimately be bound to SCM_UNDEFINED or SCM_UNSPECIFIED? What is the difference? -twp] */
@unnumbered{Part V: Using Scheme with C -- Guile's Low-Level Interface}
-- this is where we explain that all the functions marked as "Primitive Functions" are also accessible from C, and how to derive the C interface given the Scheme interface, when we don't spell it out. ... I think there's other stuff needed here ...
Error handling is based on catch and throw. Errors are always thrown with a key and four arguments: key: a symbol which indicates the type of error. The symbols used by libguile are listed below. subr: the name of the procedure from which the error is thrown, or #f. message: a string (possibly language and system dependent) describing the error. The tokens %s and %S can be embedded within the message: they will be replaced with members of the args list when the message is printed. %s indicates an argument printed using "display", while %S indicates an argument printed using "write". message can also be #f, to allow it to be derived from the key by the error handler (may be useful if the key is to be thrown from both C and Scheme). args: a list of arguments to be used to expand %s and %S tokens in message. Can also be #f if no arguments are required. rest: a list of any additional objects required. e.g., when the key is 'system-error, this contains the C errno value. Can also be #f if no additional objects are required. In addition to catch and throw, the following Scheme facilities are available: (scm-error key subr message args rest): throw an error, with arguments as described above. (error msg arg ...) Throw an error using the key 'misc-error. The error message is created by displaying msg and writing the args. The following are the error keys defined by libguile and the situations in which they are used: error-signal: thrown after receiving an unhandled fatal signal such as SIGSEV, SIGBUS, SIGFPE etc. The "rest" argument in the throw contains the coded signal number (at present this is not the same as the usual Unix signal number). system-error: thrown after the operating system indicates an error condition. The "rest" argument in the throw contains the errno value. numerical-overflow: numerical overflow. out-of-range: the arguments to a procedure do not fall within the accepted domain. wrong-type-arg: an argument to a procedure has the wrong thpe. wrong-number-of-args: a procedure was called with the wrong number of arguments. memory-allocation-error: memory allocation error. stack-overflow: stack overflow error. regex-error: errors generated by the regular expression library. misc-error: other errors. C support ========= SCM scm_error (SCM key, char *subr, char *message, SCM args, SCM rest) Throws an error, after converting the char * arguments to Scheme strings. subr is the Scheme name of the procedure, NULL is converted to #f. Likewise a NULL message is converted to #f. The following procedures invoke scm_error with various error keys and arguments. The first three call scm_error with the system-error key and automatically supply errno in the "rest" argument: scm_syserror generates messages using strerror, scm_sysmissing is used when facilities are not available. Care should be taken that the errno value is not reset (e.g., due to an interrupt.) void scm_syserror (char *subr); void scm_syserror_msg (char *subr, char *message, SCM args); void scm_sysmissing (char *subr); void scm_num_overflow (char *subr); void scm_out_of_range (char *subr, SCM bad_value); void scm_wrong_num_args (SCM proc); void scm_wrong_type_arg (char *subr, int pos, SCM bad_value); void scm_memory_error (char *subr); static void scm_regex_error (char *subr, int code); (only used in rgx.c). void (*scm_error_callback) (SCM key, char *subr, char *message, SCM args, SCM rest)); When a pointer to a C procedure is assigned to this variable, the procedure will be called whenever scm_error is invoked. It can be used by C code to retain control after a Scheme error occurs.
@unnumbered{Appendices and Indices}
Here is the information you will need to get and install Guile and extra packages and documentation you might need or find interesting.
Guile can be obtained from the main GNU archive site ftp://prep.ai.mit.edu/pub/gnu or any of its mirrors. The file will be named guile-version.tar.gz. The current version is 1.2a, so the file you should grab is:
ftp://prep.ai.mit.edu/pub/gnu/guile-1.2a.tar.gz
To unbundle Guile use the instruction
zcat guile-1.2a.tar.gz | tar xvf -
which will create a directory called `guile-1.2a' with all the sources. You can look at the file `INSTALL' for detailed instructions on how to build and install Guile, but you should be able to just do
cd guile-1.2a ./configure make install
This will install the Guile executable `guile', the Guile library `libguile.a' and various associated header files and support libraries. It will also install the Guile tutorial and reference manual.
Since this manual frequently refers to the Scheme "standard", also known as R4RS, or the "Revised$^4$ Report on the Algorithmic Language Scheme", we have included the report in the Guile distribution; See section `Introduction' in Revised(4) Report on the Algorithmic Language Scheme. This will also be installed in your info directory.
We ship the Guile tutorial and reference manual with the Guile distribution [FIXME: this is not currently true (Sat Sep 20 14:13:33 MDT 1997), but will be soon.] Since the Scheme standard (R4RS) is a stable document, we ship that too.
Here are references (usually World Wide Web URLs) to some other freely redistributable documents and packages which you might find useful if you are using Guile.
guile> %load-path ("/usr/local/share/guile/site" "/usr/local/share/guile/1.3a" "/usr/local/share/guile" ".")The relevant chapter (see section SLIB) has details on how to use SLIB with Guile.
Any problems with the installation should be reported to bug-guile@gnu.ai.mit.edu
[[how about an explanation of what makes a good bug report?]] [[don't complain to us about problems with contributed modules?]]
This appendix describes many functions that may be used to inspect and modify Guile's internal structure. These will mainly be of interest to people interested in extending, modifying or debugging Guile.
Guile symbol tables are hash tables. Each hash table, also called an obarray (for `object array'), is a vector of association lists. Each entry in the alists is a pair (SYMBOL . VALUE). To intern a symbol in a symbol table means to return its (SYMBOL . VALUE) pair, adding a new entry to the symbol table (with an undefined value) if none is yet present.
If obarray is #f
, use the default system symbol table. If
obarray is #t
, the symbol should not be interned in any
symbol table; merely return the pair (symbol
. #<undefined>).
The soft? argument determines whether new symbol table entries
should be created when the specified symbol is not already present in
obarray. If soft? is specified and is a true value, then
new entries should not be added for symbols not already present in the
table; instead, simply return #f
.
#t
if the symbol was present and #f
otherwise.
#f
,
use the global symbol table. If string is not interned in
obarray, an error is signalled.
symbol-bound?
determines whether a symbol has
been given any meaningful value.
%%gensym
.
When debugging a program, programmers often find it helpful to examine
the program's internal status while it runs: the values of internal
variables, the choices made in if
and cond
statements, and
so forth. Guile Scheme provides a debugging interface that programmers
can use to single-step through Scheme functions and examine symbol
bindings. This is different from the section Internal Debugging Interface, which permits programmers to debug the Guile interpreter
itself. Most programmers will be more interested in debugging their own
Scheme programs than the interpreter which evaluates them.
[FIXME: should we include examples of traditional debuggers
and explain why they can't be used to debug interpreted Scheme or Lisp?]
When a function is traced, it means that every call to that function is reported to the user during a program run. This can help a programmer determine whether a function is being called at the wrong time or with the wrong set of arguments.
function
. While a program is being run, Guile
will print a brief report at each call to a traced function,
advising the user which function was called and the arguments that were
passed to it.
function
.
Example:
(define (rev ls) (if (null? ls) '() (append (rev (cdr ls)) (cons (car ls) '())))) => rev (trace rev) => (rev) (rev '(a b c d e)) => [rev (a b c d e)] | [rev (b c d e)] | | [rev (c d e)] | | | [rev (d e)] | | | | [rev (e)] | | | | | [rev ()] | | | | | () | | | | (e) | | | (e d) | | (e d c) | (e d c b) (e d c b a) (e d c b a)
Note the way Guile indents the output, illustrating the depth of execution at each function call. This can be used to demonstrate, for example, that Guile implements self-tail-recursion properly:
(define (rev ls sl) (if (null? ls) sl (rev (cdr ls) (cons (car ls) sl)))) => rev (trace rev) => (rev) (rev '(a b c d e) '()) => [rev (a b c d e) ()] [rev (b c d e) (a)] [rev (c d e) (b a)] [rev (d e) (c b a)] [rev (e) (d c b a)] [rev () (e d c b a)] (e d c b a) (e d c b a)
Since the tail call is effectively optimized to a goto
statement,
there is no need for Guile to create a new stack frame for each
iteration. Using trace
here helps us see why this is so.
When a running program is interrupted, usually upon reaching an error or breakpoint, its state is represented by a stack of suspended function calls, each of which is called a frame. The programmer can learn more about the program's state at the point of interruption by inspecting and modifying these frames.
#t
if obj is a calling stack.
start-stack
.
Jump to: c - d - e - f - g - i - j - l - m - n - o - p - r - s - t
This is an alphabetical list of all the procedures and macros in Guile. [[Remind people to look for functions under their Scheme names as well as their C names.]] Jump to: % - * - < - a - b - c - d - e - f - g - h - i - j - k - l - m - n - o - p - q - r - s - t - u - v - w - y
This is an alphabetical list of all the important variables and constants in Guile. [[Remind people to look for variables under their Scheme names as well as their C names.]] Jump to: % - f - i - o - r - s - w - x
This is an alphabetical list of all the important data types defined in the Guile Programmers Manual. Jump to: s
This document was generated on 12 December 1998 using the texi2html translator version 1.52.