3 Module System
The module system is a simple, static, one-file per module system most closely resembling the Bigloo module system [Ser]. Before going into details we will show a simple example to give you an idea of where we're coming from and what problem Common-Scheme is attempting to solve.
Suppose you want to provide the Fibonacci function in a portable library. You decide conceptually that this function fits within the hierarchy of ``math'' modules, and so distribute this in a file fibonacci.scm under the folder math, with the following contents:
(common-module (math fibonacci) ((export fib)) (define (fib n) (if (< n 2) 1 (+ (fib (- n 1)) (fib (- n 2))))) )
You also provide a program print-fib.scm which imports this module and uses it to print the Fibonacci value of a number supplied on the command-line:
(common-module () ((import-extension (math fibonacci)) (entry-point main)) (define (main args) (if (pair? (cdr args)) (display (fib (string->number (cdr args)))) (display "usage: print-fib <number>")) (newline)) )
Here we have gained something previously impossible with Scheme - portable libraries. The factioned Scheme community can begin to share code freely between implementations.
3.1 Syntax
Each file1 should contain exactly
one common-module
form at the top-level and nothing
else.2 This should be
considered a special form, and cannot be expanded from a macro or nested
inside a begin
. The name common-module
was chosen to avoid
conflicts with any existing module systems. The syntax is as follows:
(common-module <module-name> (<module-declaration> ...) <body-expr> ... )
where
<module-name> ::= (<module-symbol> ...) <module-declaration> ::= (entry-point <identifier>) / (export <export-clause> ...) / (export-syntax <export-clause> ...) / (import-extension <module-list> ...) / (import-syntax <module-list> ...) / (import-lazy <module-name> <identifier> ...) / (import-rename <module-name> <identifier> <identifier>) / (import-prefix <module-name> <symbol>) / (inherit <module-list> ...) / (keywords <suffix-keyword> ...) / (declare <declaration> ...) <module-list> ::= (<module-symbol> ...) / (<module-symbol> ... <module-list> ...) <declaration> ::= (<identifier> <data> ...) <export-clause> ::= <identifier> / (<identifier> <data> ...) <module-symbol> ::= <symbol> / <number> <identifier> ::= <symbol> <suffix-keyword> ::= <symbol>: <body-expr> ::= <sexp> <data> ::= <sexp>
entry-point
declares the given identifier to be the entry point
for a program.3 This is generally called main
, as in SRFI-22, and has
the same semantics. When the program is started, main
will be
called on a list of strings, the first element of which is the program
name, and the remainder of which are the command-line arguments as
strings. On Unix systems if main
returns an integer that may be
used as the process' exit code, however since the same program is
intended to be usable on non-Unix systems the program should not return
an error status if main
returns something other than an integer.
export
makes the given identifiers visible to other modules which
import the current module. The second form where the identifier is the
head of a list followed by arbitrary data is reserved for future
extension, but implementations should support it now by simply ignoring
the data.
export-syntax
is the same as export
but declares the
exported values to be syntax.
import-extension
imports the exported identifiers for each module
in the list into the current module. This means the identifiers are
visible in the current module's scope. Mutating those identifiers with
set!
affects only a local lexical copy of the variable, not the
imported module. Shared mutable module variables can be achieved with
parameters or other boxing techniques.
import-extension
is also responsible for loading any module that
is not already loaded. Loading of all imported modules occurs prior to
initialization of the current module, in an unspecified order (but
limited by the fact that each imported module's imports will be loaded
before the imported module). Circular module references are not
permitted. Initialization means binding of the current modules
variables and execution of its <body-expr> forms, optionally followed by
exection of a main
procedure specified by entry-point
.
import-syntax
is exactly the same as import-extension
except
that the imported modules include syntax, which should be loaded at
compile-time in addition to run-time and used in the macro expansion
performed on the current module.
import-lazy
, unlike the previous two expressions, imports only a
single module, follwed by a list of identifiers, which must be bound to
procedure values in the imported module. It is a hint to the
implementation that the imported module may not always be needed, and
can be loaded on demand to save time and memory, which can be crucial in
very large systems such as Emacs. The implementation is free to load
the library only when one of the imported procedures is called (or
optionally referenced). This is only an optimization hint, and the
above form may legally be treated as an import-extension
of one
module.
import-rename
also imports a single module and a single identifier
from that module, giving it a new local name. For example,
(import-rename (net http) download http-fetch)
imports http-fetch
from the (net http)
module, giving it the
name download
.
import-prefix
is similar but imports all bindings from the
module, prefixing every identifier with the specified prefix symbol.
inherit
lets you extend or override one or more other modules.
What it does is import the listed modules while at the same time
exporting all of the bindings they export, in addition to whatever else
you manually export. This can be used in a number of ways:
To override specific bindings in one module, by inheriting the module and redefining the bindings in the child module.
To group together multiple modules in a single convenience module, by creating an empty module that does nothing but inherit the other modules. This can be a nice convenience in development, and can also provide a way to group a single framework used by all modules in a given application.
To create an abstract module, by exporting but only defining stubs for a given interface of bindings. This abstract module can then be inherited by many different implementations.
Equivalently powerful mechanisms can be found in other module systems, however this approach uses fewer new concepts, and the techniques involved are direct analogs of the equivalents in OOP so should be familiar to many users.
keywords
is a way to use declare colon-suffixed DSSSL-style
keywords. Basically, a keyword is just a symbol ending in a colon,
however for convenience people often want them to be self-evaluating.
Some Schemes already provide this functionality, but to use them
seemlessly in other Schemes all you need to do is declare which keywords
you want to use with a keywords
declaration and the identifiers of
the same name will be bound and exported as the symbol itself. For
example,
(keywords key:)
would ensure that key:
evaluates to 'key:
in all
implementations.
See let-keywords*
below for a convenient way to define procedures
which take keyword arguments. Currently no standard modules make use of
keywords.
declare
specifies compiler options and general system behavior
that may be optional in implementations. Any identifier in a
declare
clause which is not known should be ignored. In fact, it
is perfectly legal to ignore the entire declare
clause. If an
identifier is known to be only partially supported in some sense, it
should be supported as best as possible rather than signalling an error.
At present this is fairly loosely defined, but the currently registered identifiers are as follows (all with no parameters):
fixnum
- only fixnum arithmetic needed (a compiler hint)number-tower
- full number towerunicode
- use full Unicode characters and strings
3.2 Module Names
Many modern module systems support hierarchical module names for the
same reason that we have hierarchical filesystems (or because of it?).
With the number of modules named ``html'' or ``utils'' it's important to
be able avoid conflicts. Therefore the module names take the form of a
hierarchical list of symbols or numbers. For a single module (the
<module-name>
syntax) this is a flat list. Examples:
(sdl)
- single top-level module(net smtp)
- the ``smtp'' module within the ``net'' hierarchy(org schemers srfi 1)
- a Java-style module name
In a hierachical file-system the above names would translate to something like
sdl.scm
net/smtp.scm
org/schemers/srfi/1.scm
and this is how they should be distributed.
As a special case, for programs you may specify an empty module name of
()
as the first parameter to common-module
, and this will
act as an anonymous module.
When importing modules, you often want to import a number of modules from within the same hierarchy. For instance, a program making heavy use of SRFI's with the above syntax might write something like:
(import-extension (org schemers srfi 1) (org schemers srfi 2) (org schemers srfi 13) (org schemers srfi 14))
This is verbose and tedious, so we introduce a special syntax for
<module-list>
forms. If one of the module components is a list,
then all subcomponents of the list are expanded into place as separate
options. So for example, the above could be written:
(import-extension (org schemers srfi (1 2 13 14)))
Sublists within the list are taken as new hierachies. Naturally if they contain a third level of lists then those are expanded and spliced recursively, and so on. For example,
(import-extension (com myscheme ((text format) (net (smtp imap)))))
expands into
(import-extension (com myscheme text format) (com myscheme net smtp) (com myscheme net imap))
This is also optimally concise in that you never need to write the same module component twice. If you don't like the abbreviations you can always write the names out in full.
3.2.1 Choosing Names
Java uses a very rigid naming system using an organization's domain name
to ensure uniqueness of module names. Perl uses a first-come,
first-serve naming scheme which is inconsistent and misleading as to
what the preferred module is. We should try to strike a balance in the
middle, such as (<person-or-project-name> <module-name>)
for
personal projects.
3.2.2 SRFI Aliases
SRFI numbers are constantly growing, can be difficult to remember for many people, and moreover importing many SRFIs by number makes Scheme source code look overly cryptic and intimidating to outsiders. In the interests of proselytization and our own sanity we provide aliases usable in place of the SRFI numbers in the module import forms. The current list of aliases follows.
SRFI-0:
cond-expand
SRFI-1:
list-lib
SRFI-2:
and-let*
SRFI-4:
numeric-vectors
SRFI-5:
signature-let
SRFI-6:
string-ports
SRFI-7:
configuration-language
SRFI-8:
receive
SRFI-9:
define-record-type
SRFI-10:
external-form-syntax
SRFI-11:
multiple-value-bind
SRFI-13:
string-lib
SRFI-14:
char-set-lib
SRFI-16:
case-lambda
SRFI-17:
generalized-set!
SRFI-18:
multithreading
SRFI-19:
time-lib
SRFI-21:
real-time-multithreading
SRFI-22:
unix-scheme-scripts
SRFI-23:
error
SRFI-25:
array-lib
SRFI-26:
cut
SRFI-27:
random-numbers
SRFI-28:
basic-format
SRFI-29:
localization
SRFI-30:
multi-line-comment-syntax
SRFI-31:
rec
SRFI-32:
sort-lib
SRFI-33:
bitwise-lib
SRFI-34:
exceptions
SRFI-35:
conditions
SRFI-36:
i/o-conditions
SRFI-37:
args-fold
SRFI-38:
shared-structure-read/write
SRFI-39:
parameters
SRFI-40:
streams-lib
SRFI-42:
comprehensions
SRFI-43:
vector-lib
SRFI-44:
comprehensions
SRFI-45:
lazy-eval
SRFI-46:
syntax-rules-choose-ellipse
SRFI-47:
homogeneous-array-lib
SRFI-48:
intermediate-format
SRFI-55:
require-extension
SRFI-57:
advanced-records
SRFI-58:
array-notation
SRFI-59:
vicinity
SRFI-60:
bitwise-operations
SRFI-62:
sexpr-comments
SRFI-63:
homogeneous-and-heterogeneous-arrays
SRFI-64:
testing
If future SRFI's suggest a reasonable name in the document, that name will be used as the Common-Scheme alias. Note that for SRFIs 1, 13, 14, 32 and 33 the names here are those already suggested in the SRFIs themselves.
3.3 Additional Syntax
Two convenience forms are provided for optional and keyword arguments.
(let-optionals* ls ((name default) ...) body ...)
This is Olin Shiver's
let-optionals*
. It binds each variablename
to the corresponding element of the listls
, or todefault
if there are fewer elements inls
than there are optional variables. A typical usage might be(define (sort ls . o) (let-optionals* o ((less string<?) (key identity)) ...)) (sort '("dog" "mouse" "cat") string<?) ; key is identity
(let-keywords* ls ((name default) ...) body ...)
Instead of using the positional order of options you can use keyword-based arguments. This makes sense when there are many options or when you want to leave room for forwards compatibility. It's slower and frequent use may indicate you aren't factoring your procedures well, however since there are valid uses we provide it.
(define (sort ls . o) (let-keywords* o ((less string<?) (key identity)) ...)) (sort '("dog" "MOUSE" "Cat") key: string-downcase) ; less is string<?
3.4 Bootstrapping
We made a slight simplification in the example files at the start of
this section, and assumed that Scheme implementations already support
the common-module
syntax. Currently this isn't the case for any
implementation, and this is probably a good thing while Common-Scheme is
a work in progress. We must therefore first bootstrap the system by
first loading the common-module
syntax. One approach, used by
early versions of Common-Scheme, is to use SRFI-0's cond-expand
form to conditionally load the needed syntax for each implementation.
This is fairly ugly, and has to be updated with each implementation you
add support for. More recently, SRFI-55 provided a portable way to load
other SRFI's. Common-Scheme is not a SRFI, but for purposes of
leveraging SRFI-55 it ``borrows'' the SRFI number 10,000 (ten thousand)
and installs accordingly, so that it may be loaded with:
(require-extension (srfi 10000))
The number is chosen to be large enough to not conflict with official SRFIs4, and partly borrowing the Asian connotation of ten thousand to mean ``myriads,'' the number of modules we hope to have soon.
This require-extension
form can only appear at the top-level prior
to the common-module
form, and can only include (srfi
10000)
. Implementations which support Common-Scheme natively should
simply ignore this form.
1 We assume for the time being code is at least distributed over the net as files, even if an individual implementation does not make use of a conventional file system.
2 With one exception discussed later.
3 Usually when we say program we mean a native process that can be run directly from the host OS. In a SchemeOS or implementation that behaves as an OS this may be extended to the general idea of a user command which is passed arguments in the form of strings.
4 If the number of SRFI's ever reach 10,000 then we can assume Common-Scheme will either no longer be in use, or at least be deserving of an official SRFI number by that point