From Gambit wiki
Revision as of 07:36, 15 November 2009 by Pclouds (Linker)
People who want to contribute to Gambit development will need to learn something about how the Gambit-C runtime and compiler are organized. While we intend that source code documentation be included in the source itself (currently there is very little documentation), we intend that descriptions of program design or algorithms used in the runtime and compiler could be included here.
The manual lists
continuation-return but doesn't describe them. The REPL debugger, and possibly other things, use them. See Marc Feeley's paper A Better API for First-Class Continuations.
The REPL has some fairly interesting functions and variables, especially for hackers.
- Should the REPL give relative or absolute pathnames. Note: When using emacs with gambit, it is useful to set it to #f, especially if you change the current-directory.
- where x is a REPL command letter (typed after a comma from the REPL). Executes that command as if it was executed inside of the REPL. For instance
##cmd-bdisplays a backtrace.
define-type. Based on SRFI-9, but extensions not documented. This email provides the best explanation 
To get list of interned symbols:
(define (symbol-table->list st) (define (symbol-chain s syms) (let loop ((s s) (syms syms)) (if (symbol? s) (loop (##vector-ref s 2) (cons s syms)) syms))) (let loop ((lst (vector->list st)) (syms '())) (if (pair? lst) (loop (cdr lst) (symbol-chain (car lst) syms)) (reverse syms)))) (define (interned-symbols) (symbol-table->list (##symbol-table))) (pp (length (interned-symbols)))
(From Gambit ML 2009-03-22)
Script igsc.scm inside gsc directory can be used to get REPL of compiler so you can inspect details.
The frontend entry point is cf, main function to do compilation is compile-parsed-program, which generates GVM instructions. Some optimization is done by frontend via function normalize-program.
TODO: Optimizations, program tree representation.
The closet document to describe Gambit Virtual Machine is probably A Parallel Virtual Machine for Efficient Scheme Compilation.
There are 6 types of operands, described in _gvmadt.scm. All operands are encoded to a number. The following list is extracted from _gvmadt.scm:
reg(n) n*8 + 0 stk(n) n*8 + 1 lbl(n) n*8 + 2 glo(name) index_in_operand_table*8 + 3 clo(opnd,n) index_in_operand_table*8 + 4 obj(x) index_in_operand_table*8 + 5
Global variables (glo), closed variables (clo) and objects are saved in *opnd-table*. Reg, stk, lbl are respectively abbreviations of register, stack and label. All these operands can be created by make-X, where X is the abbreviation.
In .gvm output, registers are prefixed by "+", stack by "-", objects by a single quote, labels by "#". Global variables are displayed by variable name, closed variables are enclosed in brackets, objects
GVM instructions include apply, copy, close, ifjump, switch, jump, comment and label, in _gvm.scm, "Virtual machine instruction representation" section.
Instructions in .gvm output are represented by their name. If it's a poll jump, the instruction will be followed by a star. If it's a safe jump, it is followed by a dollar.
After GVM generation, dead code is removed by bbs-purify!.
Backend is selected by target-select!. All backend functions start with target.. The only supported backend is C, which explains the "C" part in "Gambit-C", reside in _t-c-[1-3].scm.
The initial state of a module includes global variables, symbols, keywords, subtypes, number of used labels and procedures. All are maintained during compilation (see "Object management" in _t-c-1.scm) and dumped out as C declaration. The entry point that does this is targ-heap-dump. C output is written in C macros only. gambit.h will actually produce C code suitable for each architecture.
Because each module contains each own global variables, symbols and keywords. All must be combined to produce single tables for those. Linking works by reading linker info in generated C files. Linker info is actually sexp embedded in C files (and be never read by C compiler as it is protected by #ifdef). Structure of linker info can be found in targ-read-linker-info, which is a list of
- Compiler version
- Module name
- List of symbols
- List of keywords
- List of supplied and demanded globals
- List of supplied and not demanded globals
- List of not supplied globals
- Script line
Linking is started with targ-linker. When incremental link is demanded, INCREMENTAL_LINKFILE will be defined in generated C file. gambit.h will handle the rest.