Difference between revisions of "Internal Documentation"
From Gambit wiki
(Linker) |
|||
Line 65: | Line 65: | ||
(From Gambit ML 2009-03-22) | (From Gambit ML 2009-03-22) | ||
+ | |||
+ | === Program startup === | ||
+ | |||
+ | * The entry point function (either main(), winmain()...) will be generated (in linker file) to call either <code>___main()</code>, <code>___main_UCS_2</code> or <code>___winmain</code>. | ||
+ | * These functions do very basic initialization (setup <code>___base_mod</code> and <code>___program_startup_info</code>) then passes to <code>___main()</code> in main.c. | ||
+ | * This function in turn calls <code>___setup()</code>, which does | ||
+ | ** sets up the VM | ||
+ | ** Call linker | ||
+ | ** Initialize tables (symbol, keyword, global variables, primitives) | ||
+ | ** Kick off the kernel | ||
== Compiler == | == Compiler == |
Revision as of 14:12, 16 November 2009
People who want to contribute to Gambit development will need to learn something about how the Gambit-C runtime and compiler are organized. While we intend that source code documentation be included in the source itself (currently there is very little documentation), we intend that descriptions of program design or algorithms used in the runtime and compiler could be included here.
Namespace handling
See Namespaces.
Runtime Library
Memory Management
General notes on internal object storage and memory consumption is on the Debugging page. Also see Notes on Memory Management.
Thread System
I/O System
Arithmetic implementation
Eval
Continuation manipulation
The manual lists continuation-graft
, continuation-capture
, and continuation-return
but doesn't describe them. The REPL debugger, and possibly other things, use them. See Marc Feeley's paper A Better API for First-Class Continuations.
REPL
The REPL has some fairly interesting functions and variables, especially for hackers.
Variables
##repl-location-relative
- Should the REPL give relative or absolute pathnames. Note: When using emacs with gambit, it is useful to set it to #f, especially if you change the current-directory.
Functions
##cmd-
x- where x is a REPL command letter (typed after a comma from the REPL). Executes that command as if it was executed inside of the REPL. For instance
##cmd-b
displays a backtrace.
Record system
That is, define-type
. Based on SRFI-9, but extensions not documented. This email provides the best explanation [1]
Introspection
Symbol introspection
To get list of interned symbols:
(define (symbol-table->list st) (define (symbol-chain s syms) (let loop ((s s) (syms syms)) (if (symbol? s) (loop (##vector-ref s 2) (cons s syms)) syms))) (let loop ((lst (vector->list st)) (syms '())) (if (pair? lst) (loop (cdr lst) (symbol-chain (car lst) syms)) (reverse syms)))) (define (interned-symbols) (symbol-table->list (##symbol-table))) (pp (length (interned-symbols)))
(From Gambit ML 2009-03-22)
Program startup
- The entry point function (either main(), winmain()...) will be generated (in linker file) to call either
___main()
,___main_UCS_2
or___winmain
. - These functions do very basic initialization (setup
___base_mod
and___program_startup_info
) then passes to___main()
in main.c. - This function in turn calls
___setup()
, which does- sets up the VM
- Call linker
- Initialize tables (symbol, keyword, global variables, primitives)
- Kick off the kernel
Compiler
Script igsc.scm inside gsc directory can be used to get REPL of compiler so you can inspect details.
Frontend
The frontend entry point is cf, main function to do compilation is compile-parsed-program, which generates GVM instructions. Some optimization is done by frontend via function normalize-program.
TODO: Optimizations, program tree representation.
Intermediate representation
The closet document to describe Gambit Virtual Machine is probably A Parallel Virtual Machine for Efficient Scheme Compilation.
Operands
There are 6 types of operands, described in _gvmadt.scm. All operands are encoded to a number. The following list is extracted from _gvmadt.scm:
reg(n) n*8 + 0 stk(n) n*8 + 1 lbl(n) n*8 + 2 glo(name) index_in_operand_table*8 + 3 clo(opnd,n) index_in_operand_table*8 + 4 obj(x) index_in_operand_table*8 + 5
Global variables (glo), closed variables (clo) and objects are saved in *opnd-table*. Reg, stk, lbl are respectively abbreviations of register, stack and label. All these operands can be created by make-X, where X is the abbreviation.
In .gvm output, registers are prefixed by "+", stack by "-", objects by a single quote, labels by "#". Global variables are displayed by variable name, closed variables are enclosed in brackets, objects
Instructions
GVM instructions include apply, copy, close, ifjump, switch, jump, comment and label, in _gvm.scm, "Virtual machine instruction representation" section.
Instructions in .gvm output are represented by their name. If it's a poll jump, the instruction will be followed by a star. If it's a safe jump, it is followed by a dollar.
Optimization
After GVM generation, dead code is removed by bbs-purify!.
Backend
Backend is selected by target-select!. All backend functions start with target.. The only supported backend is C, which explains the "C" part in "Gambit-C", reside in _t-c-[1-3].scm.
The initial state of a module includes global variables, symbols, keywords, subtypes, number of used labels and procedures. All are maintained during compilation (see "Object management" in _t-c-1.scm) and dumped out as C declaration. The entry point that does this is targ-heap-dump. C output is written in C macros only. gambit.h will actually produce C code suitable for each architecture.
Linking
Because each module contains each own global variables, symbols and keywords. All must be combined to produce single tables for those. Linking works by reading linker info in generated C files. Linker info is actually sexp embedded in C files (and be never read by C compiler as it is protected by #ifdef). Structure of linker info can be found in targ-read-linker-info, which is a list of
- Compiler version
- Module name
- Modules
- List of symbols
- List of keywords
- List of supplied and demanded globals
- List of supplied and not demanded globals
- List of not supplied globals
- Script line
Linking is started with targ-linker. When incremental link is demanded, INCREMENTAL_LINKFILE will be defined in generated C file. gambit.h will handle the rest.