Inside FVM
Forth-like for the TI MSP430

Currently under heavy development.

Unstable and Unfinished
Software and Documentation

Others Documents

Introduction and Concepts

Memory Map

Indirect Threaded Code implementation

BUILD> DOES> example

Licensing

Others Documents

FVM - Forth-like for the TI MSP430

Definitions Glossary

Introduction and Concepts

The intent of this document is to present some internal information about our implementation. We are assuming that you have already visited the main web page for FVM. Also, remember Froth is close to the Forth language, but it's not Forth...

Froth is Stack-Based instead of Registers-Based, and use the next Stacks and Registers:

- a Parameter Stack: used to exchange data between definitions (a definition can be compared somewhat as a procedure or a function, but a variable, a constant are also a definition in Froth).

- a Return Stack: mainly used to save/restore call contexts.

- an Address Register (AREG), referencing memory

- a pointer (IP) into the code stream, known as the “Interpret Pointer”.

The FVM cross-compiler generates addresses codes which are interpreted by an optimized address interpreter. This common Forth model is known as Indirect Threading Code.

The code for the core-definitions of FVM is defined here (MSP430) and also here (linux x86).

Memory Map

This is the memory map for a specific FVM implementation: for the MSP430F1121A.

FFFF FFE0	Int. Vector
F800	Forth-like ‘kernel’ FVM core definitions	These definitions are kept after a WARM reset: the Flash segments are not erased.
F000	User-Code (Flash)	These definitions stored in these Flash segments are erased after a WARM reset. In order to minimize the number of erase-cycles of the Flash, it is possible to ask the compiler to optimize the use of the Flash.

10FF 1000	Flash Information Memory	Available for user-code or user-data.
0FFF 0C00	1 KB Boot ROM	Not used by FVM (currently)

02FF 0200	256 Bytes RAM	See detailed usage below.
01FF 0100	16b Per.
00FF 0010	8b Per.
000F 0000	SFR

Here is how the memory is used (again, it’s just an example for the MSP430F1121A, the configuration is somewhat different depending the type of CPU):

Address	Variable	Initial Value	Size (16-bits cell)	Growing	Usage
200	vdepth		1		Internal use
202	vareg		1		Internal use
204	vdp / Flash addr.		1		Internal use.
206	Flash value		1		Internal use
208 217	anodef0	208	8	¯	Anonymous definition Buffer.
218 256	RP0	256	31		Return Stack
257 26F	SP0	270	13		Parameters Stack
270 2FF	DP0	270	72	¯	Allocated at compilation time (static data) and at run-time (automatic variables)..

Indirect Threaded Code implementation

We have chosen this classic model of Forth implementation: ITC (Indirect Threaded Code). Basically, the compiler generates a list of addresses, which are then interpreted at run-time by an ‘Address Interpreter’, something that can be also referred as an ‘Inner Interpreter’. There are several well-known techniques to implement a Forth-based system: see this link at Forth, Inc.

In addition to the Parameters Stack, which is always part of a Forth system, whatever the chosen implantation (direct code, tokens, etc.), a Forth ITC implementation specifically needs

- an Interpreter Pointer (IP): will be used to point to a cell in memory which contains the address of a definition.

- a Return Stack –separated from the Parameters Stack- to save/restore the values of IP

We’ll need 3 different small pieces of code to implement the ITC model:

- NEXT

- DOCOL

- EXIT

Here is the implementation of NEXT in pseudo-language:

W = (IP) read content of the cell pointed by the Interpreter Pointer (IP)

IP = IP++ advance IP to point to next cell

X = (W) read content of the 1st cell of the definition – Is machine-code defined

jump (X) jump to this code

This only function will need two other functions in order to handle Forth-defined definitions and nesting:

- DOCOL will be used for all Forth-defined (as opposite to machine code-defined) classes of definitions:

PUSH IP save the Interpreter Pointer (IP) on the Return Stack

IP = ++W advance IP to point to next cell – the beginning of the code

jump NEXT jump to the Inner-Interpreter

- EXIT is doing the exact opposite of DOCOL:

POP IP restore the Interpreter Pointer (IP) from the Return Stack

jump NEXT jump to the Inner-Interpreter

NEXT, DOCOL and EXIT must be coded in machine code.

As example, here is how we have implemented these 3 core functions in MSP430 assembler
(please check the TI documentation for the MSP430 CPU model):

next:                         ; R7 point always to this address (next:)
     mov.w @R4+,R15          ; Load W with the CFA (Code Field Address): DOCOL, DOVAR, or $+2
;                              for definitions written in assembler (CODE..ENDCODE)
     mov.w @R15+,R0          ; Load PC with address of the routine

//   (:)                  ( --- )
-code (:)
     .set-pstack-protocol0 0
     .set-rstack-protocol0 0
     .set-cells-protocol 0
docol:
     push.w R4              ; Save IP on the return stack
     mov.w   R15,R4          ; Load IP with the next 'instruction' to execute
     mov.w   R7,R0           ; NEXT: the indirect threaded code inner interpreter
endcode

//   (;)                  ( --- )
code (;)
     .set-pstack-protocol0 0
     .set-rstack-protocol0 0
     .set-cells-protocol 0
     pop.w   R4          ; Restore IP from the Return stack
     mov.w   R7,R0       ; NEXT
endcode

Let’s take a simple example:

(0)
(0) list
103 words
. one              start=F000 end=F00A pstack_effect=1, 0 rstack_effect=0, 0
. two              start=F00A end=F016 pstack_effect=1, 0 rstack_effect=0, 0
. three            start=F016 end=F026 pstack_effect=0,+1 rstack_effect=0, 0
. led              start=F026 end=F034 pstack_effect=0, 0 rstack_effect=0, 0
(0) decomp one
Code: start=F000 end=F00A
stack_effect=1, 0 rstack_effect=0, 0
F000: F8B4 (:)
F002: F8C0 (lit)              10
F006: FAB2 +
F008: F8BA (;)
(0) decomp two
Code: start=F00A end=F016
stack_effect=1, 0 rstack_effect=0, 0
F00A: F8B4 (:)
F00C: F8C0 (lit)              100
F010: FAB2 +
F012: F000 one
F014: F8BA (;)
(0) decomp three
Code: start=F016 end=F026
stack_effect=0,+1 rstack_effect=0, 0
F016: F8B4 (:)
F018: F8C0 (lit)              1000
F01C: F00A two
F01E: F8C0 (lit)              30
F022: FACE -
F024: F8BA (;)
(0) three ??
Target: 103 words
anodef0=0208 anodef=020A - cp0=F000 cp=F034 - dp=0270 areg=0270
Stack empty
Current stack effect: parameters:0,+1 return:0, 0
0208: F016 three
(0)

We have this in memory:

F000

F002

F004

F006

F008

F00A

F00C

F00E

F010

F012

F014

F016

F018

F01A

F01C

F01E

F020

F022

(:)

(lit)

(;)

(:)

(lit)

100

one

(;)

(:)

(lit)

1000

two

(lit)

(;)

DOCOL

EXIT

The code of NEXT, DOCOL and EXIT are defined here (MSP430) and also here (linux/x86).

BUILD> DOES> example

In the next example, we’ll see how the BUILD> DOES> is generated. We’ll see two examples:

- one use static memory allocation (that’s the standard Forth model)

- the other use dynamic memory allocation (automatic variables are a FVM extension).

(0) list
106 words
. led              start=F000 end=F00E pstack_effect=0, 0 rstack_effect=0, 0
D const            start=F00E end=F01A pstack_effect=0, 0 rstack_effect=0, 0
. (const)          start=F01A end=F01E pstack_effect=0,+1 rstack_effect=0, 0
. ten              start=F01E end=F024 pstack_effect=0,+1 rstack_effect=0, 0
. vingt            start=F024 end=F02A pstack_effect=0,+1 rstack_effect=0, 0
. test1            start=F02A end=F034 pstack_effect=0,+1 rstack_effect=0, 0
. test2            start=F034 end=F062 pstack_effect=0,+1 rstack_effect=0, 0
(0) decomp const
Code: start=F00E end=F01A
stack_effect=0, 0 rstack_effect=0, 0
F00E: F8B4 (:)
F010: F8C0 (lit)              2
F014: FA70 (allot)
F016: F9DC !
F018: FAA0 (end-build)
(0) decomp (const)
Code: start=F01A end=F01E
stack_effect=0,+1 rstack_effect=0, 0
F01A: F9D0 @
F01C: F8BA (;)
(0) decomp ten
Code: start=F01E end=F024
stack_effect=0,+1 rstack_effect=0, 0
F01E: FA3A (does-static)      dp=0270 function=(const)
(0) decomp vingt
Code: start=F024 end=F02A
stack_effect=0,+1 rstack_effect=0, 0
F024: FA3A (does-static)      dp=0272 function=(const)
(0) decomp test1
Code: start=F02A end=F034
stack_effect=0,+1 rstack_effect=0, 0
F02A: F8B4 (:)
F02C: F01E ten
F02E: F024 vingt
F030: FAB2 +
F032: F8BA (;)
(0) decomp test2
Code: start=F034 end=F062
stack_effect=0,+1 rstack_effect=0, 0
F034: F8B4 (:)
F036: F8C0 (lit)              11
F03A: FA1C (alloc-auto-vars) nb=2
F03E: FA32 (build-auto)       frame index=1 function=const
F044: F8C0 (lit)              100
F048: FA32 (build-auto)       frame index=0 function=const
F04E: FA42 (does-auto)        frame index=1 function=(const)
F054: FA42 (does-auto)        frame index=0 function=(const)
F05A: FAB2 +
F05C: FA22 (free-auto-vars)   nb=2
F060: F8BA (;)
(0)

Notes:

(1) (alloc-auto-vars) allocate space for two automatic variables. Actually, (alloc-auto-vars) does not really allocate memory for the variable, but rather allocate memory (on the return Stack) which will hold the memory for the data, once the variable will be later allocated with (build-auto). Two (2) cells are needed on the stack:

(2) one cell will hold the data memory address

(3) one cell store the length of this data memory block (this is needed and used by (free-auto-vars) to free the memory).

(4) (build-auto) is used to allocate and to initialize the variable (e.g. to create it). The next two (2) cells hold:

(5) the address of the code to execute on variable creation (the part between BUILD>..DOES>)

(6) the index of the automatic variable (0, 1, .. N-1)

(7) (does-auto) is used on each instance of a variable. The next two (2) cells hold:

(8) the address of the code to execute on variable use (the part after the DOES>)

(9) the index of the automatic variable (0,1,.. N-1)

(10) (free-auto-vars) is used to free the space used by the variables (both in the heap and in the return stack).

FVM supports variables scoping in logical blocks (such IF..ELSE..ENDIF, BEGIN..UNTIL, etc.). If there is any automatic variables within a logical block, the compiler with then compile (alloc-auto-vars) at the beginning of the block, and will compile (free-auto-vars) at the end of the block. FVM does not allow shadowing a variable (same variable name in deeper logical blocks).

Automatic variables are a nice, elegant and powerful concept in FVM. The obvious drawbacks are:

- it use more memory space (for example, each variable call cost 3 cells of memory code)

- it use more CPU cycles.

For this reason, if you use only constant and variable, FVM support them as automatic variables, with a code more optimized than if they were coded using BUILD> .. DOES>

It is very important to understand that FVM does NOT (currently) need a memory garbage collector. Automatic variables are allocated on the heap, in a LIFO way.

The code for BUILD> DOES> and friends is defined here (MSP430) and also here (linux x86).

Licensing

This software is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Please take a look at the GNU Web Pages for a copy of the GPL license: http://www.gnu.org/licenses/gpl

	Pages created: December 27^th 2002
Olivier Singla	Pages last revised: March 13^rd 2006

Inside FVM Forth-like for the TI MSP430

Others Documents

Introduction and Concepts

Memory Map

Indirect Threaded Code implementation

BUILD> DOES> example

Licensing

Inside FVM
Forth-like for the TI MSP430