Inside FVM
 Forth-like for the TI MSP430

Currently under heavy development.

Unstable and Unfinished
Software and Documentation

 

 

Others Documents

Introduction and Concepts

Memory Map

Indirect Threaded Code implementation

BUILD> DOES> example

Licensing

Others Documents

FVM - Forth-like for the TI MSP430

Definitions Glossary

Introduction and Concepts

The intent of this document is to present some internal information about our implementation. We are assuming that you have already visited the main web page for FVM. Also, remember Froth is close to the Forth language, but it's not Forth...

Froth is Stack-Based instead of Registers-Based, and use the next Stacks and Registers:

-         a Parameter Stack: used to exchange data between definitions (a definition can be compared somewhat as a procedure or a function, but a variable, a constant are also a definition in Froth).

-         a Return Stack: mainly used to save/restore call contexts.

-         an Address Register (AREG), referencing memory

-         a pointer (IP) into the code stream, known as the “Interpret Pointer”.

The FVM cross-compiler generates addresses codes which are interpreted by an optimized address interpreter. This common Forth model is known as Indirect Threading Code.

The code for the core-definitions of FVM is defined here (MSP430) and also here (linux x86).

Memory Map

This is the memory map for a specific FVM implementation: for the MSP430F1121A.

 

FFFF
FFE0

Int. Vector

 

F800

Forth-like ‘kernel’
FVM core definitions

These definitions are kept after a WARM reset: the Flash segments are
not erased.

F000

User-Code
(Flash)

These definitions stored in these Flash segments are erased after a WARM reset. In order to minimize the number
of erase-cycles of the Flash, it is possible to ask the compiler to optimize the use of the Flash.

 

 

 

10FF
1000

Flash
Information Memory

Available for user-code or user-data.

0FFF

0C00

1 KB Boot ROM

Not used by FVM (currently)

 

 

 

02FF

0200

256 Bytes RAM

See detailed usage below.

01FF
0100

16b Per.

 

00FF
0010

8b Per.

 

000F
0000

SFR

 

 

Here is how the memory is used (again, it’s just an example for the MSP430F1121A, the configuration is somewhat different depending the type of CPU):

 

Address

Variable

Initial
Value

Size
(16-bits cell)

Growing

Usage

200

vdepth

 

1

 

Internal use

202

vareg

 

1

 

Internal use

204

vdp / Flash addr.

 

1

 

Internal use.

206

Flash value

 

1

 

Internal use

208
217

anodef0

208

8

¯

Anonymous definition Buffer.

218
256

RP0

256

31

­

Return Stack

257
26F

SP0

270

13

­

Parameters Stack

270
2FF

DP0

270

72

¯

Allocated at compilation time (static data) and
at run-time (automatic variables)..

 

Indirect Threaded Code implementation

We have chosen this classic model of Forth implementation: ITC (Indirect Threaded Code). Basically, the compiler generates a list of addresses, which are then interpreted at run-time by an ‘Address Interpreter’, something that can be also referred as an ‘Inner Interpreter’. There are several well-known techniques to implement a Forth-based system: see this link at Forth, Inc.

In addition to the Parameters Stack, which is always part of a Forth system, whatever the chosen implantation (direct code, tokens, etc.), a Forth ITC implementation specifically needs

-         an Interpreter Pointer (IP): will be used to point to a cell in memory which contains the address of a definition.

-         a Return Stack –separated from the Parameters Stack- to save/restore the values of IP

We’ll need 3 different small pieces of code to implement the ITC model:

-         NEXT

-         DOCOL

-         EXIT

Here is the implementation of NEXT in pseudo-language:

W = (IP)          read content of the cell pointed by the Interpreter Pointer (IP)

IP = IP++        advance IP to point to next cell

X = (W)           read content of the 1st cell of the definition – Is machine-code defined

jump (X)          jump to this code

            This only function will need two other functions in order to handle Forth-defined definitions and nesting:

-         DOCOL will be used for all Forth-defined (as opposite to machine code-defined) classes of definitions:

PUSH IP                save the Interpreter Pointer (IP) on the Return Stack

IP = ++W              advance IP to point to next cell – the beginning of the code

jump NEXT           jump to the Inner-Interpreter

-         EXIT is doing the exact opposite of DOCOL:

POP IP                  restore the Interpreter Pointer (IP) from the Return Stack

jump NEXT           jump to the Inner-Interpreter

NEXT, DOCOL and EXIT must be coded in machine code.

As example, here is how we have implemented these 3 core functions in MSP430 assembler
(please check the TI documentation for the MSP430 CPU model):

next:                         ; R7 point always to this address (next:)
     mov.w  @R4+,R15          ; Load W with the CFA (Code Field Address): DOCOL, DOVAR, or $+2
;                              for definitions written in assembler (CODE..ENDCODE)
     mov.w  @R15+,R0          ; Load PC with address of the routine

 

//   (:)                  ( --- )
-code (:)
     .set-pstack-protocol0 0
     .set-rstack-protocol0 0
     .set-cells-protocol  0
docol:    
     push.w  R4              ; Save IP on the return stack
     mov.w   R15,R4          ; Load IP with the next 'instruction' to execute
     mov.w   R7,R0           ; NEXT: the indirect threaded code inner interpreter
endcode

 

//   (;)                  ( --- )
code (;)
     .set-pstack-protocol0 0
     .set-rstack-protocol0 0
     .set-cells-protocol  0
     pop.w   R4          ; Restore IP from the Return stack
     mov.w   R7,R0       ; NEXT
endcode

 

Let’s take a simple example:

 

(0)
(0) list
103 words
. one              start=F000 end=F00A pstack_effect=1, 0 rstack_effect=0, 0
. two              start=F00A end=F016 pstack_effect=1, 0 rstack_effect=0, 0
. three            start=F016 end=F026 pstack_effect=0,+1 rstack_effect=0, 0
. led              start=F026 end=F034 pstack_effect=0, 0 rstack_effect=0, 0
(0) decomp one
Code: start=F000 end=F00A
stack_effect=1, 0 rstack_effect=0, 0
F000: F8B4 (:)
F002: F8C0 (lit)              10
F006: FAB2 +
F008: F8BA (;)
(0) decomp two
Code: start=F00A end=F016
stack_effect=1, 0 rstack_effect=0, 0
F00A: F8B4 (:)
F00C: F8C0 (lit)              100
F010: FAB2 +
F012: F000 one
F014: F8BA (;)
(0) decomp three
Code: start=F016 end=F026
stack_effect=0,+1 rstack_effect=0, 0
F016: F8B4 (:)
F018: F8C0 (lit)              1000
F01C: F00A two
F01E: F8C0 (lit)              30
F022: FACE -
F024: F8BA (;)
(0) three ??
Target: 103 words
anodef0=0208 anodef=020A - cp0=F000 cp=F034 - dp=0270 areg=0270
Stack empty
Current stack effect: parameters:0,+1  return:0, 0
0208: F016 three
(0)

 

We have this in memory:

F000

F002

F004

F006

F008

F00A

F00C

F00E

F010

F012

F014

F016

F018

F01A

F01C

F01E

F020

F022

F022

(:)

(lit)

10

+

(;)

(:)

(lit)

100

+

one

(;)

(:)

(lit)

1000

two

(lit)

30

-

(;)

DOCOL

 

 

 

 

 

 

 

 

 

EXIT

 

 

 

 

 

 

 

 

 

The code of NEXT, DOCOL and EXIT are defined here (MSP430) and also here (linux/x86).

BUILD> DOES> example

In the next example, we’ll see how the BUILD> DOES> is generated. We’ll see two examples:

-         one use static memory allocation (that’s the standard Forth model)

-          the other use dynamic memory allocation (automatic variables are a FVM extension).

 

(0) list
106 words
. led              start=F000 end=F00E pstack_effect=0, 0 rstack_effect=0, 0
D const            start=F00E end=F01A pstack_effect=0, 0 rstack_effect=0, 0
. (const)          start=F01A end=F01E pstack_effect=0,+1 rstack_effect=0, 0
. ten              start=F01E end=F024 pstack_effect=0,+1 rstack_effect=0, 0
. vingt            start=F024 end=F02A pstack_effect=0,+1 rstack_effect=0, 0
. test1            start=F02A end=F034 pstack_effect=0,+1 rstack_effect=0, 0
. test2            start=F034 end=F062 pstack_effect=0,+1 rstack_effect=0, 0
(0) decomp const
Code: start=F00E end=F01A
stack_effect=0, 0 rstack_effect=0, 0
F00E: F8B4 (:)
F010: F8C0 (lit)              2
F014: FA70 (allot)
F016: F9DC !
F018: FAA0 (end-build)
(0) decomp (const)
Code: start=F01A end=F01E
stack_effect=0,+1 rstack_effect=0, 0
F01A: F9D0 @
F01C: F8BA (;)
(0) decomp ten
Code: start=F01E end=F024
stack_effect=0,+1 rstack_effect=0, 0
F01E: FA3A (does-static)      dp=0270 function=(const)
(0) decomp vingt
Code: start=F024 end=F02A
stack_effect=0,+1 rstack_effect=0, 0
F024: FA3A (does-static)      dp=0272 function=(const)
(0) decomp test1
Code: start=F02A end=F034
stack_effect=0,+1 rstack_effect=0, 0
F02A: F8B4 (:)
F02C: F01E ten
F02E: F024 vingt
F030: FAB2 +
F032: F8BA (;)
(0) decomp test2
Code: start=F034 end=F062
stack_effect=0,+1 rstack_effect=0, 0
F034: F8B4 (:)
F036: F8C0 (lit)              11
F03A: FA1C (alloc-auto-vars)  nb=2
F03E: FA32 (build-auto)       frame index=1  function=const
F044: F8C0 (lit)              100
F048: FA32 (build-auto)       frame index=0  function=const
F04E: FA42 (does-auto)        frame index=1  function=(const)
F054: FA42 (does-auto)        frame index=0  function=(const)
F05A: FAB2 +
F05C: FA22 (free-auto-vars)   nb=2
F060: F8BA (;)
(0)

Notes:

(1)   (alloc-auto-vars) allocate space for two automatic variables. Actually, (alloc-auto-vars) does not really allocate memory for the variable, but rather allocate memory (on the return Stack) which will hold the memory for the data, once the variable will be later allocated with (build-auto). Two (2) cells are needed on the stack:

(2)   one cell will hold the data memory address

(3)   one cell store the length of this data memory block (this is needed and used by (free-auto-vars) to free the memory).

(4)   (build-auto) is used to allocate and to initialize the variable (e.g. to create it). The next two (2) cells hold:

(5)   the address of the code to execute on variable creation (the part between BUILD>..DOES>)

(6)   the index of the automatic variable (0, 1, .. N-1)

(7)   (does-auto) is used on each instance of a variable. The next two (2) cells hold:

(8)   the address of the code to execute on variable use (the part after the DOES>)

(9)   the index of the automatic variable (0,1,.. N-1)

(10)           (free-auto-vars) is used to free the space used by the variables (both in the heap and in the return stack).

 

FVM supports variables scoping in logical blocks (such IF..ELSE..ENDIF, BEGIN..UNTIL, etc.). If there is any automatic variables within a logical block, the compiler with then compile (alloc-auto-vars) at the beginning of the block, and will compile (free-auto-vars) at the end of the block. FVM does not allow shadowing a variable (same variable name in deeper logical blocks).

 

Automatic variables are a nice, elegant and powerful concept in FVM. The obvious drawbacks are:

-         it use more memory space (for example, each variable call cost 3 cells of memory code)

-         it use more CPU cycles.

 

For this reason, if you use only constant and variable, FVM support them as automatic variables, with a code more optimized than if they were coded using BUILD> .. DOES>

 

It is very important to understand that FVM does NOT (currently) need a memory garbage collector. Automatic variables are allocated on the heap, in a LIFO way.

 

The code for BUILD> DOES> and friends is defined here (MSP430) and also here (linux x86).

 

Licensing

This software is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version.

 

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.

 

Please take a look at the GNU Web Pages for a copy of the GPL license: http://www.gnu.org/licenses/gpl

Pages created:
December 27th 2002

Olivier Singla

Pages last revised:
March 13rd 2006

 

SourceForge.net Logo