Crashcourse: How to programm 'Avise 5':

This text is not intended to be an introduction to Forth programming. But the most important differences from other programming languages are explained coarsely in section 3.

1. Installation, Startup: This crashcourse assumes that you have a completely assembled hardware and a microcontroller programmed with 'Avise5' firmware.

Start a terminal program with following preset:
Terminal Baud Rate is actually 57600 Baud for all CPU and System Clock versions, 8 data bits, no parity, one stop bit (8N1); no handshake; local echo OFF; 1:1 direct data transfer (i.e. no transfer protocol, no call procedure).
To send data from the terminal to 'Avise', a new line should be entered only with 'Carriage Return' (hexD). 'Line Feed' (hexA) is ignored by 'Avise'. To display received characters properly, however, most terminal software must add a 'Line Feed' (hexA).
As a standard solution I would recommend the special terminal program 'DTerm', which is available for download at this website.

Connect your 'Avise' hardware to the seleceted COM port of your PC and finally switch on the power supply.

Now you should see the start prompt on your display: "Avise 5.9 ATmega328 1.84MHz ok" (or similar. ATmega32U4 may not show the text due to USB enumeration delay). Then type some arbitrary characters and finish the line with <RETURN>. Every typed character should reappear immediately on the PC display. After <RETURN> a text should appear, which contains "ok" or some kind of error message. Then you may assume that hardware and firmware installation work correctly.
If strange signs appear on the screen, check the baud rate. If nothing appears on the PC screen, first check the COM port and cable with a loopback and the supply voltage. If all that works, the problem seems to be the 'Avise' module: defective hardware or faulty programming.

'Avise' does not distinguish between uppercase and lowercase letters, but inside everything is stored and handled uppercase. The following examples are typed in uppercase letters due to the 'official' notation. Typing lowercase letters is more comfortable, of course. If unexpectedly startup messages of 'Avise' appear during program execution, probably the program hang up and 'Avise' was reset by the watchdog timer of the microcontroller. Another reason may be a very unstable or noisy power supply which influences the brownout detector.
If the program hangs up without sending the reset message, control can be re-established at any time by typing the 'ESC' key (= hex1B, decimal27) on the terminal keyboard. A "warm" start is executed then: all control parameters are reset to their default values, all stack entries are deleted. The 'ESC' character is catched and evaluated at very low system level by the USART interrupt handler. As a consequence, the 'ESC' character never may be used in cases of raw binary data transfer. This effect can be disabled with the special command NOBAK (only available at ATmega with USART1).

2. The Operators ('Words', Commands, Functions, Subroutines) of 'Avise5':

Working with 'Avise', you can enter simple commands directly from the keyboard. But you can write and compile complex structured programs, too.
Due to its simple stack oriented parameter transfer, 'Avise' technically does not distinguish between 'operators', 'commands', 'functions', 'procedures' or traditional Forth 'words'. As a lingual convention in this text, commands built into the kernel shall be called 'Kernel Operators' or simply 'operators' and commands programmed by the user shall be called 'User Functions' or simply 'functions'.
To display a list of all currently available "operators" at your terminal screen, type "OPS <RETURN>". You find a detailled description of all operators built into the 'Avise' kernel in the Glossary of Kernel Operators. To display a list of all currently available 'User Functions' at your terminal screen, type "USER <RETURN>".

Commands are entered at the terminal program as text lines or command lines. 'Avise' does echo every character. But execution of the command line is not started before the line is entered completely and finished with <RETURN>. A command line may contain max. 80 characters.

Have a look more inside:
Generally in programming languages, a 'token' is an elementary unit of meaning. At user level a 'text token' is a word, i.e.a piece of text which is enclosed by space characters - or formulated more generally - enclosed by 'delimiters'. At runtime a 'token' is a specific byte code or cluster of some bytes to identify a specific operation. As will be shown below, the actually executed sequence of tokens may differ considerably from the sequence of text tokens in the source code! Avise5 is "subroutine threaded" which means that any compiled token (Kernel Operator or User Function) is represented by a call of a subroutine, which contains the specific runtime code of that operator or function. Because the code thread is a mixture of ATmega "call" and "rcall" instructions, it is hard to be deciphered "manually".

'Avise'- like any Forth interpreter - can operate in two different states of operation: "direct execution, interpreting" or "compiling". In the state of direct execution every text token is processed immediately. In the compiling state the corresponding sequence of token codes is generated out of the source code and integrated into the system as a new User Function, which has its own token code and its own unique name as a text token. From now on this token can be mounted as a brick into new User Functions with higher complexity... and so on. The maximum depth of nested calls ist limited only by the size of the AVR processor return stack. As a simple rule it can be stated that in most cases a depth of up to 20 nested calls is safe. But especially every FOR...LOOP structure takes 6 additional entries in the return stack, which may reduce the maximum depth of nested calls considerably.

A significant difference exists between the operators which are basically integrated in the 'Avise' kernel (as fast running assembler routines) and the user programmed functions: the latter ones each are a threaded sequence of soubroutine calls and need more CPU time for execution.

3. The data stack and Forth typical programming style:

The most elementary storage mechanism of all Forth - 'Avise' too - interpreters is the 'data stack' memory.
The classical textbook example for a "stack" is a stack of plates: the last put on one gets revoved first.

The number put last on the stack is called "top of stack" (shortened TOS). The second element of the stack is called "next of stack" (shortened NOS). When a number, a variable or a constant is called, its actual numeric value is put on the TOS, this way the former TOS becomes NOS.

In contrast to the commonly used method of mathematical task formulation, Forth interpreters - 'Avise' too - use "Postifx Notation".
A standard addition task like "SUM = 2 + 3" is formulated under Postfix Notation:
2 3 + SUM W
This means: 2 is put on TOS, 3 is put on TOS (2 is NOS now) both are added, 5 is TOS now, the address of VARiable SUM is put on TOS (5 is NOS now), 5 is written into this address, 2 items are deleted from data stack.

For example at runtime of a 'C'-compiled program, inside the computer still Postfix Notation is working to load data from variables into registers, combine them and store them back to variables very similar to the way as it is programmed in Forth. You can check it with a disassembled 'C'-code. The commonly used source code style is overhead for compilation. This way, 'Avise' and every Forth interpreter works very near at the root of machine computing.

When accepted, Postfix Notation simplifies interactive programming, especially handling I/O of embedded microcontrollers - compared with C- or JavaScript- style interactive input. (To be honest: writing a Forth interpreter is much more easy for me and takes much less microcontroller resources than writing a CPU-based C- or JavaScript interpreter.)

DUP and DROP are most elementary stack operations: DUP copies the TOS and puts it on itself, the former TOS is NOS then. DROP deletes the actual TOS and the former NOS is TOS then.
In most algorithmic operations, the interpreter engine combines the TOS with the NOS, deletes the NOS and returns the result on the TOS. (The stack shrinks, example see above)

Other very basical Forth operators are SWAP (exchanges NOS and TOS) and OVER (copies NOS over TOS, so previous TOS is NOS now).
There is a set of operators too, which only manipulate the TOS, e.g. 1+, 1-, ABS, +/-, NOT

4. Numbers, Constants and Variables

The (virtual) numeric processor of 'Avise' exclusively operates with 16bit integer numbers. The range of (signed) numbers is -32768(hex8000) to +32767(hex7FFF). -1 is represented as hexFFFF. This technique of cyclic representation of numeric values may be confusing at a first glance, but is common use in computers. In some cases numbers are regarded unsigned, then the 16bit range is 0 to 65535(hexFFFF)

By default, at powerOn 'Avise' operates on decimal number base. Because I/O operations and memory handling is much easier handled with HEX numbers, the default number base can be changed as follows: Enter via terminal: 0 0xD WREEP <return>. This gets effective after reset and next powerON. Change back to decimal, enter via terminal: 0xFF 0xD WREEP <return>. During programming session, this can be temporarily changed with HX or DZ, details see below. WREEP must be handled with care, wrong input may destroy other system parameters !

Independent of the active number base hex numbers can be entered then with leading 0x, decimal numbers can be entered with leading &(no space!). Number input in correspondence with the active number base can be made without a prefix marker.
Change the number base during operating/programming session: The Kernel Operator 'DZ' switches to decimal number base and decimal number I/O. The Kernel Operator 'HX' operator switches back to hex number base, until this is revised by another 'DZ' or 'HX' or system reset.

When 'DZ' or 'HX' are used as Kernel Operators during compilations, there arises a problem: are these operators effective at compile time or at runtime? 'Avise' does handle them as follows: When 'DZ' or 'HX' are used in definitions of User Functions, they have no influence at compile time. They only switch the number base at runtime. During compile time generally the previously active number base remains active. If the number base shall be modified during compile time, I suggest to use 0x or & as number prefix where appropriate. When a User Function terminates, the number base is set back the general system level.

Constant numbers, which are used more than one time in a program, should be called by symbolic names. In 'Avise', you can declare a CONSTant:
  2413 CONST CONNY
The declaration must be done BEFORE the first use. The value of the CONSTant has to be entered first. It even can be the result of a calculation. CONST values are stored in CPU EEPROM, so they are unchanged after power cycle. After a CONSTant is declared, it returns it's value on stack when called by name.

The name must not be identical with another operator or another user defined function or with a numeric expression. (Attention, words like AFFE, DEC, FACE, DEAD are valid hexadecimal numbers!). The same applies to all names of variables and functions!

A special feature to change CONSTant values at runtime is the Kernel Operator 'RECON', syntax expample: "0xABDC RECON CONNY". This is not recommended for fast and frequent change of parameters, but the new value is still active after system restart (stored in EEProm).

To store intermediate results, VARiables (16 bit integers) are used, which are declared as follows:
  VAR VALLY
VARiable values are stored in SRAM. Directly after declaration and after system (cold-)start, every variable is initialized with value 0. 'Avise' knows only one type of variable: 16 bit integer.

Variables declared this way are 'global', i.e. can be called from any function. This is simple and compact but has some disadvantages for instance when functions are calling themselves recursively. In complex programs you may forget, that you have already used the same 'unimportant' variable perhaps as loop counter in another function. Parameter handling by data stack is less clear but avoids these problems.

After a variable is declared, it returns the ADDRESS !! of its data memory when called by name. See below "Read and Write random Memory Cells".

Though VARiables are generally user defined, there are three variables VA, VB, VC implemented as Kernel Operators, which may be changed at runtime of a User Function with a special background process. Details see section 12 below. Else runtime behaviour of VA, VB, VC is the same as user defined variables.
A special kind of global variables is the ARRAY Kernel Operator, details see glossary.

5. Arithmetics, Logical Operations and Comparisons

Arithmetic operators combine 16 bit integer numbers to resulting 16 bit numbers. Addition and subtraction show cyclic overflow behaviour, i.e. are applicable to signed and unsigned numbers. Multiplication and division are processed "signed".

Bitwise logical operations are executed with 16 bit binary numbers. Every bit of the left operand is combined with the respective bit of the right operand. No carry or borrow between the bits. Compare operations have the result 1 when the result of the comparison is TRUE and the result is 0, when the result is NOT true (i.e. FALSE).
    Example:
   2 3 <   returns 'TRUE'=1, but
   4 4 <   returns 'FALSE'=0 to the TOS. All comparisons are done with signed numbers except U> and U<, which take operands as unsigned.

6. Read and Write random memory cells:

Differing from previous versions of 'Avise', we have a set of BYTE and WORD read and write operators now.

Any of these write operations is used as follows:
First enter the value to be written (becomes NOS), then enter the address where written to (becomes TOS). Both stack entries are deleted during the write operation.
    Example:
  VAR VALLY
  374 VALLY W

Any of these read operations is used as follows:
Enter the address from where to read (is TOS). Execute the read operator. After, the read value is TOS ,exchanded, no change of stack level.
    Example:
  VAR VALLY
  VALLY R .

7. Structuring Techniques:

Structuring techniques only can be applied with compilation, not during direct execution of a command line. But user programmed functions which contain structured code sequences can always be called directly by name.

Conditional execution of program parts:

IF... ELSE ... THEN
The ELSE alternative is optional.

First in this construction, IF checks the TOS.
If it is 'TRUE', i.e. unequal 0, the code between IF and THEN (respectively IF and ELSE) is executed.
If the TOS is 'FALSE', i.e. exactly equal 0, the code between ELSE and IF is executed (or nothing is executed when the ELSE part is not present).
Example:
VAR STATUS
: TEST STATUS R IF ." TRUE" ELSE ." FALSE" THEN ;

<start> <stop> <variable> FOR .. your code .. LOOP (counted loop):

<variable> is the name of a previously declared variable, which serves as loop counter. <start> is a number, variable or constant or even a function, which describes the start index of the loop, <stop> accordingly supplies the index of the last loop turn, after which the loop is finished. Via signed compare of the start and stop index automatically is determined if the loop is run with ascending or descending loop counter. For a more detailled description how this is managed on machine code level, look at the glossary.
After each loop turn, LOOP adds (or subtracts) 1 to the counter variable. Other loop steps are created by changing the loop counter variable within the loop.
When the Kernel Op BREAK is called within the loop, it is quitted immediately. If several loops are nested, BREAK always causes the innermost loop to be terminated.
When the Kernel Op CONTI is called within the loop, the loop index is updated and the next "looping" is started immediately (or loop is terminated when index countdown is finished) .
Example:
VAR LUPO
: TEST
   1 100 LUPO FOR
      LUPO R DUP . 28 = IF BREAK THEN
   LOOP ;
;

DO ... WHILE     --WHILE terminates the loop if it reads TOS==FALSE, TOS gets deleted anyway
DO ... UNTIL     --UNTIL terminates the loop if it reads TOS==TRUE, TOS gets deleted anyway
DO ... AGAIN     --AGAIN never terminates the loop, TOS is not checked nor modified
Kernel OPs BREAK and CONTI are useful with these loop types, too. CONTI immediately initiates an unconditional loop repetition (like AGAIN, but from inside the loop).
Example:
VAR COUNT
: FOREVER
   0 COUNT W
   DO
      COUNT R 1+ DUP . DUP COUNT W
   20 == UNTIL ;
;

These types of loop are very useful when the stop criterion is evaluated within the loop. In a worst case, the loop can be terminated with 'ESC'.

The sequences of token codes, which are constructed by structuring techniques do differ considerably from the source code. Structuring operators are not directly compiled. Instead alternative token are compiled, which perform the corresponding action at runtime. The latter contains some runtime primitives doIF, doELSE, doDO, doUNTIL, doWHILE, doAGAIN, doFOR, doLOOP, which contain the Flash target address where the jump goes to. Kernel Ops RS-1UP or RS-4UP are compiled automatically to adjust the CPU return stack. This can easily be checked with the command SEE <name>. THEN doesn't leave any visible trace in the compiled code, but delivers the corresponding target address at compile time.

8. Timed operations:

If necessary, a very short delay in a User Function can be inserted with the sequence DUP DROP without any effect on stacks and other system functions.
For longer delays, MS provides a blocking delay. The delay parameter of MS is entered in milliseconds (not absolutely precise). The blocking state can be released at any time with terminal input CTRL_R(0x12,dec18).
Example:
1000 MS

Another option is a non-blocking delay, started with TIX. Delay parameter is in given in 1/10second units now, to create a simple "timeline" for media installations, e.g. up to 6553 seconds. The actual countdown value is read with TIMEwithout disturbing the countdown procedure. A new value of TIX can be entered at any time to restart or ppolong the TIME countdown. Except countdown, nothing else happens inside the Avise interpreter.
Example:
5000 TIX . Then type TIME .  and the remaining number of 1/10 second units will be displayed

9. The 'Avise' mini Operating System and Compiler:

A great advantage of Forth interpreters like 'Avise' is the possibility for the user to program his own User Functions more or less interactively via terminal upon and expanding existing user code. These may not only call the Kernel Operators, even previously written User Functions can be called. Every User Function wears its own name.

To make best use of ATmega internal memory, the program code is separated into different sections:

The Kernel Symbol table contains of following elements for every Kernel Operator:

As in 'Avise' the User Symbol table is built into the CPU internal EEProm, it is strongly advised to create short names, because the maximum number of user definitions depends on the limited EEProm space. Differing from Flash code, EEProm is organized byte strucured. Every entry in the symbol table takes (name length + 3) bytes. So into 1 kilobyte of EEProm will fit about 125 User Function definitions, when the average name length is 5 characters. Max.length is limited to 12.

The colon operator :   - separated with a space from the following - starts the compiler.
The first word in the source code will be the name of this new User Function.
Subsequent text words (numbers, constants, variables, Kernel Operator and User Function names) are not executed directly but are restructured and compiled as token code.

The semicolon operator ; always must be the last of a User Function because it finishes the compilation process and switches back to the mode of directly interpreting command execution. The programmed user code is kept stored in Flash even when the CPU power is switched off.
The semicolon operator ; always compiles the byte sequence 8 95, which is helpful to decipher the subroutine threaded code. At runtime this code token switches execution up to next higher subroutine level. Highest level is return to terminal input, interpreting command execution.

First, the code is built temporarily in the microcontroller SRAM. When compilation is terminated successfully, the complete code block is burnt into the microcontroller flash memory. Note: Flash memory can be re-programmed only ca. 10.000 times.

To create a user project, 'incremental' compilation is recommended following the "bottom up" method. This means: Elementary routines like hardware handler are developed and tested each first. These are mounted ascendingly into more complex sequences. The resulting product finally is the "main program".

Only Kernel Operators and older User Functions (incl VAR, CONST) can be used for new User Functions. No 'forward references' are possible. When a name is used double, subsequent calls of this name refer to the youngest definition or declaration. But older references programmed before remain valid for calling User Functions which are older than the new User Function.

When a new firmware version is burnt into the ATmega CPU, all user programmed functions are lost.
For this reason it is recommended, to write the ASCII text to be compiled first on an ASCII text editor and upload it via terminal. Incremental program development is done by "commenting out" the finished parts of source code.

10. Debug Features of 'Avise':

11. Warmstart, delete User Functions:

12. Hotkeys and Background features used by 'Avise 5'

In general, 'Avise' accepts only 'printable characters'. These are: ASCII codes between hex20 and hex7E, furthermore 'carriage return' =hexD as <RETURN> key and 'backspace' = 8, but no control characters and country specific special characters like german 'Umlaute'.
Nevertheless, the following ASCII codes have a special meaning for 'Avise' - they are already filtered out in the terminal input handler of the serial interface:


* State of information November 2023.
* Right of technical modifications reserved. Provided 'as is' - without any warranty. Any responsibility is excluded.
* This description is for information only. No product specifications are assured in juridical sense.
* Trademarks and product names cited in this text are property of their respective owners.