/ charliterals /

Character literals

description:

The ANSI/ISO Forth standard recommends the following syntax for
character literals:
[CHAR] c
within colon definitions and
CHAR c
outside colon definitions.

Many Forth systems support the following syntax as well:
'c'


Supported in:

Win32Forth, T32Forth

HEX ok
'a' . 61 ok
'a' EMIT a ok


Not supported in:

MPE VFX Forth 3.40 14 May 2001,
SwiftForth 2.2.2 09Mar2001,
SP-Forth 4.006


Differently supported in:

bigForth and gForth
HEX ok
'a' . 6127 ok
'A' . 4127 ok
where 61 is the code of 'a' and 27 is the code of tick.
To obtain the code of 'a', you have to write: 'a

'a . 61 ok
'a EMIT a ok

The logic behind this strange syntax is that ' (tick)
is actually a number conversioon prefix. (Do not ask
me why or how, I am not a proponent of this syntax -- mlg)

In addition, I (mlg) would like to cite the following message:

Marcel Hendrix writes (Message-ID: <9r1i29$juf$1@news.IAEhv.nl>):
>
> An amusing bug in a metacompiler that allows forward references:
>
> 'REGMASK @ IF ....
>
> Because my old NUMBER? allowed 'A to mean [CHAR] A the reference to the
> forward label 'REGMASK was compiled as [CHAR] R . I have now changed NUMBER?
> so that a quoted character constant should have a length of 3 and end in a
> "'" too.


summary:

Which syntax is better, 'c' or 'c , is the subject of an argument.

If you write for portability, you cannot rely on this syntax.
OTOH, understanding what was meant in such code is usually
no problem for a human.


note:

You see that with the gForth syntax, 'abcd leaves 'a' in the most
significant byte and 'd' in the least significant byte. Let us
define VARIABLE X . What do you think, which character will be in
the byte at address X after execution of 'abcd X ! ?
The answer is that this depends on the platform.


one more approach:

ASCII c

ASCII is a state-smart word that leaves the character code
on the data stack when invoked in the interpretation state
and compiles that code as a literal when invoked in the
compilation state.

Origin : Forth-83. Widely adopted since.


one more approach:


: CH> ( "c" -- c )
    SOURCE >IN @ TUCK > >R + C@ R> 0= OR 1 >IN +!
;

CH> "c-h-from"

( "c" -- c )

Obtain the next character from the input stream.
Advance >IN by one character.
Return -1 (the value with all bits set) in the case of end-of-line.

The word CH> is useful inside user definitions and in the interpretation
state. (It does not compile a char as a literal in compilation state.)

generated Tue Apr 28 11:05:53 2026runner