/ charliterals /

Character literals

description:: The ANSI/ISO Forth standard recommends the following syntax for character literals: [CHAR] c within colon definitions and CHAR c outside colon definitions. Many Forth systems support the following syntax as well: 'c'
Supported in:: Win32Forth, T32Forth HEX ok 'a' . 61 ok 'a' EMIT a ok
Not supported in:: MPE VFX Forth 3.40 14 May 2001, SwiftForth 2.2.2 09Mar2001, SP-Forth 4.006
Differently supported in:: bigForth and gForth HEX ok 'a' . 6127 ok 'A' . 4127 ok where 61 is the code of 'a' and 27 is the code of tick. To obtain the code of 'a', you have to write: 'a 'a . 61 ok 'a EMIT a ok The logic behind this strange syntax is that ' (tick) is actually a number conversioon prefix. (Do not ask me why or how, I am not a proponent of this syntax -- mlg) In addition, I (mlg) would like to cite the following message: Marcel Hendrix writes (Message-ID: <9r1i29$juf$1@news.IAEhv.nl>): > > An amusing bug in a metacompiler that allows forward references: > > 'REGMASK @ IF .... > > Because my old NUMBER? allowed 'A to mean [CHAR] A the reference to the > forward label 'REGMASK was compiled as [CHAR] R . I have now changed NUMBER? > so that a quoted character constant should have a length of 3 and end in a > "'" too.
summary:: Which syntax is better, 'c' or 'c , is the subject of an argument. If you write for portability, you cannot rely on this syntax. OTOH, understanding what was meant in such code is usually no problem for a human.
note:: You see that with the gForth syntax, 'abcd leaves 'a' in the most significant byte and 'd' in the least significant byte. Let us define VARIABLE X . What do you think, which character will be in the byte at address X after execution of 'abcd X ! ? The answer is that this depends on the platform.
one more approach:: ASCII c ASCII is a state-smart word that leaves the character code on the data stack when invoked in the interpretation state and compiles that code as a literal when invoked in the compilation state. Origin : Forth-83. Widely adopted since.
one more approach:: : CH> ( "c" -- c ) SOURCE >IN @ TUCK > >R + C@ R> 0= OR 1 >IN +! ; CH> "c-h-from" ( "c" -- c ) Obtain the next character from the input stream. Advance >IN by one character. Return -1 (the value with all bits set) in the case of end-of-line. The word CH> is useful inside user definitions and in the interpretation state. (It does not compile a char as a literal in compilation state.)