Concatenative Languages: Forth, PostScript, Factor

a side-by-side reference sheet

stack operators | arithmetic and logic | heap | strings | containers | functions | execution control | environment and i/o
libraries and modules | reflection and hooks | graphics

forth (1973) postscript (1982) factor (2003)
version used Gforth 0.70 Preview 5.0.2 0.92
get version $ gforth --version select About Preview in Preview menu note version when downloading factor
command line repl $ gforth none $ factor
interpreter $ gforth file -e bye none $ factor file
shebang none none #! /usr/bin/env factor
exit interpreter or repl bye quit USE: system
0 exit
documentation \ tan help
words case sensitive? no yes yes
assignment variable a
3 a !
7 a @ +
/a 3 def
7 currentdict /a get add
SYMBOL: a
3 a set
7 a get +
increment and decrement 7 1+
7 1-
to-end-of-line comment 1 1 + \ addition 1 1 add % addition 1 1 + ! addition
embedded-in-line comment 1 1 ( addition ) + none none
null none null null
null test none a get null =
undefined word
stacks data
floating point
return
locals
operand
dictionary
execution
graphics state
clipping path
data
stack underflow
stack operators
display stack .s
displays float stack:
f.s
none
clear stack clears data stack:
clearstack
clears data and float stack:
clearstacks
clear clear
count stack depth
fdepth
count
copy stack count copy
(a -- ) drop
displays item being dropped:
.
operate on float stack:
fdrop
f.
pop drop
(a -- a a) dup
fdup
dup dup
(a b -- b a) swap
fswap
exch swap
(a b -- b) nip
fnip
exch pop nip
(a b -- a b a) over
fover
1 index over
(a b -- b a b) tuck
ftuck
dup 3 1 roll USE: shuffle
tuck
(a b c -- b c a) rot
frot
3 -1 roll rot
(a b c -- a b c a) 2 pick
2 fpick
2 index pick
(a b c d -- b c d a) 3 roll
3 froll
4 -1 roll USE: shuffle
roll
arithmetic and logic
forth postscript factor
true and false true false
-1 0
true false t f
falsehoods false 0 false f
logical operators and or invert xor and or not xor and or not xor
convert from string, to string 7 (12) cvi add
73.9 (.037) cvr add
USE: math.parser
7 "12" string>number +
73.9 ".037" string>number +
17 number>string
17.1 number>string
comparison operators = <> > < >= <=
f= f<> f> f< f>= f<=
eq ne gt lt ge le = none < > >= <=
float literal must contain an e:
3.14e0
3.14
3.14e0
must contain decimal point or e:
10.0
1e1
arithmetic operators + - * ?? / mod ??
float stack operators:
f+ f- f* f/ f**
add sub mul div idiv mod exp USE: math
+ - * / /i mod ^
negation negate
fnegate
neg neg
arithmetic functions fsqrt fexp fln fsin fcos ftan fasin facos fatan fatan2
also flog for base 10 logarithm
sqrt exp ln sin cos none none none none atan
also log for base 10 logarithm
USE: math.functions
sqrt exp log sin cos tan asin acos atan ??
aslo log10 for base 10 logarithm
arithmetic truncation return floats
fround ?? floor ??
round truncate floor ceiling USE: math.functions
round truncate floor ceiling
arithmetic decomposition abs abs arg
numerator denominator
real-part imaginary-part
closure of integers under division integers floats rationals
largest integer none
float overflow 1/0.
float limits USE: math.constants
single-epsilon largest-float smallest-float
1/0 Division by zero exception
sqrt(-2) USE: math.functions
-2 sqrt
C{ 0.0 1.414213562373095 }
bit operators and or xor invert
logical shift by 3:
7 3 lshift
7 3 rshift
arithmetic shift by 1:
7 2*
7 2/
and or xor not
logical shift by 3:
7 3 bitshift
7 -3 bitshift
bitand bitor bitxor bitnot
arithmetic shift by 3:
7 3 shift
7 -3 shift
setting seed 13 srand
result of not seeding always given seed of 1060806853 when Preview starts
integer rand generates random integer from 0 to 231-1:
rand 100 mod
USE: random
100 random
uniform rand 2 31 exp div USE: random
0 1 uniform-random-float
normal USE: random
0 1 normal-random-float
sample w/o replacement USE: random
{ 3 7 5 12 19 8 } 3 sample
heap
allocate region on heap pushes address and 0 onto stack:
10 allocate
free region on heap free
resize region on heap
write
read
strings
forth postscript factor
string literals allocates memory for string and pushes address and length onto stack:
s\" don't say \"no\""
(don't say "no")
parens and backslashes are escaped with backslashes. Balanced parens do not need to be escaped.
"don't say \"no\""
string escapes \a \b \e \f \n \r \t \v \" \\ \, \ooo \xxx \b \f \n \r \t \\ \( \) \ooo \e \n \r \s \t \\ \" \0 \uxxxxxx
newline in literal no, use \n no, use \n yes
character access (hello) 0 get 0 "hello" nth
length (hello) length "hello" length
concatenate "foo" "bar" append
split USE: splitting
"foo bar baz" " " split
join { "foo" "bar" "baz" } " " join
trim
pad
case manipulation
sprintf USE: formatting
"Spain" 7 13.1 "%s: %d %f" sprintf
containers
forth postscript factor
array literal [ 1 2 3 ] { 1 2 3 }
must arrays be homogeneous no no
array element access [ 1 2 3 ] 0 get 0 { 1 2 3 } nth
out-of-bounds access rangecheck error out of bounds error
array length [ 1 2 3 ] length { 1 2 3 } length
array concatenation { 1 2 } { 3 4 } append
map { 1 2 3 } [ dup * ] map
filter { 1 2 3 } [ 2 > ] filter
reduce { 1 2 3 } 0 [ + ] reduce
array iteration [ 50 100 150 ] { 100 moveto (foo) show } forall
dictionary literal << 1 (one) 2 (two) 3 (three) >> association list:
{ { 1 "one" } { 2 "two" } { 3 "three" } }
hashtable:
H{ { 1 "one" } { 2 "two" } { 3 "three" } }
dictionary length << 1 (one) 2 (two) >> length { { 1 "one" } { 2 "two" } } assoc-size
dictionary element access pushes (one) onto stack:
<< 1 (one) 2 (two) >> 1 get
pushes "one" and t onto stack:
1 { { 1 "one" } { 2 "two" } } at*
element not found behavior undefined error pushes f twice onto stack:
4 { { 1 "one" } } at*
set dictionary element /h << 1 (one) 2 (two) >> def
currentdict /h get 3 (three) put
SYMBOL: h
H{ { 1 "one" } { 2 "two" } } h set
"three" 3 h get set-at
range USE: math.ranges
1 100 [a,b]
functions
forth postscript factor
definition : add3 + + ; /add3 { add add } def : add3 ( a1 a2 a3 -- s ) + + ;
invocation 3 7 5 add3 3 7 5 add3 3 7 5 add3
function value { 2 add }
3 { 2 add } exec
[ 2 + ]
3 [ 2 + ] call
execution control
forth postscript factor
notes on conditionals conditionals can only be used in word definitions.
endif can be used in place of then
ifelse : ifelse rot if drop else nip then ;
x @ 0 > x @ x @ negate ifelse
x 0 gt { x } { x neg } ifelse x get 0 > x get x get neg ?
if bool
if
  if true
then
bool { if true } if bool [ if true ] when
if else bool
if
  if true
else
  if false
then
bool { if true } { if false } ifelse bool [ if true ] [ if false ] if
notes on loops loops can only be used in word definitions.
leave can be used with loop to terminate early
exit terminates innermost for, forall, loop, or repeat
conditional loop begin
  bool
while
  code
repeat
{
  bool { exit } if
  code
} loop
[ bool ] [ code ] while
infinite loop begin
  code
again
{ code } loop [ code t ] loop
loop 10 times 10 0 ?do code loop 10 { code } repeat
for loop 10 0 ?do i code loop iterator value on stack
0 20 200 { 72 exch moveto (foo) show } for
for loop step 2 10 0 +do i code 2 +loop
notes on exceptions throw raises an exception if the top of the data stack is nonzero. exception puts a nonzero value on the stack. If the exception is not handled the string will be displayed by the interpreter on exit. catch can only be used in a word definition.
raise exception s" bam!" exception throw $error /errorname (foo) put stop "bam!" throw
handle exception ['] risky word catch if handle error then {
  risky code
} stopped { handle error }
[ risky code ] [ handle error ] recover
environment and i/o
forth postscript factor
write to stdout ." hello" cr "hello" write
redirect to file
libraries and modules
forth postscript factor
interpret file foo.fs include
or
s" foo.fs" included
load library require foo.fs
or
s" foo.fs" required
reflection and hooks
forth postscript factor
query data type none 3 type USE: classes
3 class
pretty print pprint
dump location and stack ~~
show code see word
graphics
forth postscript factor
default location of origin lower left corner of page
default scale 72 units per inch
text 72 72 moveto (hello) show
render page and start a new one showpage
set current font /Helvetica 20 selectfont
built-in fonts /Times-Roman /Times-Italic
/Times-Bold /Times-BoldItalic
/Helvetica /Helvetica-Oblique
/Helvetica-Bold /Helvetica-BoldOblique
/Courier /Courier-Bold
/Courier-Oblique /Courier-BoldOblique
closed polygon creates outline of triangle:
newpath
72 72 moveto
144 144 lineto
144 72 lineto
closepath
stroke
set current line width default value is 1:
2 setlinewidth
arc of circle creates half-circular arc centered at (144,144) with radius 40:
newpath
144 144 40 90 270 arc
stroke
bezier curve control points are (144,144), (288,144), (288,288), and (144,288):
newpath
144 144 moveto
288 144 288 288 144 288 curveto
stroke
filled polygon creates solid triangle:
newpath
72 72 moveto
144 144 lineto
144 72 lineto
closepath
fill
set color default color is black. Sets current color to red:
1.0 0.0 0.0 setrgbcolor
translate coordinates move origin 72 units up and to the right:
72 72 translate
scale coordinates increase size of unit 72-fold:
72 72 scale
rotate coordinates rotate coordinates 90 degrees counterclockwise:
90 rotate
items in graphics state currentpoint
currentrgbcolor
currentfont
currentmatrix
currentlinewidth
save graphics state gsave
restore graphics state grestore
__________________________________________________________ __________________________________________________________ __________________________________________________________

repl

factor:

The Factor REPL is called a listener. The factor installation includes a clickable application which provides a GUI listener.

assignment

factor:

The following code will get the value for a variable and push it onto the stack:

a get

Stack Operators

Arithmetic and Logic

operators

factor:

The power operator ^ requires loading the math.functions library.

Factor's division operator / will return a rational if used on integers that don't divide evenly. /f always returns a float when used on integers.

Heap

Strings

Containers

Functions

function value

How to store a function in a variable or pass it as an argument

factor:

The square brackets in factor create a quotation. It delays the execution of the code inside the brackets and pushes that code onto the stack as a single value. If a quotation is on the top of the stack, it can be popped and executed with call.

Execution Control

Environment and I/O

Libraries and Modules

Reflection and Hooks

Graphics

Forth

Gforth User Manual
A Beginner's Guide to Forth Noble
Forth in Lisp
ANS Forth 1994

Forth is an easy language to implement. It is surprisingly expressive, considering its simplicity. A Forth interpreter can be implemented with a small memory footprint and it can be run on hardware as a shell without an operating system. As a result Forth is often the first software implemented on newly designed hardware, and it has been used as a bootstrap language by Sun on RISC architectures and Apple on the PowerPC.

The data stack used by Forth procedures to pass operands consists of cells of machine words. These will be interpreted as signed integers, unsigned integers, or memory addresses, depending upon the operator invoked on them. A Forth interpreter does not perform any checks to ensure type safety.

To work with strings and arrays, a Forth programmer must allocate and free space on the heap with the allocate and free commands. Strings are then represented on the data stack with two cells: the address of the string and the length of the string. To make working with these string representations easier, there are operators which manipulate the stack two cells at a time: 2drop, 2dup, 2swap.

Most Forth implementations support floating point numbers. Floats are kept on a separate stack, and many data stack operators have a floating point stack analogue. Floating point operators have an "f" prefix: fdrop, fdup, fswap, f+, f*, f<.

PostScript

PostScript Language Reference (pdf)
PDF Reference (pdf)

PostScript is a commonly used page description language. It has programming language features one might not expect to find in such a special purpose language such as looping and branching commands, arrays and dictionaries, and even commands which read from standard input and write to standard out.

Like Forth, PostScript is a language with built-in operators that add value to or remove values from an operand stack. Unlike Forth, PostScript keeps track of the type of each value in the operand stack, and it will raise an error if an operator is called on an operand of the wrong type. PostScript also offers string, array, and dictionary data types so that direct access to memory is not needed or even supported.

A PostScript document consists of a sequence of pages, one for each execution of the showpage operator. PostScript provides a selection of operators which paint marks such as letters, shapes, or lines onto the page. The painted marks are collected in a buffer. When a showpage operator is invoked, the marks are rendered on the page and the buffer is cleared. If two marks overlap, the mark which was painted last will be completely visible, and the earlier mark will be partially or completely obscured as if it had been painted over.

The coordinate system used for determining the location of a painting mark places the origin in the lower left corner of the pa by default. The scale is 72 units per inch, with the x coordinate increasing in a rightward direction and the y coordinate increasing in an upward direction. The program may scale, rotate, or translate the coordinate system as pleased as long as the transformation is linear. The size of the page depends on the output device. Although the origin is by default the lower left corner of the physical page, some printers cannot print on the margin of the page; such printers cannot place a mark at or near the origin. The portions of marks which are outside the renderable window of the output device are ignored.

Some of the behavior of the painting operands is governed by the current graphics state. This reduces the number of operands that must be pushed onto the stack when calling a painting operator. The most commonly used parameters in the current graphics state are the current position, path, color, font, line width, coordinate transformation matrix. The gsave and grestore will push the current graphics state onto a stack or pop and restore the top value of the graphics state from the same stack.

PostScript Development

Ghostscript
PSAlter
Preview

One can test PostScript code by sending it to a printer, but it is better to have an application which can render the PostScript on a screen. Ghostscript is free and runs on Unix and Linux. PSAlter can be purchased for Windows.

Mac OS X includes an application called Preview which renders PDF documents on the screen. It can also be used to render PostScript which it accomplishes by automatically converting the PostScript to PDF. Preview doesn't provide informative error messages. The following PostScript shows how to use the stopped operator to catch an error and display the error name and the command that caused the error:

%!PS
/showobject { () exch 40 string cvs show } bind readonly def
/showerror {
    /saved_errorname $error /errorname get def
    /saved_command $error /command get def
    initgraphics
    /Helvetica 20 selectfont
    72 644 moveto (this error occurred: ) show
    144 572 moveto currentdict /saved_errorname get showobject
    72 504 moveto (operator causing the error: ) show
    144 432 moveto currentdict /saved_command  get showobject
    $error /newerror false put
    showpage } bind readonly def

{
    /Helvetica-Bold 20 selectfont
    72 576 moveto
    /h << 1 (one) 2 (two) >> def
    currentdict /h get 3 get show showpage

} stopped { showerror } if

Factor

Factor Documentation

History

The Development of Forth

Forth: The Early Years Moore 1991
The Evolution of Forth: Development and Dissemination
Forth-79 (pdf)
Forth-83
ANSI Forth 1994

In 1958 Charles Moore was an undergraduate at MIT with a part time job at the Smithsonian Astrophysical Observatory (SAO). His job involved calculating orbits with the aid of an IBM 704 computer and the Fortran II programming language. He implemented an interpreter in Fortran that recognized four words: WORD, NUMBER, INTERPRET, ABORT. Like modern Forth, the interpreter expected input consisting of words separated by spaces.

In 1961 Moore was studying mathematics at Stanford, where he used a Burroughs B5500 computer. He worked as a programmer for the Stanford Linear Accelerator Center (SLAC). He wrote software to "optimize beam steering" for the accelerator. While at SLAC Moore implemented an interpreter in Algol called Curve. Curve was a clear predecessor to Forth, and in particular it used a stack in the same manner as Forth to pass arguments. Here are some of the words which the Curve recognized, and their Forth equivalents:

Curve Forth
+ +
- -
* *
MINUS NEGATE
IF ELSE THEN IF ELSE THEN
< !
DUP DUP
; DROP
. SWAP
DEFINE word code END : word code ;
DECLARE VARIABLE
COMMENT text END ( text )
SIN SIN
ATAN ATAN
EXP EXP
LOG LOG

Moore worked as a free-lance programmer in a variety of languages between 1965 and 1968, and during this time Moore ported his interpreter to other architectures. Teletype terminals became available, and Moore introduced the convention of having the interpreter echo 'ok' after a line of text is processed.

In 1968 Moore began work at Mohasco Industries, a home furnishings company which leased an IBM 1130 with an IBM 2250 graphics display. Moore decided to name his language. He wanted to name it Fourth, but due to a 5 character limitation of the 1130 operating system, he used Forth instead. He implemented his interpreter in COBOL on the 1130, in Fortran on the 2250, and in 1970 when then company ordered a Univac 1108, he implemented it in Univac assembler. Moore used Forth on the 2250 to render animated 3-D images, a feat that IBM itself was apparently incapable of. On the Univac 1108 Moore intended to use Forth interfacing with Cobol to build an order entry system that ran on leased lines, but Mahasco canceled the project and gave up the Univac 1108.

In 1971 Moore began work at the National Radio Astronomy Observatory (NRAO) in Tuscon. Moore was hired by Greg Conant, who knew of Moore's work at SAO. Moore's first project was to cross-compile Forth to the Honeywell 316 from an IBM 360/50. He then used Forth to control "a new filter-bank for the 36' millimeter telescope". Conant was not happy with the result because NRAO was a Fortran shop, but the system was otherwise well received.

At NRAO, Moore began using indirect-threaded code. He found that he could write code more quickly that was more efficient, more reliable, and more portable than code written in alternative languages. Bess Rather became the first Forth programmer other than Moore himself and she documented the language.

Moore felt that Forth had become a mature product at NRAO. In 1973 he and Rather founded Forth, Inc. The company ported Forth to a variety of architectures. The developed a variant of the language called microFORTH in 1976 which they sold off the shelf. Versions of microFORTH were available for the Intel 8086, the Zilog Z80, and the Motorola 6800.

Standards for Forth were published in 1979, 1983, and by ANSI in 1994.

HP Calculators and Reverse Polish Notation

Friden EC-130 Calculator
HP 9100

In 1924 the Polish logician Jan Lukasiewicz discovered that if the customary binary operators of arithmetic were written using prefix notation then the use of parentheses became unnecessary. Such notation became known as Polish notation, and the corresponding postfix notation which also obviates the need for parentheses became known as reverse Polish notation (RPN).

Since the 19th century mechanical calculators had been using a postfix method of entry. The first fully electronic calculator was introduced in 1961, and this was followed by the first fully transistorized calculator, the Friden EC-130, in 1963. The EC-130 was notable because it used RPN and it had a stack of 4 registers. It lacked stack manipulation operators such as swap or roll, making it at times difficult to use.

The stack as a data structure was first described by the German computer scientist Friedrich Bauer in 1955.

In 1968 Hewlett-Packard introduced a programmable calculator, the HP 9100. It used RPN and had a stack of three registers, labelled X, Y, and Z. When HP began selling calculators with a stack of four registers in 1972 the fourth register was given the name T. The HP 9100 could hold programs with up to 196 instructions and each line in the program was labeled with two hex digits. The HP 9100 language could loop and branch.

The original HP 9100 was a desktop device which weighed 40 lbs. In 1972 the company introduced the HP-35, a pocket calculator with transcendental functions which used RPN. In 1974 they introduced the HP-65, a programmable RPN handheld calculator. It could hold a program with up to 100 lines.

Early HP calculators were implemented entirely in the machine language of the underlying hardware. The HP-18C, which appeared in 1986, was the first to be partially implemented in a higher language. It was a new language developed for the purpose which HP called Reverse Polish Lisp (RPL). The HP-28C, released the following year, exposed the language to the user. The HP-28C had a stack size limited only by memory. The stacks of previous calculators could only hold real numbers, but the HP-28C stack could also hold complex numbers, lists, arrays (including vectors and matrices), strings, and programs. On the HP-28C, programs were no longer stored in a separate region of memory. A program could be entered by hitting the @<@@ key, typing in the program, and hitting the @@>@ key. This would push the program onto the stack. Hitting the EVAL key would pop the program from the stack and run it. To keep the program for reuse the program can be stored in a variable like any other type of data.

PostScript

In 1976 John Warnock developed a Forth-like language called Design System Language while he was an employee of Evans & Sutherland. It was used to store a three dimensional image of New York Harbor. In 1978 Warnock joined Xerox PARC. He developed a language called JaM with Martin Newell which was like Design System Language and which was used for VLSI. Xerox PARC had developed the first laster printer in 1969, and JaM evolved into InterPress, a language suitable for describing printer documents. Warnock and Charles Geschke founded Adobe Systems in 1982 where they designed from scratch a language like InterPress called PostScript. The LaserWriter, introduced by Apple in 1985, was the first PostScript enabled printer.

Adobe developed a variant called Display PostScript for on screen graphics. Sun workstations began to use it in 1986 though Sun would later discard it for X11. It was also used by the NeXT workstation starting in 1987.

In 1993 Adobe introduced the Portable Document Format (PDF). Although PDF is built on a subset of PostScript, it does not support branching or looping and hence is not considered a programming language.

Concatenative Languages

Mathematical Foundations of Joy von Thun

The languages described on this page share in common the use of postfix notation and one or more stacks or dictionaries to pass arguments between operators.

In 2000 Manfred von Thun started work on Joy. Programs in Joy are built up from smaller programs using concatenation and quotation. The notation for concatenation was simply to place the two programs to be concatenated next to each other but separated by whitespace. The notation for quotation was to put the program inside square brackets [ ]. As for semantics, running the concatenation of two programs is the same as running the first program and then the second program. Quoting a program does not run the program but instead delays execution in some manner. Presumably there is a primitive operator in the language which can be used to run it at a later time.

Programming languages which have concatenation and quotation are called concatenative languages. Concatenative languages have a relationship to combinatory logic similar to the relationship Lisp has to lambda calculus. PostScript and Factor are concatenative languages. Factor uses the same notation as Joy. PostScript uses curly braces { } instead of square brackets [ ] for quotation.

Forth falls short of being a concatenative language. It has concatenation. Defining a word in Forth can be regarded as quotation. Unlike PostScript and Factor which push the quoted program onto the stack, Forth in effect uses a dictionary of word names to store quotations. In Forth the ' word can be used to push the program associated with a word onto the stack as an execution token, and the execute word can be used to run the execution token.

The critical difference is that quotations in PostScript and Factor can be nested, whereas as far as I can tell word definitions in Forth cannot. In order to be mathematically general the quotation operator needs to operate on arbitrary programs. The closest approximation to a quotation operator in Forth cannot operate on a program that itself has the quotation operator in it.

Introduce term "higher order concatenative language" for a language with both concatenation and quotation?

Factor

Factor appeared in 2003, originally on the JVM.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License