Introduction to the Factor Programming Language


Author: Zackery L. Arnold

CPS 499-03: Emerging Languages, Spring 2017


Key language concepts in Factor

  • Factor = Object-Oriented + Functional + Stack Balancing
  • Factor is an object-oriented language that utilizes functional programming execution and a stack based data flow to create powerful and expressive programs. [FDSL]
  • Developed by Slava Pestov, a Russian-American programmer who now works at Apple developing the Swift programming language.
  • Heavily influenced by Forth, the progenitor concatenative and stack-based programming language [ABIF].
  • Factor's name comes from the belief that 'factoring out' common basic functions is necessary and useful for expressive concatenative programs.
  • In the concatenative tradition functions are idiomatically called words and collections of words are called vocabularies.
  • Higher order words called combinators are essential to minimizing code definitions.
  • All values are objects and objects share generic words that serve a role similar to template functions in other languages.
  • Types are generally enforced only by whether or not a given generic word is defined for the given object's class.
  • The combination of simple syntax and strict rules for managing the stack encourage the programmer to write terse and specific code.
  • Overall, Factor demands that the programmer take responsibility for errors with a 'let it crash' philosophy.
  • Factor is a general purpose language that is deeply tied to its development environment and features a wide range of mature vocabularities for a variety of use cases.


Core Factor

  • Short and expressive words are always the goal
  • Combinators are encouraged where applicable
  • Stack shuffling is useful but should be minimized
  • Stack effect is required for documentation and practical reasons
  • Objects do not own their own methods, but can have member variables
  • Derived classes are easily obtained through simple set theory operations
  • Generic words encourage safe and effective code reuse, even across classes
  • Lazy evaluation is present, such as the [a,b] word that produces a promise of a range on a to b


Concatenative Programming

A concatenative programming language operates by concatenating smaller programs (or functions) together to create a new program. [CPLW]
  • Similar in concept to filter pipelining in Linux/Unix.
  • Evaluations occur by concatenating several functions that all operate on a single piece of data that is transformed along the way.
  • To fit this paradigm, functions in Factor are called words, and the space character is not-so-jokingly called the concatenation operator.
For example, consider the following program, which computes the factorial of 10:
10 [1,b] 1 [ * ] reduce

--- Data stack:
3628800
The number 10 is taken from the stack, used to generate an array of range 1 to 10, and finally reduced through product multiplication applied to all of the elements.
This form of programming fits well with filter pipelining heavily used in Linux/Unix. Consider the following program that collects capital letters from a character array:
{ "T" "E" "s" "T" } [ upper? ] filter

--- Data stack:
{ "T" "E" "T" }


Stack-Based Programming

A stack-based programming language loads operands onto a data stack. Words that use the operands pop operands from the stack and push back the results.
  • This is separate from the stack reffered to from the realm of memory allocation.
  • The data stack is a 'global' region, but scope is effectively introduced through the concept of stack effect.
  • The data stack may be manipulated using certain words, such as dup that duplicates the item on the top of the stack, or swap which swaps the top and second-from-top items.
For example, consider the following Factor sequence that evaluates (1 - (2 * (3 + 4))):
IN: scratchpad 1 2 3

--- Data stack:
1
2
3
IN: scratchpad 4 + * -

--- Data stack:
-13


Stack Effects

A stack effect is a part of a word declaration in Factor that serves for both documentation and very basic argument pattern matching.
  • Stack effects are declared using the '( input ... -- output ... )' word composition.
  • All defined words must include an explicit stack effect declaration.
  • Stack effects are checked during compilation of code and produce errors when ignored.
Consider the following formal definition of a factorial function:
: fact ( n -- n! ) [1,b] 1 [ * ] reduce ;
The stack effect '( n -- n! )' above notes that a item 'n' at the top of the stack will be consumed and replaced by an item 'n!'.
(Remember that arbitrary characters have no special meaning without surrounding spaces.)


Stack Checker (Error Detection)

The stack checker is a tool built into the Factor Listener that additionally checks for and enforces consistency in the execution of words.
Specifically, the stack checker ensures that words with branching control flow have the same general stack effect across all branches.
  • In practice, this means that a word will always leave the stack at the same height after execution.
  • Pestov and the other authors argue that stack-checking is necessary to reduce the chance for hard to detect bugs caused by inconsistent stack effect.
  • Additionally, without stack checking it is easy for the programmer to miss edge-case conditions during unit testing.


Stack Shuffling

Data stacks have a unique challenge when it comes to visualizing the hierarchy of data throughout a program.
In some situations the programmer will find that they need to reorganize items on the stack in order to accomplish some task.
For instance, consider the following function which determines if a is a multiple of b:
: multiple? ( a b -- ? ) swap divisor? ; inline
This word takes advantage of the divisor? word to determine multiples. After all, if a is a multiple of b then b must be divisible by a.
In order to use this straightforward observation the word 'swap' is used to swap the pair of items on the top of the stack, giving the divisor? word a whole new implication.
Some common stack shuffling words and their stack effects are:
  • swap ( x y -- y x ) simply swaps the two topmost items on the stack.
  • dup ( x -- x x ) duplicates the topmost item on the stack
  • drop ( x -- ) removes the item on the top of the stack.
  • nip ( x y -- y ) removes the item just below the topmost stack item.
  • over ( x y -- x y x ) duplicates the second item from the top and places it on the top.


Combinators

Combinators are higher-order words that take other words as arguments and apply theme to the data stack in a unique way.
Some common combinators include:
  • dip takes a word as an argument, saves the topmost item of the stack, applies the word to the remaining stack, and then returns the saved element to the top of the stack.
  • bi takes a item off the stack and two additional words. It then applies the word to each item, resulting in two separate results that are placed on the stack.
  • tri extends bi by applying three words to a given item, resulting in three items in the stack.
  • cleave is the ultimate extension of the previous words, taking an unbounded number of words and leaving an unbounded number of results.
  • reduce takes a list, an accumulator, and a word, and applies the word to each element of the list, adding the result of each cycle to the accumulator.
  • map takes a list and a word and produces a new list obtained by applying the word to each element of the original list.
  • filter takes a list and a predicate and returns the items of the list that successfully match the predicate.


Quotations

Quotations are anonymous functions in Factor that can be used as arguments for various combinators.
  • Quotations are denoted through the use of the [ and ] parsing words.
  • Like defined words, quotations must have consistent stack effects for successful compilation.
  • They may be partially applied through the a specific syntax. For example, 10 '[ _ = ] produces the quotation [ 10 = ].


Vocabularies

Vocabularies are collections of words, similar to libraries and namespaces in other languages.
  • Vocabularies made by the user are traditionally placed in the work Factor subdirectory.
  • Each vocabulary must explicitly reference any other vocabularies they rely on in their header.
  • They may include words that are labeled private, but the user can still use these words by asking for the private word collection explicitly.
  • With explicit referencing Factor vocabularies are only included and compiled when needed, allowing for exceptionaly compact deployed programs compared to some other languages.


Classes

Factor is a object-oriented language where every value is an instance of an object. Factor has three types of classes:
  • Primitive classes are used for basic data, such as numbers, strings, and others. These classes may not be subclassed.
  • Tuple classes are more complicated classes that support instance variables.
  • Derived classes are classes that stem from a specific tuple class. Derived classes can be divided into smaller subgroups, such as predicate, union, and intersection classes.
A unique result of Factor's object-oriented implementation is that classes do not own any intrinsic words.
This means that objects are fully separated from the words that operate upon them and their variables.
Instead, generic words are used. Generic words are defined to operate on a wide number of classes.
New definitions for a new class are provided by the programmer. Generic words are thus comparable to template class functions in other languages like C++.


Development Workflow

Factor programs are written and evaluated through an interactive development environment called the Factor Listener.
  • Source code and vocabularies are compiled into a program image.
  • The running state of Factor can be saved into an image file and restored for easy access at a later time.
  • This gives the language the performance benefits of compilation as well as some of the flexibility of an interpreted language.
  • The scaffold vocabulary is available for easy generation of source code files.
  • The documentation of Factor is built into the interface through a button on the Listener.
  • Basic tools also exist for supplying new documentation for custom words, vocabularies, and classes from within the Listener.


Useful Links & Resources


Exercises

The following are some programming exercises that incorporate some essential Factor concepts:
  • Define a word caesar in a vocabulary homework that takes a string of alphabetical characters and an integer and applies the integer to each character in the string. Your solution must handle both positive and negative offset values. Only include uppercase and lowercase alphabetical characters in the output. For example, ABCD with an offset of 4 should produce EFGH. Factor your solution into a primary word caesar and set any helper words as private. This problem can be solved in less than 10 lines of code.

    Examples:
    > "SEESPOTRUN" 26 caesar .
    "seespotrun"
    
    > "SEESPOTRUN" -26 caesar .
    "seespotrun"
    
    > "ABCDEFG" 7 caesar .
    "HIJKLMN"
    
  • Define a new tuple class novel that represents a fictional literary work. Include member variables that correspond to the novel's title, author, genre, publisher, year of publication, and an identification number. Use strings for the first four variables and integers for the last pair. Include a constructor that takes values for each variable as arguments and sets them automatically. Also include a word book-print that takes a novel as an argument and prints the novel's details in the following simple format:
    Class: Novel
    Title: < title >
    Author: < author >
    Genre: < genre >
    Publisher: < publisher >
    Year: < year >
    ID: < number >
    
    Examples:
    > "Narnia" "C. S. Lewis" "Fantasy" "Harper-Collins" 1952 1 
    
    --- Data stack:
    T{ novel f "The Lion, the Witch, and the Wardrobe" "C. S. Lewis" "Fantasy" "Geoffrey Bles" 1950...
    
    > book-print
    Class: Novel
    Title: The Lion, the Witch, and the Wardrobe
    Author: C. S. Lewis
    Genre: Fantasy
    Publisher: Geoffrey Bles
    Year: 1950
    ID: 1
    
  • Extend the solution to problem 2 to two new classes of books - textbook and article. For textbooks, change the 'genre' field to 'subject'. For article, change the 'genre' field to 'discipline' and add 'journal' and 'volume' fields. Then, with the three classes, define a mixin class called library that will represent the union of different types of books. Adjust the book-print word from before to be generic with templates for each type of book. Publish the library and each class in a library vocabulary.

    Examples:
    > USE: library
    > "The C Programming Language" "Ritchie, D. and Kernighan, B." "Computer Science" "Prentice Hall" 1988 2 
    
    --- Data stack:
    T{ textbook f "The C Programming Language"...
    
    > book-print
    Class: Novel
    Title: The C Programming Language
    Author: Ritchie, D. and Kernighan, B.
    Subject: Computer Science
    Publisher: Prentice Hall
    Year: 1988
    ID: 2
    
    > "Factor: A Dynamic Stack-based Programming Language" "Pestov, S." "Computer Science" "ACM SIGPLAN Notices" 45 "ACM Press" 2010 3 
    --- Data stack: T{ article f... > book-print Class: Novel Title: Factor: A Dynamic Stack-based Programming Language Author: Pestov, S. Discipline: Computer Science Journal: ACM SIGPLAN Notices Volume: 45 Publisher: ACM Press Year: 2010 ID: 3


References


Return Home