CalicoScheme

From IPRE Wiki
Jump to: navigation, search

Pyjama Scheme is a new implementation of a Scheme-based language for .NET/Mono. It implements many core Scheme functions, but also adds some functionality to bring it into line with the other modern languages like Python and Ruby. The goal is to have a complete environment for teaching, called Pyjama.

Background

Pyjama Scheme is actually written in Scheme. It is written in a special form (continuation-passing style) and then automatically converted through a series of transformations, into C# so that it can be run on .NET/Mono. The generated language could be any language, but for this project we chose Scheme (because we need a Scheme for Pyjama).

Pyjama Scheme is currently under development. Plans:

  1. get a basic scheme working in .NET/Mono [done]
  2. use the Dynamic Language Runtime (DLR) for environments (more on that below)
  3. integrate it with the rest of the DLR

Pyjama Scheme is not a regular scheme and is not meant to be compatible with others. It is designed for education and research.

Extensions

  1. Exceptions - implements try/catch/finally
  2. Modules and namespaces - implements namespaces for imports and foreign objects
  3. DLLs - import libraries written for the CLR
  4. Interop - ways for Scheme to interop with other Pyjama languages

Exceptions

(raise <exp>)
(try <exp>...)
(try <exp>... (catch <sym> <exp>...))
(try <exp>... (finally <exp>...))
(try <exp>... (catch <sym> <exp>...) (finally <exp>...))

There are four main forms of try:

  1. try alone: has no effect---doesn't catch exceptions
  2. try with a catch-clause: if exception in try-body, catch-clause will catch and provide return value, otherwise body of try provides. If exception in catch-clause will simple raise it.
  3. try with a finally-clause: finally-clause will run if exception in try-body or not. If there is an exception in the finally, it will raise it.
  4. try with a catch-clause and finally-clause: if an exception in try-body, the catch-clause will run, followed by the finally-clause. If an exception is raised in the catch-clause too, then finally-clause will run, and then raise the catch-exception; if there is an exception raised in the

finally-clause, then it will raise.

The <sym> of the catch-clause will be bound to the expression raised.

Examples

==> 

> (try x)
(uncaught exception: "unbound variable x")
==> "(try x (catch e e))"
"unbound variable x"
==> "(try x (catch e (raise e)))"
(uncaught exception: "unbound variable x")
==> "(try x (finally 'hi))"
hi 
(uncaught exception: "unbound variable x")
==> "(try x (catch e 1 2 3 4 (finally 'hi))"
hi 
4

Modules

Modules provide an easy method for structuring hierarchies of libraries and code.

Import

(import <string-exp>)
(import <string-exp> '<symbol-exp>)

Examples

==> (import "my-file.ss")

my-file.ss is a Scheme program file, which itself could have imports.

==> (import "my-file.ss" 'my-stuff)

Loads the file, and puts it in the namespace "my-stuff" accessible through the lookup interface below.

Lookup

module.name
module.module.name
==> (import "my-file.ss" 'my-stuff)
==> my-stuff.x
5

Directory Listing

==> (dir)
(- * / + < = > and append apply caaaar caaadr caaar caadar caaddr caadr caar 
cadaar cadadr cadar caddar cadddr caddr cadr call/cc call-with-current-continuation 
car case cdaaar cdaadr cdaar cdadar cdaddr cdadr cdar cddaar cddadr cddar cdddar 
cddddr cdddr cddr cdr cond cons current-time debug dir display env eq? equal? eval 
exit float for-each format get group help import int iter? length let let* letrec 
list list? list->vector list-head list-tail load map member memq newline not null? 
or pair? parse print printf procedure? property range record-case reverse set-car! 
set-cdr! sort sqrt string? string<? string->list string->symbol symbol symbol->string 
typeof using vector vector? vector->list vector-ref vector-set! vector-set!)
==> (dir my-stuff)
(x y z)


DLLs

Use a DLL.

scheme> (using "DLLName")
scheme> (DLLName.Class arg1 arg2)

Interop

Use the define! to put a variable in the global Pyjama namespace.

scheme> (define! x 8)
python> x
8

Wrap a function for use by other Pyjama languages:

scheme> (func (lambda (a b) (+ a b)))

Combine for cross-language interoperation:

 scheme> (define fact (lambda (n) (if (= n 1) 1 (* n (fact (- n 1))))))
 scheme> (define! factorial (func fact))
 python> factorial(5)
 120

Be careful not to wrap the func around the part that is called recursively, or you will destroy the tail-call optimization.


Iterators

Strings, vectors, and lists all work with map and for-each.

==> (map display "123")
123(void void void)
==> (for-each (lambda (v) (printf "~a\n")) (vector 1 2 3))
1
2
3

Help/Doc System

When you define a variable, you can optionally add a doc-string:

==> (define x "This variable holds the sum" 0)
==> (define y 0)
==> (define function "This computes the polynomial..." (lambda (x) ...))

You can lookup the doc-string with:

==> (help 'x)
This variable holds the sum

Command-line

You can run scheme code on the command line:

pyjama --exec examples.ss file2.ss

Misc

get will lookup a symbol in the environment:

==> (get 'get)
#<procedure>

typeof will give you the .NET type of a value:

==> (typeof 1)
System.Int32
==> (typeof 238762372632732736)
Microsoft.Scripting.Math.BigInteger
==> (typeof 1/5)                
Rational

Bugs

  1. Defining a vector returns an uncaught exception and if we call the defined functions name, it returns the first element of vector. FIXED
  2. Bug in (map - '(1 2 3)). Error was in Subtract with one arg. FIXED
  3. (/ 5 1) should give an integer, but gives rational with 1 in denominator. FIXED
  4. (/ 5) should give 1/5 but gives an error. FIXED
  5. Append doesn't work for empty lists, and no error checking. FIXED
  6. Not sure whether is a bug: if (display '(don't try this at home)) returns (uncaught exception: "unexpected character ' encountered") (petite: (don (quote t) try this at home)) Error in reader-cps.ss FIXED
  7. Environment.cs -- is this file necessary? Why is it written in Scheme but labeled .cs? FIXED
  8. Bug in mapping printf in C#; example: (map printf '("hello")). FIXED
  9. No Error for incorrect number of arguments.
  10. PJScheme parser doesn't allow #3(1 2 3) only #(1 2 3)
  11. C# PJScheme doesn't work with #(1 2 3)

Features

  1. Semantics of dotted-names are complex:
    1. First we look up the name ("math.x.y.z") if value found, return it
    2. Next, we lookup "math", then "math.x", "math.x.y", and "math.x.y.z"
    3. If none of those are modules, then it is an error
    4. This makes it impossible to mix dotted names with modules. OK
  2. Changed the dir command to include macros (special forms from define-syntax). So, it now lists "and", "or", "let" etc. Idea by Julia Kelly.

Getting Started

To make changes, you need only edit the Scheme programs ending in -cps.ss. Running make will rebuild the system.

Under the Hood

Pyjama Scheme is written in Scheme in these files:

  • environments-cps.ss - environment
  • interpreter-cps.ss - interpreter
  • parser-cps.ss - parser
  • reader-cps.ss - scanner/lexer
  • unifier-cps.ss - pattern matching (for define-datatype)

These are written in Continuation-Passing Style (CPS).

These files are transformed into C# through a series of stages:

  1. grouped into a single file, pjscheme-cps.ss
  2. converted from continuation-passing style TO data structures, pjscheme-ds.ss
  3. converted from data structures TO register machine, pjscheme-rm.ss
  4. register machine TO C# code, pjscheme.cs

So pjscheme-cps.ss is converted to pjscheme-ds.ss, then pjscheme-ds.ss is converted to pjscheme-rm.ss, and finally pjscheme-rm.ss is converted to pjscheme.cs.

Support for low-level Scheme functions are defined in Scheme.cs.

Testing Scheme in Scheme

As we are writing Scheme in Scheme, you might want to first test your changes by running the Scheme directly in Petite. To do that:

$ petite interpreter-cps.ss
> (start)
===> 

That is, run petite interpreter-cps.ss at the shell prompt, and then (start) at the petite prompt. You'll get the PJ prompt, with PJ running in Petite.

pjscheme.cs

Scheme commands, such as string->list, are primitives, so there isn't any Scheme code that implements it... it is at a lower level. Petite Scheme is probably written in C, so string->list is written in C. So, that means we have to write functions like this manually in C#.

You'll find all of the Pyjama Scheme primitives written so far in Scheme.cs. But, C# can't have hyphens and greater-thans in symbol names. So, there are some common character combinations that are automatically replaced when going from Scheme to C#:

  • -> becomes _to_
  • - becomes _
  •  ? becomes _q
  •  ! becomes _b

So, string->list is a primitive defined in Scheme.cs as string_to_list.

In the Scheme code, k (or k2) represents a continuation, and handler is an error handler (for taking care of exceptions and errors).

There is a very basic trace mechanism in pjscheme.cs. You can set the level of detail with:

==> (debug N)

where N is a number between 0 and 10, inclusive. You can also see the debug level with:

==> (debug)
0

Zero means off (no tracing) and 10 is maximum tracing. The indent level shows the level of trace, which is approximately the low-levelness of the code. We should come up with a better means than this!

Advanced Issues

If you are writing code in CPS fashion, then you should use "define*" to define the function. Otherwise, it is just a regular function, and just use "define".

If you are using a lambda for something other than a function definition, use:

  • lambda-cont: indicates it is a continuation that receives a single value
  • lambda-cont2: indicates it is a continuation that receives 2 values
  • lambda-macro: indicates that this is a macro
  • lambda-handler: indicates that this is an error handler

If you are writing code that maps a new function (one that you have defined in Scheme) you need to add create what is called a "delegate". This is a bit of a hassle for now. If you had a new function named, say, fixstring, and you were mapping that to a list, you need to add:

 public static Proc fixstring_proc =
        new Proc(\"fixstring\", (Procedure1) fixstring, 1, 1);

to the string variable named *code* in the file scheme-to-csharp.ss. The first name should be the name of the function with "_proc" appended. Thus: "fixstring_proc".

NOTE: This is only required for support code (e.g., code that implements the language). It is not necessary for functions in Pyjama Scheme.

Example Extensions

The following examples show you what is necessary to make changes to Pyjama Scheme. After any changes, you should be able to:

make pjscheme

and it should rebuild. If something doesn't seem to rebuild correctly, you can issue a make clean followed by make pjscheme.

To make sure you have the latest, up to date code, do this in the Scheme directory:

svn update
make clean

You may want to do that each session before beginning work.

To make a "patch" to submit, do this:

svn diff > descriptive-name.patch

The "diff" will put all of the differences between the current version and yours (ie, your changes). Someone can then apply your changes to the server's code.

To apply a patch to your own code:

patch -p0 < descriptive-name.patch

That will make all of the changes in the patch file to your copy of files in the directory.

Example Extension: Vector

Example extension #1: Adding a new function to Pyjama Scheme

The first example is adding a new function to the Pyjama Scheme environment in interpreter-cps.ss.

  1. Add the keyword to the keywords
  2. Add the function code to list of functions

For this example, let's add a new function vector that takes a number of arguments and puts them into a vector (eg, an array). It will work like:

==> (vector 'a 'b 'c)
(vector a b c)

This is not a special form (doesn't have special syntax) but is just a new function that does something to a list of evaluated arguments. The first step is to add the keyword vector to the list of words in the environment. This is done in the interpreter-cps.ss file, near line 255:

    (make-initial-environment
     (list 'exit 'eval 'parse 'apply 'sqrt 'print 'display 'newline 'load 'null? 'cons 'car 'cdr
          'list '+ '- '* '/ '< '> '= 'equal? 'eq? 'memq 'range 'set-car! 'set-cdr!
          'import 'get 'call-with-current-continuation 'call/cc
          'reverse 'append 'list->vector 'dir 'current-time 'map 'for-each 'env
          'using 'not 'printf 'vector)

Next, we add the function to the list of functions. Again, in interpreter-cps.ss near line 331:

        ...
	;; printf
        (lambda-proc (args env2 handler k2) (begin (apply printf-prim args) (k2 '<void>)))
        ;; vector
        (lambda-proc (args env2 handler k2) (k2 (apply make-vector args)))
        ...

NOTE: the order of words added in step 1 must match the order of functions here!

In this example, we need merely call make-vector on the arguments and pass it to the continuation (k2).

If you are defining functions that are used by the system, then you need one last step. You will need to add those to scheme-to-csharp.cs in the *code* list, like:

  public static Proc make_vector_proc = new Proc(\"list->vector\", (Procedure1) make_vector, 1, 1);

If you didn't define these explicitly (perhaps they are low-level scheme functions) then you need to implement them. You can write them in C# in Scheme.cs, or write them in Scheme in interpreter-cps.ss and let them get translated. If they are low-level Scheme functions, such as car, you'll need to write them in C#.

Above, the Proc() constructor takes three parameters: the name of the function (this isn't used, except to give feedback to the user in times of error or tracing), the name of the C# function to call, the type of arguments it takes, and the type of return value.

Arguments it takes:

  • -1, means variable length list (one arg, interpreted as a list)
  • 1, means exactly one (interpreted as an object)
  • 2, means exactly two (interpreted as objects)
  • 3, means exactly three (interpreted as objects)

What it returns:

  • 0, means return void (nothing returned)
  • 1, means return object
  • 2, means return boolean

Finally, the cast must match the incoming parameter count and type:

  • Procedure0, takes no arguments, returns object
  • Procedure0Bool, takes no arguments, returns boolean
  • Procedure1, if it takes one argument (variable or not), returns Object
  • Procedure1Bool, takes one argument, returns boolean
  • etc. up to 3

Adding Primitives

There are a large number of missing Scheme primitives (such as round inexact? modulo integer? char->integer make-string vector-length string-set! output-port? current-output-port current-input-port read-char eof-object?). This section describes what is necessary to add them.

1. Working in Calico/languages/Scheme/Scheme

cd Calico/languages/Scheme/Scheme

2. Edit interpreter-cps.ss

This is where Calico Scheme is defined. CPS stands for "continuation-passing style". You can read more about that in "Essentials of Programming Languages" and other places.

3. Define function-prim, where "function" is replaced with the function you are defining:

The basic form is:

(define function-prim
  (lambda-proc (args env2 info handler fail k2)
      (k2 ... fail)))

Where k2 is given the result of the function (and the fail).

We also would like to add some checks to help the user use the function:

(define function-prim
  (lambda-proc (args env2 info handler fail k2)
    (cond
      ((not (length-one? args))
       (runtime-error "incorrect number of arguments to string-length" info handler fail))
      ((not (string? (car args)))
       (runtime-error "string-length called on non-string argument" info handler fail))
      (else (k2 ... fail)))))

Possible functions used in testing: string? number? length-one? length-two?

Any code that you call should not be written in a recursive manner, as we want to eliminate the possibility of using up the stack space in C#.

4. Replace the ... with primitive operations that exist in both C# and Scheme, or with a call to a Scheme function that doesn't exist in C#. You will need to write it in C# and defined in Scheme.cs.

5. Add the symbol for the function and the function-prim in make-toplevel-env

6. Run "make"

This can take some time.

You will need to have petite Chez Scheme installed (and may need to have a link to a library added). Example install:

$ wget http://www.scheme.com/download/pcsv8.4-i3le.tar.gz
$ tar xfz pcsv8.4-i3le.tar.gz
$ cd csv8.4/custom
$ ./configure
$ make
$ sudo make install

May also need:

$ sudo apt-get install indent
$ sudo apt-get install libncurses5-dev
$ ln -s /lib/x86_64-linux-gnu/libncurses.so.5.9 libtinfo.so.5

or

$ sudo ln -s /lib/libncurses.so.5.7 libtinfo.so.5

7. Test (preferably in code that can be re-run later for regression testing)

You should test in petite Scheme and with Calico.

$ petite interpreter-cps.ss
Petite Chez Scheme Version 8.4
Copyright (c) 1985-2011 Cadence Research Systems

Loaded EOPL init file
> (start)
==> "(round 4)"
4
==> "(exit)"
(exiting the interpreter)
goodbye
> 
$ ../../../StartCalico
...

Example of Adding Primitives

Consider adding char->integer and integer->char to Calico Scheme.

First, to interpreter-cps.ss, add the primitive function wrappers. These end with "-prim" and typically just apply the actual primitive to the list containing the single arguments. They test to make sure the arguments are correct (maybe right type or count).

Add to interpreter-cps.ss:

;; char->integer
(define char->integer-prim
  (lambda-proc (args env2 info handler fail k2)
    (cond
      ((not (length-one? args))
       (runtime-error "incorrect number of arguments to char->integer" info handler fail))
      (else (k2 (apply char->integer args) fail)))))

;; integer->char
(define integer->char-prim
  (lambda-proc (args env2 info handler fail k2)
    (cond
      ((not (length-one? args))
       (runtime-error "incorrect number of arguments to integer->char" info handler fail))
      (else (k2 (apply integer->char args) fail)))))

Then, we need to add the -prim functions to the toplevel environment:

(define make-toplevel-env
  (lambda ()
    (let ((primitives 
	   (list
...
	    (list 'char->integer char->integer-prim)
	    (list 'integer->char integer->char-prim)
...

which is a list containing the name we want to call it in Calico Scheme, and the actual primitive function name.

In the C# code, we only need to define the functions. We note that Scheme function names must be changed slightly to be valid C# names. So, we turn "-" to "_", "->" to "_to_", "?" to "_q", "!", "_b", etc. See scheme-to-csharp.ss for additional conversions.

Add to Scheme.cs:

    public static object char_to_integer(object thing) {
	if (thing is char) {
	    return Convert.ToInt32(thing);
	} else {
	    throw new Exception(String.Format("char->integer: invalid character: '{0}'", thing));
	}
    }

    public static object integer_to_char(object thing) {
	if (thing is int) {
	    return Convert.ToChar(thing);
	} else {
	    throw new Exception(String.Format("integer->char: invalid integer: '{0}'", thing));
	}
    }

That is it, except for one complication. Because we applied the functions in our implementation (rather than just calling them), then we need to define a Proc construct that allows C# functions to be used as higher-order functions.

So, we need to add to Scheme.cs:

   public static Proc char_to_integer_proc = new Proc("char->integer", (Procedure1) char_to_integer, 1, 1);
   public static Proc integer_to_char_proc = new Proc("integer->char", (Procedure1) integer_to_char, 1, 1);

The following casts are available:

  • Procedure0 - function is receiving zero arguments, returns an object
  • Procedure1 - function is receiving one argument, returns an object
  • Procedure2 - functions is receiving two arguments, returns an object
  • ProcedureN - funtion receives a variable number of arguments, returns an object
  • Procedure0Bool - function is receiving zero arguments, returns a bool
  • Procedure1Bool - function is receiving one argument, returns a bool
  • Procedure2Bool - functions is receiving two arguments, returns a bool
  • Procedure0Void - function is receiving zero arguments, no return
  • Procedure1Void - function is receiving one argument, no return
  • Procedure2Void - functions is receiving two arguments, no return

This information is also duplicated in the arguments to Proc(). The 3rd parameter is the parameter count (-1 = all, 1 = 1, 2 = 2, etc), and the 4th parameter is the return type (0 = void, 1 = object, 2 = bool).

See source code for more examples.

Trouble Shooting

I have added a function in the -cps Scheme code, but when I try to "make pjscheme" I get the error that "not all code paths return a value". What am I doing wrong?

This can happen when you add a function such as:

(define help
  (lambda (args env handler k)
        (lookup-value  args env handler k)))

to interpreter-cps.ss. This produces the following C# code:

   new public static object help (object args, object env, object handler, object k2) {
      k_reg = k;
      handler_reg = handler;
      env_reg = env;
      variable_reg = args;
      pc = (Function) lookup_value;
   }

If you look closely at the C# function, it doesn't return anything. In fact, the code is just sets some values, as that is what the transformations do: they unroll the recursions and function calls so there is no stack, and no return values. All that help does is call lookup_value. The pc (program counter) will call that function, which gets its arguments from the registers set in here in help.

However, the "public static object help" says that you will return an "object". This is a guess that the transformation makes, but it is wrong. To fix this, you need to manually tell the Scheme-to-CSharp system that the function help does not return anything. You can do that by adding an entry in the list *system-function-signatures* in the file scheme-to-csharp.ss. Like so:

  ...
  (help "void" ())
  ...

This says: "use void as the return type for the function help." Void means that this doesn't return anything. This signature doesn't specify the types of the four arguments to pass in to help (which would be given in the list after "void") because they are all "object", which is the default.

Future Work

Ideas for projects:

  • Make Pyjama Scheme behave more like Python:
    • add iterators and support
      • Try Pyjama Scheme command (range 10)
      • add ways to make other items iterators
      • Try Pyjama Scheme command (for-each function list) ie, (for-each display '(1 2 3)).
  • Hook into the Dynamic Language Runtime (DLR)
    • use the DLR environments
    • use it to load DLLs
      • see the support for modules in Pyjama Scheme (above)
  • Add better debugging support
    • scheme code should know file, line, col of where they are defined
    • scheme functions should know their name (if they have one)
  • Add a help system
  • Add a step function
  • Add some Scheme primitives
    • what's missing?

References

Pyjama

  1. PyjamaDevelopment - plans and details for Pyjama development

Dynamic Language Runtime - DLR

  1. http://compilerlab.members.winisp.net/dlr-spec-hosting.pdf - DLR Hosting Specification
  2. http://blogs.msdn.com/mmaly/archive/2008/01/14/building-a-dlr-language-trees.aspx - Overview of ToyScript
  3. http://blogs.msdn.com/mmaly/archive/tags/Dynamic+Language+Runtime/default.aspx - Aspects of the DLR (read from bottom to top)

Abstract Syntax Tree - AST

  1. http://www.bitwisemag.com/2/DLR-Build-Your-Own-Language
  2. http://devhawk.net/2008/01/29/Practical+F+Parsing+Recursion+And+Predicate+Functions.aspx
  3. http://www.knowing.net/PermaLink,guid,526e4ccd-39e8-4932-b9a4-d15e6c82bfbf.aspx
  4. http://blogs.gotdotnet.com/mmaly/archive/2008/01/14/building-a-dlr-language-trees.aspx