CYTHON DOCUMENTATION

(created 00-04-06)
(last update 00-10-23)







What is Cython?

Cython and Pyrl are fancy preprocessors which makes either C, C++ or Perl look more pythniosh. What it does is, it let indentation control the flow, form the blocks. This means; no curly brackets, no semicolons. It also means you have to be very careful not to miss a space or a tab somewhere.. This sounds harder (does it sound hard at all?) than it is. With a good editor, no problems will arise.
Cytoc is the 'compiler' that's used for C++ and C files.
Apart from the synthax there's some small added features too.

Why?

Because I find it tedious to type semicolons and curlybrackets, and I find it too be much prettier on screen, especially in Perl, which I like for quickly throwing things together, I find these annoying 'unnecessay characters' to be a nuinsance.

What do they do?

Cython translates files from .cy, .cyp, .cyh to .c, .cpp, .h files passed via the commandline or as entered in the project file. After compiling each file it calls the compiler (if stated in the project file) with the .c/.cpp file. After all files are compiled (to C/C++ and to .o files) the linker is called (again; if one is specified in the projectfile).
Code can be generated to be line to line correct so that error messages give the right line in .cy(p) source, or if you wish to generate prettier C/C++ code, output from compilers are "translated" to give the right linenumber in the cython source (if the compiler is gcc, lcc, or something that spits out warnings/errors in a similar manner...).

HELP! I need help figuring out how to catch output from compilers in windows/dos. Anyone know how?
Also: There's a bug; if run in windows, cytoc thinks that the files have changed everytime, and thus recompiles all files everytime cytoc is called. I guess stat() soesn't work the same way in winows-perl and linux-perl?


Commandline options

(Following pretty much cut n paste from --help output)
Everything inside [ and ] is optional, all paramters can thus be shortened to one or two chars.

cytoc [-v[ersion]] [-h[elp]] [-t[ab]=n] [-f[orce]] [-k[eepfiles]] [-p[roject]] [-pr[etty]] [-tr[anslate][=0|1]] [-de[bug][=0|1]] [-d[ontStopAtErrors]] [filename.cy[ph]]

-version spits out the version number.
-help writes this text.
-tab=n sets tabsize to n.
-force forces a compile even if a file is uptodate.
-keepfiles makes cytoc leave the generated .c or .cpp file after 'compilation', useful when using gdb or when the C/C++ files are. needed for any other reason.
-project makes cytoc use the projectfile.y (although it will anyway).
-pretty generate 'pretty' C/C++ code, if chosen lines in C/C++ source will not correspond to the same line in Cython. This can be helped with;
-translate=n turns on or off translation of compiler output, so that errors/warnings referr to the .cy(p) file and linenumbers are correct even in pretty mode.
-debug=n n being 1 or 0, turns on or off debugging output in your application where you've used _DBG or DBG.
-dontStopAtErrors makes cytoc continue compiling files even after coming upon a compile error in one of the files.

Cython synthax:

It's simple! It's just like C or C++, with the difference that there are no "{" or "}" brackets and no ";" (semicolons) (to be completely honest; you can enter them anyway, if you want, to clearify some section in the code for instance).
Blocks are determined by indentation alone, much like in Python (which is why it's called Cython... C & Python) I'm not making a claim that it's very Pythonish, the only similarity is that indentation decides blocks.
Important, you should use an editor you can trust, that doesn't throw in spaces or tabs at will. If a space is missing it could mean a total difference in program execution. Use tabs all the way, or softtabs all the way, and you should be safe. Also if the lines look like they line up, they probably do, if the tabstop setting is set to the right value of spaces.


Comparsion, plain C(++) vs. Cython:

In C:
void blabla (int b,  char *bla) {
	if (bla) {
		cout << "Dada: " << bla << endl;
		a = b+5;
	}
	printf("The good ol' printf function.\n");
}

And in Cython:
void blabla (int b,  char *bla)
	if (bla)
		cout << "Dada: " << bla << endl
		a = b+5
	printf("The good ol' printf function.\n")

If you like to split up a statement over several lines you have to put a "\" at the end of the lines.
ex in c:
if (ladidadi && veeeeeeerrryyylongvariable &&
	(bladibla || dobidida) &&
	(evenorestuff != otherstuff)
	) {
	blabla();
}
in cython:
if (ladidadi && veeeeeeerrryyylongvariable && \
	(bladibla || dobidida) && \
	(evenorestuff != otherstuff) \
	)
	blabla()
Bad example I guess, but anyway..
One important thing is that case: statements has to be lined up with the switch statement, the rest has to be indented one step.
example:
switch (a)
case 1:
        doThis()
	break
case 2:
        b = c * d
	break


Further control of indent and blocks

By using a colon (:) alone on a row, you can set indent "without leaving a trace". It can be used to make it clearer where a block ends for example, so that the code is easier to read if there are deep indents.

By writing a colon followed by a less-than or greter-than symbol (<, >) you can 'fool' cytoc to think the the lines after are either indented more or less than they really are. The number of </>'s decide the change in indent value. Any amount of these 'foolers' can be entered, in any order.

Example:

sub void mcbTargettingMissile (Obj &o)
:>
switch(ObjAction)
case OACTION:
:>
switch(o.state)
case OUTOFFUEL:
        blablabla
        break
case SEEKING:
	yadayada
        break
:<
        break
case OINIT:
        o.speedx = 3.0
        o.speedy = 0
	o.state = SEEKING
        break
:<
instead of
sub void mcbTargettingMissile (Obj &o)
	switch(ObjAction)
	case OACTION:
		switch(o.state)
		case OUTOFFUEL:
		        blablabla
		        break
		case SEEKING:
			yadayada
		        break
	        break
	case OINIT:
	        o.speedx = 3.0
	        o.speedy = 0
		o.state = SEEKING
	        break


"Extended features"

_DBG string/vars

To ease debugging via printf statements (if you still use that ancient method [I do ;) ] ):

_DBG string with %dvariables% baked in, very %snice%
translates to:

fprintf (stderr, "string with %d baked in, very %s", variables, nice);
It should be easy to see how it works..
To print a "%" simply type "%%" (like usual).

Whether this reach final code depends on the -debug flag.

DBG line

Another feature is that if you write DBG immediately in the beginning of a line, that line will be controlled with the -debug flag.
example:
DBG	if(!memPtr) BREAKPOINT()
would become
	if(!memPtr) BREAKPOINT()
if debug outputs is turned on (-debug=1 or -debug in project or by commandline, or "#debug on" in a sourcefile..) otherwise (if -debug=0) it would be rendered
//DBG	if(!memPtr) BREAKPOINT()
No extra space should be entered after the DBG keyword, just the one's that should be there for indentation to be correct, it should be put at the immediate start of the line.

sub, export

Now, one really cool feature:
If you type sub before a function (it has to be in the immediate beginning of a line) or export in front of a global variable, those will be 'fetched' and inserted in to a global interface header.
example:
export char *stringToPrint;

sub void hello (void) {
	cout << stringToPrint << endl;
}
To turn this feature on you have to add "globalHeader: filename" to the projectfile.y (read more in the project section).
This way you'll have all functions exported automatically, just include the global header. No more forgetting to change the prototype declaration after changing the datatypes of parameters and returnvalues!
It might be a good idea to make another global header which includes all headers with classes, structs etc that functions in the global header uses, and then include the global header after them in 'your own' global header..

Directives


In the first three lines you have the option to write:
#tab 4
to change the tabstop value for the file, this overrides the tab value set in the project.y file or passed via commandline, it only change it for this file, not the others. Ofcourse the value can be anything, not just 4 or 8.

#debug on
to turn debug on ("#debug on") or off ("#debug off") for this file, this overrides the value in project.y and the one passed via commandline, only for this file.

Some small features are that #include can be written as #i and #define as #d.
Also, you don't have to enclose the string in quotes when using the #i statement. If you want to use <> however, you have to enter them.

Example:
 #i <iostream.h>
 #i blabla.h
 #d APE 53
which would become
#include <iostream.h>
#include "blabla.h"
#define APE 53


Possible future features

I'm thinking about adding some vector template as an "extension" so that one can use:
foreach (a, vectorlist)
	do something with a
which would equal something like
vectorlist.resetposition();
while (vectorlist.each(&a) == true) {
	do something with a;
}
I guess there is some vector template in STL, but I've never used anything of that and I guess people would think it's stupid of me to include my own Vector template.. It would ofcourse be even better to not use a template/class, but instead only structs and C functions so it can be used in C too.
Ofcourse, by not using it, you don't have to care what it produces..

If anyone has any thoughts about this,
mail me.


Cython: project.y file format:

Soem of these examples below are taken from a GameBoy Colour game I coded hastingly in Cython, so don't get confused by the strange examples, you should easily see how they apply too gcc or any other compiler...

A '#' followed by a space, first in a row means that following is a list of args to cytoc. The synthax for the arg list is just the same as if you would have passed them directly to cytoc. Any args passed to cytoc directly on the commandline override settings here.

Example:
# -t=4 -debug=0
Here tabsize would be set to 4, debug to off (that is; any rows in cython code beginning with DBG and any _DBG statements would be commented away before compilation.)


compiler: compiler string, %f = file, %o = file with .o ending.
example:
compiler: gcc -Wall -c -o %o %f
globalHeader: filename.h, if specified, generates a "global interface header", with the name you entered here, containing all functions and global variables that has been marked with "sub" and "export", in all source files in this compile block. NOTE: I'm going to change this to be applied to the 'link block', instead of just one 'compile block' ie all "files:" definitions to the next "linker:"
example:
globalHeader: globalInterface.h
files: list of files
example:
files: main.cyp npc.cyp graphics.cy blit.s
Here one simply adds the files that should be compiled with the above compiler statement. .cy, .cyp, .c, .cpp, .s, .whatever, etc. files can be mixed freely.
If another compiler statement is needed for some files, simply add another "compiler:" line, and another "files:" after. There can be any amount of repetitions of "compiler:", "files:", and also, "compiler:", "files:", "linker:", ... (See 'complex' example further down)

linker: linker string
%o = all the object files (in this subsection.. more compiler/files statement can follow this, but they will be handled by the next linker line..)
%f = inserts all files in filesection with their original endings, so you can compile and link on one row if you want.

run: statement to be run, IF all files compiled and linked without errors, ex:
run: vgb-debug -scale 2 -sync 60 -verbose 32 program.gb
or simply
run: ./thisNiceFreshlyCompiledProgram
or just anything you can run.
You can have several run: commands.
Simple example:
compiler: lcc -Wa-l -c -o %o %f
files: nibbles.cy gfx.cy asmtest.s
linker: lcc -Wl-m -Wl-yt2 -Wl-yo4 -Wl-yp0x143=0cy0 -o nibbles.gb %o

More complex:

#tab 8
// sets default for all files that doesn't override with their
// own #tab.. (8 is already the default so this line is really
// unnecessary in this case..)

compiler: /usr/lib/gbdk/bin/lcc -Wa-l -c -o %o %f
files: nibbles.cy \
	gfx.cy \
	asmtest.s

compiler: /usr/lib/gbdk/bin/lcc -Wa-l -Wf-bo1 -c -o %o %f
files: maps1.cy

compiler: lcc -Wa-l -Wf-bo2 -c -o %o %f
files: maps2.cy

linker: lcc -Wl-m -Wl-yt2 -Wl-yo4 -Wl-yp0x143=0cy0 -o nibbles.gb %o

run: vgb-debug -scale 2 -sync 60 -uperiod 2 -verbose 32  nibbles.gb


Several programs in one project:
compiler: gcc -Wall -c -o %o %f
files: test.cy util.cy
linker: gcc -o test %o

compiler: gcc -Wall -O2 -c -o %o %f
files: blabla.cyp
linker: gcc -o blabla %o


You can shorten the names of the project directives by using only their initial letter, ex:
c: gcc -Wall -c -o %o %f
f: test.cy util.cy
l: gcc -o test %o

c: gcc -Wall -O2 -c -o %o %f
f: blabla.cyp
l: gcc -o blabla %o


Reasonable future additions to the project file would be the possibility of defining compilestrings, like:
CC = gcc -Wall -O2 -m486
c: CC -c -o %o %f
...
c: CC -someMoreFlags -c -o %o %f
...





Visit the Cython/Pyrl webpage: www.algonet.se/~jsjogren/oscar/cython/
Mail Oscar Campbell: oscar@linux.nu
Copyright (c) 2000, Oscar Campbell