Introduction to Ly (as of 2002-03-04)

Ly is available at http://sourceforge.net/projects/lyterate.

This is a quick-and-dirty introduction to Ly, the "lyterate programming thingy." Ly is an engine for Literate Programming; if you don't know what that is, please have a look at http://www.literateprogramming.com (I won't explain it here).

So why would anybody use Ly instead of the established Literate Programming tools like WEB or noweb? I cannot speak for anybody else, but here are the main reasons why I've decided to roll my own tool:

Elegance. I do not consider the source code of WEB or noweb to be particularly nice. I especially wanted a syntax that allows me to easily write a short paragraph, then two or three lines of code, then another short paragraph. Ly accomplishes this by distinguishing code from text through indentation.
HTML. WEB and CWEB are targeted at outputting TeX. This is a valid choice, but not mine, as I want to create documents with WWW links, and as I want to re-create documents each time I change something-- both of these features don't go well together with a format suited to printout, like TeX is.
(Of course you can create HTML from LaTeX, but if my output format is HTML, I want to be able to write HTML directly.)
Python. I originally created Ly for a Python-based project, and therefore I wanted it to run everywhere where Python runs. WEB and noweb are mostly targeted at UNIX-like platforms.

Ly is a one-pass weaver and tangler, meaning that you only run a single script to both tangle and weave your literate program. One important principle of Ly is that there is one .ly source code file for each .html output file created. However, a literate program can consist of many .ly files. All of them have to be processed at once by the tangler so that the chunk definitions from all the different .ly files are known.

To make things easier, you can invoke Ly on a whole directory at once. For example, this command weaves and tangles Ly's source (when run in the ly/ directory):

python ly.py src/

Instead of src/, you can substitute any files or directories you'd like Ly to process together.

`.ly` files

The headers

Let's now have a look at the structure of a .ly file. One of the odd things about .ly files is that they start with a RFC 2822-style list of headers. (This may be the only thing that won't become customizable in the future, because the headers precisely will contain the information how Ly is to understand the body of the file.) If you do not know how to write RFC 2822-style headers, please do a web search on RFC 822 or RFC 2822. (RFC 2822 is an updated version of RFC 822.) It's Internet mail headers, if you're familiar with that.

Now, the header you'll use most is the Title: header. It is the only header that is currently required in each .ly file (the requirement to have a Title: header is expected to go away in the future). The value of that header is used for the <title> tag of the HTML page, which means it becomes the HTML page's title. So usually, your files will start by a line of the form "Title: A great page", followed by a blank line (to mark the end of the headers, as ususal).

For the moment, you should also always put in an Easy-tags: Off header. Ly contains a facility called 'easy tags' that makes writing HTML more comfortable (in my opinion). For example, instead of writing <code>foo</code> to format foo in a typewriter font, you can simply put vertical bars around it: |foo|. However, the complete set of formatting instructions is currently undocumented (to be changed in release 0.0.5). Thus, it could bite you by changing something in your HTML you do not want to be changed. The Easy-tags: Off header ensures that Ly does not pre-process your HTML.

The third header which is already useful is the optional Ly-Version: header. If you use it, include the version number of the Ly release you're using. Then, when weaving and tangling, Ly will check whether it believes it understands source files created for the given version of Ly; if not, it will reject the input file. For example, if you use Ly version 0.0.4, you may want to include the header Ly-Version: 0.0.4 in your Ly files.

The chunks

As I mentioned above, one reason for me to write Ly was that it makes it easy to have very small chunks. (A chunk in a .ly file corresponds to one paragraph in the woven .html file.) This, of course, requires that chunk boundaries are very easy to enter, and the solution I found for this was to make any blank line or sequence of blank lines in a .ly input file a chunk boundary. This means that to enter two paragraphs in a .ly file, you simply enter the first paragraph, leave a blank line, and enter the second paragraph. Ly automatically inserts a <p> tag to separate the two.

Of course, as in any Literate Programming engine, there are two kinds of chunks: code chunks and documentation chunks. Documentation chunks are HTML and appear only in the woven output. Code chunks are Python or Java or whatever, and appear both in the woven and in the tangled output. Code chunks are named, documentation chunks are not.

This document was itself created from a .ly file (which can be found here). So far, all the paragraphs in it have been documentation chunks, i.e. ordinary HTML paragraphs (including HTML headlines). Now, however, we will look at a code chunk.

-- Say hello:
	print "Hello, world!"

This introduces a new code chunk with the name Say hello and one line of code in it. Ly automatically encloses it in <pre> tags in the woven HTML output. In the input file, it appears exactly as you see it now. This is one important principle of Ly: the output looks just like the .ly source, only better (normal text is set in a variable-width font etc.).

So how does Ly recognize the chunk above as a code chunk? The double dash at the beginning of the line does the trick. When starting a new named code chunk, you have to use two dashes (--), one space, the chunk name, and a colon (:). After that, only whitespace may follow on the same line. (Whitespace at the end of any line is completely ignored and automatically removed by Ly.)

After that follows the code itself. It has to be indented. Here is one thing you have to remember well: Ly looks at how much the first line after the chunk name is indented, and it strips this amount of whitespace (in characters) from every line in the code chunk. This means that you cannot indent some line in a code chunk less than you indented the first code line of that chunk!

Also, you cannot use spaces in the first code line and tabs in the following ones (or vice versa). This is because Ly has no intelligence about how to treat tabs.

(This behavior is buggy and will be fixed when I have figured out how to do so. For the moment, you'll just have to work around it.)

Okay. So above we have printed out "Hello, world." Now, let's say we want this to be followed by a blank line. We just add another statement to the code chunk above:

	print

Now what has happened here? This code chunk (paragraph) is not itself given a name through the two-dashes syntax introduced above.

What happens is that this chunk is supposed to have the same name as the last named code chunk, i.e. this is added to the "Say hello" chunk. And the computer can tell this is a code chunk and not a documentation chunk because it's indented-- if the first line of a chunk is indented, it's taken to be a code chunk.

Chunk references

Now we have created one chunk-- let's create another chunk that references the first one. Let's make a chunk that says "Hello world" four times.

-- Say hello four times:
	for i in range(4):
		-- Say hello.

As you may know, for i in range(4) is the Python idiom for executing some command or block four times. The interesting part is the inclusion of the Say hello chunk into this chunk. The inclusion starts with two dashes, followed by a space, followed by the chunk name, followed by a dot (not a colon). The inclusion itself is indented, i.e. it is entered as a code line. Note that the two lines of the Say hello chunk will be indented just as much as the inclusion.

Currently, code chunk lines starting with double dashes cannot be escaped. This means that if a code chunk line starts with two dashes, it is always interpreted as an inclusion of another chunk-- there's no way around that.

This can be unfortunate in languages that use the double dash for distinguishing comments; if you want to use Ly for such a language, please join the lyterate-info mailing list and tell me about it and we'll figure something out.

C, C++ and Java use double dashes for the decrement operator. However, you can always put them after instead of before a variable (as the value of the expression hardly matters if it's the first thing on a line).

Special chunk names

`file` chunks

So far, we have created a number of chunks, but these are not tangled into any output file. To make the Ly tangler create an output file, we have to create a code chunk with a special name. The following chunk creates a file called hello.py.

-- file "hello.py":
	print "Starting hello.py"

	-- Say hello four times.

	print "Finished with hello.py"

Note how we have included the Say hello four times chunk. What the tangler looks for when creating the output files are chunk names of the following form: the word file (in lower case), one space, double quotes, the name of the file, double quotes again. The file name is relative to the location of the .ly file, i.e. if we used "../hello.py", the hello.py file would be put into the parent directory of the intro.ly file. (Now it's put in the same directory as intro.ly.)

You can look at the hello.py file now.

`runPython` chunks

Sometimes, you may want to do some preprocessing on chunks before tangling them. Ly provides support for that in the form of __runPython__ chunks. These chunks contain Python code that returns code to be written in a tangled file. Let us look at an example.

-- __runPython__ @ Say hello with numbers:
	import string
	chunk = self.tangleChunk('Say hello')

This introduces a __runPython__ chunk that produces the content of the Say hello with numbers chunk. You are not allowed to also define a Say hello with numbers chunk in the normal way, at least not currently. (We import the string module because we'll need it shortly.)

With self.tangleChunk(), we call a method of the LiterateProgram class in the Ly source. tangleChunk() is probably LiterateProgram's most important method for you if you are writing __runPython__ chunks.

As you probably have guessed, self.tangleChunk('Say hello') tangles the Say hello chunk we have introduced above. What is important to note is that the format returned is not a single string containing the chunk, but a list of strings, one string for each line. This is also the format the __runPython__ chunk must return.

(This is because of the indentation tricks we have to do during tangling-- if the reference to a chunk is indented by two tabs, we have to indent each line in that chunk by two tabs. That's easier if we operate on lists of lines instead of strings containing newlines.)

So now, the chunk variable contains a list of strings. We'll also introduce the result list of strings. Initially it's empty, but we'll append to it.

	result = []

Now, we'll include the Say hello chunk ten times. But every time we see a print statement, we make it tell the how-manieth time we are including the Say hello chunk. For example, when we include it for the fifth time, and when we encounter the print "Hello, world!" statement, we change it to print "5 " "Hello, world!".

	for i in range(10):

We need to go through all of the lines (strings) in the chunk each time.

		for line in chunk:

Now, we replace each occurence of the string print in that line by print "i ". For this, we use the Python idiom string.join(string.split(str, replaceWhat), byWhat).

			line = string.join(string.split(line, 'print'),
				'print "'+str(i)+' "')

And now we append line to result.

			result.append(line)

Finally, we return result.

	return result

Now we need to create a file chunk containing the Say hello with numbers chunk which we have just written.

-- file "hellonumbers.py":
	print "Starting hellonumbers.py"
	print
	-- Say hello with numbers.
	print
	print "Finished with hellonumbers.py"

You can look at the resulting file here.

HTML formatting

Ly has some special ways for entering HTML tags. However, they are currently undocumented. You can learn them by looking at .ly source files (such as the source of this document) if you want to learn it already, but if you aren't sure, it'll be better to include the Easy-tags: Off header in all your files (it turns the HTML formatting off). I will replace this note by real documentation of the HTML formatting in the next release (0.0.5).

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".