Title: Introduction to Ly (2002-03-04)

Introduction to Ly (as of 2002-03-04)

Written by [Benja Fallenstein -> mailto:b.fallenstein@gmx.de]. Ly is available at [http://sourceforge.net/projects/lyterate -> http://sourceforge.net/projects/lyterate]. This is a quick-and-dirty introduction to Ly, the "lyterate programming thingy." Ly is an engine for Literate Programming; if you don't know what that is, please have a look at http://www.literateprogramming.com (I won't explain it here). So why would anybody use Ly instead of the established Literate Programming tools like WEB or noweb? I cannot speak for anybody else, but here are the main reasons why I've decided to roll my own tool: Ly is a one-pass weaver and tangler, meaning that you only run a single script to both tangle and weave your literate program. One important principle of Ly is that there is one |.ly| source code file for each |.html| output file created. However, a literate program can consist of many |.ly| files. All of them have to be processed at once by the tangler so that the chunk definitions from all the different |.ly| files are known. To make things easier, you can invoke Ly on a whole directory at once. For example, this command weaves and tangles Ly's source (when run in the |ly/| directory):
python ly.py src/
Instead of |src/|, you can substitute any files or directories you'd like Ly to process together.

|.ly| files

The headers

Let's now have a look at the structure of a |.ly| file. One of the odd things about |.ly| files is that they start with a RFC~2822-style list of headers. (This may be the only thing that won't become customizable in the future, because the headers precisely will contain the information how Ly is to understand the body of the file.) If you do not know how to write RFC~2822-style headers, please do a web search on RFC~822 or RFC~2822. (RFC~2822 is an updated version of RFC~822.) It's Internet mail headers, if you're familiar with that. Now, the header you'll use most is the |Title:| header. It is the only header that is currently required in each |.ly| file (the requirement to have a |Title:| header is expected to go away in the future). The value of that header is used for the |<>| tag of the HTML page, which means it becomes the HTML page's title. So usually, your files will start by a line of the form "|Title: A great page|", followed by a blank line (to mark the end of the headers, as ususal). For the moment, you should also always put in an |Easy-tags:~Off| header. Ly contains a facility called 'easy tags' that makes writing HTML more comfortable (in my opinion). For example, instead of writing |<<code>>foo<</code>>| to format |foo| in a typewriter font, you can simply put vertical bars around it: <code>\|foo\|</code>. However, the complete set of formatting instructions is currently undocumented (to be changed in release 0.0.5). Thus, it could bite you by changing something in your HTML you do not want to be changed. The |Easy-tags:~Off| header ensures that Ly does not pre-process your HTML. The third header which is already useful is the optional |Ly-Version:| header. If you use it, include the version number of the Ly release you're using. Then, when weaving and tangling, Ly will check whether it believes it understands source files created for the given version of Ly; if not, it will reject the input file. For example, if you use Ly version~0.0.4, you may want to include the header |Ly-Version: 0.0.4| in your Ly~files. <h3>The chunks</h3> As I mentioned above, one reason for me to write Ly was that it makes it easy to have very small chunks. (A chunk in a |.ly| file corresponds to one paragraph in the woven |.html| file.) This, of course, requires that chunk boundaries are very easy to enter, and the solution I found for this was to make any blank line or sequence of blank lines in a |.ly| input file a chunk boundary. This means that to enter two paragraphs in a |.ly| file, you simply enter the first paragraph, leave a blank line, and enter the second paragraph. Ly automatically inserts a |<<p>>| tag to separate the two. Of course, as in any Literate Programming engine, there are two kinds of chunks: code chunks and documentation chunks. Documentation chunks are HTML and appear only in the woven output. Code chunks are Python or Java or whatever, and appear both in the woven and in the tangled output. Code chunks are named, documentation chunks are not. This document was itself created from a |.ly| file (which can be found <a href="intro.ly">here</a>). So far, all the paragraphs in it have been documentation chunks, i.e. ordinary HTML paragraphs (including HTML headlines). Now, however, we will look at a code chunk. -- Say hello: print "Hello, world!" This introduces a new code chunk with the name |Say hello| and one line of code in it. Ly automatically encloses it in |<<pre>>| tags in the woven HTML output. In the input file, it appears _exactly_ as you see it now. This is one important principle of Ly: the output looks just like the |.ly| source, only better (normal text is set in a variable-width font etc.). So how does Ly recognize the chunk above as a code chunk? The double~dash at the beginning of the line does the trick. When starting a new named code chunk, you have to use two dashes~(|--|), one space, the chunk name, and a colon~(|:|). After that, only whitespace may follow on the same line. (Whitespace at the end of any line is completely ignored and automatically removed by Ly.) After that follows the code itself. It has to be indented. Here is one thing you have to remember well: Ly looks at how much the first line after the chunk name is indented, and it strips this amount of whitespace (in characters) from _every line_ in the code chunk. This means that you cannot indent some line in a code chunk less than you indented the first code line of that chunk! Also, you cannot use spaces in the first code line and tabs in the following ones (or vice versa). This is because Ly has no intelligence about how to treat tabs. (This behavior is buggy and will be fixed when I have figured out how to do so. For the moment, you'll just have to work around it.) Okay. So above we have printed out "Hello, world." Now, let's say we want this to be followed by a blank line. We just add another statement to the code chunk above: print Now what has happened here? This code chunk (paragraph) is not itself given a name through the two-dashes syntax introduced above. What happens is that this chunk is supposed to have the same name as the last named code chunk, i.e. this is added to the "|Say hello|" chunk. And the computer can tell this is a code chunk and not a documentation chunk because it's indented-- if the first line of a chunk is indented, it's taken to be a code chunk. <h3>Chunk references</h3> Now we have created one chunk-- let's create another chunk that references the first one. Let's make a chunk that says "Hello world" four times. -- Say hello four times: for i in range(4): -- Say hello. As you may know, |for i in range(4)| is the Python idiom for executing some command or block four times. The interesting part is the inclusion of the |Say hello| chunk into this chunk. The inclusion starts with two dashes, followed by a space, followed by the chunk name, followed by a dot (_not_ a colon). The inclusion itself is indented, i.e. it is entered as a code line. Note that the two lines of the |Say hello| chunk will be indented just as much as the inclusion. Currently, code chunk lines starting with double~dashes cannot be escaped. This means that if a code chunk line starts with two dashes, it is always interpreted as an inclusion of another chunk-- there's no way around that. This can be unfortunate in languages that use the double dash for distinguishing comments; if you want to use Ly for such a language, please join <a href="http://lists.sourceforge.net/lists/listinfo/lyterate-list"> the |lyterate-info| mailing list</a> and tell me about it and we'll figure something out. C, C++ and Java use double dashes for the decrement operator. However, you can always put them after instead of before a variable (as the value of the expression hardly matters if it's the first thing on a line). <h2>Special chunk names</h2> <h3>|file| chunks</h3> So far, we have created a number of chunks, but these are not tangled into any output file. To make the Ly tangler create an output file, we have to create a code chunk with a special name. The following chunk creates a file called |hello.py|. -- file "hello.py": print "Starting hello.py" -- Say hello four times. print "Finished with hello.py" Note how we have included the |Say hello four times| chunk. What the tangler looks for when creating the output files are chunk names of the following form: the word |file| (in lower case), one space, double quotes, the name of the file, double quotes again. The file name is relative to the location of the |.ly| file, i.e. if we used "|../hello.py|", the |hello.py| file would be put into the parent directory of the |intro.ly| file. (Now it's put in the same directory as |intro.ly|.) You can <a href="hello.py">look at the |hello.py| file</a> now. <h3>|\_\_runPython\_\_| chunks</h3> Sometimes, you may want to do some preprocessing on chunks before tangling them. Ly provides support for that in the form of |\_\_runPython\_\_| chunks. These chunks contain Python code that returns code to be written in a tangled file. Let us look at an example. -- __runPython__ @ Say hello with numbers: import string chunk = self.tangleChunk('Say hello') This introduces a |\_\_runPython\_\_| chunk that produces the content of the |Say hello with numbers| chunk. You are not allowed to also define a |Say hello with numbers| chunk in the normal way, at least not currently. (We import the |string| module because we'll need it shortly.) With |self.tangleChunk()|, we call a method of the |LiterateProgram| class in the Ly source. |tangleChunk()| is probably |LiterateProgram|'s most important method for you if you are writing |\_\_runPython\_\_| chunks. As you probably have guessed, |self.tangleChunk('Say~hello')| tangles the |Say~hello| chunk we have introduced above. What is important to note is that the format returned is not a single string containing the chunk, but a list of strings, one string for each line. This is also the format the |\_\_runPython\_\_| chunk must return. (This is because of the indentation tricks we have to do during tangling-- if the reference to a chunk is indented by two tabs, we have to indent each line in that chunk by two tabs. That's easier if we operate on lists of lines instead of strings containing newlines.) So now, the |chunk| variable contains a list of strings. We'll also introduce the |result| list of strings. Initially it's empty, but we'll append to it. result = [] Now, we'll include the |Say~hello| chunk ten times. But every time we see a |print| statement, we make it tell the how-manieth time we are including the |Say~hello| chunk. For example, when we include it for the fifth time, and when we encounter the |print~"Hello,~world!"| statement, we change it to |print~"5 "~"Hello,~world!"|. for i in range(10): We need to go through all of the lines (strings) in the chunk each time. for line in chunk: Now, we replace each occurence of the string |print| in that line by |print~"i "|. For this, we use the Python idiom |string.join(string.split(str,~replaceWhat),~byWhat)|. line = string.join(string.split(line, 'print'), 'print "'+str(i)+' "') And now we append |line| to |result|. result.append(line) Finally, we return |result|. return result Now we need to create a |file| chunk containing the |Say hello with numbers| chunk which we have just written. -- file "hellonumbers.py": print "Starting hellonumbers.py" print -- Say hello with numbers. print print "Finished with hellonumbers.py" You can look at the resulting file <a href="hellonumbers.py">here</a>. <h2>HTML formatting</h2> Ly has some special ways for entering HTML tags. However, they are currently undocumented. You can learn them by looking at |.ly| source files (such as <a href="intro.ly">the source of this document</a>) if you want to learn it already, but if you aren't sure, it'll be better to include the |Easy-tags:~Off| header in all your files (it turns the HTML formatting off). I will replace this note by real documentation of the HTML formatting in the next release (0.0.5). <hr> Copyright (c) 2002 by Benja Fallenstein Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license is included in <a href="fdl.txt">the section entitled "GNU Free Documentation License"</a>.