Secure TeX
From Mathtran
Contents |
Why secure TeX?
To obtain much improved performance, we run TeX as a daemon. For this to work, after each formula is typeset, TeX must to returned to a standard state. TeX grouping left to itself will do most of this work, but unless precautions are taken, the careless or malicious user could circumvent this.
What should be stopped
The primitive TeX command '\end', when executed, causes TeX the program to come to an end. Clearly, we don't want users executing this command.
Similarly, the '\openin' and '\openout' commands allow TeX to read and write files. This is usually harmless when the user is processing her own document on her own machine. But MathTran runs on a server, and does not know its users. Therefore, these commands are also on the forbidden list.
TeX also has primitive commands '\def' and '\let' for changing the meaning of control sequences. If the user writes (at top level)
\let\alpha\beta
then subsequent user, when using '\alpha' to request a will instead get a
. This too should be stopped.
Some examples
Here are some examples of how secure TeX works. If you give MathTran the input
\let\alpha\beta
it will respond by typesetting and producing the TeX log message
! Undefined control sequence.
l.6021 $ \let
\alpha\beta $
By the way, MathTran feeds TeX one long virtual file (via a Unix fifo file) and the 'l. 6021' is where TeX thinks it is in this file. However, this file may contain multiplexed input from several users.
Secure TeX attaches the primitive command '\let' to the control sequence '\_let'. However, if one tries
\_let\alpha\beta
then MathTran responds with and
OK
Do you see what has happened? '\_let' has been read as '\_' (which produces an underscore) followed by 'let'.
Types of security problems
Here we list some classes of security problems, with the most serious first:
- Unauthorised file access: This may compromise the whole server, and any data on it.
- Corruption: The TeX daemon appears to be correct, but it produces
instead of
, for example.
- Denial of Service: This is cause the TeX daemon to die, say by giving it an '\end' command.
Secure TeX
TeX uses something called [[TeX category codes|category codes] to limit access to control sequences. The usual category codes allow users to from control sequences whose name is either a sequence of letters (both upper- and lower-case are permitted) or a single character (of any type).
Secure TeX moves all primitive commands out of the user-accessible area. The primitive command that was stored as '\def' is moved to '\_def', and '\def' is made undefined. The same goes for all other primitive commands. This is done in the file secmove.sty. (There is one exception, namely '\par', which we can ignore.)
Secure TeX then moves into the user-accessible area only those commands that are safe.
Active TeX
It is not so easy to write TeX macros in an environment when one has to write '\_def' instead of the customary '\def', and so on for almost all other control sequences. Secure TeX next loads the file seccode.sty, which makes all characters active in order to define a programming environment in which one can write 'def' instead of the customary '\def', and get the desired '\_def' as a result.
Secure plain TeX
Don Knuth wrote plain TeX, which is a basic TeX format described in The TeXbook. Secure plain TeX is a secure variant of plain TeX. It places in the user-accessible area just the primitive commands and macros of plain TeX that are needed for typesetting. Any other commands, such as those that are needed 'behind the scenes' are hidden from the user. This is done in the file secplain.sty, which in addition gives '\frac' the same meaning as it does in LaTeX.
