ViewVC Help
View File | Revision Log | Show Annotations | Download File
/cvs/thttpd/thttpd.8
Revision: 1.1.4.2
Committed: Mon Jun 18 22:10:09 2001 UTC (23 years ago) by root
Branch: connpatch
CVS Tags: cp_j
Changes since 1.1.4.1: +1 -1 lines
Log Message:
*** empty log message ***

File Contents

# User Rev Content
1 root 1.1 .TH thttpd 8 "29 February 2000"
2     .SH NAME
3     thttpd - tiny/turbo/throttling HTTP server
4     .SH SYNOPSIS
5     .B thttpd
6     .RB [ -C
7     .IR configfile ]
8     .RB [ -p
9     .IR port ]
10     .RB [ -d
11     .IR dir ]
12     .RB [ -r | -nor ]
13     .RB [ -s | -nos ]
14     .RB [ -v | -nov ]
15     .RB [ -g | -nog ]
16     .RB [ -u
17     .IR user ]
18     .RB [ -c
19     .IR cgipat ]
20     .RB [ -t
21     .IR throttles ]
22 root 1.1.4.1 .RB [ -n
23     .lR connections ]
24     .RB [ -o
25     .lR timeout ]
26 root 1.1 .RB [ -h
27     .IR host ]
28     .RB [ -l
29     .IR logfile ]
30     .RB [ -i
31     .IR pidfile ]
32     .RB [ -T
33     .IR charset ]
34     .RB [ -V ]
35     .RB [ -D ]
36     .SH DESCRIPTION
37     .PP
38     .I thttpd
39     is a simple, small, fast, and secure HTTP server.
40     It doesn't have a lot of special features, but it suffices for most uses of
41     the web, it's about as fast as the best full-featured servers (Apache, NCSA,
42     Netscape),
43     and it has one extremely useful feature (URL-traffic-based throttling)
44     that no other server currently has.
45     .SH OPTIONS
46     .TP
47     .B -C
48     Specifies a config-file to read.
49     All options can be set either by command-line flags or in the config file.
50     See below for details.
51     .TP
52     .B -p
53     Specifies an alternate port number to listen on.
54     The default is 80.
55     The config-file option name for this flag is "port",
56     and the config.h option is DEFAULT_PORT.
57     .TP
58     .B -d
59     Specifies a directory to chdir() to at startup.
60     This is merely a convenience - you could just as easily
61     do a cd in the shell script that invokes the program.
62     The config-file option name for this flag is "dir",
63     and the config.h options are WEBDIR, USE_USER_DIR.
64     .TP
65     .B -r
66     Do a chroot() at initialization time, restricting file access
67     to the program's current directory.
68     If -r is the compiled-in default, then -nor disables it.
69     See below for details.
70     The config-file option names for this flag are "chroot" and "nochroot",
71     and the config.h option is ALWAYS_CHROOT.
72     .TP
73     .B -nos
74     Don't do explicit symbolic link checking.
75     Normally, thttpd explicitly expands any symbolic links in filenames,
76     to check that the resulting path stays within the original document tree.
77     If you want to turn off this check and save some CPU time, you can use
78     the -nos flag, however this is not recommended.
79     Note, though, that if you are using the chroot option, the symlink
80     checking is unnecessary and is turned off, so the safe way to save
81     those CPU cycles is to use chroot.
82     The config-file option names for this flag are "symlink" and "nosymlink".
83     .TP
84     .B -v
85     Do el-cheapo virtual hosting.
86     If -v is the compiled-in default, then -nov disables it.
87     See below for details.
88     The config-file option names for this flag are "vhost" and "novhost",
89     and the config.h option is ALWAYS_VHOST.
90     .TP
91     .B -g
92     Use a global passwd file.
93     This means that every file in the entire document tree is protected by
94     the single .htpasswd file at the top of the tree.
95     Otherwise the semantics of the .htpasswd file are the same.
96     If this option is set but there is no .htpasswd file in
97     the top-level directory, then thttpd proceeds as if the option was
98     not set - first looking for a local .htpasswd file, and if that doesn't
99     exist either then serving the file without any password.
100     If -g is the compiled-in default, then -nog disables it.
101     The config-file option names for this flag are "globalpasswd" and
102     "noglobalpasswd",
103     and the config.h option is ALWAYS_GLOBAL_PASSWD.
104     .TP
105     .B -u
106     Specifies what user to switch to after initialization when started as root.
107     The default is "nobody".
108     The config-file option name for this flag is "user",
109     and the config.h option is DEFAULT_USER.
110     .TP
111     .B -c
112     Specifies a wildcard pattern for CGI programs, for instance "**.cgi"
113     or "/cgi-bin/*".
114     See below for details.
115     The config-file option name for this flag is "cgipat",
116     and the config.h option is CGI_PATTERN.
117     .TP
118     .B -t
119     Specifies a file of throttle settings.
120     See below for details.
121     The config-file option name for this flag is "throttles".
122 root 1.1.4.1 .TP
123     .B -n
124     Specifies the number of connections one IP address may have at one time.
125     A 403 is returned, and the host with the IP address will be blocked for the timeout specified with -o.
126     The config-file option name for this flag is "connections", it defaults to 0, which means no limit.
127     A request which is blocked will receive "err403blocked.html" if it exists.
128     .TP
129     .B -o
130 root 1.1.4.2 Specifies the time in seconds that a host who was blocked because of too many connections is totally blocked from the server.
131 root 1.1.4.1 The config-file option name for this flag is "blocktime",
132     and the config.h option is DEFAULT_BLOCKTIME.
133 root 1.1 .TP
134     .B -h
135     Specifies a hostname to bind to, for multihoming.
136     The default is to bind to all hostnames supported on the local machine.
137     See below for details.
138     The config-file option name for this flag is "host",
139     and the config.h option is SERVER_NAME.
140     .TP
141     .B -l
142     Specifies a file for logging.
143     If no -l argument is specified, thttpd logs via syslog().
144     If "-l /dev/null" is specified, thttpd doesn't log at all.
145     The config-file option name for this flag is "logfile".
146     .TP
147     .B -i
148     Specifies a file to write the process-id to.
149     If no file is specified, no process-id is written.
150     You can use this file to send signals to thttpd.
151     See below for details.
152     The config-file option name for this flag is "pidfile".
153     .TP
154     .B -T
155     Specifies the character set to use with text MIME types.
156     The default is iso-8859-1.
157     The config-file option name for this flag is "charset",
158     and the config.h option is DEFAULT_CHARSET.
159     .TP
160     .B -V
161     Shows the current version info.
162     .TP
163     .B -D
164     This was originally just a debugging flag, however it's worth mentioning
165     because one of the things it does is prevent thttpd from making itself
166     a background daemon.
167     Instead it runs in the foreground like a regular program.
168     This is necessary when you want to run thttpd wrapped in a little shell
169     script that restarts it if it exits.
170     .SH "CONFIG-FILE"
171     .PP
172     All the command-line options can also be set in a config file.
173     One advantage of using a config file is that the file can be changed,
174     and thttpd will pick up the changes with a restart.
175     .PP
176     The syntax of the config file is simple, a series of "option" or
177     "option=value" separated by whitespace.
178     The option names are listed above with their corresponding command-line flags.
179     .SH "CHROOT"
180     .PP
181     chroot() is a system call that restricts the program's view
182     of the filesystem to the current directory and directories
183     below it.
184     It becomes impossible for remote users to access any file
185     outside of the initial directory.
186     The restriction is inherited by child processes, so CGI programs get it too.
187     This is a very strong security measure, and is recommended.
188     The only downside is that only root can call chroot(), so this means
189     the program must be started as root.
190     However, the last thing it does during initialization is to
191     give up root access by becoming another user, so this is safe.
192     .PP
193     The program can also be compile-time configured to always
194     do a chroot(), without needing the -r flag.
195     .PP
196     Note that with some other web servers, such as NCSA httpd, setting
197     up a directory tree for use with chroot() is complicated, involving
198     creating a bunch of special directories and copying in various files.
199     With thttpd it's a lot easier, all you have to do is make sure
200     any shells, utilities, and config files used by your CGI programs and
201     scripts are available.
202     If you have CGI disabled, or if you make a policy that all CGI programs
203     must be written in a compiled language such as C and statically linked,
204     then you probably don't have to do any setup at all.
205     .PP
206     Relevant config.h option: ALWAYS_CHROOT.
207     .SH "CGI"
208     .PP
209     thttpd supports the CGI 1.1 spec.
210     .PP
211     In order for a CGI program to be run, its name must match the pattern
212     specified either at compile time or on the command line with the -c flag.
213     This is a simple shell-style filename pattern.
214     You can use * to match any string not including a slash,
215     or ** to match any string including slashes,
216     or ? to match any single character.
217     You can also use multiple such patterns separated by |.
218     The patterns get checked against the filename
219     part of the incoming URL.
220     Don't forget to quote any wildcard characters so that the shell doesn't
221     mess with them.
222     .PP
223     Restricting CGI programs to a single directory lets the site administrator
224     review them for security holes, and is strongly recommended.
225     If there are individual users that you trust, you can enable their
226     directories too.
227     .PP
228     If no CGI pattern is specified, neither here nor at compile time,
229     then CGI programs cannot be run at all.
230     If you want to disable CGI as a security measure, that's how you do it, just
231     comment out the patterns in the config file and don't run with the -c flag.
232     .PP
233     Note: the current working directory when a CGI program gets run is
234     the directory that the CGI program lives in.
235     This isn't in the CGI 1.1 spec, but it's what most other HTTP servers do.
236     .PP
237     Relevant config.h options: CGI_PATTERN, CGI_TIMELIMIT, CGI_NICE, CGI_PATH, CGI_LD_LIBRARY_PATH, CGIBINDIR.
238     .SH "BASIC AUTHENTICATION"
239     .PP
240     Basic Authentication is available as an option at compile time.
241     If enabled, it uses a password file in the directory to be protected,
242     called .htpasswd by default.
243     This file is formatted as the familiar colon-separated
244     username/encrypted-password pair, records delimited by newlines.
245     The protection does not carry over to subdirectories.
246     The utility program htpasswd(1) is included to help create and
247     modify .htpasswd files.
248     .PP
249     Relevant config.h option: AUTH_FILE
250     .SH "THROTTLING"
251     .PP
252     The throttle file lets you set maximum byte rates on URLs or URL groups.
253     There is no provision for setting a maximum request rate throttle,
254     because throttling a request uses as much cpu as handling it, so
255     there would be no point.
256     .PP
257     The format of the throttle file is very simple.
258     A # starts a comment, and the rest of the line is ignored.
259     Blank lines are ignored.
260     The rest of the lines should consist of a pattern, whitespace, and a number.
261     The pattern is a simple shell-style filename pattern, using ?/**/*, or
262     multiple such patterns separated by |.
263     .PP
264     The numbers in the file are byte rates, specified in units of bytes per second.
265     For comparison, a v.32b/v.42b modem gives about 1500/2000 B/s
266     depending on compression, a double-B-channel ISDN line about
267     12800 B/s, and a T1 line is about 150000 B/s.
268     .PP
269     Example:
270     .nf
271     # throttle file for www.acme.com
272    
273     ** 100000 # limit total web usage to 2/3 of our T1
274     **.jpg|**.gif 50000 # limit images to 1/3 of our T1
275     **.mpg 20000 # and movies to even less
276     jef/** 20000 # jef's pages are too popular
277     .fi
278     .PP
279     Throttling is implemented by checking each incoming URL filename against all
280     of the patterns in the throttle file.
281     The server accumulates statistics on how much bandwidth each pattern
282     has accounted for recently (via a rolling average).
283     If a URL matches a pattern that has been exceeding its specified limit,
284     then the data returned is actually slowed down, with
285     pauses between each block.
286     If that's not possible (e.g. for CGI programs), then
287     the server returns a special code saying 'try again later'.
288     .SH "MULTIHOMING"
289     .PP
290     Multihoming means using one machine to serve multiple hostnames.
291     For instance, if you're an internet provider and you want to let
292     all of your customers have customized web addresses, you might
293     have www.joe.acme.com, www.jane.acme.com, and your own www.acme.com,
294     all running on the same physical hardware.
295     This feature is also known as "virtual hosts".
296     There are three steps to setting this up.
297     .PP
298     One, make DNS entries for all of the hostnames.
299     The current way to do this, allowed by HTTP/1.1, is to use CNAME aliases,
300     like so:
301     .nf
302     www.acme.com IN A 192.100.66.1
303     www.joe.acme.com IN CNAME www.acme.com
304     www.jane.acme.com IN CNAME www.acme.com
305     .fi
306     However, this is incompatible with older HTTP/1.0 browsers.
307     If you want to stay compatible, there's a different way - use A records
308     instead, each with a different IP address, like so:
309     .nf
310     www.acme.com IN A 192.100.66.1
311     www.joe.acme.com IN A 192.100.66.200
312     www.jane.acme.com IN A 192.100.66.201
313     .fi
314     This is bad because it uses extra IP addresses, a somewhat scarce resource.
315     But if you want people with older browsers to be able to visit your
316     sites, you still have to do it this way.
317     .PP
318     Step two.
319     If you're using the modern CNAME method of multihoming, then you can
320     skip this step.
321     Otherwise, using the older multiple-IP-address method you
322     must set up IP aliases or multiple interfaces for the extra addresses.
323     You can use ifconfig(8)'s alias command to tell the machine to answer to
324     all of the different IP addresses.
325     Example:
326     .nf
327     ifconfig le0 www.acme.com
328     ifconfig le0 www.joe.acme.com alias
329     ifconfig le0 www.jane.acme.com alias
330     .fi
331     If your OS's version of ifconfig doesn't have an alias command, you're
332     probably out of luck (but see http://www.acme.com/software/thttpd/notes.html).
333     .PP
334     Third and last, you must set up thttpd to handle the multiple hosts.
335     The easiest way is with the -v flag, or the ALWAYS_VHOST config.h option.
336     This works with either CNAME multihosting or multiple-IP multihosting.
337     What it does is send each incoming request to a subdirectory based on the
338     hostname it's intended for.
339     All you have to do in order to set things up is to create those subdirectories
340     in the directory where thttpd will run.
341     With the example above, you'd do like so:
342     .nf
343     mkdir www.acme.com www.joe.acme.com www.jane.acme.com
344     .fi
345     If you're using old-style multiple-IP multihosting, you should also create
346     symbolic links from the numeric addresses to the names, like so:
347     .nf
348     ln -s www.acme.com 192.100.66.1
349     ln -s www.joe.acme.com 192.100.66.200
350     ln -s www.jane.acme.com 192.100.66.201
351     .fi
352     This lets the older HTTP/1.0 browsers find the right subdirectory.
353     .PP
354     There's an optional alternate step three if you're using multiple-IP
355     multihosting: run a separate thttpd process for each hostname, using
356     the -h flag to specify which one is which.
357     This gives you more flexibility, since you can run each of these processes
358     in separate directories, with different throttle files, etc.
359     Example:
360     .nf
361     thttpd -r -d /usr/www -h www.acme.com
362     thttpd -r -d /usr/www/joe -u joe -h www.joe.acme.com
363     thttpd -r -d /usr/www/jane -u jane -h www.jane.acme.com
364     .fi
365     But remember, this multiple-process method does not work with CNAME
366     multihosting - for that, you must use a single thttpd process with
367     the -v flag.
368     .SH "CUSTOM ERRORS"
369     .PP
370     thttpd lets you define your own custom error pages for the various
371     HTTP errors.
372     There's a separate file for each error number, all stored in one
373     special directory.
374     The directory name is "errors", at the top of the web directory tree.
375     The error files should be named "errNNN.html", where NNN is the error number.
376     So for example, to make a custom error page for the authentication failure
377     error, which is number 401, you would put your HTML into the file
378     "errors/err401.html".
379     If no custom error file is found for a given error number, then the
380     usual built-in error page is generated.
381     .PP
382     If you're using the virtual hosts option, you can also have different
383     custom error pages for each different virtual host.
384     In this case you put another "errors" directory in the top of that
385     virtual host's web tree.
386     thttpd will look first in the virtual host errors directory, and
387     then in the server-wide errors directory, and if neither of those
388     has an appropriate error file then it will generate the built-in error.
389     .SH "NON-LOCAL REFERERS"
390     .PP
391     Sometimes another site on the net will embed your image files in their
392     HTML files, which basically means they're stealing your bandwidth.
393     You can prevent them from doing this by using non-local referer filtering.
394     With this option, certain files can only be fetched via a local referer.
395     The files have to be referenced by a local web page.
396     If a web page on some other site references the files, that fetch will
397     be blocked.
398     There are three config-file variables for this feature:
399     .TP
400     .B urlpat
401     A wildcard pattern for the URLs that should require a local referer.
402     This is typically just image files, sound files, and so on.
403     For example:
404     .nf
405     urlpat=**.jpg|**.gif|**.au|**.wav
406     .fi
407     For most sites, that one setting is all you need to enable referer filtering.
408     .TP
409     .B noemptyreferers
410     By default, requests with no referer at all, or a null referer, or a
411     referer with no apparent hostname, are allowed.
412     With this variable set, such requests are disallowed.
413     .TP
414     .B localpat
415     A wildcard pattern that specifies the local host or hosts.
416     This is used to determine if the host in the referer is local or not.
417     If not specified it defaults to the actual local hostname.
418     .SH SYMLINKS
419     .PP
420     thttpd is very picky about symbolic links.
421     Before delivering any file, it first checks each element in the path
422     to see if it's a symbolic link, and expands them all out to get the final
423     actual filename.
424     Along the way it checks for things like links with ".." that go above
425     the server's directory, and absolute symlinks (ones that start with a /).
426     These are prohibited as security holes, so the server returns an
427     error page for them.
428     This means you can't set up your web directory with a bunch of symlinks
429     pointing to individual users' home web directories.
430     Instead you do it the other way around - the user web directories are
431     real subdirs of the main web directory, and in each user's home
432     dir there's a symlink pointing to their actual web dir.
433     .PP
434     The CGI pattern is also affected - it gets matched against the fully-expanded
435     filename. So, if you have a single CGI directory but then put a symbolic
436     link in it pointing somewhere else, that won't work. The CGI program will be
437     treated as a regular file and returned to the client, instead of getting run.
438     This could be confusing.
439     .SH PERMISSIONS
440     .PP
441     thttpd is also picky about file permissions.
442     It wants data files (HTML, images) to be world readable.
443     Readable by the group that the thttpd process runs as is not enough - thttpd
444     checks explicitly for the world-readable bit.
445     This is so that no one ever gets surprised by a file that's not set
446     world-readable and yet somehow is readable by the HTTP server and
447     therefore the *whole* world.
448     .PP
449     The same logic applies to directories.
450     As with the standard Unix "ls" program, thttpd will only let you
451     look at the contents of a directory if its read bit is on; but
452     as with data files, this must be the world-read bit, not just the
453     group-read bit.
454     .PP
455     thttpd also wants the execute bit to be *off* for data files.
456     A file that is marked executable but doesn't match the CGI pattern
457     might be a script or program that got accidentally left in the
458     wrong directory.
459     Allowing people to fetch the contents of the file might be a security breach,
460     so this is prohibited.
461     Of course if an executable file *does* match the CGI pattern, then it
462     just gets run as a CGI.
463     .PP
464     In summary, data files should be mode 644 (rw-r--r--),
465     directories should be 755 (rwxr-xr-x) if you want to allow indexing and
466     711 (rwx--x--x) to disallow it, and CGI programs should be mode
467     755 (rwxr-xr-x) or 711 (rwx--x--x).
468     .SH LOGS
469     .PP
470     thttpd does all of its logging via syslog(3).
471     The facility it uses is configurable.
472     Aside from error messages, there are only a few log entry types of interest,
473     all fairly similar to CERN Common Log Format:
474     .nf
475     Aug 6 15:40:34 acme thttpd[583]: 165.113.207.103 - - "GET /file" 200 357
476     Aug 6 15:40:43 acme thttpd[583]: 165.113.207.103 - - "HEAD /file" 200 0
477     Aug 6 15:41:16 acme thttpd[583]: referer http://www.acme.com/ -> /dir
478     Aug 6 15:41:16 acme thttpd[583]: user-agent Mozilla/1.1N
479     .fi
480     The package includes a script for translating these log entries info
481     CERN-compatible files.
482     Note that thttpd does not translate numeric IP addresses into domain names.
483     This is both to save time and as a minor security measure (the numeric
484     address is harder to spoof).
485     .PP
486     Relevant config.h option: LOG_FACILITY.
487     .PP
488     If you'd rather log directly to a file, you can use the -l command-line
489     flag. But note that error messages still go to syslog.
490     .SH SIGNALS
491     .PP
492     thttpd handles a couple of signals, which you can send via the
493     standard Unix kill(1) command:
494     .TP
495     .B INT,TERM
496     These signals tell thttpd to shut down immediately.
497     Any requests in progress get aborted.
498     .TP
499     .B USR1
500     This signal tells thttpd to shut down as soon as it's done servicing
501     all current requests.
502     In addition, the network socket it uses to accept new connections gets
503     closed immediately, which means a fresh thttpd can be started up
504     immediately.
505     .TP
506     .B HUP
507     This signal tells thttpd to close and re-open its (non-syslog) log file,
508     for instance if you rotated the logs and want thttpd to start using the
509     new one.
510     However, this feature isn't actually that useful at the moment.
511     The problem is that thttpd will generally be started as root, so that
512     it can bind to port 80; then it gives up the root uid as soon as it can,
513     for security reasons.
514     But if you later send it a HUP, it will try to re-open the log file
515     without root access and will generally fail.
516     Also, if you're running inside a chroot tree, as you should be,
517     the log file won't even be accessible.
518     Currently the best alternative for log rotation is to send a USR1 signal,
519     shutting down thttpd altogether, and then restart it.
520     .SH "SEE ALSO"
521     redirect(8), ssi(8), makeweb(1), htpasswd(1), syslogtocern(8), weblog_parse(1), http_get(1)
522     .SH THANKS
523     .PP
524     Many thanks to contributors, reviewers, testers:
525     John LoVerso, Jordan Hayes, Chris Torek, Jim Thompson, Barton Schaffer,
526     Geoff Adams, Dan Kegel, John Hascall, Bennett Todd, KIKUCHI Takahiro,
527     Catalin Ionescu.
528     Special thanks to Craig Leres for substantial debugging and development,
529     and for not complaining about my coding style very much.
530     .SH AUTHOR
531     Copyright © 1995,1998,1999,2000 by Jef Poskanzer <jef@acme.com>.
532     All rights reserved.
533     .\" Redistribution and use in source and binary forms, with or without
534     .\" modification, are permitted provided that the following conditions
535     .\" are met:
536     .\" 1. Redistributions of source code must retain the above copyright
537     .\" notice, this list of conditions and the following disclaimer.
538     .\" 2. Redistributions in binary form must reproduce the above copyright
539     .\" notice, this list of conditions and the following disclaimer in the
540     .\" documentation and/or other materials provided with the distribution.
541     .\"
542     .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
543     .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
544     .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
545     .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE
546     .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
547     .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
548     .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
549     .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
550     .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
551     .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
552     .\" SUCH DAMAGE.