Professional Documents
Culture Documents
Verbose mode. This increases the verbosity of the program. Using more than 2
is probably only useful for debugging purposes. The default verbose mode (using
only one -v) gives a nice progress report while digging.
Name
htnotify - sends email notifications about out-dated web pages discovered by htm
erge
Synopsis
htnotify [options]
Description
Htnotify scans the document database created by htmerge and sends an email messa
ge for every page that is out of date. Please have a look at the ht://Dig notifi
cation manual for instructions on how to set up this service.
Options
-b database
Specifies an alternative database than what is specified in the configuratio
n file.
-c configfile
Use the specified configfile instead of the default. -v Verbose mode. This i
ncreases the verbosity of the program. Used once will display a log of what emai
l messages were sent. Used more than once will display information about each do
cument that has email notification set.
Files
/etc/htdig/htdig.conf
The default configuration file.
Name
htload - reads in an ASCII-text version of the document database
Synopsis
htload [options]
Description
Htload reads in an ASCII-text version of the document database in the same form
as the -t option of htdig and htdump. Note that this will overwrite data in your
databases, so this should be used with great care.
Options
-a
Use alternate work files. Tells htload to append .work to database files, al
lowing it to operate on a second set of databases.
-c configfile
Use the specified configfile instead of the default.
-i
Initial. Do not use any old databases. This is accomplished by first erasing
the databases.
-v
Verbose mode. This doesn't have much effect.
File Formats
Document Database
Each line in the file starts with the document id followed by a list of fiel
dname : value separated by tabs. The fields always appear in the order listed be
low:
u
URL
t
Title
a
State (0 = normal, 1 = not found, 2 = not indexed, 3 = obsolete)
m
Last modification time as reported by the server
s
Size in bytes
H
Excerpt
h
Meta description
l
Time of last retrieval
L
Count of the links in the document (outgoing links)
b
Count of the links to the document (incoming links or backlinks)
c
HopCount of this document
g
Signature of the document used for duplicate-detection
e
E-mail address to use for a notification message from htnotify
n
Date to send out a notification e-mail message
S
Subject for a notification e-mail message
d
The text of links pointing to this document. (e.g. <a href="docURL">descript
ion</a>)
A
Anchors in the document (i.e. <A NAME=...)
Word Database
While htdump and htload don't deal with the word database directly, it's wor
th mentioning it here because you need to deal with it when copying the ASCII da
tabases from one system to another. The initial word database produced by htdig
is already in ASCII format, and a binary version of it is produced by htmerge, f
or use by htsearch. So, when you copy over the ASCII version of the document dat
abase produced by htdump, you need to copy over the wordlist as well, then run h
tload to make the binary document database on the target system, followed by run
ning htmerge to make the word index.
Each line in the word list file starts with the word
followed by a list of fieldname : value separated by tabs. The fields always
appear in the order listed below, with the last two being optional:
i
Document ID
l
Location of word in document (1 to 1000)
w
Weight of word based on scoring factors
c
Count of word's appearances in document, if more than 1
a
Anchor number if word occurred after a named anchor
Files
/etc/htdig/htdig.conf
The default configuration file.
/var/lib/htdig/db.docs
The default ASCII document database file.
/var/lib/htdig/db.wordlist
The default ASCII word database file.