N/A
<MIMEFILTERS>
filter-specification
...
</MIMEFILTERS>
N/A
The resource MIMEFILTERS
is used to hook in user
specifed filters into MHonArc.
The MIMEFILTERS resource can only be set via the
MIMEFILTERS
element.
The syntax for each line of
the the MIMEFILTERS
element is as follows:
content-type;
routine-name;
file-of-routine
The definition of each semi-colon-separated value is as follows:
The MIME content-type the filter processes. An explicit content-type (base/subtype) or a base content-type (base/*) can be specified.
The actual routine name of the filter. The name
should be fully qualified by the package it is defined in
(e.g. "mypackage::filter
").
The name of the file that defines routine-name. If the file is not a full pathname, MHonArc finds the file by looking in the standard include paths of Perl, and the paths specified by the PERLINC resource.
NOTE | For backwards compatibility, the values of a filter specification can be separated with a colon, ":". However, if you use a colon, package qualification of a function must use Perl 4 syntax. |
Whitespace is stripped out for each filter specification.
If you want to write your own filter for use in MHonArc, you need to know the Perl programming language. The following information assumes you know Perl.
NOTE | The filter model follows Perl 4 syntax conventions and constructs. This is because of historical reasons. Note, the implementation of the filter can use Perl 5 syntax and features, where applicable, if running MHonArc under Perl 5. |
NOTE | The default filters provided by MHonArc are described in the Default Settings section. |
MHonArc interfaces with MIME filters by calling a routine with a specific set of arguments. The prototype of the interface routine is as follows:
sub filter { local($head, *fields, $data, $decoded, $argstring) = @_; # Filter code here # The last statement should be the return value, unless an # explicit return is done. See the following for the format of the # return value. }
$head |
This is the header text of the message (or body part if called in a multipart message). |
||
*fields |
A pointer (typeglob) to an associative array that has broken down
If a field occurs more than once in a header, MHonArc
separates the field values in the associative array by a
`
|
||
$data |
This is a copy of the message (or body part if called in a mulitpart message) body. |
||
$decoded |
This flag is set to 1 if MHonArc decoded the message
and MHonArc has decoded the data for you if it was encoded in 7-Bit, 8-Bit, Binary, Quoted-Printable, Base64, Uuencode (x-uuencode, uuencode, x-uue, uue). |
||
$argstring |
This is an optional argument string that may be used to modify the behavior of the filter. The format of this string is determined by the filter itself. The value of the string is set by the MIMEARGS resource. |
The return value is treated as a list. The first item of the list is a string representing the HTML markup to insert in the HTMLized message. An empty string may be returned to tell MHonArc that the routine was unable to filter data.
Any other list items are treated as names of any files that were generated by the filter. MHonArc needs to keep track if any extra files that a filter may generate in order for MHonArc to delete those files if the message gets removed from the archive.
NOTE | If the filter creates a subdirectory with files, the filter only needs to return the subdirectory in the return list. If the message gets removed, MHonArc will delete the entire directory. |
The following recommendations/tips are given to help you write filters:
Qualify your filter in its own package. This eliminates possible variable/routine conflicts with MHonArc.
If the filter creates derived files (like the image filters),
you may use the variable $mhonarc::OUTDIR
to determine the
location of the mail archive.
NOTE | Do not include
|
Look at the default filters contained in the distribution of MHonArc. You can use these as templates for writing your own.
Make sure your Perl source file ends with a true statement
(like "1;
"). MHonArc just performs a
require
on the file, and if the file does not return
true, Perl will abort execution.
Test your filter before production use.
If a MIME filter requires the utilization of a C program, or other non-Perl executable, a Perl wrapper must be written for the program in-order to interface with MHonArc. The wrapper must follow the conventions described in Writing Filters.
<MIMEFilters> application/*; m2h_external::filter; mhexternal.pl application/x-patch; m2h_text_plain::filter; mhtxtplain.pl audio/*; m2h_external::filter; mhexternal.pl chemical/*; m2h_external::filter; mhexternal.pl model/*; m2h_external::filter; mhexternal.pl image/*; m2h_external::filter; mhexternal.pl message/delivery-status; m2h_text_plain::filter; mhtxtplain.pl message/partial; m2h_text_plain::filter; mhtxtplain.pl text/*; m2h_text_plain::filter; mhtxtplain.pl text/enriched; m2h_text_enriched::filter; mhtxtenrich.pl text/html; m2h_text_html::filter; mhtxthtml.pl text/plain; m2h_text_plain::filter; mhtxtplain.pl text/richtext; m2h_text_enriched::filter; mhtxtenrich.pl text/setext; m2h_text_setext::filter; mhtxtsetext.pl text/tab-separated-values; m2h_text_tsv::filter; mhtxttsv.pl text/x-html; m2h_text_html::filter; mhtxthtml.pl text/x-setext; m2h_text_setext::filter; mhtxtsetext.pl video/*; m2h_external::filter; mhexternal.pl x-sun-attachment; m2h_text_plain::filter; mhtxtplain.pl </MIMEFilters>
The following describes the behavior of each filter.
The filter extracts the data into a separate file and puts a hyperlink to the file into the HTMLized message.
By default, the filter ignores any filename specification given in the message when writing the data to disk. A unique filename with an extenstion based upon sub-type is generated.
m2h_external::filter can take the following arguments:
iconurl="url" | Use "url" as the url for the icon to use if the useicon option is set. This option will override any setting defined by the ICONS resource. The double quotes are required. |
inline | Inline image data by default if content-disposition not defined. |
ext=ext | Use ext as the filename extension. The filter already has a large list of extensions for various content-types. Use this argument if you process a content-type not recognized by the filter. |
type="description" | Use "description" as type description of the data. The double quotes are required. The filter already has a large list of descriptions for various content-types. Use this argument if you process a content-type not recognized by the filter. |
subdir | Place derived file in a subdirectory of the archive. The subdirectory will be called "msgMSGNUM.dir". This option may be useful if usename is specified to avoid security and name conflict problems. |
target=name | Set the TARGET attribute of anchor link to file Default value is undefined (ie. no TARGET attribute will be written). |
useicon | Include a content-type icon with the hyperlink to the derived file. The icon used is the value of the iconurl option or the icon defined by the ICONS resource. |
usename | Use (file)name attribute for determining name of derived file. Use this option with caution since it can lead to filename conflicts and security problems (however, see the subdir option). |
usenameext | Use (file)name attribute for determining the extension of derived file. Use this option with caution since it can lead to security problems (however, see the subdir option). |
All arguments should be separated by at least one space.
The following table shows the default list of content-types with the filename extension used and a short description that m2h_external::filter recognizes:
Content-type | Extension | Description |
---|---|---|
application/astound | asd | Astound presentation |
application/envoy | evy | Envoy file |
application/fastman | lcc | fastman file |
application/fractals | fif | Fractal Image Format |
application/iges | iges | IGES file |
application/mac-binhex40 | hqx | Mac BinHex archive |
application/mathematica | ma | Mathematica Notebook document |
application/mbedlet | mbd | mbedlet file |
application/msword | doc | MS-Word document |
application/octet-stream | bin | Binary data |
application/oda | oda | ODA file |
application/pdf | Adobe PDF document | |
application/pgp | pgp | PGP message |
application/pgp-signature | pgp | PGP signature |
application/postscript | ps | PostScript document |
application/rtf | rtf | RTF file |
application/sgml | sgml | SGML document |
application/studiom | smp | Studio M file |
application/timbuktu | tbt | timbuktu file |
application/vis5d | v5d | Vis5D dataset |
application/vnd.framemaker | fm | FrameMaker document |
application/vnd.hp-hpgl | hpg | HPGL file |
application/vnd.mif | mif | Frame MIF document |
application/vnd.ms-excel | xls | MS-Excel spreadsheet |
application/vnd.ms-powerpoint | ppt | MS-Powerpoint presentation |
application/vnd.ms-project | mpp | MS-Project file |
application/winhlp | hlp | WinHelp document |
application/wordperfect5.1 | wp | WordPerfect 5.1 document |
application/x-asap | asp | asap file |
application/x-bcpio | bcpio | BCPIO file |
application/x-compress | Z | Unix compressed data |
application/x-cpio | cpio | CPIO file |
application/x-csh | csh | C-Shell script |
application/x-dot | dot | dot file |
application/x-dvi | dvi | TeX dvi file |
application/x-earthtime | etc | Earthtime file |
application/x-envoy | evy | Envoy file |
application/x-excel | xls | MS-Excel spreadsheet |
application/x-gtar | gtar | GNU Unix tar archive |
application/x-gzip | gz | GNU Zip compressed data |
application/x-hdf | hdf | HDF file |
application/x-javascript | js | JavaScript source |
application/x-ksh | ksh | Korn Shell script |
application/x-latex | latex | LaTeX document |
application/x-maker | fm | FrameMaker document |
application/x-mif | mif | Frame MIF document |
application/x-mocha | moc | mocha file |
application/x-msaccess | mdb | MS-Access database |
application/x-mscardfile | crd | MS-CardFile |
application/x-msclip | clp | MS-Clip file |
application/x-msmediaview | m14 | MS-Media View file |
application/x-msmetafile | wmf | MS-Metafile |
application/x-msmoney | mny | MS-Money file |
application/x-mspublisher | pub | MS-Publisher document |
application/x-msschedule | scd | MS-Schedule file |
application/x-msterminal | trm | MS-Terminal |
application/x-mswrite | wri | MS-Write document |
application/x-net-install | ins | Net Install file |
application/x-netcdf | cdf | Cdf file |
application/x-ns-proxy-autoconfig | proxy | Netscape Proxy Auto Config |
application/x-patch | patch | Source code patch |
application/x-perl | pl | Perl program |
application/x-pointplus | css | pointplus file |
application/x-salsa | slc | salsa file |
application/x-script | script | A script file |
application/x-sh | sh | Bourne shell script |
application/x-shar | shar | Unix shell archive |
application/x-sprite | spr | sprite file |
application/x-stuffit | sit | Macintosh archive |
application/x-sv4cpio | sv4cpio | SV4Cpio file |
application/x-sv4crc | sv4crc | SV4Crc file |
application/x-tar | tar | Unix tar archive |
application/x-tcl | tcl | Tcl script |
application/x-tex | tex | TeX document |
application/x-texinfo | texinfo | TeXInfo document |
application/x-timbuktu | tbp | timbuktu file |
application/x-tkined | tki | tkined file |
application/x-troff | roff | Troff document |
application/x-troff-man | man | Unix manual page |
application/x-troff-me | me | Troff ME-macros document |
application/x-troff-ms | ms | Troff MS-macros document |
application/x-ustar | ustar | UStar file |
application/x-wais-source | src | WAIS Source |
application/x-zip-compressed | zip | Zip compressed data |
application/zip | zip | Zip archive |
audio/basic | snd | Basic audio |
audio/echospeech | es | Echospeech audio |
audio/microsoft-wav | wav | Wave audio |
audio/midi | midi | MIDI audio |
audio/wav | wav | Wave audio |
audio/x-aiff | aif | AIF audio |
audio/x-epac | pae | epac audio |
audio/x-midi | midi | MIDI audio |
audio/x-mpeg | mp2 | MPEG audio |
audio/x-pac | pac | pac audio |
audio/x-pn-realaudio | ra | PN Realaudio |
audio/x-wav | wav | Wave audio |
chemical/chem3d | c3d | Chem3d chemical test |
chemical/chemdraw | chm | Chemdraw chemical test |
chemical/cif | cif | CIF chemical test |
chemical/cml | cml | CML chemical test |
chemical/cxf | cxf | Chemical Exhange Format file |
chemical/daylight-smiles | smi | SMILES format file |
chemical/embl-dl-nucleotide | emb | EMBL nucleotide format file |
chemical/gaussian-input | gau | Gaussian chemical test |
chemical/gcg8-sequence | gcg | GCG format file |
chemical/genbank | gen | GENbank data |
chemical/jcamp-dx | jdx | Jcamp chemical spectra test |
chemical/kinemage | kin | Kinemage chemical test |
chemical/macromodel-input | mmd | Macromodel chemical test |
chemical/mdl-molfile | mol | MOL mdl chemical test |
chemical/mdl-rdf | rdf | RDF chemical test |
chemical/mdl-rxn | rxn | RXN chemical test |
chemical/mdl-sdf | sdf | SDF chemical test |
chemical/mdl-tgf | tgf | TGF chemical test |
chemical/mif | mif | MIF chemical test |
chemical/mopac-input | mop | MOPAC data |
chemical/ncbi-asn1 | asn | NCBI data |
chemical/pdb | pdb | PDB chemical test |
chemical/rosdal | ros | Rosdal data |
image/bmp | bmp | Windows bitmap |
image/cgm | cgm | Computer Graphics Metafile |
image/fif | fif | Fractal Image Format image |
image/g3fax | g3f | Group III FAX image |
image/gif | gif | GIF image |
image/ief | ief | IEF image |
image/ifs | ifs | IFS image |
image/jpeg | jpg | JPEG image |
image/png | png | PNG image |
image/tiff | tif | TIFF image |
image/vnd | dwg | VND image |
image/wavelet | wi | Wavelet image |
image/x-cmu-raster | ras | CMU raster |
image/x-pbm | pbm | Portable bitmap |
image/x-pcx | pcx | PCX image |
image/x-pgm | pgm | Portable graymap |
image/x-pict | pict | Mac PICT image |
image/x-pnm | pnm | Portable anymap |
image/x-portable-anymap | pnm | Portable anymap |
image/x-portable-bitmap | pbm | Portable bitmap |
image/x-portable-graymap | pgm | Portable graymap |
image/x-portable-pixmap | ppm | Portable pixmap |
image/x-ppm | ppm | Portable pixmap |
image/x-rgb | rgb | RGB image |
image/x-xbitmap | xbm | X bitmap |
image/x-xbm | xbm | X bitmap |
image/x-xpixmap | xpm | X pixmap |
image/x-xpm | xpm | X pixmap |
image/x-xwd | xwd | X window dump |
image/x-xwindowdump | xwd | X window dump |
model/iges | iges | IGES model |
model/mesh | mesh | Mesh model |
model/vrml | wrl | VRML model |
text/enriched | rtx | Text-enriched document |
text/html | html | HTML document |
text/plain | txt | Text document |
text/richtext | rtx | Richtext document |
text/setext | stx | Setext document |
text/sgml | sgml | SGML document |
text/tab-separated-values | tsv | Tab separated values |
text/x-speech | talk | Speech document |
video/isivideo | fvi | isi video |
video/mpeg | mpg | MPEG movie |
video/msvideo | avi | MS Video |
video/quicktime | mov | QuickTime movie |
video/vivo | viv | vivo video |
video/wavelet | wv | Wavelet video |
video/x-sgi-movie | movie | SGI movie |
This filter is designed to process text/enriched, or text/richtext, data. The following table summarizes the translation of text/enriched commands to HTML tags:
Text/Enriched Command | HTML Translation |
---|---|
<Bold> | <B> |
<Italic> | <I> |
<Underline> | <U> |
<Fixed> | <TT> |
<Smaller> | <SMALL> |
<Bigger> | <BIG> |
<FontFamily><Param>family</Param> | <FONT face="family"> |
<Color><Param>color</Param> | <FONT color="color"> |
<Center> | <P align="center"> |
<FlushLeft> | <P align="left"> |
<FlushRight> | <P align="right"> |
<FlushBoth> | <P align="both"> (not supported in HTML) |
<ParaIndent> | <BLOCKQUOTE> |
<Excerpt> | <BLOCKQUOTE> |
<Lang> | Stripped |
If the text/enriched contains non-ASCII character, the filter will convert the characters to the appropriate entity references.
NOTE | Only the ISO-8859-[1-10] character sets are recognized. |
This filter is designed to process text/html or text/x-html data. The filter modifies HTML documents so they can be included into the message pages without causing invalid markup to occur. The following modification are done to HTML documents when processed by MHonArc:
The HEAD element is removed. Since some elements within the HEAD element may be relevant to the rest of the document, the following is done when removing the HEAD element:
cid: URLs are resolved, if possible. Therefore, if image data related to the HTML document is included with the message, the URLs will be modified to the filenames of the images that were decoded.
This filter is designed to process text/plain messages and messages with no MIME information. The filter is also used to process text messages of an unknown subtype.
The default behavior of the filter is wrap the data in the HTML PRE element and escape special characters. It will also convert text that looks like a URL into a hyperlink. If the data contains non-ASCII character, the filter will convert the characters to the appropriate entity reference.
NOTE | Only the ISO-8859-[1-10] and ISO-2022-JP character sets are recognized. For ISO-2022-JP data, a Web client with ISO-2022-JP is required to read the data. |
m2h_text_plain::filter can take the following arguments:
asis=set1:... | Colon separated lists of charsets to leave as-is. Only HTML special characters will be converted into entities. |
default=charset | Character set to use as the default if no character set is defined for the message. If option not specified, "us-ascii" is used. |
keepspace | Preserve all spaces if the nonfixed option is specified. All spaces and tabs will be translated to the equivalent number of entity references. |
maxwidth=# | Force the maximum width of lines to be # characters in length. Any lines longer than # characters will be wrapped. |
nonfixed | Do not wrap message text in the HTML PRE element. This will cause text to be rendered in the default font (which is normally proportionally spaced). Each line of the message will have a <BR> appended in order to preserve the line representation of the message. |
nourl | Do not hyperlink URLs. |
quote | Italicize quoted message text. |
target=name | Set the TARGET attribute of an anchor links generated from hyperlinking URLs. |
All arguments should be separated by at least one space.
This filter converts text/setext and text/x-setext messages to HTML.
This filter converts text/tab-separated-values to HTML. The tabular data will be converted into an HTML table.
N/A
The following code is an example filter for HTML message data (note, this example has a subset of the functionality of the HTML filter used by MHonArc):
##---------------------------------------------------------------------------## ## Copyright (C) 1995-1998 Earl Hood, earlhood@usa.net ##---------------------------------------------------------------------------## package m2h_text_html; $Url = '(\w+://|\w+:)'; # Beginning of URL match expression ##--------------------------------------------------------------------------- ## The filter must modify HTML content parts for merging into the ## final filtered HTML messages. Modification is needed so the ## resulting filtered message is valid HTML. ## sub filter { local($header, *fields, *data, $isdecode, $args) = @_; local($base, $title, $tmp); $base = ''; $title = ''; $tmp = ''; ## Get/remove title if ($data =~ s%<title\s*>([^<]*)</title\s*>%%i) { $title = "<ADDRESS>Title: <STRONG>$1</STRONG></ADDRESS>\n"; } ## Get/remove BASE url if ($data =~ s%(<base\s[^>]*>)%%i) { $tmp = $1; if ($tmp =~ m|href\s*=\s*['"]([^'"]+)['"]|i) { $base = $1; } elsif ($tmp =~ m|href\s*=\s*([^\s>]+)|i) { $base = $1; } } elsif ((defined($tmp = $fields{'content-base'}) || defined($tmp = $fields{'content-location'})) && ($tmp =~ m%/%)) { ($base = $tmp) =~ s/['"\s]//g; } $base =~ s%(.*/).*%$1%; ## Strip out certain elements/tags $data =~ s%<!doctype\s[^>]*>%%i; $data =~ s%</?html[^>]*>%%ig; $data =~ s%</?body[^>]*>%%ig; $data =~ s%<head\s*>[\s\S]*</head\s*>%%i; ## Modify relative urls to absolute using BASE if ($base =~ /\S/) { $data =~ s%(href\s*=\s*['"])([^'"]+)(['"])% &addbase($base,$1,$2,$3)%gei; $data =~ s%(src\s*=\s*['"])([^'"]+)(['"])% &addbase($base,$1,$2,$3)%gei; } ($title . $data); } ##--------------------------------------------------------------------------- sub addbase { local($b, $pre, $u, $suf) = @_; local($ret); $u =~ s/^\s+//; if ($u =~ m%^$Url%o) { # Non-relative URL, do nothing $ret = $pre . $u . $suf; } else { # Relative URL if ($u =~ m%^/%) { # Check for "/..." $b =~ s%^(${Url}[^/]*)/.*%$1%o; # Get hostname:port number } $ret = $pre . $b . $u . $suf; } $ret; } ##--------------------------------------------------------------------------- 1;
1.0
CHARSETCONVERTERS, MIMEARGS, PERLINC