IMC!


Contents


Photos

 







Browsing around...
News  News Links  Links Blog  Blog Italiano  Español 
Convert Latex documents in Html

Latex is a descriptive language used for writing texts. It's commons especially in the scientific sectors, thanks to its good text layouting and for the extended support to special and unusual characters, like mathematical formulas. Even if its philosophy is very close to the HTML one, that is to tell how to layout a text, I found a lot of difficulties in converting a text written in Latex into HTML. Therefore I decided to write this short guide to make save time to people who are in my same situation.

Converting Latex to HTML is not difficult, in internet there are a lot of softwares, both command line or graphical style. But the mathematical formulas are the problem: there aren't in HTML enough commands to show correctly these formulas; two are the possible solutions:
  1. Trying to represent formulas with the HTML tags available. This is the solutions chosen by almost all the Latex - HTML converters. The results are awful.
  2. Convert formulas into images and insert them into HTML documents. It's the most practical solution, and it's implemented by html2latex, the main available converter.
In this guide we will consider html2latex. This open source converter is written in Perl and relies on external programs to convert images, and unluckily is no longer maintained. As result various compatibility problems with current Perl interprets take place. In this guide I'll try to show how to solve these problems.

What we need
  • Microsoft Windows: it's an obligatory choice, since in Ubuntu (Linux) I didn't manage to make latex2html work, in spite of the wide package availability. To write this guide Windows Vista was used.
  • MiKTEX: it's the main Latex distribution under Windows. In this guide it's supposed to be installed in C:\MiKTEX. For this guide, MiKTEX 2.7.3224 (downloaded in June 2009) was used. Note that MiKTEX is a big package to compile Latex documents. As editor, I suggest TexnicCenter.
  • Ghostscript: a free software to create PS and PDFfile. For this guide, Ghostscript 8.63 was used.
  • netpbm for Windows: a package of small instruments to modify images from command line. For this guide, netpbm 10.27 (downloaded in March 2010) was used.
  • Latex2Html: the original software was written in Perl, and was updated the last time in 2001. Anyway there are different implementations, also for Windows. In this guide we will use latex2html-2008.tar.gz, but I suggest to check in the directory http://saftsack.fs.uni-bayreuth.de/~latex2ht/current/ to check if there's a more updated version.
  • ActivePerl: oh yes, in Windows there's no native Perl implementation, and therefore we have to install it. For this guide we used ActivePerl 5.10.1.1007
Beware!
  1. There are various incompatibilities among different versions of latex2html, netpbm, Perl, etc. It's important to use exactly the Perl, netpbm and latex2html versions that I listed previously.
  2. It's important to install all the programs in folders without spaces. The directory C:\Program Files is not ok! You'll get strange errors. I suggest C:\MiKTEX for MiKTEX and C:\Texutils for all the other programs (Latex2html, netpbm, GhostScript). From now we'll suppose that all the software are installed in these directories.


Initial configuration
Let's install all the software as described before. Finally, we should be in this situation:
  • MiKTEX installed in C:\MiKTEX. We should perform a complete install.
  • Ghostscript installed in C:\texutils\gs.
  • netpbm installed in C:\texutils\netpbm. We should use, as installer, the executable setup file (and not the archive with binary files: in this way we won't install the libraries).
  • Latex2Html extracted into C:\texutils\l2h_inst.
  • ActivePerl installed in the default setup directory. Note that in the last installation phase we should remember to associate the Perl interpreter to the relative files and to add its path to the system paths. In this way by writing perl from a command prompt we can launch the interpreter.
latex2html configuration
  1. Open the latex2html configuration file, which is in C:\texutils\l2h_inst\prefs.pm. At line 125 and following ones we can find this code:

    # Specify any additional search paths here, use `:' or `;´ as delimiter
    $prefs{'EXTRAPATH'} = '';

    # This is where the installation will take place. On UNIXish systems
    # $prefs{'PREFIX'} = '/usr/local';
    # is preferred. On DOS/Win, you might say
    # $prefs{'PREFIX'} = 'C:\\progs\\latex2html';
    $prefs{'PREFIX'} = '';

    We must change this code in:

    # Specify any additional search paths here, use `:' or `;´ as delimiter
    $prefs{'EXTRAPATH'} = 'C:\\texutils\\gs\\bin;C:\\texutils\\netpbm\\bin';

    # This is where the installation will take place. On UNIXish systems
    # $prefs{'PREFIX'} = '/usr/local';
    # is preferred. On DOS/Win, you might say
    # $prefs{'PREFIX'} = 'C:\\progs\\latex2html';
    $prefs{'PREFIX'} = 'C:\\texutils\\l2h';

    That is, we are adding the path for MiKTEX and netpbm, and we're specifying the latex2html installation directory. Note that the double slashes should be used for the paths!
  2. Open a command line prompt, go in C:\texutils\l2h_inst and launch the config.bat file. This file will config latex2html. Ignore the errors of the Perl interpreter: the configuration will succeed anyway. As an example, below there's my config.bat output:

    Starting Configuration...

    config.pl, Release 2008 (Revision 1.49)
    Accompanies LaTeX2HTML, (C) 1999 GNU Public License.

    checking for old config file (cfgcache.pm)... not found (ok)
    checking for platform... MSWin32 (Windows 32 bit)
    checking for C:\Perl\bin\perl.exe... C:\Perl\bin\perl.exe
    checking perl version... 5.010001
    checking if perl supports some dbm... yes
    checking if perl globbing works... yes
    checking for tex... C:\MikTex\miktex\bin\tex.exe
    checking for latex... C:\MikTex\miktex\bin\latex.exe
    checking for initex... C:\MikTex\miktex\bin\initex.exe
    checking for kpsewhich... C:\MikTex\miktex\bin\kpsewhich.exe
    checking for kpsewhich syntax... ok (style=1)
    checking for TeX include path... /MikTex/tex\latex\latex2html
    checking for mktexlsr... C:\MikTex\miktex\bin\mktexlsr.exe
    checking for dvips... C:\MikTex\miktex\bin\dvips.exe
    checking dvips version... 5.96d
    checking if dvips supports the combination of -E and -i -S 1... yes
    checking for html4-check... no
    checking for gswin32c... \texutils\gs\bin\gswin32c.exe
    checking for ghostscript version... 8.63
    checking for ghostscript portable bitmap device... pnmraw
    checking for full color device for anti-aliasing... ppmraw
    checking for ghostscript library and font paths... built-in paths are correct
    checking for pnmcrop... \texutils\netpbm\bin\pnmcrop.exe

    \texutils\netpbm\bin\pnmcrop.exe -verbose yes
    checking for pnmflip... no
    Warning: You may need to rely on LaTeX to generate images with  effects.
    checking for ppmquant... no
    checking for pnmfile... \texutils\netpbm\bin\pnmfile.exe
    checking for pnmcat... \texutils\netpbm\bin\pnmcat.exe
    checking for pbmmake... \texutils\netpbm\bin\pbmmake.exe
    checking for ppmtogif... \texutils\netpbm\bin\ppmtogif.exe
    yes
    checking if ppmtogif can make interlaced GIFs... yes
    checking for pnmtopng... \texutils\netpbm\bin\pnmtopng.exe
    checking for ppmtojpeg... \texutils\netpbm\bin\ppmtojpeg.exe
    checking for pnmcut... \texutils\netpbm\bin\pnmcut.exe
    checking for pnmpad... \texutils\netpbm\bin\pnmpad.exe
    checking for pnmrotate... \texutils\netpbm\bin\pnmrotate.exe
    checking for pnmscale... \texutils\netpbm\bin\pnmscale.exe
    checking for giftopnm... \texutils\netpbm\bin\giftopnm.exe
    checking for jpegtopnm... \texutils\netpbm\bin\jpegtopnm.exe
    checking for pngtopnm... C:\MikTex\miktex\bin\pngtopnm.exe
    checking for tifftopnm... C:\MikTex\miktex\bin\tifftopnm.exe
    checking for picttoppm... no
    Warning: You cannot directly translate/modify graphics of  format.
    checking for anytopnm... no
    Warning: You cannot directly translate/modify graphics of  format.
    checking for bmptoppm... \texutils\netpbm\bin\bmptoppm.exe
    checking for pcxtoppm... C:\MikTex\miktex\bin\pcxtoppm.exe
    checking for sgitopnm... \texutils\netpbm\bin\sgitopnm.exe
    checking for xbmtopbm... \texutils\netpbm\bin\xbmtopbm.exe
    checking for xwdtopnm... \texutils\netpbm\bin\xwdtopnm.exe
    checking if multiple pipes work... no
    Unfortunately multiple pipes are not reliable on this OS.
    checking for temporary disk space... C:\Users\Isacco\AppData\Local\Temp
    creating cfgcache.pm
    creating test.bat
    creating install.bat
    Note: Will install...
          ... executables to   : C:\texutils\l2h\pippo\bin
          ... shared library items to : C:\texutils\l2h\pippo
          ... unshared library items to : C:\texutils\l2h\pippo
    Starting build...
    ... building latex2html
    build.pl (Revision 1.6)
    Building "latex2html.bat" from "latex2html.pin"
    ... building pstoimg
    build.pl (Revision 1.6)
    Building "pstoimg.bat" from "pstoimg.pin"
    ... building texexpand
    build.pl (Revision 1.6)
    Building "texexpand.bat" from "texexpand.pin"
    ... building configuration module
    build.pl (Revision 1.6)
    Building "l2hconf.pm" from "l2hconf.pin"
    Configuration procedure finished
     

  3. Open C:\texutils\l2h_inst\pstoimg.bat. This script takes care of converting PS files usually used in Latex documents in PNG files, suitable for the web. It relies on netpbm to perform the conversions, but since this one evolved meanwhile, there's a small incompatibility in specifying, from command line, the PNG transparency. Change the line 1253:

    $trans_color = $TRANSPARENT_COLOR||'gray85';

    to:

    $trans_color = $TRANSPARENT_COLOR||'#d9d9d9';

    Now latex2html should be ready.
  4. Launch C:\texutils\l2h_inst\test.bat to perform a test. If everything is ok, in the folder C:\texutils\l2h_inst\tests\l2h\test it there will be an HTML document converted. We can verify that in this document tables, characters images and formulas were correctly converted.
  5. If everything was fine, we can finally install latex2html with the command C:\texutils\l2h_inst\install.bat into the directory C:\texutils\l2h.
  6. To convert files use latex2html.bat contained in the directory C:\texutils\l2h\bin\. To avoid path problems, is better to copy in this directory all the latex files, included images.
Finally, don't forget that latex2html, as all the command line programs, has a lot of options. To know them, you can read its website, or write

latex2html -h

An usage example is this:

latex2html nome_file_tex -info 0 -image_type gif -antialias -antialias_text -long_titles 6 -short_extn -top_navigation -contents_in_navigation -index_in_navigation -split +2 -show_section_numbers


Useful links
If you want to read more, you can check these links which I read while writing this guide:






Comments

No comment present!

Write a comment

You can write here a comment to the article you've just read. Smiles, links and images are not allowed. The maximum comment length is 4000 characters. Please be polite, all the offensive messages will be deleted.

Your comment (lascia bianco!):
Uses (max 25 characters, required)
Web site (max 255 characters, optional)
e-Mail (max 255 characters, optional, will not be published) Your opinion (lascia bianco!):
Comment (max 4000 characters, required):





Valid HTML 4.01 Transitional
E-Mail - 130.38 ms

Valid HTML 4.01 Transitional