A Python interface to XSL-FO libraries (Conversion HTML to PDF, RTF, DOCX, WML and ODT)
Project description
=======================================
A Python interface to XSL-FO libraries.
=======================================
The zopyx.convert package helps you to convert HTML to PDF, RTF, ODT, DOCX and
WML using XSL-FO technology.
Requirements
============
- Java 1.5.0 or higher (FOP 0.94 requires Java 1.6 or higher)
- `csstoxslfo`__ (included)
__ http://www.re.be/css2xslfo
- `XFC-4.0`__ (XMLMind) for ODT, RTF, DOCX and WML support (if needed)
__ http://www.xmlmind.com/foconverter
- `XINC 2.0`__ (Lunasil) for PDF support (commercial)
__ http://www.lunasil.com/products.html
- or `FOP 0.94`__ (Apache project) for PDF support (free)
__ http://xmlgraphics.apache.org/fop/download.html#dist-type
- `BeautifulSoup`__ (will be installed automatically through easy_install. See Installation.)
__ http://www.crummy.com/software/BeautifulSoup/
- `ElementTree`__ (will be installed automatically through easy_install. See Installation.)
__ http://effbot.org/zone/element-index.html
Installation
============
- install **zopyx.convert** either using ``easy_install`` or by downloading the sources from the Python Cheeseshop.
This will install automatically the Beautifulsoup and Elementree modules if necessary.
- the environment variable *$XFC_DIR* must be set and point to the root of your XFC installation directory
- the environment variable *$XINC_HOME* must be set and to point to the root of your XINC installation directory
- the environment variable *$FOP_HOME* must be set and point to the root of your FOP installation directory
Supported platforms
===================
Windows, Unix
Subversion repository
=====================
- http://svn-public.zopyx.com/viewvc/python-projects/zopyx.convert/trunk/
Usage
=====
Some examples from the Python command-line::
from zopyx.convert import Converter
C = Converter('/path/to/some/file.html')
pdf_filename = C('pdf') # using XINC
pdf2_filename = C('pdf2') # using FOP
rtf_filename = C('rtf')
pdt_filename = C('odt')
wml_filename = C('wml')
docx_filename = C('docx')
A very simple command-line converter is also available::
xslfo-convert --format rtf --output foo.rtf sample.html
`xslfo-convert` has a --test option that will convert some
sample HTML. If everything is ok then you should see something like that::
>xslfo-convert --test
Entering testmode
pdf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.pdf
rtf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.rtf
docx: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.docx
odt: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.odt
wml: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.wml
pdf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.pdf
rtf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.rtf
docx: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.docx
odt: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.odt
wml: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.wml
How zopyx.convert works internally
==================================
- The source HTML file is converted to XHTML using mxTidy
- the XHTML file is converted to FO using the great "csstoxslfo" converter
written by Werner Donne.
- the FO file is passed either to the external XINC or XFC converter to
generated the desired output format
- all converters are based on Java technology make the conversion solution
highly portable across operating system (including Windows)
Known issues
============
- If you are using zopyx.convert together with FOP: use the latest FOP 0.94
only. Don't use any packaged FOP version like the one from MacPorts which is
known to be broken.
- Ensure that you have read the ``csstoxslfo`` documentation. ``csstoxslfo`` has
several requirements about the HTML markup. Don't expect that it is the ultimate
HTML converter. Any questions regarding the necessary markup are documented in the
``csstoxslfo`` documentation and will not be answered.
Author
======
**zopyx.convert** was written by Andreas Jung for ZOPYX Ltd. & Co. KG, Tuebingen, Germany.
License
=======
**zopyx.convert** is published under the Zope Public License 2.1 (ZPL).
See LICENSE.txt.
Contact
=======
| ZOPYX Ltd. & Co. KG
| c/o Andreas Jung,
| Charlottenstr. 37/1
| D-72070 Tuebingen, Germany
| E-mail: info at zopyx dot com
| Web: http://www.zopyx.com
Changes:
========
1.1.11 (07.06.2009)
------------------
- moved code repository to svn.zope.org
- changed license to ZPL
1.1.10 (29.05.2009)
------------------
- support for USE_OS_SYSTEM environment variable (as workaround
for hanging Java processes)
1.1.9 (04.01.2009)
------------------
- fixed packaging issue
1.1.8 (26.06.2008)
------------------
- changed logging levels
- reorganized files
1.1.7 (20.06.2008)
------------------
- better support for csstoxslfo commandline options
1.1.6 (19.04.2008)
------------------
- call 'fop' using bash
- better logger configuration
- minor code cleanup
1.1.5 (01.03.2008)
------------------
- updated documentation
1.1.4 (05.02.2008)
------------------
- remove duplicate ID attributes
1.1.3 (31.01.2008)
------------------
- clearified Java requirements for FOP
1.1.2 (22.01.2008)
------------------
- removed some nasty debugging code
1.1.1 (22.01.2008)
------------------
- supporting FOP on Windows
1.1.0 (20.01.2008)
------------------
- support for free FOP PDF converter
1.0.6 (14.10.2007)
------------------
- html2fo: added workaround for generated FO code for
PRE tags
1.0.5 (05.10.2007)
------------------
- minor bugfixes
1.0.4 (05.10.2007)
------------------
- Windows support added
1.0.3 (04.10.2007)
------------------
- passing -Duser.language=en to java in order to
prevent corrupted FO code caused by locales
1.0.2 (03.10.2007)
------------------
- bugfix
1.0.1 (03.10.2007)
------------------
- added --test option to command-line frontend
1.0.0 (30.09.2007)
------------------
- update to css2xslfo V 1.5.0
- official 1.0.0 release
0.5.0 (09.09.2007)
------------------
- replaced mxTidy related code with the BeautifulSoup
module (no longer requires any compiling)
- html2fo checks the existence of images
0.4.9 (25.07.2007)
------------------
- support for utidy lib (which is the preferred tidy library).
Using mx.Tidy only as fallback
0.4.8 (unreleased)
------------------
- unreleased
0.4.7 (08.07.2007)
------------------
- reSTified documentation
0.4.6 (08.07.2007)
------------------
- fixes in availableFormats()
0.4.5 (07.07.2007)
------------------
- various FO fixes
0.4.4 (06.07.2007)
------------------
- using logging module
0.4.3 (05.07.2007)
------------------
- html2fo: using ElementTree for most FO modifications
0.4.2 (30.06.2007)
------------------
- converting page-break-after: always back into break-after: page
0.4.1 (24.06.2007)
------------------
- various fixes
0.4.0 (24.06.2007)
------------------
- added zope interfaces
- converters are now classes
- added unittests
0.3.1 (18.06.2007)
------------------
- html2fo() and the converter constructor got a new 'encoding'
parameter in order to specify the input encoding of the
HTML file. This parameter will be passed down to Tidy in order
to perform a proper conversion of non-ascii characters.
0.3.0 (unreleased)
------------------
- using subprocess module of Python
- new Convert() class for high-level XSLFO access
- logger added
- better checks for XINC, XFC
- updated documentation
0.2.0 (16.06.2007)
------------------
- PDF support added
- command line interface added
- mxTidy integration
0.1.0 (16.06.2007)
------------------
- initial release
A Python interface to XSL-FO libraries.
=======================================
The zopyx.convert package helps you to convert HTML to PDF, RTF, ODT, DOCX and
WML using XSL-FO technology.
Requirements
============
- Java 1.5.0 or higher (FOP 0.94 requires Java 1.6 or higher)
- `csstoxslfo`__ (included)
__ http://www.re.be/css2xslfo
- `XFC-4.0`__ (XMLMind) for ODT, RTF, DOCX and WML support (if needed)
__ http://www.xmlmind.com/foconverter
- `XINC 2.0`__ (Lunasil) for PDF support (commercial)
__ http://www.lunasil.com/products.html
- or `FOP 0.94`__ (Apache project) for PDF support (free)
__ http://xmlgraphics.apache.org/fop/download.html#dist-type
- `BeautifulSoup`__ (will be installed automatically through easy_install. See Installation.)
__ http://www.crummy.com/software/BeautifulSoup/
- `ElementTree`__ (will be installed automatically through easy_install. See Installation.)
__ http://effbot.org/zone/element-index.html
Installation
============
- install **zopyx.convert** either using ``easy_install`` or by downloading the sources from the Python Cheeseshop.
This will install automatically the Beautifulsoup and Elementree modules if necessary.
- the environment variable *$XFC_DIR* must be set and point to the root of your XFC installation directory
- the environment variable *$XINC_HOME* must be set and to point to the root of your XINC installation directory
- the environment variable *$FOP_HOME* must be set and point to the root of your FOP installation directory
Supported platforms
===================
Windows, Unix
Subversion repository
=====================
- http://svn-public.zopyx.com/viewvc/python-projects/zopyx.convert/trunk/
Usage
=====
Some examples from the Python command-line::
from zopyx.convert import Converter
C = Converter('/path/to/some/file.html')
pdf_filename = C('pdf') # using XINC
pdf2_filename = C('pdf2') # using FOP
rtf_filename = C('rtf')
pdt_filename = C('odt')
wml_filename = C('wml')
docx_filename = C('docx')
A very simple command-line converter is also available::
xslfo-convert --format rtf --output foo.rtf sample.html
`xslfo-convert` has a --test option that will convert some
sample HTML. If everything is ok then you should see something like that::
>xslfo-convert --test
Entering testmode
pdf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.pdf
rtf: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.rtf
docx: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.docx
odt: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.odt
wml: /tmp/tmpuOb37m.html -> /tmp/tmpuOb37m.wml
pdf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.pdf
rtf: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.rtf
docx: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.docx
odt: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.odt
wml: /tmp/tmpZ6PGo9.html -> /tmp/tmpZ6PGo9.wml
How zopyx.convert works internally
==================================
- The source HTML file is converted to XHTML using mxTidy
- the XHTML file is converted to FO using the great "csstoxslfo" converter
written by Werner Donne.
- the FO file is passed either to the external XINC or XFC converter to
generated the desired output format
- all converters are based on Java technology make the conversion solution
highly portable across operating system (including Windows)
Known issues
============
- If you are using zopyx.convert together with FOP: use the latest FOP 0.94
only. Don't use any packaged FOP version like the one from MacPorts which is
known to be broken.
- Ensure that you have read the ``csstoxslfo`` documentation. ``csstoxslfo`` has
several requirements about the HTML markup. Don't expect that it is the ultimate
HTML converter. Any questions regarding the necessary markup are documented in the
``csstoxslfo`` documentation and will not be answered.
Author
======
**zopyx.convert** was written by Andreas Jung for ZOPYX Ltd. & Co. KG, Tuebingen, Germany.
License
=======
**zopyx.convert** is published under the Zope Public License 2.1 (ZPL).
See LICENSE.txt.
Contact
=======
| ZOPYX Ltd. & Co. KG
| c/o Andreas Jung,
| Charlottenstr. 37/1
| D-72070 Tuebingen, Germany
| E-mail: info at zopyx dot com
| Web: http://www.zopyx.com
Changes:
========
1.1.11 (07.06.2009)
------------------
- moved code repository to svn.zope.org
- changed license to ZPL
1.1.10 (29.05.2009)
------------------
- support for USE_OS_SYSTEM environment variable (as workaround
for hanging Java processes)
1.1.9 (04.01.2009)
------------------
- fixed packaging issue
1.1.8 (26.06.2008)
------------------
- changed logging levels
- reorganized files
1.1.7 (20.06.2008)
------------------
- better support for csstoxslfo commandline options
1.1.6 (19.04.2008)
------------------
- call 'fop' using bash
- better logger configuration
- minor code cleanup
1.1.5 (01.03.2008)
------------------
- updated documentation
1.1.4 (05.02.2008)
------------------
- remove duplicate ID attributes
1.1.3 (31.01.2008)
------------------
- clearified Java requirements for FOP
1.1.2 (22.01.2008)
------------------
- removed some nasty debugging code
1.1.1 (22.01.2008)
------------------
- supporting FOP on Windows
1.1.0 (20.01.2008)
------------------
- support for free FOP PDF converter
1.0.6 (14.10.2007)
------------------
- html2fo: added workaround for generated FO code for
PRE tags
1.0.5 (05.10.2007)
------------------
- minor bugfixes
1.0.4 (05.10.2007)
------------------
- Windows support added
1.0.3 (04.10.2007)
------------------
- passing -Duser.language=en to java in order to
prevent corrupted FO code caused by locales
1.0.2 (03.10.2007)
------------------
- bugfix
1.0.1 (03.10.2007)
------------------
- added --test option to command-line frontend
1.0.0 (30.09.2007)
------------------
- update to css2xslfo V 1.5.0
- official 1.0.0 release
0.5.0 (09.09.2007)
------------------
- replaced mxTidy related code with the BeautifulSoup
module (no longer requires any compiling)
- html2fo checks the existence of images
0.4.9 (25.07.2007)
------------------
- support for utidy lib (which is the preferred tidy library).
Using mx.Tidy only as fallback
0.4.8 (unreleased)
------------------
- unreleased
0.4.7 (08.07.2007)
------------------
- reSTified documentation
0.4.6 (08.07.2007)
------------------
- fixes in availableFormats()
0.4.5 (07.07.2007)
------------------
- various FO fixes
0.4.4 (06.07.2007)
------------------
- using logging module
0.4.3 (05.07.2007)
------------------
- html2fo: using ElementTree for most FO modifications
0.4.2 (30.06.2007)
------------------
- converting page-break-after: always back into break-after: page
0.4.1 (24.06.2007)
------------------
- various fixes
0.4.0 (24.06.2007)
------------------
- added zope interfaces
- converters are now classes
- added unittests
0.3.1 (18.06.2007)
------------------
- html2fo() and the converter constructor got a new 'encoding'
parameter in order to specify the input encoding of the
HTML file. This parameter will be passed down to Tidy in order
to perform a proper conversion of non-ascii characters.
0.3.0 (unreleased)
------------------
- using subprocess module of Python
- new Convert() class for high-level XSLFO access
- logger added
- better checks for XINC, XFC
- updated documentation
0.2.0 (16.06.2007)
------------------
- PDF support added
- command line interface added
- mxTidy integration
0.1.0 (16.06.2007)
------------------
- initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
zopyx.convert-1.1.11.tar.gz
(349.9 kB
view hashes)