44

Can I do this conversion with any programming language or library?

Sridhar Ratnakumar
  • 68,948
  • 61
  • 139
  • 172
Juanjo Conti
  • 25,163
  • 37
  • 101
  • 128

7 Answers7

60

The short answer is yes, it can be done in any programming language.

Basic steps:

  1. Convert your HTML to XHTML (+ CSS). This can be done in your program or through an XSLT file.
  2. Copy your files (XHTML, CSS, any images and fonts) into a directory structure that follows the format.
  3. Zip the directory structure up and name the archive with a ".epub" extension.

Some web sites to help you get started:

June 2015 Note: The epubcheck validator has moved from google code to GitHub; note the new URL.

eb1
  • 2,737
  • 3
  • 29
  • 42
16

Calibre supports a wide variety of input formats, including HTML, and a wide variety of output formats, including EPUB, but it's not "a programming language or library". Are there specific reasons you desire a programming-based approach rather than a free-standing tool? If so, maybe Python and ebookmaker.py, for example, could help you.

Ry-
  • 199,309
  • 51
  • 404
  • 420
Alex Martelli
  • 762,786
  • 156
  • 1,160
  • 1,345
3

A late reply, but I found the Python 3-based ebookmaker to be of value, at least after I contributed a pull request to remove a UTF-8 BOM. One problem with it appears to be that it uses brittle regular expressions to parse HTML, but I guess I'll have to report it there.

Community
  • 1
  • 1
Shlomi Fish
  • 4,070
  • 2
  • 21
  • 27
1

I just started to implement such a tool in Java (OpenJDK compatible): html2epub. In order to get rid of manually editing the config file, I'll probably start a separate tool to generate the config file from any given directory (however, it would still be necessary to determine the order of the XHTMLs in the EPUB - for non-programmatical use, developing a GUI helper tool could be considered, for a fully flexible programmatical solution, I haven't come up with an idea yet). Before that, I implemented shell script based converters for custom XML input (hag2epub tools) - in case you're interested, I would probably port them to XHTML input (with a config file for the EPUB metadata or obtaining metadata from the topmost index.html of a directory, if existing).

skreutzer
  • 19
  • 3
  • if you would license it under Apache 2.0 it would be way to go for many people, as its under AGPL i cannot use it. its pitty :( – To Kra Oct 14 '15 at 07:00
  • Could you please tell me how the AGPL could possibly blocking you from using it? – skreutzer Oct 19 '15 at 22:58
  • cannot use AGPL in commercial product – To Kra Oct 21 '15 at 08:26
  • That's simply not true. To the contrary: being usable in a commercial context or as a commercial product is an important right free software grants its users, so does the AGPL too. – skreutzer Oct 22 '15 at 20:31
1

Here's pdf to epub, I know that's not what you're after, but it's a start.

The calibre package may have what you want

cofiem
  • 1,374
  • 1
  • 16
  • 28
1

I am using the following library from Aspose - http://www.aspose.com/categories/.net-components/aspose.words-for-.net/default.aspx

In just two lines of code I am able to do html to epub conversions. Using this currently in a production system.

Document doc = new Document(_sourceFilePath);

doc.Save(_destinationFilePath, SaveFormat.Epub);

Brian Singh
  • 6,416
  • 4
  • 22
  • 22
0

I have the same issue previously, necause I want to read some webpage content offline on my iPad. I have no idea and I am not a computer savvy. There are calibre or stanza blabla....

But for me they are just formats converters and I need a ePub book creator which will allows me to combine many desired documents together to read. Then I found a bookish html to ePub converter, I save the html page from web then convert with it. It's a quite good tool for me now.

user81718
  • 24
  • 1