Programming thoughts and other randomities: Koboish ebooks

Thursday, July 19, 2012

Koboish ebooks

I really like the kobo touch but I noticed that there is a difference in how side-loaded epubs and epubs downloaded from the kobo store are treated by the device. The thing that I liked most about the ones from kobo was the page numbering that corresponds to actual screens required to reach the end of a chapter. After noticing this I started combing the net and discovered the infamous kepub format. On finding this I also found that changing the extension of the files from .epub to to .kepub.epub should give me access to some of those features so I did just that.
Results were mixed:

Hooray, we've got page numbering
In addition to page numbering we can also control the text appearance using the fonts menu
hmmm, where did the covers go? (actually I did read about this happening)
the third thing is more subtle and I only noticed that after reading about 30 pages of a chapter, closing the book and then trying to pick up where I had left off: while the chapter was correct, it always loaded the first page. The problem here is that I could no longer add annotations to the book.

Since I really, really wanted that page numbering but I don't have the time to read a book in one go I started to poke around a little.
A warning first:

this work on the kobo touch with the 2.0 firmware
While I've managed to get covers and annotations working I don't know if anything else broke
FOLLOW THIS GUIDE AT YOUR OWN RISK

The first thing you need to do is to change the extension of your epub to .kepub.epub and upload it to your kobo (disconnect the device to trigger it to detect the book). The rest I'll split into three parts:
Give me my cover back
Recover annotations
Get lazy with scripting

Give me my cover back

The first thing I wanted to do was get rid of the ugly kobo generated covers. Even if the epub has a cover set after changing the extension to .kepub.epub the cover is no longer used and is replaced with some generated covers.
In order to fix this you need to do the following:

Assuming you cover is a jpeg file named mywonderfullcover.jpg
you will need your cover in four sizes (you can also convert it to grayscale to reduce the size) and each one of this files needs to have a special name and the extension changed from .jpg to .parsed:

"mywonderfullcover - N3_LIBRARY_FULL.parsed" - size: 355 x 530 pixels
"mywonderfullcover - N3_LIBRARY_GRID.parsed" - size: 149 x 223 pixels
"mywonderfullcover - N3_LIBRARY_LIST.parsed" - size: 60 x 90 pixels
"mywonderfullcover - N3_LIBRARY_SHELF.parsed" - size: 40 x 60 pixels

Connect you kobo and copy the files you created to the .kobo/images folder
Next you'll need to update the sqlite database. The file can be found at .kobo/KoboReader.sqlite. I suggest making a back of this first. Open the database. I use SQLiteSpy but you can use any sqlite browser you like. Open up the content table and look for the row that contains your book name in the ContentID column and has the ContentType column set to 6. The ImageId column for this row should be empty. Set it to "mywonderfullcover".
Save the changes and disconnect the reader
In order for the cover to be detected you have to restart the reader (if anyone knows of another way of refreshing the data from the database please let me know)

Now you should have your cover back, plus the additional benefits already listed but you still don't have any annotations

Recover annotations

At first I thought that the annotations are kept somewhere in the database...I was wrong.
I also tried putting the file in the .kobo/kepub folder and adding the required entried manually in the database but still there were no annotations although I did learn some stuff about the database structure while doing this
After doing a little more reading on the forums I found some opinions that there are some special javascripts involved so I opened a free book that I got from the kobo site. This also proved wrong but it did lead me in the correct direction.
There doesn't seem to be any javascript involved but, in looking at the html files it look like the html files inside the kepubs have some special tags inserted that allow annotations to be retrieved.
So what I did was open each html file and wrap the content of each top level h and p tags in a span tag with the id set to "kobo.[incremental_count]".1 where incrementa_count starts at 1 and goes as high as you need it to.
To illustrate this let's suppose you have the following insisde the body tag of your html:

<h1>Chapter</h1>
<h2>SubChapter</h2>
<p>some paragraph</p>
<p>some other paragraph</p>
<p>the last paragraph</p>

This would be changed to:

<h1><span id="kobo.1.1">Chapter</span></h1>
<h2><span id="kobo.2.1">SubChapter</span></h2>
<p><span id="kobo.3.1">some paragraph</span></p>
<p><span id="kobo.4.1">some other paragraph</span></p>
<p><span id="kobo.5.1">the last paragraph</span></p>

Upload your .kepub.epub back on the device and enjoy the freshly recovered annotations.

Get lazy with scripting

Automatic renaming using calibre

If you use calibre you can configure it to rename your file when uploading it to the reader by going to "Preferences->Sending books to device" and and adding ".kepub" at the end of the title template.

Automatic photo shop script for updating images

If you use photoshop you can use the script below to automatically create and rename your cover files (save the code in a file with the extension set to .jsx). Just open your jpeg file in photoshop and go to File->Scripts->Browse and select the file where you saved the script code. Your original file will be left unchanged.

    #target photoshop
    main();
    function main(){
    if(!documents.length) return;
    var startRulerUnits = app.preferences.rulerUnits;
    app.preferences.rulerUnits = Units.PIXELS;
 var quality = 10
    var doc = app.activeDocument;
    var Name = doc.name.replace(/\.[^\.]+$/, '');
    var Path = decodeURI(doc.path);
 var outFolder = Folder(Path);// +"/"+Name);
    if(!outFolder.exists) outFolder.create();

 doc.changeMode(ChangeMode.GRAYSCALE);
 doc.bitsPerChannel = BitsPerChannelType.EIGHT
    createNamedSnapshot("Snap 1");

    doc.resizeImage(355, 530, 96, ResampleMethod.BICUBIC);
    var saveFile = File(outFolder +"/"+Name+" - N3_LIBRARY_FULL.parsed");
    SaveJPEG(saveFile,quality);
    var saveFile = File(outFolder +"/"+Name+" - N3_LIBRARY_FULL.jpeg");
    SaveJPEG(saveFile,quality);
    revertNamedSnapshot("Snap 1");
 
    doc.resizeImage(149, 223, 96, ResampleMethod.BICUBIC);
    var saveFile = File(outFolder +"/"+Name+" - N3_LIBRARY_GRID.parsed");
    SaveJPEG(saveFile,quality);
    revertNamedSnapshot("Snap 1");
 
    doc.resizeImage(60, 90, 96, ResampleMethod.BICUBIC);
    var saveFile = File(outFolder +"/"+Name+" - N3_LIBRARY_LIST.parsed");
    SaveJPEG(saveFile,quality);
    revertNamedSnapshot("Snap 1");
 
    doc.resizeImage(40, 60, 96, ResampleMethod.BICUBIC);
    var saveFile = File(outFolder +"/"+Name+" - N3_LIBRARY_SHELF.parsed");
    SaveJPEG(saveFile,quality);
 
    app.activeDocument.close(SaveOptions.DONOTSAVECHANGES);
    app.preferences.rulerUnits = startRulerUnits;
    }

    function SaveJPEG(saveFile, jpegQuality){
    jpgSaveOptions = new JPEGSaveOptions();
    jpgSaveOptions.embedColorProfile = true;
    jpgSaveOptions.formatOptions = FormatOptions.STANDARDBASELINE;
    jpgSaveOptions.matte = MatteType.NONE;
    jpgSaveOptions.quality = jpegQuality; //1-12
    activeDocument.saveAs(saveFile, jpgSaveOptions, true,Extension.LOWERCASE);
    }

    function createNamedSnapshot(name) {
        var desc = new ActionDescriptor();
            var ref = new ActionReference();
            ref.putClass( charIDToTypeID('SnpS') );
        desc.putReference( charIDToTypeID('null'), ref );
            var ref1 = new ActionReference();
            ref1.putProperty( charIDToTypeID('HstS'), charIDToTypeID('CrnH') );
        desc.putReference( charIDToTypeID('From'), ref1 );
        desc.putString( charIDToTypeID('Nm  '), name );
        desc.putEnumerated( charIDToTypeID('Usng'), charIDToTypeID('HstS'), charIDToTypeID('FllD') );
        executeAction( charIDToTypeID('Mk  '), desc, DialogModes.NO );
    }

    function revertNamedSnapshot(name) {
        var desc = new ActionDescriptor();
            var ref = new ActionReference();
            ref.putName( charIDToTypeID('SnpS'), name );
        desc.putReference( charIDToTypeID('null'), ref );
        executeAction( charIDToTypeID('slct'), desc, DialogModes.NO );
    }

Automatic kobofying

Now all that is fine but what if you don't want to spend twice as much time as is requred to read a book just to kobofy it.
No worries, since I'm a little lazy myself I create the small python script bellow. In order to run it you need python 2.7 and BeautifulSoup installed.
In order to use it unpack your files to a folder, create file called something.py inside that folder containing the code bellow and run it by calling "python something.py".
Warning: this script was hastily put toghether so it has a few restrictions:

It always adds the spans to h and p tags so don't run it multple times on the same file
It only works on top level h and p tags so, for example, if your h and p tags are wrapped in a div or another tag it won't work.

import sys
from bs4 import BeautifulSoup
import re
import os, os.path
 
def altertags(soup):
 counter = 1
 for tag in soup.body.find_all(re.compile("^(p|h)"), recursive=False):
  new_tag = soup.new_tag("span", id="kobo."+str(counter)+".1")
  counter = counter + 1
  tag.wrap(new_tag)
  tag.unwrap()
  new_tag.wrap(tag)

def parsefile(infile, outfile):
 soup = BeautifulSoup(open(infile))
 altertags(soup)
 output = open(outfile,'w')
 output.write(str(soup))
 output.close()
 

if (len(sys.argv) > 2):
 infile = sys.argv[1]
 outfile = sys.argv[2]
 parsefile(infile, outfile)
 
if (len(sys.argv) == 2):
 infile = sys.argv[1]
 parsefile(infile, infile)

if (len(sys.argv) < 2):
 for file in os.listdir("."):
  if file.endswith("html"):
   parsefile(file, file)

If someone can improve this please share.
I don't actually use this as it requires you to unpack the book, repack and so on and that's to much work for me so I just modified kiwidude's Modify Epub plugin for calibre to do this for me.

Well...that's about it. If anyone reads this and finds it useful feedback is always appreciated.

33 comments:

MaxJuly 31, 2012 at 12:17 AM
Hi, may I ask if it's possible to get a copy of your modified "Modify Epub" plugin?
ReplyDelete
Replies
MichaelJuly 31, 2012 at 12:34 PM
Hi dsandrei
Thanks so much for the uploaded my modified Modify Epub.
Can not believe on 2 requests as it changes it to Kepub!
Perhaps too busy complaining about V2. I bought mine only 3 weeks ago so I am learning. Knew nothing about epubs. If it was VBA for Access I would be OK.
Just a bit of feed back and request for help.
1st found had to add .kepub to the "Configure the device" for it to work. As putting it in Preference>Output Options had no effect.
2nd download your zip file and used Preferences>Load Plugin from file. (I already had Modify Epub installed and did not remove it as I could not find it). Seemed to update as I now have "Add Kobo paragraph markers". Highlighted epub > tick "Add Kobo paragraph markers"> OK. Get calibre, version 0.8.61
ERROR: Modify ePub failed: No ePub files were updated

Modify ePubs
Logfile for book ID 133 (Master and Commander / Patrick O'Brian)
133
Modifying: C:\Users\Michael\AppData\Local\Temp\calibre_0.8.61_tmp_iq9uxm\uokqnn_modify_epub\133.epub
Trying to add kobo page markers
ePub not changed after 0.99 seconds
Any thoughts on what I am doing wrong?
Many thanks for your hard work.
Michael
ReplyDelete
Replies
MichaelJuly 31, 2012 at 4:00 PM
Hi 1 epub I purchased from Amazon converted ok.
Looked into another and found "eagerness in his step than the mere Lieutenant Aubrey would have shown./p
div class="calibre1">/div
div class="mbppagebreak" id="calibre_pb_1"/div
/div" for example.
Don't understand DIV tag any suggestions on what I can do?
Many thanks
Michael
ps had to remove <> as did not allow tags
ReplyDelete
Replies
MichaelAugust 5, 2012 at 11:30 AM
Hi Daniel
Thanks for pointers and have had success in using "Search & Replace" in calibre convert to remove a DIV etc from HTML's in one easy click.
Now with your jsx PS file I have the covers as well. Really please as my ebooks now work well. No really interested in Shelves as I read a book and the delete it.
Calibre works no problem with V 2. Just go the CalibrePreferences "dialog" look for the Advanced section at the bottom, then Plugins, expand the line that reads Device Interface Plugins, select Kobo Reader Device Interface and then Customize, and finally, on that screen select the option Attempt to support new...
In our libraries lend epubs so that where I hope to get my books from.
Thanks again and any tips you have come across I would be interested in as I don't think Kobo will do a lot with epubs as they wish to sell Kepubs.
Michael
ReplyDelete
Replies
MichaelAugust 8, 2012 at 4:42 AM
Hi Daniel
As I said "Search & Replace" in calibre convert to remove a DIV etc from HTML's in one easy click. Some CSS margins can be wrong and this can be change on convert by removing with "Look and Feel".
Find to make sure it rememberer's where it is must goto sleep mode first. Although is set sleep 15mins off 60 or never works fine.
You said you are not sure about converting to Kepubs. Have you found some problems? Be interested.
Thanks Michael
ReplyDelete
Replies
MichaelAugust 14, 2012 at 6:55 AM
Hi Daniel hope the holiday was good.
Don't seem to notice any major problem in use except sleep mode first to remember position. A Kobofied epub is better in that making the bottom menu appear I see the little stripes indicating new chapter starts and how much of the book read by the thick black line from the left. Kobo books do not have this and I just read a Kobo book (purchased) which was in 5 parts with about 22 chapters in each, so really missed the little stripes in bottom menu to see how far I had read.
I'm interest too if automation can be improved but it seems very few are interested!
One change I make is to change "font-size: 1em;" to "font-size: 1.1em;" in stylesheet.css. I find that the font size is then the same as Kobo books and I can change between type without altering font size or margins
ReplyDelete
Replies
UnknownAugust 14, 2012 at 7:11 AM
I usually read one book at a time so it's not difficult changing the font size on the device. I just strip out the font formatting from the epubs.
ReplyDelete
Replies
MichaelSeptember 1, 2012 at 2:09 AM
Hi Daniel
Downloaded 2.1.1 seems to work great with Kobofied books.
I am the same as you just keep books I am reading on my KT. If I want a lib keep it on PC.
ReplyDelete
Replies
UnknownSeptember 1, 2012 at 2:40 AM
Thanks for the news. Didn't know they released a new version. Any improvements?
ReplyDelete
Replies
MichaelSeptember 1, 2012 at 11:51 PM
It's a Japan release that people are trying. Forum reports are mixed but most have no problems. Clock is back and speed in epubs which I don't use now are improved. Still like to see, as you do, if adding to KoboReader.sqlite could be automated. Might try another forum post to see it it generates more interest?
ReplyDelete
Replies
Joel GoguenJanuary 12, 2013 at 11:39 AM
I'm working on an extension to the Calibre Kobo Touch driver to auto-magically handle most of this when sending an ePub file to a Kobo Touch/Glo/Mini. I only have a Glo to test with though, but it looks like everything should be the same from the database side. I'm using lxml directly, I found that using BeautifulSoup caused more issues than it was worth (mostly around poorly-converted ePubs or some ePubs from the Kobo store that have intermingled file encodings) and lxml handled them much better.

Here's what the driver currently does if the uploaded file ends in '.epub':
1. For all h[1-6] and p tags in every HTML file, wrap the contents of the tag in a span tag with id attribute "kobo.X.1", where X is an incrementing counter starting at 1, reset with each file. The driver checks for the existence of these span tags and skips this processing on a per-tag basis. There's no validation that existing tags aren't duplicated elsewhere so it would be a bad idea to pass a half-processed file through this. It will still work on the device, but annotations and last read position might get wonky.
2. Ensure the file name on the device ends with the required '.kepub.epub', so adding this to preferences somewhere is not required.
3. Generate the database entries for each ePub file. There's an entry for the book itself, entries for anything in toc.ncx, and an entry for each individual HTML file defined in content.opf. The database entries I add include series information and the proper Image ID, but this doesn't stop the Kobo device from processing the new books after unplugging the device, which strips out series and image ID data. Still working on how to prevent the device from automatically processing these files.
4. Every time metadata is synced to the device, re-set the Image ID field for every book entry with a NULL ImageId field.

The images are already uploaded and properly sized by Calibre so there isn't any need to deal with converting/sizing images. They (and series information, if you're setting it) won't display the first time you eject the device after copying the books but plug it back in after the Kobo does its processing, let calibre update the metadata, and it'll be ready to go.

I'm not sure it's ready for publishing yet (I'm still testing some things around ePubs, haven't tried non-ePub files yet) but the source is on Github at https://github.com/jgoguen/calibre-kobo-driver for anyone that wants to try it out. The usual disclaimers (at your own risk, your mileage may vary, back up your data before you start, not my fault if anything breaks or explodes or ignites or if Zerg start crawling out of your computer) all apply.
ReplyDelete
Replies
AlarinJanuary 18, 2013 at 12:30 PM
Regarding Joel plugin, and the problems encauntered by peppe and jarruda:

i got the same problems, and i found a solution: in calibre, you have to set the metadata managment to "automatic", then, copy the books in your kobo, unplug, plug back and the program calibre will send a new stream of metadata; unplug again, and you'll have your covers ;)
to set the metadata managment to automatic go to Preferences | Import/Export | Sending books to devices and change it.

Hope this will help ;)

Tested with kobo glow 2.1.5
ReplyDelete
Replies

Add comment