How I didn't make an ebook
Jul. 5th, 2009 04:41 pm[Note: One should read this realizing that I know nothing about photography. For example, I understand the megapixel myth only because my day job involves semiconductors.]
For the first time in my life, I own a camera. Ok, it's the camera in my iPhone. However, it has auto-focus and a macro mode. Surely, I ought to be able to make ebooks by taking pictures of book pages and doing OCR then. The answer turns out to be yes in theory, but not yet in practice.
(Yes, I know. I can do this with an actual scanner. The whole point was to see if I could make an ebook with what I already have on hand and what I can download from the internet. I have a digital camera on hand. I don't have a scanner. Also, the standard process of scanning a book is to cut away the spine. The digital camera scenario leaves the book intact.)
The whole process ought to be simple. Take the picture. Load it into the OCR software. It figures out everything out and generates a PDF, LIT file or whatever. The details are what turns it into several protracted days of wacky hijinks.
It took me more than several tries before I managed to take a picture with the iPhone that matched what I saw through its viewfinder. My problem is that I'd tap the button when I took the picture. In the process, I pushed the phone down. The camera was closer to the page than I'd intended. Eventually, I realized that it takes a picture not when you tap the screen, but when you lift your finger off the button. (Ok, I ended up googling for this info.) This gets me, more or less, a properly framed picture. Of course, what I really need is a tripod for my phone because it turns out to be next to impossible to get a picture where the text isn't slightly blurred because I can't hold the phone absolutely steady. This makes is hard to get accurate OCR.
Another obstacle is that, because of the spine, book pages are curved. OmniPage implies they take care of this when they do OCR. They talk a lot about 3D-correction technology. Since they don't have a free trial though, I couldn't test that. Snapter also takes corrects for page curvature and has a free trial but doesn't do OCR. I used a free trial of ABBYY FineReader to do the OCR on the output of Snapter. FineReader will correct for page skew and line skew, but it doesn't appear to deal with curvature.
The good news is that the resolution of my iPhone camera is perhaps just good enough. ABBYY FineReader recognized the text that was in focus. (It did throw a lot of warnings about the resolution not being good enough though.) The bad news is that a lot had to go right to get to that point and things don't always go right. Despite following precisely all the instructions on how to compose a photo that Snapter can process, I didn't always end up with such a photo. Finding the page boundary is how it works out how much curve to correct for. It needs to be able to see the entire page over an empty background. It turns out that this isn't enough for it to find the page boundary.
Not having a tripod hurt. Also, I kept throwing shadow onto the page. That confused both Snapter and the OCR software. My best effort result in a page of mostly correct text, but not good enough that I didn't end up reading the entire page in the process. (In that case, it'd be quicker just to read the book instead.
(If the resolution really isn't good enough, taking a bunch of pictures then stitching a panorama together might be an option. Again, to try this, I need a tripod to keep the distance to the page constant. Otherwise, the panorama may not be very useful.)
What I got out of this is that I might be able to make ebooks with my iPhone but I have to resolve a bunch of ifs. First, I need a tripod or some sort of harness to keep the phone in one place. Second, I need something to keep the book in one place and the pages still. (Of course, all this needs to be adjustable in order to make the pages fill as much of the photo as possible.) I may not want to deal with page curvature. That way, I don't have to worry about whether Snapter will accept the photo or not. Then I need a V-cradle for the book and a more complicated harness for the phone.
By the time I'm done, I'd have something like BookSnap by Atiz. Well, it might be cheaper, but also lack its fit and finish.
Why do I want to make ebooks? Right now, it's mostly so that I can read various science fiction magazines more easily. The dead tree subscription is actually cheaper than the electronic subscription. In any case, my current subscriptions aren't up for a long while, due to slight paperwork mix up on my part. (I renewed once too often by accident.) No, I don't have any real qualms about cutting away their spines. However, I'm not buying a scanner just for this. Also, if the digital camera process were quick and easy enough, I could see me making ebooks of my backlog of Books I Haven't Gotten To Yet. (i.e., like iTunes, but for books.)
Now, is it actually worth the trouble? For me, probably not. Maybe I can sort out the camera stabilization and lighting issues. There's still the cost of all that software for which I have no other use. (OCR software is decidedly expensive but, if I'm going to make ebooks, I want searchable text.) I could save up for it and turning those books into ebooks might be worth the money. The main obstacle is that my jury-rigged process is anything but turnkey. I suspect that even if everything went swimmingly, I could read the science fiction magazine in the time I spent to turn it into an ebook. Some of this may be my unfamiliarity with the software. (I didn't look into how batch operations work.)
BTW, I'm not talking about something like BookSnap. Even though a human being has to turn the pages, it apparently works very quickly and generates photos that OCR easily. If BookSnap were cheaper and didn't require that I own two Canon Powershots that I otherwise have no use for, I'd be seriously tempted. I really like the idea of iTunes for books.
Anyway, it was an interesting experiment. At some point, I may built a tripod to see if that improves things any. I think it ought to. I don't know if that would improve things enough though.
(What surprises me is that I couldn't find on the web the account of anyone taking pictures of book pages and feeding them to OmniPage. I mean, Nuance actually touts OmniPage's abilities to deal with photos taken by iPhone. They tout its 3D-correction technology. They don't actually say that it will do highly accurate OCR of iPhone photos of curved book pages, but it's not an unreasonable inference from the sum of their claims. Certainly, it makes it hard to drop $150 or $500 on their software not knowing. They're also a bit coy on what the differences between the $150 and $500 version of their software are. If I were to buy OmniPage, I'd have no idea if the $150 version would be good enough or not.)
For the first time in my life, I own a camera. Ok, it's the camera in my iPhone. However, it has auto-focus and a macro mode. Surely, I ought to be able to make ebooks by taking pictures of book pages and doing OCR then. The answer turns out to be yes in theory, but not yet in practice.
(Yes, I know. I can do this with an actual scanner. The whole point was to see if I could make an ebook with what I already have on hand and what I can download from the internet. I have a digital camera on hand. I don't have a scanner. Also, the standard process of scanning a book is to cut away the spine. The digital camera scenario leaves the book intact.)
The whole process ought to be simple. Take the picture. Load it into the OCR software. It figures out everything out and generates a PDF, LIT file or whatever. The details are what turns it into several protracted days of wacky hijinks.
It took me more than several tries before I managed to take a picture with the iPhone that matched what I saw through its viewfinder. My problem is that I'd tap the button when I took the picture. In the process, I pushed the phone down. The camera was closer to the page than I'd intended. Eventually, I realized that it takes a picture not when you tap the screen, but when you lift your finger off the button. (Ok, I ended up googling for this info.) This gets me, more or less, a properly framed picture. Of course, what I really need is a tripod for my phone because it turns out to be next to impossible to get a picture where the text isn't slightly blurred because I can't hold the phone absolutely steady. This makes is hard to get accurate OCR.
Another obstacle is that, because of the spine, book pages are curved. OmniPage implies they take care of this when they do OCR. They talk a lot about 3D-correction technology. Since they don't have a free trial though, I couldn't test that. Snapter also takes corrects for page curvature and has a free trial but doesn't do OCR. I used a free trial of ABBYY FineReader to do the OCR on the output of Snapter. FineReader will correct for page skew and line skew, but it doesn't appear to deal with curvature.
The good news is that the resolution of my iPhone camera is perhaps just good enough. ABBYY FineReader recognized the text that was in focus. (It did throw a lot of warnings about the resolution not being good enough though.) The bad news is that a lot had to go right to get to that point and things don't always go right. Despite following precisely all the instructions on how to compose a photo that Snapter can process, I didn't always end up with such a photo. Finding the page boundary is how it works out how much curve to correct for. It needs to be able to see the entire page over an empty background. It turns out that this isn't enough for it to find the page boundary.
Not having a tripod hurt. Also, I kept throwing shadow onto the page. That confused both Snapter and the OCR software. My best effort result in a page of mostly correct text, but not good enough that I didn't end up reading the entire page in the process. (In that case, it'd be quicker just to read the book instead.
(If the resolution really isn't good enough, taking a bunch of pictures then stitching a panorama together might be an option. Again, to try this, I need a tripod to keep the distance to the page constant. Otherwise, the panorama may not be very useful.)
What I got out of this is that I might be able to make ebooks with my iPhone but I have to resolve a bunch of ifs. First, I need a tripod or some sort of harness to keep the phone in one place. Second, I need something to keep the book in one place and the pages still. (Of course, all this needs to be adjustable in order to make the pages fill as much of the photo as possible.) I may not want to deal with page curvature. That way, I don't have to worry about whether Snapter will accept the photo or not. Then I need a V-cradle for the book and a more complicated harness for the phone.
By the time I'm done, I'd have something like BookSnap by Atiz. Well, it might be cheaper, but also lack its fit and finish.
Why do I want to make ebooks? Right now, it's mostly so that I can read various science fiction magazines more easily. The dead tree subscription is actually cheaper than the electronic subscription. In any case, my current subscriptions aren't up for a long while, due to slight paperwork mix up on my part. (I renewed once too often by accident.) No, I don't have any real qualms about cutting away their spines. However, I'm not buying a scanner just for this. Also, if the digital camera process were quick and easy enough, I could see me making ebooks of my backlog of Books I Haven't Gotten To Yet. (i.e., like iTunes, but for books.)
Now, is it actually worth the trouble? For me, probably not. Maybe I can sort out the camera stabilization and lighting issues. There's still the cost of all that software for which I have no other use. (OCR software is decidedly expensive but, if I'm going to make ebooks, I want searchable text.) I could save up for it and turning those books into ebooks might be worth the money. The main obstacle is that my jury-rigged process is anything but turnkey. I suspect that even if everything went swimmingly, I could read the science fiction magazine in the time I spent to turn it into an ebook. Some of this may be my unfamiliarity with the software. (I didn't look into how batch operations work.)
BTW, I'm not talking about something like BookSnap. Even though a human being has to turn the pages, it apparently works very quickly and generates photos that OCR easily. If BookSnap were cheaper and didn't require that I own two Canon Powershots that I otherwise have no use for, I'd be seriously tempted. I really like the idea of iTunes for books.
Anyway, it was an interesting experiment. At some point, I may built a tripod to see if that improves things any. I think it ought to. I don't know if that would improve things enough though.
(What surprises me is that I couldn't find on the web the account of anyone taking pictures of book pages and feeding them to OmniPage. I mean, Nuance actually touts OmniPage's abilities to deal with photos taken by iPhone. They tout its 3D-correction technology. They don't actually say that it will do highly accurate OCR of iPhone photos of curved book pages, but it's not an unreasonable inference from the sum of their claims. Certainly, it makes it hard to drop $150 or $500 on their software not knowing. They're also a bit coy on what the differences between the $150 and $500 version of their software are. If I were to buy OmniPage, I'd have no idea if the $150 version would be good enough or not.)