Ebooks and overlays, backfired

So I celebrated Ebooks and overlays earlier. I didn’t want to mention my underlying fear.

It’s happened.

I have 160 records from vendor B that just overlaid records from vendor A. They have the same OCLC record numbers.

Nuts. I just knew this was coming. OCLC has multiple ebook vendors on the same records, which leads directly to this situation. And it’s not going to improve in the future, I expect.

Now, I could try to keep both on the same record, but all too often one vendor removes records while the other does not. (At present, vendor B is guilty of this more often, but that could change.) Trying to deal with that could end up being more complex and prone to error.

Now, if I duplicate the records and edit them to be exclusive of each other, at some point in the future, one vendor will want to overlay to delete, which means the catalog cannot decide which to overlay – resulting in a a new record being created. I can usually pick these out by looking for new records created during the overlay of deletion records.

But first, I have 160 records to separate. <sigh!>


Ebooks and Overlays

So finally vendor p..q…. has moved to using OCLC record numbers in the 001 fields of their records, instead of their own numbering system.

That means that I can now download records and — for example — overlay records with a notation to delete them. Huzzah! Couldn’t do that without some complex and expensive shenanigans with importing records before, when the numbering system was non-standard.

So, I take the academic collection as a basis and download that with NO overlays – new records must be inserted for all of them.

Then I take the public library collection (and later one for education, when it becomes available) and overlay it, so I don’t have a bunch of duplicate records. Then I remove duplicate fields, such as the 856 links, and get over 65,000 records (so far).

One little glitch. Not all the records used by this vendor are unique — a few (more than a dozen) are the same numbers as those used by a couple of other vendors, so those records are overlaid. Now I have multiple vendor links in the same records.

So, I have to find those records and make copies, and tweak until I have unique copies of the records with separate vendor links. That way, I can remove records from vendor A without losing records from vendor B.

It would have been nice if p..q…. had taken the original record, copied it, and changed it with their own link, and saved it under another OCLC record number in every case. It would have been. They seem to have done that for the most part, but at this time, there are a few little exceptions — and I have no idea why.

Close, people, really close. 4 out of 5 stars.


Ebooks and the stats

“Given the proliferation of e-Books and the discussions about the future of print books, the results demonstrate that only 4% of Americans are ‘e-Book only’ readers. Only 28% of Americans have ever read an e-Book and e-Book readers also read print books. Additionally, as the Survey results indicate, people prefer the two formats in different circumstances. People traveling prefer e-Books because they are portable and baggage restrictions don’t apply to them. Print books are ideal for reading to children and for sharing with friends and family. So, the two formats tend to be more complementary than competitive.

The contest between offering free e-Books through libraries and publishers’ fear of losing sales has been brought to rest by another surprising revelation. Active library users also tend to be the most active book buyers, print or e-Book. So, free e-Books in libraries do not actually drive down sale figures.”

from http://www.webjunction.org/news/webjunction/pew-research-center-7-surprising-facts-about-libraries.html

Just this weekend I purchased some e-books in a series which I like. I got epub format so they would go on my phone in the Bluefire app, and give me something to read during odd moments, usually while waiting for something. At the same time, I am reading print format books from the public library, part of the time while digitizing my phonograph records for music in my vehicle (since operating a phonograph-based stereo system in a moving vehicle is, if anything, even more unsafe than texting). All of that is consistent with the Pew results above (excepting the digitizing stuff).

And there is the argument that some have been making for a while now that e-books will take over and print will/has become obsolete. Thomas A. Edison did not predict that, but he predicted that the phonograph record would replace books (you know, those little wax cylinders — oh, wait, that changed to flat discs), and then motion pictures, and poor old Tom still didn’t have it right when we moved to DVDs of movies, and then streaming videos. The fact is, we have multiple formats because we have multiple uses, and multiple contents, and changing technologies, and personal preferences. Even the inventors cannot predict how long a new format will last. That’s why I’m digitizing my old phonograph records.

While the big publishers are reporting that e-book sales have been falling in 2015, or at least flattening (which is disputed elsewhere as only referring to those publishers sales), it seems that multiple formats will remain in the future. Which formats are available may shift, but versatility continues to be the preference of enough users/buyers that require that flexibility. Print, however, continues to endure and even thrive.

I already have several series that I have read various volumes in print and in e-book format. It’s a little harder to keep track of, but it allows me to use the format that is convenient/cost-effective at the moment.

And so it goes…

Ebooks and Overlays

When you buy ebooks, you have the records in the catalog and that’s pretty much it.  It’s a steady state deal, and if you weed, you do it by hiding or deleting the records.

Subscriptions such as “such and such Collection” are another thing entirely.  I didn’t realize until we’d been doing it a while how much change would occur as records were added and deleted.

I needed to overlay, but I didn’t realize that until too late.  The catches (multiple!) included:

  • Some ebooks came from more than one vendor, but the OCLC record might be the same.  If you overlay the same record with multiple links, how do you deal with having to remove several thousand from one vendor as that subscription changes but the other vendor still (at least for a while) continues to include them?
  • How do you deal with having purchased an ebook specifically, but it also shows up in the subscription, so overlaying the record means you may have to remove the subscription link but not the purchased one?
  • What do you cue on for overlaying records?  ISBNs turned out to be a bad idea, but not all ebooks come with an 001 field for OCLC number — some put that in the 035 field, or need to use the 001 field for a custom number (which then won’t load into our system since it is not recognized as a proper OCLC number).

Combining that with several vendors, and changing major ebook subscriptions from ebrary to ebscohost and THEN getting ebrary from a statewide subscription on top of that…

So, our ebscohost ebooks were a mess.  Lots of duplicate records, and that meant that more and more of those couldn’t be overlaid (which one?) so yet another was added instead, making it worse.

During spring break, I’ve been delete-proofing the ebscohost ebooks specifically purchased (put PURCHASED after the OCLC number), and then deleting the rest — something like 180,000 or so. A number of these were actually duplicates (don’t ask how many — I have no way to tell except when I run across them).

Then I began reloading the ebscohost from scratch.  First batch: 50,000 records.

Wish that had been simpler.  Taught me the complications, however.  Many overlaid our older (owned) netlibrary ones, for example.  Yes, ebscohost owns netlibrary now, but many of the old links still work, and I need to keep those purchased ones separate.  So, fixed that and removed the ebscohost links.  Have to remove all the ebscohost other than that as well, just to be sure.

Found the credo and cambridge and so on that were overlaid and fixed those.  And changed the 001 numbers to add the vendor name after the number to ALL the non-ebscohost ebooks, to try to prevent overlayment.  Vendors other than ebscohost and ebrary, these days, come in small enough amounts that I can deal with them without overlaying records.

Having one OCLC record (ideally) for each ebook vendor, and a separate record for the print editions (even if the ISBNs appear on both), allows the ebooks to be handled separately.  I tried, in early days, combining on the same record.  Big mistake — couldn’t keep up with all the changes.  So I’ve separated them again.  Lots of work both times.

And I mentioned that I have ebrary records again, from a statewide subscription, which fortunately use a different 001 numbering and a ebr prefix.  Unfortunately, it won’t download with my load table, so overlayment is not going to work with them, at present.  I had a suggestion from another librarian (thanks!) who suggested I put that number in the 903 field, and have Innovative add the 903 field to the o index (like the 001 field) so I can overlay additions/deletions to ebrary using the 903.  I added the ebrary ahead of that change, so we’d have some ebooks during the process of working out the ebscohost situation.

So, once I find all (one hopes) the possible wrong overlays, I load the ebscohost from scratch all over yet again.

New idea — since I don’t really need Cutter (092 subfield b) to keep ebooks in order on the shelf, I can change that to the vendor name for ebooks.  That makes it easy to distinguish quickly when I find it in Sierra, such as using CTRL-g to check the index for other copies of a title.

Ended up with a lot fewer duplicates. Still, we now have over 235,000 ebooks and pdf files.

Now I have to add the 690 PROGRAM fields to these all over again. Job security!


Amazon Kindle for Windows 8


The Kindle app software on Windows 8 comes out a bit of a mystery.  Where the heck are the controls?  Nice they don’t intrude, but I need them.

Google search, and I discover that I have to right-click to get them.  That’s apparently press-and-hold in Windows 8, or I can unfold and use the keyboard’s touch pad, or just use a mouse.  Maybe an explanation on that first time I used the app would have been nice.

Upper right corner, and a title bar appears so I can close or minimize, thanks to a Microsoft update on apps.

Display mode

Pages show up in two columns in landscape mode, but make a very nice full page in portrait mode.  I’m very happy with that so far.

I prefer to use the white on black setting to cut down on glare.  Also, the less strong light directed at my eyes late at night may help me to get to sleep more easily, or so some sources claim.

Library display

The library display is, by default, all the covers.  Nice, but there are advantages to a simple list with thumbnail cover displays.  You don’t get that option (or any other) on the Windows 8 Kindle from all I can discover.  Hasn’t Amazon seen even the Adobe Digital Editions — which is not perfect, either, but still ahead of Kindle on this.

Suggestions for Amazon

Amazon is, IMHO, not doing themselves any favors by not making it easier to manipulate their collections.

I need to be able to change to an alternate list format, see what I have already read (VERY important once you get more than a dozen or so books), sort by author or title in each list, group books on both my online and tablet… maybe this is possible on a Kindle device, but limiting it on the Kindle software is making it increasingly awkward to handle my growing collection.  I’m starting to wonder if I should keep my collection here growing, Amazon.  Think about it:

  • Change to list only view, perhaps with smaller thumbnails as one alternative view
  • Be able to sort by authors and series in series order (if I bought the set, which do I read next?)
  • Be able to separate out the ones I’ve read and have those marked (hey, Kindle is supposed to sync all this stuff, so make this possible)
  • Move a few of my collection onto a device (which has limited storage), read them, have it noted that I read them, and then let me move them out to the cloud again.  Right now, the only way to get them out of the way in the cloud is to delete them entirely from my cloud.  For those with limited space on a device (tablet or phone, for example), you may only need a few titles downloaded on the device.  Oh, and they need to be marked as read in the cloud so I don’t bring them back down unless I want to read again.

Kindle is not really functional for heavy readers — the very people it wants to attract.  Kindle has been around long enough that it should have been working this out well before now.



Comparing ebook collections

We more than doubled the size of our collection by subscribing to a (if you’ll pardon the term) “bulk” collection from an ebook vendor a couple of years ago.  It’s a cost-effective way to get access for a lot of material which will be available even when the building is closed or some distance from the user.  Over 70,000 records to start, at the time of comparison, up to 88, 907.  Let’s call that Collection A.

These “bulk” purchases come in various packages of differing sizes, and can either specialize in subject areas, or be — as we selected — an “Academic” collection, which may or may not hold some or all of the specialized collections.

We’re not above looking at alternatives, however, so we’ve done some comparisons with an offering from another vendor, where the Academic collection is presently shown as 119,757 (based on a spreadsheet download of titles at the time of comparison).  Let’s call that Collection (wait for it…) “B.”

So, how do we evaluate such a massive amount of records? And, how do we do it as fairly as possible?  Given the sizes, numbers are going to have to suffice for most purposes, which is not my preferred standard, but you work with what you can manage.


I started with date ranges.  Bring them into spreadsheets and sort by publication date.  This isn’t as accurate for comparisons as it might be, for several reasons.  For one thing, I have updates on Collection A received over the months since we got it.  Is it fair to compare it exactly for the current year ( part way through 2013) when the B spreadsheet may not include any titles updated since the spreadsheet was created for marketing?  I’d prefer to see 2013 but not weight it as much as I might.

Also — and this is much harder, probably impossible, to compensate for — a number of ebooks tend to be shown with the ebook publication date, even though the original print date is years earlier, such as a classic work that has just become an ebook.  This tends to flex the ebook records to appear more recent than the contents actually are.  Add to that the fact that some publishers don’t come out with the ebook until after the print, and the dates are skewed further.  I can’t really do much about this, given the large numbers I’m dealing with.

Dates are also tricky in that many records have odd punctuation, “c” for copyright, brackets, and quite a few multiple dates, but this shows up only in Collection A.  Does that mean Collection B has more accurate dates, or just more of them updated to the ebook edition regardless of the original print edition?  I suspect the latter, given a few of the titles which I can tell are older but show 2013.

I do go back several years, to 2008, and compare the numbers.  Since I’ve got two different base sizes, however, I then figure by percentages, to keep it a bit more fair.

Collection A consists of 4,8% titles from 2012.  Collection B consists of 3.4%.

Collection A consists of 7.15% titles from 2011, Collection B has 7.12%.

Collection A consists of 10.6% titles from 2010, Collection B has 9.44%.

Collection A consists of 8.95% titles from 2009, Collection B has 9.04%.

Collection A consists of 8.29% titles from 2008, Collection B has 7.63%.

Okay, let’s jump to a range:

Collection A consists of 88.33% titles from 2000 or later, Collection B has 77%.

So, Collection B has more titles but seems to have a close percentage based on percentages of the collection, until we get to 2000 or later where it falls behind.  How significant is that, given that B’s percentage is from a much larger collection?  Maybe that 11% difference is a smaller factor than it may appear, but it should be kept in mind, at least until some more comparisons can be made.  Going by the other percentages, however, the collections aren’t far apart  in recent years.

Now, a really in depth comparison could take the collections and do this again by sorting first by call number, and then by date, to see how the collections stack up by subject area.  However, this is (a) more time-consuming, and (b) not necessarily as useful as might be expected.

Publishers are weird about ebooks.  You can have key publishers in certain subjects refuse to be part of a third-party collections because (a) they don’t do ebooks, or (b) they do ebooks on their own, or (c) they have ebooks through this vendor but not in the collections — only individually, or (d) it’s just the wrong POTM (Phase Of The Moon).  So collections which look skimpy in a particular subject might just be caught by some of these factors, and/or by when the publisher(s) signed the contract (as in, before or after the new owners came in and made changes not already limited by that contract).  So, I’m going to pass on trying to do anything such as that.

Choice Titles

Now, instead of just calculations, comes the Choice Outstanding Academic Titles lists.  My director thought this would be a good standard to apply, and it’s a very convenient one at that.

I used the three previous years: 2010, 2011, and 2012.  I sorted by title (to randomize the subjects covered a bit), and downloaded the files to spreadsheets (and I’m very glad they set up that feature on Choice).

I have to play with the titles a bit — remove the quotes, leading articles, like that.  The download is NOT alphabetized by title at this point, so next I do that.  Then I remove the unneeded columns, and add two columns on the far left for the A and B collections.

Now I go through the first 100 titles of each year (total 300 titles) and compare them to the holdings in the two collections.  That might not seem like a large sample, but the results were so consistent, I decided it was sufficient.  This is time-consuming, but I quickly realized just a sample would do.

The holdings for each year were exactly identical for both collections.  Same titles held, same number of titles in each 100.

Again, this is probably largely to do with publishers.  Some of them don’t want to have ebooks or want to keep them away from the large vendors; some might sell through these vendors but the licensing or pricing make the titles cost-inefficient to put into bulk collections such as these.  So, both vendors are working from pretty much the same pool of titles, I suspect (for academic titles, anyway).

There’s also the fact that both vendors almost certainly have access to Choice and expect librarians to use that as a standard, so they are likely to focus on getting what they can of those OAT titles for their Academic collections.

User Survey

Now, I did a little informal survey of students who were going through our promotional event in Fall 2013.  I had them look at a screen shot of the same title in both A and B versions, and asked which they preferred, if they had a preference.

Caveat: one of the reference librarians preferred A, and I leaned to B, on this one.  We’re both flexible, however, so left it up to the students.

After the results were tallied, B came out ahead by a considerable margin with students, although A still had some adherents.  There were details such as having controls that did, or did not, scroll off the page, and the use of non-obvious controls for tasks.  Ebooks are so new that controls are not all that standardized, but some actions need to be obvious, and stay on the screen if the entire page is not visible.

Factors, therefore, come out fairly close, although B might have more older materials which give it a larger count.

We use Adobe Digital Editions for Windows/Macs, and suggest Bluefire for use with Android and iOS since the apps are pretty much the same (and therefore easy to help people with).

Internal Factors

Okay, let’s consider internal factors such as upkeep and price.

Collection B is from the same vendor who handles our Discovery service.  A, however, is not, and therefore we have to do more upkeep on sending files to the Discovery service vendor.  There’s staff time on that, including changing settings on the deleted ebooks before sending those records, which varies from month to month.  I have to download records to the catalog in either case, but B relieves the chores of updating the Discovery service.

Price: B is cheaper, by enough to buy at least one or more other databases.  (We have the A invoice and a quote for collection B.)  These days, that’s a humongous factor, although it can be outweighed or at least balanced with enough on the other side.


As of spring 2014, the final decision by the director, considering all these factors, is to switch to Collection B.

This is not without consequences.  Due to Collection A expiring at the end of March, we’re changing in mid-semester.  Students who thought they had an ebook for their paper may find it missing later, for example.

I’ll have to take all the old ebooks from Collection A out of the catalog, and update the Discovery service, as well as OCLC WorldShare, on our holdings.


Also, I do a modification of bibliographic records in our catalog for our Programs page, which allows faculty to find all the materials in their subjects, as well as allowing me to keep track of materials for accreditation purposes.  (I’ve discussed this in another post.)  Now I’ll have to start from scratch with a huge batch of new records, adding 690 fields.  But, I knew the job was dangerous when I took it, Fred (as Super Chicken used to say).

[Please note that this is not to be considered an endorsement for any purposes, and no reimbursement was received in any form, including discounts, for this post.  That’s why I didn’t name vendors.  YMMV — Your Mileage May Vary]

unCrowned ebooks

I had to remove over 1300 ebooks from access, as of August 22.


They were part of our “bulk” Academic collection from ebrary.  And they are pretty much all part of the Random House/Crown group of publishers and imprints.  Knopf, Random House, Doubleday, Bantam, etc.

I can’t help but wonder if the deal signed with Ingram’s MyiLibrary earlier this year led to this.  I don’t know, mind you, but I suspect.  Maybe it just seemed like a better deal and therefore no longer worth the existing contract with ebrary.

Some of these titles we also had in print already.  They can’t take those back.  But ebooks are always vulnerable to changes in the ebook winds: changes in policies, ownership, lawsuits, and so on.  I see now that Disney is blocking movies from people who bought them, during the season(s) they want to show them on TV — librarians are now wondering when that might spread to books.  Imagine having books blocked each time the TV show is scheduled to air.

We’ve seen lawsuits and other court actions lead to recalls of individual titles, but those were rare things.  Now, with one decision, entire chunks of materials can be pulled out and moved away for a perceived commercial advantage.  It’s one thing to say “from now on” all titles will be elsewhere, but another to do it retroactively with all the titles currently available on a platform.

Certainly, they have the right to do so.  That’s not disputed.  But like a lot of librarians, I’m thinking this may not be as wise a course of action as some publishers might think.

I realize that a lot of decisions are being made by people who look only at the bottom line for each year, period.  Bottom lines are important; they keep the publishers operating.  Loyalty has never been part of that equation.  Authors might be loyal to a publisher (or more often, to an editor, and jump ships with the editor to another publisher).  But not readers, in my experience.  Readers tend to be more loyal to an author, or a series, and usually have little or no idea who publishes them.  Oddly enough, some publishers still seem to think readers should be loyal to a specific publisher’s offerings, but loyalty to making materials accessible to readers through a consistent avenue is not necessarily a factor in making their decisions.

And folks, ebooks are about accessibility, first and foremost, not about who provides it.  Make it too hard or expensive to get something, and people either cheat or go find something else easier or cheaper to deal with (which is pretty much the way library ebooks have been treated by publishers).  Experiences with music, and with ebooks (the Baen free ebooks, for example) indicate that it’s not so much about free as it is about easy.

But nowadays, when an ebook vendor — ebrary, Ebsco, Ingram or whoever — comes to libraries and says, these publishers are part of what we’re offering you, does it occur to anyone that we’re going to respond “but probably not offering for long, given the way that particular publishing group deals with their ebook titles.”

Libraries are going to expect to catch all the flack of offering titles that could be taken away at any time, if we purchase them as ebooks.  So, for some publishers, libraries are going to quickly prefer to stick with print editions, because we can’t trust them to keep ebook editions available, and libraries don’t have an infinite amount of money — any more than readers — to keep switching platforms seeking specific publishers.  As of now, I obviously can’t count on any of the Random House group staying put; how many other publishers will keep their commitments?  And if I subscribed to their current outlet, how long would that last?

But then, the publishers making these kind of decisions don’t really seem to like libraries much anyway.  They’ve always been obsessed with the same delusion that music publishers had (still have, many of them) — that there is an infinite amount of money out there waiting to be eagerly spent on their products, and if they can just close off all the other avenues to obtain those products, people will have to spend that infinite amount of money buying the products.  Recession or not, other forms of competing entertainment or not, independent/self-publishing alternatives or not.

That’s why some publishers are deciding to take advantage of ebook editions to exercise more control over their materials, in the hope of making more money.  And I suspect that they’re thinking, let’s hope everyone moves to ebooks, so we can stop publishing print editions that leave our control.  Let’s hope everything moves to ebooks so we retain control perpetually, so no title can be kept without our continuing contractual permission, so no title can be sold used, so we make an infinite amount of money with one-user/one-book/one-sale-or-limited-license.  They figure they’re gonna get rich(er), Wheeee!

And the readers — and libraries — will simply turn elsewhere.  We’re not all that loyal, remember?  And authors (who are also readers, remember), even the ones not offended by these publisher actions, are likely to see their revenues and readers being limited by this hot-swapping control, and take their works to less volatile publishers.  Given the increasing trend to independent publishing, and self-publishing, having access to those big printing presses may not be much use in the digital world.

There are a lot more authors able to publish ebooks without having to justify themselves to publishers with high sales figures — and the cost is minimal compared to printing and distributing paper books.  I’m wondering how long before the ebook vendors start picking up these little niche titles to fill out their catalogs, when big publishing is yanking titles out.  Of course, that works more for fiction than nonfiction — having somebody reputable check the facts and doing the editing still makes a difference — but mechanisms can evolve for nonfiction too, as the demand increases, I expect.