Monthly Archive: December 2014

Organizing Your Comic Collection – Part 2

Last week, I presented a simple schema for organizing a comic collection.  At the heart of the schema is the idea of a many-to-many relationship between publications and stories.  That is to say that a given story can appear in many publications and a given publication may contain many stories.  This many-to-many relationship is mapped into three tables that are ‘linked’ together in a relational database.  The first table contains all the information about each publication.  The second contains all the information about each story.  The third table contains all of the relationships between the two.

In this post, I am going to present my particular implementation of this schema, and talk about some of its strengths and weaknesses.  While the tables I’ll be presenting will have a lot more information in them than the basic ones presented in the last post, the basic approach remains the same.  The additional information is provided to make the database much more useful and rich in the kinds of queries that can be made.

The database (technically a relational database management system or DBMS) that I use is Microsoft Access, which usually comes with the professional MS Office Suite.  There are freeware alternatives to Access including MY SQL and LibreOffice Base, which work just as well if not better and, in my experience, moving the tables between tools is almost painless.

Let’s start with the Publication Table. In my implementation, each publication has 10 attributes corresponding to the 10 columns in the table, which, in the language of relational databases, are called fields.  Each field has a name and a data type (integer, string, etc.) that specifies what kind of data it is. A record is a row in the table where a field takes on a particular value.  The names, data types, and explanation of the fields in the Publication Table are:

Field Name Type Meaning
Publication Integer A unique publication id label.  No two Publication values can be the same, and the specific value is assigned by the DBMS unless overridden.
Publisher String Identifies the publishing house (Dark Horse, DC, Marvel, etc.)
Title String The title of the publication, usually as it appears in the indicia.
Number Integer The issue number as it appears in the indicia, or the value 1 if none is given.
Vol Integer The publication volume as it appears in the indicia, or the value 1 is none is given.
Year Integer The publication year as it appears in the indicia.
Type String Lists the type of publication.  Usual values are ‘Comic’, ‘Trade’, or ‘Trade Paperback’, ’Trade/B&W’, ‘Glossy’, but they can be any string desired.
Condition String List the condition.  I use only ‘Good’, ‘Fair’, and ‘Coverless’, but any string including CCG labels can be used.
Box Integer Identifies in which box the publication can be found.  A value of 0 means on a book shelf.
copies Integer Number of copies.  Occasionally I bought more than one by accident and so I needed this field.


The following figure shows approximately the first 30 rows of the Publication Table.


Next is the Story Table.  A story has 13 fields, and the names, data types, and explanations of these fields are:

Field Name Type Meaning
Story Integer A unique story id label.  No two Story values can be the same, and the specific value is assigned by the DBMS unless overridden.
Title String The title of the story as it appears in the original story.
Year Integer The year that the story was originally published.
Genre String Lists the key characters as I see it in a story.  For example, a story with both Batman and Spider-man may have the string Batman & Spider-man as an entry
Author String Lists the names given author credit separated by a ‘,’ or an ‘&’ for multiple entries. No distinction is made between plot and dialog.
Illustrator String Lists the names given artist credit separated by a ‘,’ or an ‘&’ for multiple entries.
Inker String Lists the names given inker or finisher credits separated by a ‘,’ or an ‘&’ for multiple entries.
Original Comic Name String Lists the publication title in which the original story was found.
Original Comic Number Integer Lists the publication number in which the original story was found.
Original Comic Vol Integer Lists the publication volume in which the original story was found.
Read Boolean/Checkbox Yes or No value to track whether the comic has been read.
Cataloged Boolean/Checkbox Yes or No value to track whether the comic has been completely cataloged.
Variant String Special comment field to track changes in the story when reprinted.


The following figure shows approximately the first 30 rows of the Story Table.


The final table is the Relationship Table which tracks the many-to-many relationships that exist between the Publication and Story tables.  It has 3 fields, and the names, data types, and explanations of these fields are:

Field Name Type Meaning
Entry Integer A unique relationship id label.  No two relationship values can be the same, and the specific value is assigned by the DBMS unless overridden.
Publication Integer The publication id corresponding to the desired publication in the Publication table.
Story Integer The story id corresponding to the desired story in the Story table.



Data entry into the tables can be done in a variety of ways, but the easiest is to use a spreadsheet to get a portion of each table (say 100 publications or stories) just right, and then a bulk cut & paste to set the records in the Access table to have the same content.  There are other ways for bulk data entry, but there are good web tutorials that discuss this so I won’t say anymore here.  The one table where the data entry had to be done entirely by hand is the relationship table, but Access has some nice features that allow the user when editing any of the three tables to see what the relationships are to the others.  These data can be accessed by clicking on the “+” on the far left of each record.

Once the tables are filled, you are ready to mine the tables for all sorts of nuggets.  Mining data from the tables is done through a query where the user asks for all the data meeting a specified set of criteria.  The query is done using what is called an inner join on the tables to get the desired effects, and is perhaps the hardest piece to get right, as it involves learning some of the database language known as SQL.

Putting in this much structure requires some time and dedication, but when it is all done you can make a list of all the comics in your collection featuring, say, Doctor Strange.  The figure below shows a partial list from such a query, and you can see the multiple occurrences of the “Beyond the Purple Veil!” story.


The SQL that made this happen is:

SELECT [stories].[Cataloged], [stories].[Title], [stories].[Year], [stories].[Genre], [stories].[Author], [stories].[Illustrator], [stories].[Inker], [stories].[Original Comic Name], [stories].[Original Comic Number], [stories].[Original Comic Vol], [publications].[Title], [publications].[Number], [publications].[Vol], [publications].[Publication], [stories].[Story], [stories].[Variant], [publications].[Box]FROM stories INNER JOIN (publications INNER JOIN pub_story_relation ON [publications].[Publication]=[pub_story_relation].[Publication]) ON [stories].[Story]=[pub_story_relation].[Story]WHERE ((([stories].[Genre]) Like “*Doctor Strange*”) And (([publications].[Publisher])=”Marvel”))

ORDER BY [stories].[Year], [stories].[Original Comic Name], [stories].[Original Comic Vol], [stories].[Original Comic Number];

Despite its length, it is actually relatively easy to understand.  The first piece is the ‘SELECT’ command, which tells the query to extract data from the fields listed.  Each field is specified by its table name and then its field name using [<table name>].[<field name>].  So, for this query, 17 different fields were extracted and placed in the order specified from left to right (and if you count the number of columns in the above figure you find 17 of them).  The second piece is the ‘FROM’ command that acts like a switch board to say that the data in a given row should be from a story and a publication that has relationship with each other as determined from the Relationship Table.  The third piece is the ‘WHERE’ command, which tells the query to only return those records that have an occurrence of the string ‘Doctor Strange’ in them and which were published by Marvel. Finally, the fourth piece is the ‘ORDER’ command that sorts the records first by the year the story was published, then by the original name of the publication, volume, and number.

Most of the DBMSs don’t actually require you to even write SQL; the above set of commands can be constructed visually in Access using the Design View.  It takes some practice, but an afternoon of experimentation and some web searches should do the trick.

Okay, what about stories that are partially reprinted in one volume.  My strategy is to list the story in each of the publications.  An example of this is the Warlock Special Edition (1982) #1 which contains the stories “Who is Adam Warlock?”, “Death Ship!”, and the first half of “Judgment!” from Strange Tales #178, #179, and #180, respectively.  Running a query with Like “*Doctor Strange*” replaced with Like “*Warlock*” (note the wild cards) gives:


This approach works, but it is a bit awkward since it isn’t clear how much of “Judgment!” is included in one versus the other.  I don’t think that this is particularly a problem since this situation happens relatively infrequently, but there is a case to be made that the Publication Table should have a field added, maybe called ‘Comments’, where these little notes can be kept.

That brings us to the ‘Variants’ field in the Story Table.  That field was actually added well after the table was constructed, to address the following issue.  In 1986, Marvel started reprinting the Uncanny X-Men run that started with Giant-Size X-Men #1 in 1975.  Due to page count differences, the reprints appearing in 1986 had a variety of new pages added, and occasionally some panels or pages removed.  Since each issue was different, the variant field is used to track the differences.  For example, consider the story “The Doomsmith Scenario!” originally printed in X-Men #94 and then reprinted with modifications in Classic X-Men #2.  Both stories have separate entries in the story table; the original has the Story id value of 5506 and the variant a value of 8802.  They have identical entries for all the other fields except for the Variant field where the description of the modification is placed.


Additional queries that I’ve run have been to determine what publications are in each box, or how many of the comics I have have been written by Steve Englehart or drawn by Jim Starlin.  Basically, once the data is in the database and the relationships are established, then any mix-and-match scenario that one can imagine can be the basis of a query.

I will end with a few additional comments on the weaknesses of my implementation.   First and foremost are the names of the fields.  It would have been better to call the label fields used as identifications something with the abbreviation ‘Id’ in them (e.g. Publication Id instead of Publication).  This naming convention is not only a lot more precise than the one I implemented; it also prevents confusion when trying to write/design the query.  Second, I might have included a series field in the Publication table.  The last 10 or so years have seen publishers recycling names without incrementing the volume number, and many sites now track a comic by name, series, and year. Currently I address this by adding a qualifier in the title field itself if needed.  Finally, it is easy to screw up the Genre field (which, incidentally, might have been better named Key Characters) by using similar but not exact terms – for example, Dr. Strange versus Doctor Strange.  There is no really good answer for this dilemma.  One possible solution is to create an approved list of character names in another table which then would have a relationship to the Story Table, but this is a very difficult undertaking.  The better choice seems to be a judicious use of the wild card ‘*’ and some perseverance when making a query.

How to Organize Your Comic Collection – Part 1

Whether you are a long-time comic collector or you are just starting out, comic book collecting can be both fun and frustrating.  The fun part should be obvious (eye popping art, mind-blowing concepts, compelling drama) but where is the frustrating part, you may ask? Well, if you own a lot of comics, organizing and listing your collection can be a daunting experience.  In this column, I’ll be sharing how I used some simple concepts and a relational database to organize my collection.

Before diving in, let me say that, for the beginning comic reader today, the task of organizing a collection is much easier than when I was a beginner.  I bought my first comic book in 1975 at a Kroger’s grocery store about 2 miles from my house.  There were no direct sales markets and no internet.  You got what you got, and often I would have gaps in the collection that would make understanding the story arc nearly impossible.

As I got older, I would visit comic shops and conventions or mail away (yes using snail mail – as I said, there wasn’t an internet) for back issues.  Marvel comics were the staple back then, and I arrived at comic collecting about 13 years too late to capture the original start of the Fantastic Four, Spider-man, the Hulk, Thor, the Avengers, and the X-men.  Nonetheless, Marvel kept my dream alive that I might one day be able to read the original stories through a variety of ongoing reprint series.  For example, Marvel Tales helped me collect reprints of earlier Spider-man stories, and Marvel’s Greatest Comics did the same for the Fantastic Four.

There was a downside to this, though.  Often these reprint titles would neglect to include a blurb saying from where the original material came, and they usually would start in the middle of the series, so that the issue number was different from the original.  Occasionally, a story that was originally in one issue was split over several of the reprint titles due to differences in page count.  Sometimes a reprint title would bundle multiple stories from different comics into one.  Case in point, Marvel Tales #3 reprints stories from four separate comics.  And, to top all this off, when they were desperate to make a deadline, Marvel would take a portion of an old story and wrap a few pages of new material around the front and back to frame it.  A classic example of this technique, which I refer to as a cameo, is Giant-Size Defenders #1, which has 9 pages of new stuff interleaved with 25 pages of reprints from The Incredible Hulk, the Golden Age Submariner, and Strange Tales featuring Doctor Strange.  This issue also included an additional backup reprint featuring the Silver Surfer from a Fantastic Four Annual.  Confusing, isn’t it?

Today the situation is certainly better, but not ideal.  Publishers in the last 10 or 20 years have tried to shape their story lines so that they fit nicely into collections, for those readers who want the stories without the bother of buying monthly or who missed the original run.  They also offer a dazzling array of comprehensive reprints of classic material in publications with various price points, ranging from inexpensive newsprint (DC Showcase & Marvel Essentials) to trade paperbacks to glossy high-end, hard-cover publications (DC Archives & Marvel Masterworks).

So, if you are a fan of the stories, there are certainly a lot of choices to find what you may have missed in the first go around, but there is also a lot of opportunity for confusion in figuring out what you have and what you need.  Of course, you can make lists on paper or in spreadsheets, but sorting and cross-referencing is a real hassle, and how do you make annotations for cameos and split stories.  Well, the answer is a well-thought-out database schema and a relational database.

For those unfamiliar with these terms, let me give brief informal definitions and then apply them to the art of comic collecting.  A database is any method of storing and retrieving data associated with some object being described.  Common examples of a database include lists, spreadsheets, card catalogs, phone books, and the like.  A schema is a model for how the database is laid out, usually in the form of one or more tables.  For example, a phone book schema typically consists of one large table that lists the name of a household or business followed by its address and phone number.  A relational database is a sophisticated software application that relates data in multiple input tables to produce new output tables that answer specific questions.

For the rest of this post, I will focus on developing a good schema for comic collecting.  I’ll do this by considering a classic story that demonstrates most of the nuances encountered in organizing a collection, and show how a fairly simple schema can tame the complexity.  In the next post, I’ll talk about applying this schema in a relational database, adding some finishing touches, and show the results.

The story is “Beyond the Purple Veil!” featuring Doctor Strange.  Published in April of 1964, “Beyond the Purple Veil!” was an 8-page backup feature in Strange Tales 119, which was headlined by the Human Torch.


The next time this story appears is in Giant-Size Defenders #2 (1974) as one of three reprint stories backing up a 30-page new Defenders feature.

Beyond the Purple Veil appears next in a compilation entitled ‘Doctor Strange, Master of the Mystic Arts’ published in 1978 by Pocket Books.  The book, which measures about 7 by 4 inches, contains reprints of the Doctor Strange backup stories from Strange Tales 110-11, 114-129, and 146, with only issue 110 identified.


It also appears in Essential Dr. Strange, a black-and-white compilation (2001), and three separate versions of Marvel Masterworks (1987, 2003, and 2010), the first and the third being hardcover and the second a trade paperback.

While it is possible to catalog each appearance of the story in a single list as separate entries as shown in the figure below, consider what happens if you notice that the title is missing the exclamation point at the end.  You now have 7 entries that have to be edited, each one time-consuming and prone to error, even with a search and replace.


In addition, how should the other stories, contained in each of these publications, be tracked?

The simplest way to accomplish all of these goals is to have three separate lists.  In the first list are all of the stories in the collection.  In the second list are all the publications in the collection.  The two key features of the comic collection are that multiple stories can appear in a given publication and that a given story can appear in multiple publications.  This many-to-many relationship is tracked in the third list where each entry is a single relationship that indicates which story is in which publication. Taken together, the three lists constitute the schema.

The figure below shows a partial realization of the schema for the seven publications discussed above, and for the two Doctor Strange stories “Beyond the Purple Veil!” and the one that followed it, “The House of Shadows!”, published in Strange Tales #120.Comic_Collection_Schema

The first table catalogs the publications giving each a unique Publication Id, which is called a primary key in the language of relational databases.  The second does the equivalent job for the stories.  In both tables, the order of entry is not important and any shuffling of either list is workable.   The third table, the relationship table, is the crucial ingredient.  Each entry represents a relationship between the publications and the stories and is denoted by a unique Relationship Id.  Note that the relationship table clearly reflects seven appearances of the story “Beyond the Purple Veil!” in the seven publications discussed above.  Entries 3 and 8, show that the publication Doctor Strange: The Master of the Mystic Arts contains both the stories “Beyond the Purple Veil!” and “The House of Shadows!”.

That is all there is to constructing a basic schema for organizing a comics collection.  Next week I’ll go into how to handle a story that gets split into more than one publication, or one that is only partially reprinted. I’ll also present how you can put some more columns in each table to track additional items, and how to mine the data to make very nice lists showing different aspects of your collection.

Is Anybody Up There Listening?

Very few commodities are as hot in the mainstream comics right now as is the purple-pussed, multi-chinned, mad Titan Thanos.  So, one would expect that the Marvel machine would run on his popularity for as long as possible.  And, indeed, we have been treated to a spate of appearances in various forms over the past several years.  Some have been good, some not so good, but none are as bad as the installment ‘Thanos: A God Up There Listening’ (TAGUTL).

TAGUTL is a four-issue limited run that follows the last of the mad Titan’s living children, Thane,


as he tries to grapple with his new-found abilities and the fact that his father is probably the biggest mass-murder the universe has ever seen.  It builds idiocy upon the already highly unbelievable edifice erected during Infinity.

For those who missed the events of the Infinity series in the summer of 2013, Thane is a half-Inhuman and half Eternal (the Titans on the Saturn’s moon of the same name are Eternals – a post for another day) who lived in the Inhuman city of Orollan as a healer


and as a relatively normal person.  Having not been exposed to the Terrigen mists that cause the transformations in the Inhumans, he exhibited no special powers to speak of (although compassion and a desire to heal others should be regarded as a super-power all on its own).

Things changed when Thanos, desiring to wipe all his progeny from existence as yet another demonstration of his fealty to Mistress Death, showed up on Earth to find his last remaining whelp.  Seeking the location of his son, Thanos confronts Black Bolt, king of the Inhumans.  A violent and spectacular battle ensues, in which Thanos has the clear upper hand.  But, even as Black Bolt falls, he activates a Terrigen bomb that spreads the mist over the whole globe, sparking a birth of potentially thousands of Inhumans outside of the control of the imperial house.

Thane falls victim to the Terrigen mist with horrific results.  The Deviant strain inherited from his father comes to the fore and, in a blink of an eye, he wipes all other life in Orollan.


Almost immediately after the massacre, the Ebony Maw, one of Thanos’s little helpers, shows up bearing gifts.  The Maw has brought a set of armor purported to help control Thane’s new abilities and to protect the people around him.  I suppose that it is possible that Thane, under the shock of all the death he caused, could agree without asking any details.  But more realistically, shouldn’t Thane have been on his guard when a stranger shows up out of nowhere, freely announcing his heritage?  Instead, Thane is quite happy to accept the ‘gift’.


Well, the good times don’t last, as at almost the exact instant the last stitch is donned, Ebony Maw surrounds Thane with a containment field, and then fawningly turns him over to Thanos.  We are treated to a touching Father-Son moment.


Eventually, Thane frees himself, and embraces his big daddy with his ‘right hand of justice’, freezing Thanos in almost the exact pose he occupied at the end of Marvel Two-in-One Annual #2 when Adam Warlock pulled the same trick.


At this point, one might expect Thane to fall into the once-bitten-twice-shy category, but no, Thanos: A God Up There Listening finds him quite willingly accompanying the Ebony Maw off planet to learn more about his dad.  They soon arrive at Planet Malady, where, according to the narrative, the atmosphere acts as an extremely strong alcoholic gas.  Stopping in Sclerosis Syd’s fine pub, they soon find the being they are looking for in the form of Trynka.

It seems that Trynka was an eye-witness to the one time when Thanos decided to take on Ego, the Living Planet, and that Thane, by telepathically linking with him, can experience all the events as Trynka did.

And how is this telepathic link mediated?  It seems by an elegant technique, where Trynka projects his eyeballs out of his head and onto Thane’s – the optic nerve serving as the hard-wire link between the two.


Immediately, Thane finds himself aboard the bridge as Thanos orders his fleet to assault Ego with missile barrage.  The missiles strike Ego cleanly, and as the surface of the living planet burns, Thanos turns for a talk and a caress with Mistress Death, who may or may not really be there as no one else can see her.  Thane, watching the tableaux from vantage of Trynka’s memories, immediately concludes “Oh God…He’s Insane.”  Here then is the idea that was first put forth in the Thanos Rising series – that it is possible that Thanos has imagined all his interactions with Mistress Death, since she remains unobserved by all but him.  Thus having him earn the adjective ‘Mad’ in his description as the Mad Titan.

The character of Thanos has fascinated me for a long time, a very long time. I’ve followed his evolution from a shadowy villain, to a profoundly ambitious threat to universal existence, to an almost philosophical but severely flawed intriguer under the pen of Jim Starlin.  In all these incarnations, there are two constant components of Starlin’s canon.  First, Mistress Death is a real entity who interacts with Thanos and can be seen by others in the storyline.  Second, despite his raw power, Thanos is a clever manipulator who uses direct force as a final resort.

The current ideas that were introduced in Thanos Rising and further explored in TAGUTL throw the Starlin canon completely away.  Here we have a truly stupid madman suddenly deciding that his sexually promiscuous past must be erased in a tribute to an entity that only he can see, and that he will go about it in the most stupid way imaginable.

Basic physics seems to elude our purple fiend so that he thinks a frontal assault with a simple missile barrage is all that is needed to kill a living planet.  Assuming that Ego is the size of Pluto (a dwarf planet), that gives the Living Planet an advantage in mass of over 3 billion, billion times, not to mention the Galactus engine it possesses as a power source and mode of transportation.  How then does this stupid version of Thanos think he can win?

The TAGUTL narrative gets even more stupid as it progresses, insulting the reader’s intelligence with a variety of unbelievable plot points.  Without adding too much in the way of spoilers, we learn that:

  • Thanos and his crew can be surprised when Ego retaliates
  • that they all may or may not have died
  • that Ego can eat people, bringing them into his core
  • that Thanos likes lava
  • that Thane can’t remember which of his arms kills and which brings living death
  • that Thane is also mad in that he may have left Earth on his own and not with the Ebony Maw

and so on.

I suspect that the editors have a similar assessment to mine.  I base this conclusion on the fact that the series was released all at once, with all four issues coming out at the same time and all without the usual Marvel bonus digital edition.

If you haven’t read it, I recommend against it.  And if you work at Marvel, I have to wonder, is anybody up there listening?  If so, just let Starlin write Thanos.  All of us will be a lot better off.