More On Tagging Blends

Posted on April 3, 2008
Filed Under Technical |

My previous post on better searching and discovery of blends has generated a good discussion on the possibility of tagging. I thought it deserved its own thread here, so I am moving the discussion of tagging over to this post so we can keep it on its own.

Now, to respond to the comments so far. As a bit of background, this is an aspect of Information Science that I have a keen interest in, so I think a small primer is in order:

What we are talking about here is metadata: data about our data. There are two possibilities for organizing or generating metadata; Controlled Taxonomies (or Vocabularies) and what are now called “folksonomies”, previously known as free-tagging.

Controlled Taxonomies are, without a doubt, the most accurate and useful types of classification schemes for metadata. A different way to look at them is that Controlled Taxonomies are Prescriptive Vocabularies and folksonomies are Emergent Vocabularies. Each have their place in indexing information, and the discussion at hand is which is more relevant to our needs here at Tobacco Reviews.

I love well-designed metadata and Controlled Taxonomies. I labored in love over the ones that are already here on the site, and the even-better ones on the new version of the site, but I do not believe that they are always appropriate. If I thought that it was even remotely possible to create such a vocabulary for “Blend Styles”, I would be the first to jump in and start work on it. I do not, however, think it possible.

There are too many permutations of styles of Blends extent right now for one person (me) to get a handle on them without stopping all other work completely for a couple of years. There is a serious lack of information and definition about Blends that would have to be overcome before any serious attempt to classify them could be made (what constitutes a “casing” or “topping” or “flavoring”, do we have reliable information on constituent tobaccos in a blend, etc…).

Without this sort of information, or at least good approximations, we cannot create a Controlled Taxonomy. We could pretend to by sanctioning a certain set of terms to be used for tagging Blends, but that wouldn’t really help at all - it would just lend us an air of authority without anything backing it up. If that is to be the case, I would prefer a simple free-tagging system, perhaps with some sort of “garbage collection” system that automatically dropped old unused terms after a while, so that the useful (gaged by actual use) would rise to the top and the rest would sink out of sight.

However, if there is the will amongst users to work on developing a serious Taxonomy of tobacco blends for the site; one that we can show to the world and defend the reasoning behind, I am all for it! I will provide as much help and support as I can for such a project and will be an enthusiastic participant. I will also, when such a project produced results, apply them to Tobacco Reviews and rigorously enforce its usage for all new Blends added to the site.

How we would go about creating such a thing is a matter for much discussion. First, let’s see if anyone is interested in the project and then maybe figure out how to do it. Any takers?

p.s. — I use “we” above not in the royal sense to refer to myself, but to mean us, all the people using Tobacco Reviews and deciding its future direction.

Comments

RSS feed | Trackback URI

10 Comments »

Comment by Illinois_Hick
2008-04-03 09:08:12

As I noted in the last thread, I think the anarchy that would develop from free-form tagging would result in a classification system that is less useful to users, and more work for the host. In addition to the problem of combining simple typos such as “Burley” and “Burly” you must also wade through different methods of shorthand, such as “Vaper” “Va/Per” “Va Per” etc….

Free form metadata would also compound, not reduce the editorial input from the host, if the tags are to be at all useful. For instance, free-form tags for Carter Hall could include, “Burley,” “Classic American,” “Old School,” “Drug Store,” “Traditional American,” etc… Are the reviewers all referring to the same idea in different verbage? Sure all the reviewers are referring to Carter Hall, but do they all mean the same thing by these descriptors? Should the tags be lumped together or stand alone? If every tag stands alone, then all you have is a search engine for different free-form descriptors, with overly narrow results. Likewise, if tags stand alone, the reader is also left to discern what the reviewer might have had in mind when he came up with his creative tag. There would be no definitions mediating the understanding of the reviewer and the understanding of the user.

What does the host do with editorial comments that are part of the tag? Should “Crappy English” be separate from “Great English” be separate from “Classic English” or should they all be combined under “English?”

In short, I think free form tagging requires either enormous editorial judgement from the host (far more than defined taxonomies) along with tedious housekeeping, or (barring such editorial judgement) becomes a simple search engine with unlimited possibilities. Count me in as a vote for defined taxonomies. It should surely be done with input from reviewers, and reviewers should do the tagging (and multiple tagging if you like), but there should be some useful definition to the universe, or many tags will be lost in obscrurity, while many blends will lack meaningful tags. There should also be something of a common understanding of what tags like “American” mean, even if not all reviewers like the definition, or the metadata will crumble like the Tower of Babel.

 
2008-04-03 10:50:18

I have to agree with illinois_hick, without repeating the detail. The task could become far too difficult for the person in charge while confusing to the end user.

Sometimes the more simple the better. Burley is burley - Butternut Burley is very different from Prince Albert but in the end, they are both Burley.

Part of the delight in smoking a pipe is trying new/different tobacco’s. If I know I like Odyssey, for example but I haven’t got a clue what kind of tobacco it is blended from, all I really need to know is what other blend out there would be close enough that I might also like it.

While there are always improvements that can be made, the existing data base does allow me to find that information.

 
Comment by Tobold
2008-04-03 10:56:22

Jon,

With all due respect, I have to disagree with you about this metadata and I pretty much completely agree with Illinois_Hick. I think that I can see that you have a lot of the same obsessions with data as I do: Do it right, do it extremely accurately, and do it comprehensively to the nth degree! This is what leads me into situations where I almost have to give up on that particular project.

It’s even tough for me, but we all have to realize that NO ONE will EVER come up with a classification scheme for pipe tobacco that will be accurate. That’s the problem: accurate according to who?? (Whom??) We need to keep this in broad strokes without the hair splitting that will lead us down an endless road.

Here’s how I would approach it: (Your mileage may vary)

1. Use at least some of the categories on the Tobacco Aging FAQ site (under Touchstones), located at http://agingfaq.nocturne.org/touchstones.php to at least get started.

2. List four or five Blends that typify that category. The goal here is NOT to list the BEST Blends in that categore, but to show what this category IS. F’instance: If you want to show that this categore might allow a Burley component, then have at least one of the examples be one with Burley in it.

3. Post “your” list of categories and Blends on this blog and ask for opinions, additions, and subtractions. **Realize that there will NEVER be complete agreement!** But this step would be to correct obvious oversights (at least obvious in hindsight).

4. Use this “revised” list to build your tags. I would post a link on the home page to this list along with the examples so that people can see what “classification” this site uses. If I were wondering why I didn’t see a category I expected to see, and I find that the Blend that I would call the signature blend for my category is in your “English” category (for example), then I’d say, “OK, if I search for more like this, I’ll need to search this site’s English category.”

I wouldn’t get hung up on the difference between “casing”, “flavoring”, or “topping”, or whether or not any of these automatically dumps a blend into “Aromatic”, etc., etc. Start with some touchstones and let the site users do the rest. But I would use a Controlled Taxonomy.

Footnote: The history of Botany is one of “free-tagging”. This led to some serious problems and now a governing board had restructured plant taxomony into a definite Controlled Taxonomy. And I’ll bet there are any number of plant scientists that rail against the current classification system. @:^)

– Doug Pearson

 
Comment by madmarv
2008-04-03 12:42:08

Agree here with Tobold. Perhaps I missed it, but will you allow the use of multiple tags by a single reviewer? If you do, that could help especially with those blends that don’t seem to fit any classic catetory.

Post your suggested tags in a separate thread. You don’t need that many - the Tobacco Aging FAQ would be more that enough for a start. Let users suggest other categories if they think the list isn’t complete enough, and use a poll to get a sense of whether it would be useful to add or not. I expect things will actually settle down pretty quickly.

 
Comment by Jon Tillman
2008-04-03 16:29:16

Let me ask a few pointed questions here before we go full-bore into designing something to scratch this itch:

0) Are tags the right way to go about doing this. The excitement about site users tagging blends with the taxonomy that has been suggested is identical to site users “Updating” a Blend to have more correct Blend Contents or “Cut” or whatever.

1) What is this taxonomy supposed to achieve? I am not being flippant, but I think actually knowing and stating the purpose of the additional metadata will point us in the correct direction.

2) What would be the point of tagging a blend “Vaper” instead of searching for blends containing Virginia & Perique, which is already possible with the Controlled Taxonomy of Blend Contents? Is there some granular definition of “Vaper” that is not correlative to Virginia/Perique? Is there a certain balance that must be achieved? Likewise, with reference to the Aging FAQ, what would be the point of tagging something “VA Flake” when you can already search for Flake style with Virginia contents? I won’t even touch the silliness that is “light” and “full” as those are entirely subjective and personal descriptors and have no place in any taxonomy at all. Latakia heavy = Balkan Style is not necessarily true either, so that is out, and also easily found via Blend Contents. Oriental, also covered by the Blend Contents.

3) What is the line between “Burley Blends” and “Traditional American”? It sounds a lot to me like they should be called “Tobaccos with Burley I Smoke” and “Tobaccos with Burley I Don’t Smoke”.

4) Lakeland Style - Here we have an actual bending style. Anyone care to take a stab at a real, actual definition of it? The best I can come up with is “A steamed, medium strength virginia-based blend from the Lakeland district in England”.

5) Rope and Plug - easily done via the existing Cut taxonomy.

6) So far, everything I have seen put forth as potential tags or uses for tags are just recapitulations of already existing metadata taxonomies on Tobacco Reviews. If that is the point of the project, then what we need is a better designed, easier to use Advanced Search, perhaps with a few “recipes” for finding certain blend profiles.

Let me throw out another tagging possibility, one that has been used to great effect in places like LibraryThing - personal tagging. You get to tag everything with whatever you’d like, but no one else has to look at your silly tags. Not only does a tag like “Vaper” then have a definite meaning to the only person who will use it (me), but I can also tag things with “TO-TRY”, “TO-CELLAR”, “WORST-TOBACCO-EVAR”, etc…

 
Comment by madmarv
2008-04-03 18:23:22

“…then what we need is a better designed, easier to use Advanced Search, perhaps with a few “recipes” for finding certain blend profiles.”

I think you hit it here. The only use I saw for tags was to create an easy to use quicksearch for generally accepted blend categories. I’m not interested in whether a particular blend is an Enlish or a Balkan, as long as it shows up when I hit ‘find similar’ from a blend that IS similar.

To put it another way, I’m not really interested in precisely naming components or categorizing blends. I’m looking for a fairly broad listing for blends I might like, based on one I know I like, and preferably ranked from ‘most similar’ down. That hasn’t worked very well so far, since two blends with the same components can taste very different.

Comment by Jon Tillman
2008-04-04 08:31:45

Yeah, the searching needs to get nailed down. I’ve always been disappointed in how it came out. Maybe now I can get it better organized.

As to the broad categories ideas, I would still like to play around with possible ways to tag blends, but the traditional “English” “Aromatic” “Balkan” way of dividing them up doesn’t really have any relevance these days.

What I’d really like to see is an honest attempt to cluster tobaccos together in a new way. Towards that end, see my response below…

 
 
Comment by Hemlock
2008-04-03 21:51:02

Not sure about the science behind metadata and the like, but it would be helpful if we could get a report on each blend that showed what other blends reviewers who “recommended” or “highly recommended” it also “recommend” or “highly recommend”. This way similar tastes can be shared, without classifying by the old pigeon holes.

Another suggestion is to be able sort by similar ingredients in a blend e.g.leaf type and or fragrance and cut. This way grouping could be done by a number of characteristics without classifying a blend according to outdated or ambiguous terminology.

Comment by Jon Tillman
2008-04-04 08:33:27

This is a hugely needed feature, and the first thing I’m going to work on once the site is moved to the new code.

I am convinced that there is a way to discover what the real, actual groupings of tobacco blends are through such a feature. I’ll post more as I flesh out the idea.

 
 
Comment by christian hagen
2008-04-04 05:16:09

many years ago, when i started smoking a pipe, the danish tobacco-lingo allowed for 3 types of tobacco: VIRGINIAS (loose/flake/plug/rope), including those blended with a smaller portion of perique, kentucky or burley, scented as well as unscented. BURLEYS (cube cut/flake), more or less sweetened/cased. and finally, “MIXTURES”, covering everything with noticable oriental leaf in it, be that smoked or unsmoked.

now, if the categories should consistently signify the types of tobacco predominant (taste wise) in a blend, the above mentioned “mixtures” should be called “orientals”. that, of course, leaves us with the problem that “orientals” to most of us nowadays means “orientals-other-than-latakia”. i don’t know how to solve that, but i do feel that a subdivision of blends should try for consistency. no “english” or “scottish”, please ;-) (but perhaps “danish”/”german”/”dutch”/”english”/”american”?).

 
Name (required)
E-mail (required - never shown publicly)
URI
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.

Trackback responses to this post