Wednesday, May 26, 2010

The Trouble With Satchmo

Conversations on the web about Satchmo, the most popular open source Django e-commerce platform, tend to be between two types of developers. The first kind encounters some problems with installing Satchmo, or hits a snag in deploying one of the features. Naturally, they pose the question: does it have to be this hard? Are there any alternatives?

At this point, the second type of developer pipes up and says something like, "If you think it's too hard, then you're not trying hard enough. Learn more and it'll be easier for you. Satchmo is just fine, so quit your whining."

I can see where both sides of these kinds of debates are coming from. I've gone through the work of getting Satchmo running on Linux machines, deploying it to servers, and even gotten it running on Windows boxes. And it is really a pain in the tush. For all of its features, there are a ton of dependencies that have to be installed and properly configured on your system before you can get it up and running. So I can empathize with the first kind of developer.

When you get right down to it, though, the second type of developer mentioned above is also correct. As arrogant as they can be, if you spend some time learning how to set up Django development environments, how to configure your PYTHONPATH, and so on, then it becomes much easier to install and configure Satchmo. You still run into problems, but you know how to handle them, and you know how to reduce the amount of repetitious work that needs to be done.

Satchmo is an e-commerce solution that was created by programmers almost exclusively for use by other programmers. If you're an experienced Django programmer, and your spouse wants to set up an online store to sell toys for dogs, then Satchmo is the logical choice. All of Satchmo's installation headaches, catalog management difficulties through the admin interface, and documentation shortcomings can be dealt with if you know Django or you've got a Django programmer helping you out. (Maybe.)

Don't get me wrong; I like Satchmo just fine. It's got a very large community of developers behind it, and it's certainly better than anything that one individual could create by themselves. But it's definitely not a great way to start learning Django.

The kind of developer who looks at Satchmo and thinks, "Man, can't we make this easier?" recognizes that while programmers can learn programming solutions in order to make their jobs easier, not everyone who wants to do e-commerce online is a programmer capable of learning and applying those solutions.

That's not to say that we can get to the point where programmers are unnecessary. Consider the PHP e-commerce community: for many years, PHP developers wrestled with the decision of creating their own codebase, or going with one of the many open source solutions that were available. Many of them were fraught with the same kinds of installation and maintenance headaches that plague Satchmo, most of which could be alleviated if you became more experienced with PHP.

But that wasn't quite good enough for the people who developed Magento, which, while still not perfect, is an extremely popular e-commerce framework among both developers and people in business. One of the main reasons? It's easy to install and setup. Which brings up a critical observation: I don't think ease of use is appealing only to non-programmers.

Here's the important thing to note: Magento might be open source, but like MySQL, they manage to make money from the software using a freemium business model. "Open source" does not equate to a lack of business sense or profit opportunities.

I'm not completely satisfied with Satchmo. Personally, if you like it and want to join the community that is helping develop it, don't let me stop you. Those developers are doing great work, and they deserve all the help they can get.

In the Django community, Satchmo is almost certainly the elephant in the room, but I don't think that should stop anyone who doesn't like it from creating their own e-commerce platform. If Satchmo is too much of a pain for you, why not try creating something of your own? And that was part of my motivation for writing my book about Django e-commerce: for those developers for whom Satchmo just isn't cutting it. I wanted to empower the readers of my book (who are probably way smarter than me, and better programmers) to scratch their own itch if they so desired, to run off and start their own projects. If nothing else, I hope it gives programmers the ability to get a handle on doing e-commerce with Django so that they have an easier time with Satchmo.

But I think the opportunities are many, and the potential for innovation is huge, in the Django e-commerce realm. I'm really can't wait to see in a few months' time what awesome projects Django programmers are creating in someone's basement right now.

Monday, May 24, 2010

Misspellings In Search

Ever heard of a word ladder? It's a game invented by the Lewis Carroll, the author of Alice in Wonderland. Here's how it works: you're given two words, and you start with one of them. Change one letter in the word at a time, or add or remove a letter, to arrive at the second word. At each step along the way, the changes you make have to result in valid words.

Here's an example: how do you turn "LEAD" into "GOLD"? Try it yourself right now if you're interested in solving the puzzle.

Here's the solution I devised back when I tackled this problem in middle school, changing one letter at a time:

LEAD
HEAD
HELD
HOLD
GOLD

The key here is that I changed the beginning "L" to an "H", since "LELD" and "LOLD" are not real words. It's an extra step that's needed to conform to the rules of the game. But as you can see, it takes four "steps" on the word ladder in order to get from the starting word to the ending one.

Counting the number of steps from the first word to the second can be a useful number to calculate when you want to offer spelling corrections to users who are doing internal searches on your site. You know how Google offers you polite suggestions when you misspell a word? (e.g. "Did you mean: 'electric guitar'")

The number of character insertions, deletions, and substitutions required to convert one string to another is known as the Levenshtein distance. Unlike word ladders, the Levenshtein distance doesn't care if each step along the way results in an actual word. So, the steps between "stretch" and "straight" would be as follows:

stretch
stretcht
stratcht
straicht
straight

The Levenshtein distance between the two string is 4. The closer two words are together, the lower the distance score between the two of them will be.

What does this mean for misspelled search terms? If someone enters some search text on your site, and there are zero matching results returned, you might consider sweeping through your database and looking for strings or phrases with a very low Levenshtein distance, in an effort to catch misspelled words.

If someone enters "accoustic guitar", they might get no matching results. But, if your product data contains the phrase "acoustic guitar", you might be able to catch that and display a hyperlink offering the alternative "Did you mean...?" text that will repeat the search with the new word if clicked.

So how do you calculate the Levenshtein distance in Python? There are a few ready-to-use Python functions listed on this Levenshtein Distance wiki page. I've been using the first one listed in a couple of my projects and it seems to work very well.

Naturally, this is not a substitution for a complete full-text search package like Django Sphinx or your actual search algorithm; it's just something you might consider falling back on. Also, you want to be very careful about how you implement this on a live site where you're searching through thousands of records. Computing the Levenshtein distance between some search text and the words and phrases in a TextField (like, for example, in a product description field), would be a very slow, inefficient way to search for possible spelling corrections. If you do implement something, make sure you test its performance before releasing it into the wild.

However, if you have a small site in development, and want to try your hand at helping guide the user if they've possibly misspelled a search term, I found the Levenshtein distance is a great quick-and-dirty way to get started.

Thursday, January 28, 2010

Django and Ajax Form Submissions

I'm writing this post, in part, to correct a mistake. In Chapter 10 of the book, I cover how to add Ajax functionality to our Django e-commerce site by using the jQuery library. In the interest of keeping the chapter short and easy to understand, I kept the coverage in that chapter very rudimentary.

In hindsight, it was, I believe, a little too rudimentary, and I don't think the code is nearly as good as it could be. It's not very DRY (since we repeat ourselves), it's not degradable (if the user has JavaScript turned off, it doesn't work), and for larger forms with lots dozens of fields, it's likely to become a maintenance headache and possibly hurt performance.

This isn't errata, but it's worth taking another look at. Let's revisit how to do a simple Ajax form submission in a Django project with jQuery, using the submission of product reviews as an example.

(I'm assuming, if you're reading this, that you have the book handy and can follow along with what I'm doing. If you don't, I'll try to explain the concepts well enough so that you don't need the book. Just keep in mind there's a ProductReview model and associated form class being referenced here that I haven't included.)

Let's start with the URL entry that links to the view that accepts product review data, on page 219:

urlpatterns = patterns('ecomstore.catalog.views',
  ( r'^review/product/add/$', 'add_review' ),
)

One of the main problems is that this URL is defined here, and before, in our JavaScript code, we defined the same URL again, in perfect violation of the DRY principle. We're going to fix this. The first step is to give this URL entry a name parameter:

urlpatterns = patterns('ecomstore.catalog.views',
  ( r'^review/product/add/$', 'add_review', {}, 'product_add_review' ),
)

That means that, in our templates, we can use the URL tag and refer to 'product_add_review' to reference this URL. If it's in the template, we can make sure it's in the DOM someplace, and if it's in the DOM, then we can make our JavaScript code aware of it. That's exactly what we're going to do next.

In the book, on page 214, the fields are just text inputs on the page, and aren't actually contained in a form element. Actually, they're in a div element with an id attribute of 'review_form'. This is, at best, a missed opportunity.

Here's the new template code. Again, if you don't have the book, just know that 'review_form' is an instance of the review form class and that 'p' refers to the product instance on the current product page:

<div id="review_form">
<form id="review" action="{% url product_add_review %}" method="post">
  <div id="review_errors"></div>
  <table>
    {{ review_form.as_table }}
    <tbody><tr><td colspan="2">
      <input id="id_slug" value="{{ p.slug }}" type="hidden">
      <input id="submit_review" value="Submit" type="submit">
      <a href="javascript:void(0);" id="cancel_review">Cancel</a>
    </td></tr>
  </tbody></table>
</form>
</div>

With this, the DOM has all the semantic information it needs for posting a review form, and it can pass that along to our JavaScript. Also notice that I changed the "Cancel" button from a submit input to an anchor element.

Now, the changes we're going to make to our JavaScript are more than just a little superficial. If you're sitting there with the book in front of you, you won't be able to just change the addProductReview() function and have it work, because we're going to change the events that are fired and how we're attaching those events.

First, we're going to attach the addProductReview() to the form element. Inside of the prepareDocument() function, the contents of which are listed in the middle of page 216, remove this line:

jQuery("#submit_review).click(addProductReview);

and replace it with this:

jQuery("form#review").submit(function(e){
  addProductReview(e);
});

Now, the addProductReview() function will fire whenever someone submits the form, which is done by click the "Submit" button.

Let's turn our attention to the addProductReview() function itself and give it an extreme makeover. In the book, we use the jQuery.post() function to submit our Ajax request, but here, I'm going to use the jQuery.ajax() function, which allows for some more fine-grained control over the submission process. Here is the reworked version, which I will explain, line-by-line, in a moment:

function addProductReview(e){
  e.preventDefault();
  var review_form = jQuery(e.target);
  jQuery.ajax({
    url: review_form.attr('action'),
    type: review_form.attr('method'),
    data: review_form.serialize(),
    dataType: 'json',
    success: function(json){
      // code to update DOM here
    },
    error: function(xhr, ajaxOptions, thrownError){
      // log ajax errors?
    }
  });
};

This new addProductReview() function takes a single argument, 'e', which refers to the form submission event itself, passed down on high from when we attached the function to the submit form event. 'e' allows us to do a couple of interesting things.

First, we call preventDefault() on the event. That means that the JavaScript will halt the normal form submission, so the browser won't reload the page. Naturally, if the user has JavaScript disabled in their browser, e.preventDefault won't fire, and the form will post to the "Add Review" URL as usual, the non-ajaxy way. This is the first step to ensure that our form degrades gracefully. (We'll revisit this is a moment, when we look at the view function.)

Next, we can reference the form element itself by using the selector and e.target:

var review_form = jQuery(e.target);

'review_form' now refers to the 'form#review' element on the page, which includes all of its attributes and child elements, including the form inputs themselves. We're going to use this for the values we submit with our Ajax request.

The jQuery.ajax() function takes a few parameters. Here are the ones that I've used, along with a quick definition of each:

1. url - The URL path to which we should submit the request.
2. type - the HTTP method, either "get" or "post".
3. data - the form values, as a set of name-value pairs encoded in URL format. (e.g. "name=john&amp;content=good book!")
4. dataType - the type of data we expect as a response. (in this case, "json")
5. success - a function that handles the response after a successful request.
6. error - a function to handle unsuccessful requests. (optional)

The first three items, we get right from the form contents itself. The first two, we obtain by using the attr() function to get the values of the attributes of the form element:

url: review_form.attr('action'),
type: review_form.attr('method'),

The third one uses a new function that allows us to get the values of the form inputs as a set of name-value pairs, as a single URL string:

data: review_form.serialize()

That's much easier than having to spell out a selector and create a new variable for each input on the form, as was done in the book. This is especially true if you have a form with dozens of fields. The serialize() function is a handy shortcut. Also, DOM selection is an expensive operation for JavaScript, so doing a single selection of the form element and serializing it is much quicker than selecting the elements one-by-one, from a performance perspective.

For the last few items: we expect the dataType the server returns to be a JSON object. Inside of the success function, you can define the same code that was in the book, since the DOM update operations will be largely the same. Lastly, while we didn't create an architecture to handle Ajax errors in the book, I included the block where you would put code to handle any errors your Ajax request encounters. This could be useful for logging purposes.

Now that this is done, we just need to make a couple of small changes to our view function. First of all, requests that come into this URL from form submissions might be Ajax requests, or they could be coming from users who have JavaScript turned off and have just submitted a form the traditional, non-Ajax way. Our view function needs to check this and respond accordingly.

Here is the new view function in catalog/views.py:

@login_required
def add_review(request):
if request.method == 'POST':
  form = ProductReviewForm(request.POST)
  slug = request.POST.get('slug')
  product = Product.active.get(slug=slug)

  if form.is_valid():
    review = form.save(commit=False)
    review.user = request.user
    review.product = product
    review.save()

    template = "catalog/product_review.html"
    html = render_to_string(template, {'review': review })
    response = simplejson.dumps({'success': 'True', 'html': html})

  else:
    html = form.errors.as_ul()
    response = simplejson.dumps({'success':'False', 'html':html})

  if request.is_ajax():
    return HttpResponse(response,
      content_type="application/javascript")
  else:
    return HttpResponseRedirect(product.get_absolute_url())

Again, I haven't defined the Review model or form in this case, but you should get the idea. We still handle the validation and saving of the new review instance in the same fashion. Towards the end, we just conditionalize the type of response we return depending on the nature of the request. If it's an Ajax request, we return the JSON object to the JavaScript function that we assume called it. Otherwise, it's a non-Ajax request coming from someone who has JavaScript disabled, and we just reload the current product page, with their new review posted.

They key here is in using request.is_ajax() to check the nature of the request. jQuery, as well as most other major JavaScript libraries, adds a header to the request called HTTP_X_REQUESTED_WITH with a value of "XMLHttpRequest". If Django finds this header in the request with that value, request.is_ajax() returns True.

There's one small problem with this: if a user has JavaScript turned off, and they submit a product review that doesn't validate, we don't communicate this to the user. They just get the page reloaded without their product review and no indication of what went wrong. In the case of our site, we already display a warning to people with JavaScript disabled that stuff might not work as expected. (Depending on your requirements, that might be unacceptable, but I leave it up to you, and the architecture of your own site, to determine how you could fix that particular bug.)

To recap, we've made the following improvements, which I encourage you to do elsewhere in your Django projects:

1. Give the Ajax request URL a name in your urls.py file, and use the {% url %} tag in the DOM of your page so jQuery can access it. (If the action attribute should be the current path, you can use {{ request.path }} in most cases in your templates to spell that out explicitly.)

2. Structure the elements of your page so they function all right even if the user has JavaScript disabled in their browser.

3. For forms, define the method for submission in the 'method' attribute of the form and use jQuery's attr() function to get it.

4. You can serialize the contents of a form for Ajax requests using the serialize() function on the form element.

5. In the view function, use the request.is_ajax() method to determine if the request came in via an Ajax request and, based on the origin, send the appropriate response.

As a side note, a lot of readers have complained about problems with the djangodblog app. For one, the installation instructions in the book don't work with current versions, but more importantly, it seems that if your database uses the UTF-8 character set encoding, django-db-log is not compatible with it. That's a bit disappointing, and I'm trying to figure out what to do about it.

Sunday, January 24, 2010

Money Isn't Everything In Branding

It really isn't. If it were, then you could launch any product and, given unlimited money, buy tons of TV ad time and you'd have a successful product on your hands. But that theory just doesn't make sense. In my household, when the television commercials come on, one of two things happen:

1. Everyone either tunes them out and goes back to pecking on their laptops, or

2. We start going MST3K on them, ridiculing them and interjecting our own snide and cynical comments into the inane little stories and dialogue.

Just about everyone I know is pretty good at ignoring commercials. My generation has evolved with a small part of the brain that knows how to ignore media when it's not relevant to us. There are a few that stand out amongst the dross, but most are just clutter and don't get stuck in my head. That means just buying TV ads and running them doesn't work anymore, which comes as no surprise.

Case in point: Dr. Pepper and Coca-Cola came out around the same time (around 1885), but were created by different companies. In 1972, the Coca-Cola company decided they would release a Dr. Pepper-esque clone, called Mr. Pibb, to try and take some of the market share away from Dr. Pepper. I don't have exact figures, but I rarely see Mr. Pibb in convenience stores and it's not very prominent in most grocery stores.

If money were the only thing that mattered, you'd think that a company like Coca-Cola, which has an annual advertising budget that could probably acquire an entire country, would be able to pose a significant threat to the Dr. Pepper brand. But we don't see Mr. Pibb many places other than in restaurants and diners on soda machines that the Coca-Cola company has dominated with its own line of beverages. Sure, it's still around, but Dr. Pepper clearly continues winning the battles in the marketplace.

When someone wants a Dr. Pepper, they immediately think of the name "Dr. Pepper". It's next to impossible for Coca-Cola to dislodge that name from people's minds to replace it with their own. It helps to get there first.

A more relevant, techno-geek example: take the new Bing search engine. Without question, at the time I'm writing this, Google is the largest and most frequently used search engine on the web. Its name has become synonymous with search. When people need to search on the web, they think "Google" and that's what they use. It's lodged in their brains.

Microsoft introduced Bing in mid-2009, a move clearly aimed to take some market share away from Google and Yahoo! In their ad campaigns, they show people who have used Google extensively for search babbling incoherent clanging like schizophrenic patients. They go on to suggest that "search overload" is something we should all be terrified of, and to avoid a similar fate, we should switch to using Bing.

This message seems a little odd to me, since there isn't any truth in it that resonates. It reminds me of the episode of the television show "Arrested Development" where the character Gob Bleuth opens a banana stand right next to his brother's banana stand to compete with them, and their slogan is "A frozen banana that won't make you sick and kill you". It's mudslinging without any truth to the message, and it's easy for the mind to ignore what it can easily discredit.

Moreover, Microsoft has pushed hard to brand Bing as a "decision engine" instead of a mere "search engine". I suppose this is their means of differentiating Bing from Google, but that's like trying to sell a "pop machine" to compete with a "soda machine". When you go to the Bing homepage, it seems just like Google, with a few small differences. It hardly seems like enough to convince people that they should switch over to using Bing instead of Google.

That's not to say Bing will wither and die completely. People are using it, but they're probably never going to be handling the majority of web searches. If that's Microsoft's goal, then they're out of their minds. Google is the Dr. Pepper and they're Mr. Pibb.

An important corollary of that is the fact that Google is currently working to develop its own Chrome operating system, which will be open source and available as an alternative to Windows. Spokespeople for Google have stated that they have no intention of trying to use Chrome to replace Windows as the most common OS.

Positioning tactics are everywhere. The other day I saw a bar of chocolate that said "Chocolate for wine lovers." People drinking wine can easily buy a Hershey's bar, but the company is trying to sneak their own chocolate into the mind of wine drinkers. Does chocolate even go with wine? Who knows? Maybe they'll put the idea in people's heads by putting it out there. (Problem is, I don't remember the name of the brand of chocolate. Fail.)

There's a very good book on the subject of positioning, called (aptly enough) Positioning, that covers this topic very well. It was written over thirty years ago, but the book is a very good overview of modern marketing, full of examples, and is very well written. If you're accustomed to reading long and dense computer books, you can probably finish the book in a few hours. If you're planning to start your own Internet company and don't know anything about branding, I'd say reading the book is well worth your time and money.

The important takeaway here is that money is definitely not everything. If you're a small business owner or just planning to start your own niche e-commerce site, that's great news, because it means there's a good chance you can achieve success without spending a whole lot of money on advertising. If you're selling a new product in a new market space, you'll probably have an easy time getting customers, provided you're selling something that people want to buy and your site seems trustworthy in appearance.

If you're entering an existing market with established competition, you'll have a much easier time if you position yourself relative to your competitors. I did it when I wrote my book on Django. I wanted to write an introductory book that covered the framework, but of course, there are plenty of "Introduction to Django" titles. Choosing to go the e-commerce route was my way of finding an untapped niche and distinguishing my book from others.

In following the "Modern Musician" example in the book, imagine that you want to sell musical instruments and other accessories online. I would first look at who the big players are. In terms of online merchants catering to musicians, Guitar Center and Musician's Friend are two big ones that show up on Google. They're also probably first in line in the mind of our prospects (customers) when they feel the urge to go buy guitar strings, capos, etc.

Now, Guitar Center and the other big players seem to make a strong effort to cater to everyone, no matter what they might need. For Modern Musician, that's probably not a good strategy. Let's start much smaller. These days, with RockBand on Macs and lots of other PC software recording programs, lots of musicians are doing home recording. Perhaps it would make sense to try and brand Modern Musician as an online merchant that supplies instruments to those musicians who have studios in their basements and garages. The slogan could be "The go-to place for DIY home studio recording."

By itself, of course, that's not a very great idea, the slogan's bad, and I'm sure it's being done by someone at the moment. But the hope is that you can find a hole in the minds of your customers, and that your brand can fill it. Give people a reason to think of your name when they are thinking about playing, writing, or recording music. Help them feel your store is helping them self-actualize their aspirations and dreams of being a musician. Remind them of the thrill of performing live or connecting with a raving fan of their music.

Naturally, whatever you promise, just make sure you can follow through on it and that it resonates with the truth. Otherwise, it's no better than the Bing commercials and, unlike Microsoft, you probably can't afford to fail.

Of course, if you happen to have a massive advertising budget and want to spend it on television commercials and billboards, by all means, do that. But I think that no matter how much money you have, creative strategy and careful planning are the most important assets in the marketing department. If you don't have those, there's a good chance that when your commercial airs on TV, me and everyone else will be ignoring it.