Monday, June 8, 2009

Item-Based And User-Based Analysis For Recommendation Engines

There are two main approaches to building recommendation systems, based on whether the system searches for related items or related users.

In item-based analysis, when a user likes a particular item, items related to that item are recommended.

As shown in this figure:



If items A and C are highly similar and a user likes item A, then item C is recommended to the user.

There’s two approaches to finding similar items. First was content-based analysis, where the term vector associated with the content was used. The second was collaborative filtering, where user actions such as rating, bookmarking, and so forth are used to find similar items.

In user-based analysis, users similar to the user are first determined. As shown in the blow figure:



If a user likes item A, then the same item can be recommended to other users who are similar to user A.

Similar users can be obtained by using profile-based information about the user—for example cluster the users based on their attributes, such as age, gender, geographic location, net worth, and so on.

Alternatively, you can find similar users using a collaborative-based approach by analyzing the users’ actions.



Here are some tips that may help you decide which approach is most suitable for your application:

■ If your item list doesn’t change much, it’s useful to create an item-to-item correlation table using item-based analysis. This table can then be used in the recommendation engine.

■ If your item list changes frequently, for example for news-related items, it may be useful to find related users for recommendations.

■ If the recommended item is a user, there’s no option but to find related users.

■ The dimensionality of the item and user space can be helpful in deciding which approach may be easier to implement. For example, if you have millions of users and an order of magnitude fewer items, it may be easier to do item-based analysis. Whenever users are considered, you’ll deal with sparse matrices. For example, a typical user may have bought only a handful of items from the thousands or millions of items that are available in an application.

■ If there are only a small number of users, it may be worthwhile to bootstrap your application using item-based analysis. Furthermore, there’s no reason (other than perhaps time to implement and performance) why these two approaches can’t be combined.

■ It’s been shown empirically that item-based algorithms are computationally faster to implement than user-based algorithms and provide comparable or better results.

References:
Satnam Alag, “Collective Intelligence In Action”, Manning Publications Co., first edition, 2009.

Sunday, June 7, 2009

Introducing The Recommendation Engine

A recommendation engine takes the following four inputs to make a recommendation to a user:
The user’s profile — age, gender, geographical location, net worth, and so on
Information about the various items available — content associated with the item
The interactions of the users — ratings, tagging, bookmarking, saving, emailing, browsing content
The context of where the items will be shown — the subcategory of items that are to be considered



While promoting top products is useful, what we really want is to create a personalized list of recommendations for users. Recommendation engines can help build the following types of features in your application:
■ Users who acted on this item also took action on these other items, where the acted on could be watched, purchased, viewed, saved, emailed, bookmarked, added to favorites, shared, created, and so on
■ Other users you may be interested in
■ Items related to this item
■ Recommended items

Here are some concrete examples of these use cases:
■ Users who watched this video and also watched these other videos
■ New items related to this particular article
■ Users who are similar to you
■ Products that you may be interested in

In recommendation systems, there’s always a conflict between exploitation and exploration.

Exploitation is the process of recommending items that fall into the user’s sweet spot, based on things you already know about the user.

Exploration is being presented with items that don’t fall into the user’s sweet spot, with the aim that you may find a new sweet spot that can be exploited later.

Greedy recommenders, with little exploration, will recommend items that are similar to the ones that the user has rated in the past. In essence, the user will never be presented with items that are outside their current spot.

A common approach to facilitating exploration is to not necessarily recommend just the top n items, but to add a few items selected at random from candidate items. It’s desirable to build in some diversity in the recommendation set provided to the user.

References:
Satnam Alag, “Collective Intelligence In Action”, Manning Publications Co., first edition, 2009.

Saturday, June 6, 2009

Understanding Collective Intelligence


Newer web applications trust their users, invite them to interact, connect them with others, gain early feedback from them, and then use the collected information to constantly improve the application.

Users are expressing themselves. This expression may be in the form of sharing their opinions on a product or a service through reviews or comments; through sharing and tagging content; through participation in an online community; or by contributing new content.

This increased user interaction and participation gives rise to data that can be converted into intelligence in your application. The use of collective intelligence to personalize a site for a user, to aid him in searching and making decisions, and to make the application more sticky are cherished goals that web applications try to fulfill.

More formally, collective intelligence (CI) simply and concisely means To effectively use the information provided by others to improve one’s application.

What is collective intelligence?
When a group of individuals collaborate or compete with each other, intelligence or behavior that otherwise didn’t exist suddenly emerges; this is commonly known as collective intelligence. The actions or influence of a few individuals slowly spread across the community until the actions become the norm for the community.

Example: The Hundredth Monkey Theory
In his book The Hundredth Monkey, Ken Keyes recounts an interesting story about how change is propagated in groups. In 1952, on the isolated Japanese island of Koshima, scientists observed a group of monkeys. They offered them sweet potatoes; the monkeys liked the sweet potatoes but found the taste of dirt and sand on the potatoes unpleasant.

One day, an 18-month-old monkey found a solution to the problem by washing the potato in a nearby stream of water. She taught this trick to her mother. Her playmates also learned the trick and taught it to their mothers.

Initially, only adults who imitated their children learned the new trick, while the others continued eating the old way. In the autumn of 1958, a number of monkeys were washing their potatoes before eating. The exact number is unknown, but let’s say that out of 1,000, there were 99 monkeys who washed their potatoes before eating.

Early one sunny morning, a 100th monkey decided to wash his potato. Then, incredibly, by evening all monkeys were washing their potatoes. The 100th monkey was that tipping point that caused others to change their habits for the better. Soon it was observed that monkeys on other islands were also washing their potatoes before eating them.

As users interact on the web and express their opinions, they influence others. Their initial circle of influence is the group of individuals that they most interact with. Because the web is a highly connected network of sites, this circle of influence grows and may shape the thoughts of everybody in the group. This circle of influence also grows rapidly throughout the community




Example: YouTube
In October 2006, Google bought YouTube for $1.65 billion. In its 20 months of existence, YouTube had grown to be one of the busiest sites on the Internet, dishing out 100 million video (As of September 2006) views a day. It ramped from zero to more than 20 million unique user visits a day, with mainly viral marketing—spread from person to person.

In YouTube’s case, each time a user uploaded a new video, she was easily able to invite others to view this video. As those others viewed this video, other related videos popped up as recommendations, keeping the user further engaged. Ultimately, many of these viewers also became submitters and uploaded their own videos as well. As the number of videos increased, the site became more and more attractive for new users to visit.

Harnessing information from users improves the perceived value of the application to both current and prospective users. This improved value will not only encourage current users to interact more, but will also attract new users to the application. The value of the application further improves as new users interact with it and contribute more content. This forms a self-reinforcing feedback loop, commonly known as a network effect, which enables wider adoption of the service.

References:
Satnam Alag, “Collective Intelligence In Action”, Manning Publications Co., first edition, 2009.

Tuesday, June 2, 2009

Add Watermark Using Photoshop

This video tutorial is my best one I found illustrating how to add a watermark on your images and photos using Photoshpe.

It uses an old version of Adobe Photoshpe (something earlier than CS2), yet it have the main and basic idea.