Difference between revisions of "Recommender Systems"

From IPRE Wiki
Jump to: navigation, search
(What I'm Doing)
m (week of 3/16 - 3/21)
Line 28: Line 28:
====week of 3/16 - 3/21====
====week of 3/16 - 3/21====
*finally got consensus to work!
*understood data struct
*made sample data struct
=====To Do=====
*write first third
*decide on plan:
**researching -- companies, historical
**work on prog -- my own, consensus
*test data struct -- hand calculate similarities...
*try out new data? -- netflix? or classmates?
====week of 2/13 - 2/20====
====week of 2/13 - 2/20====

Revision as of 17:33, 17 March 2009

aka Natasha's Thesis


This page is for my thesis: links to articles I have found, notes on what I've found, insights, questions, etc. My thesis is on Recommendation Systems. To give you a general idea, recommendation systems are computer programs (often on the world wide web) that recommend to users items that the user might like. Some examples are Pandora (it recommends music) or Amazon.com (it recommends books and other items). There are a number of different kinds of recommendation systems, as well as a number of issues (security, privacy) that surround such systems. There are a wide range of applications for these systems, and a number of applications within other fields of computer science (interacting multi-agent systems, market economy and trust modeling for computers). In my thesis I will be exploring some of these issues and aspects of recommendation systems.

A note on abbreviations: I may at times use RS as an abbreviation for Recommendation System within this document. "-NE" denotes an idea or question I have.

About Natasha

Natasha Eilbert is a Bryn Mawr College Computer Science major and Mathematics minor. For her senior year thesis project she is studying recommendation systems under Professor Deepak Kumar (http://cs.brynmawr.edu/~dkumar/). Natasha is particularly excited about learning about Pandora's Music Genome Project because in her personal study (ie musical enjoyment) she both has found it effective for "discovering" new music and loves listening to it! She also really wants to learn about recommendation system algorithms that incorporate artificial intelligence / machine learning to continually improve the system to make more accurate predictions.

She is currently taking three computer science classes and one mathematics class:

  • Developmental Robotics, where we get to learn about AI, emergence, search, and more!
  • Gender & Technology -- check out the class blog at http://gandt.blogs.brynmawr.edu/ -- we've got some fascinating dialogue going on which the general public is encouraged to take part in
  • Senior Conference, where I'll be presenting and discussing updates on my lovely thesis and hearing about others' theses
  • Real Analysis -- get to refresh my proof-writing and Mathematica skills).

You can contact her at neilbert@brynmawr.edu.

My Papers and Documents


File:Info.pdf: Info 2/09/09 (sample file)

this doesn't work (why not?): Info 2/09/09 (sample file)

What I'm Doing

week of 3/16 - 3/21

  • finally got consensus to work!
  • understood data struct
  • made sample data struct
To Do
  • write first third
  • decide on plan:
    • researching -- companies, historical
    • work on prog -- my own, consensus
  • test data struct -- hand calculate similarities...
  • try out new data? -- netflix? or classmates?

week of 2/13 - 2/20

  • reading re: RS
  • synthesizing info
  • prepared mini-present of material on content-based RS
  • began writing RS code
  • looked into Python module consensus more
  • more specifics in terms of companies that use RS & how they use it
  • more work with code
  • more reading!
  • Python consensus: "shelving" scrobbler.pickle seems to create empty list
  • consensus main webpage couldn't access (was able to 1-2 weeks ago)
  • Toward the Next Generation of Recommender Systems... (Adomavicius & Tuzhilin):
    • how does "graph-theoretic approach to collaborative filtering...determine the nearest neighbors of x without computing Sxy for all users y"? (p. 738)

past few weeks...

  • readings!
  • talked to Prof Doug Turnbull
  • how to upload a file onto wiki
  • started learning latex & bibtex
  • historic perspective
  • domain-based perspective
  • writing notes on wiki vs. latex
    • latex looks nice
    • latex better for bibliography
    • latex annoying/slow for note-taking

week 1


Things to Change/Add to this Page

source/wiki issues...

  • how to present table?
  • how to add easily to table?
  • what info need for each source?



Here is a table of my sources so far.

Number Title author url company date created date viewed company online OR reference page
1 The race to create a 'smart' Google Jeffrey O'Brien http://money.cnn.com/magazines/fortune/fortune_archive/2006/11/27/8394347/ Fortune Magazine (and CNNMoney.com) November 20 2006 26-Jan-09 Cable News Network
2 Application of Dimensionality Reduction in Recommender System A Case Study Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl http://glaros.dtc.umn.edu/gkhome/node/122 WebKDD-2000 Workshop 2000 26-Jan-09 George Karypis 2006-2008
3 Call for Papers: Special issue on Recommenders on the Web http://tweb.acm.org/RecSysSpecialIssue.html
4 A Guide to Recommender Systems Richard MacManus http://www.readwriteweb.com/archives/recommender_systems.php 26-Jan-09 27-Jan-09
5 Rethinking Recommendation Engines Alex Iskold http://www.readwriteweb.com/archives/rethinking_recommendation_engines.php 25-Feb-08 27-Jan-09
6 Which Movie to Watch? An Overview of Recommendation Systems Joshua Porter http://bokardo.com/archives/quick-overview-of-recommendation-systems/ 14-Sep-05 03-Feb-09
7 Recommending And Evaluating Choices In A Virtual Community Of Use Will Hill, Larry Stead, Mark Rosenstein and George Furnas http://delivery.acm.org/10.1145/230000/223929/p194-hill.html?key1=223929&key2=6220763321&coll=GUIDE&dl=GUIDE&CFID=20246797&CFTOKEN=91193289 c 1995 03-Feb-09 http://portal.acm.org/citation.cfm?id=223929
8 Using collaborative filtering to weave an information tapestry David Goldberg, David Nichols, Brian M. Oki, Douglas Terry http://delivery.acm.org/10.1145/140000/138867/p61-goldberg.pdf?key1=138867&key2=9630763321&coll=GUIDE&dl=GUIDE&CFID=20246994&CFTOKEN=34197207 c Dec 1992 03-Feb-09 http://portal.acm.org/citation.cfm?id=138859.138867
9 Intelligent information-sharing systems THOMAS W. MALONE, KENNETH R. GRANT, FRANKLYN A. TURBAK, STEPHEN A. BROBST, and MICHAEL D. COHEN http://delivery.acm.org/10.1145/30000/22903/p390-malone.pdf?key1=22903&key2=4270763321&coll=GUIDE&dl=&CFID=20247527&CFTOKEN=32717136 c 1987 03-Feb-09 http://portal.acm.org/citation.cfm?id=22903&dl=

Example Companies

From The race to create a 'smart' Google (source 1)

  • IMDb
  • eBay
  • Flickr
  • MyStrands
  • StumbleUpon
  • Yahoo
  • Sun
  • Amazon (collaborative filtering / bases recommendations for given user on actions of past users)
  • Netflix (has $1 mil contest for 10% improvement of their RS)
  • Pandora (Music Genome Project)
  • Slide
  • CleverSet (related to Google)
  • ChoiceStream (related to Google)
  • Whattorent.com

From Richard MacManus's A Guide to Recommender Systems (source 4)

  • Aggregate Knowledge
  • Google
  • Strands
  • Pandora
  • Walmart
  • Blockbuster

From Joshua Porter's Which Movie to Watch? An Overview of Recommendation Systems (source 6)

  • iTune's "Top Songs"
  • Amazon “people who bought this also bought…”
  • Bloglines “similar blogs”
  • Del.icio.us “most popular” bookmarks
  • NYTimes “most emailed articles”


On Recommending And Evaluating Choices In A Virtual Community Of Use (Will Hill, Larry Stead, Mark Rosenstein and George Furnas) - source 7

  • create virtual community to share info
    • want benefits of info sharing but not costs
      • costs include time to talk with others, invasions of privacy
    • virtual community is group of ppl sharing info w/o interacting
      • NOT interacting (virtual reality)
      • NOT using computer intelligence (intelligent agents)
  • types of recommenders:
    • cognitive (content/item-based RS)
    • economic (weigh user cost & usefulness stats)
    • social (ratings based on inter-personal communication - social RS)
  • trade-off between usefulness and costs to user (eg time)
    • more useful for user: but user must do complex tasks ("user annotations", "query specifications")
    • easier for user: but less useful ("automatic history", "enhanced graphics")
  • use of HCI (human-computer interaction) in RS -- systems discussed:
    • Goldberg's email filter (LOOK AT SOURCE 3!)
    • Allen -- found HCI methods ineffective
  • MIX personal relationships and more automatically synthesized relationships
    • Resnick: Netnews -- uses personal recommendations as filters (LOOK AT SOURCE 9!)


On Joshua Porter's Which Movie to Watch? An Overview of Recommendation Systems (source 6)

Prioritization could be based on:

  • "newness... [it's a recent creation]
  • time-sensitivity... [it's important to do soon]
  • popularity... [a lot of people like it]
  • personal relevance... [it relates to something of your own interest]
  • social network relevance... [people you know like it]
  • authority-based... [people you trust like it]
  • collaborative... [people who are like you like it]"

On Alex Iskold's Rethinking Recommendation Engines (source 5)

gene analogy (as approach for RS): gene leads to expression/behavior -- RS creators must find aspect of item leading to user rating/preference psychology - Gavin Potter realizes user ratings relate to the user's recently made ratings (so user ratings are not only in relation to how much they "absolutely" like an item -NE) recommendation vs. filter: in some sense making recommendations as to what a person will like and filtering out what a person won't will have the same underlying approach, but people have a different perspective on them (different psychology goes into human interaction with them)

  • ppl react better to false negatives (say you might not like something but then you do) than false positives (say you will like something but you don't)
  • however, filtering seems less impressive (to me at least) than a recommendation, and fails to explicitly show users items that the user is predicted to like -NE

On Richard MacManus's A Guide to Recommender Systems (source 4)

4 types of rec systems:

  1. personalized RS: based on user's past actions (eg Aggregate Knowledge, Google)
  2. social RS: based on how similar users act (eg Strands) (aka collaborative filtering (see SOURCE 5)
  3. item RS: based on underlying characteristics of item (eg Pandora)
  4. hybrid of 1, 2, & 3

On Application of Dimensionality Reduction in Recommender System: A Case Study

Paper by Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl Source 2 Note: Information based only on the abstract

issues for rec systems:

  1. quality of rec
  2. speed of rec
  3. quality of rec for small data pools

one type of RS: collaborative filtering RS: bases preferences/recs of one person on similar people's preferences

On The race to create a 'smart' Google (source 1)

This article begins to answer some of the following questions: What is RS? How (and by whom) is it used? What are techniques (and algorithms?) used? Also, how can psychology/other studies inform RS?

connection bet personality & what person likes recommender system as extension of shopkeeper's perception of clients examples:

  • IMDb
  • eBay
  • Flickr
  • MyStrands
  • StumbleUpon
  • Yahoo
  • Sun
  • Amazon (collaborative filtering / bases recommendations for given user on actions of past users)
  • Netflix (has $1 mil contest for 10% improvement of their RS)
  • Pandora (Music Genome Project)
  • Slide
  • CleverSet (related to Google)
  • ChoiceStream (related to Google)
  • Whattorent.com

user provides info (eg rating) rec sys in advanced form "will have constructed the algorithm that is you"

  • does this relate to AI/learning? -NE

search vs. discovery: you looking / searching for some information vs. relevant & unsolicited information arising ("Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.") techniques:

  1. many variables (input into machine to create formula -- AI application? -NE)
  2. find underlying reason user likes st
  3. mix of #1 & #2

Pandora's technique: pre-rates songs on many qualities

  • new direction: personality may be linked to a person's musical (& other) tastes

Jason Renfrow (University of Cambridge, Britain) & Sam Gosling's (University of Texas) psych study:

  • 74 students self-rated personality & provided their top 10 songs; others then rate the students' personality based on the songs, then compare self- & other- ratings
  • found others were able to accurately (ie had similar ratings as self-ratings? -NE) predict some characteristics but not others based on musical taste
    • incorrectly predicted: "emotional stability, courage, and ambition"
    • correctly predicted:"extroversion, agreeableness, conscientiousness, openness, imagination, ... intellect"
  • www.outofservice.com -- relates music & personality to a person's politics, location, "lifestyle, favorite authors,... movies"

Max Levchin's Slide -- goal is to find info from web that a given person would like

  • currently takes info re: perosn's likes/dislikes, results of the person's recommendations to others (eg friend A recommends something via Slide to friend B; results are if friend B ends up liking vs disliking the recommendation made by friend A)

ethical issue: RS as "self-expression" & something useful for users vs. a commercial invasion of privacy