Towards Publishing Recommendation Data With Predictive Anonymization

摘要：

Recommender systems are used to predict user preferences for products or services. In order to seek better prediction techniques, data owners of recommender systems such as Netflix sometimes make their customers reviews available to the public, which raises serious privacy concerns. With only a small amount of knowledge about individuals and their ratings to some items in a recommender system, an adver sary may easily identify the users and breach their privacy. Unfortunately, most of the existing privacy models (e.g., k anonymity) cannot be directly applied to recommender sys tems. In this paper, we study the problem of privacypreserving publishing of recommendation datasets. We represent rec ommendation data as a bipartite graph, and identify several attacks that can re-identify users and determine their item ratings. To deal with these attacks, we first give formal privacy definitions for recommendation data, and then de velop a robust and efficient anonymization algorithm, Pre dictive Anonymization, to achieve our privacy goals. Our experimental results show that Predictive Anonymization can prevent the attacks with very little impact to prediction accuracy.

关键词： Anonymization Sparsity Prediction Clustering Privacy

作者: Chih-Cheng Chang Hui (Wendy) Wang Brian Thompson Danfeng Yao

作者单位: Rutgers University Department of Computer Science Piscataway, NJ, USA Stevens Institute of Technology Department of Computer Science Hoboken, NJ, USA

会议类型: 国际会议

会议名称: 5th International Symposium on ACM Symposium on Information,Computer and Communications Security(ACM信息、计算机和通信安全国际会议 ASIACCS 2010)

会议地点: 北京

会议语种:英文

页码: 24-35

在线出版日期: 2010-04-13（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Towards Publishing Recommendation Data With Predictive Anonymization