With the rapid development of Internet technology, personal customization services have been widely used in television programs, music recommendation, e-commerce, etc, which has made the recommendation algorithm become an important technology. In the past decades, the most popular and widely used preference recommendation systems are based on the traditional collaborative filtering algorithms [1-2]. The core idea of the traditional collaborative filtering recommendation algorithm is to calculate the similarity among users based on the existing "user-commodity score matrix", predict the missing scores in the matrix, select the similar users according to scores of user-commodity, and finally make recommendations. Although the collaborative filtering recommendation algorithm has been proved to be effective in practice, the products that be interested in the user are only small quantity among the millions of products, it takes too much time and effort to predict all missing scores in the score matrix. In addition, in the early use of the system based on the collaborative filtering algorithm model, due to lack of target user data (cold-start),The unsatisfied prediction results greatly affect the user experience.
Therefore, this research proposal aims to: 1) Use Bayesian algorithm to forecast the targeted user’s preference sequence in a directional manner, and find the order of a limited number of products in the user’s preference space; 2) When at the situation of the lack of target user’s data (cold-start), observe the recent preferences of other users to construct a "mean preference space" (e.g. related user group preferences, current hotspots, popular goods and some other related information) as the user's initial preference feature.
1. Related works
To predict the user’s preference sequence, it is inevitable to use the pairwise-based algorithm. Rendle et al. [3] designed a non-uniform sampler to oversample the pairs, which can overcome the problem of non-uniformly sampled data (tailed distribution). This proposal follows this idea to oversample the data for composing set . Yang et al. [4] and Varshney et al. [5] proposed methods to Bayesian-based recommendation system by referring the content ratings among linked and unlinked users. Inspired by these methods, this proposal attempts to construct a "mean preference space" for the related user group to solve the cold-start problem to some extent.
2. Proposed method
1) Constructing the "mean preference space"
Assuming that we have obtained a set of users , where indicates a known user group, . Since the user group to which belongs is known, we can do an operation similar to the collaborative filtering algorithm to calculate the preference representation by the given group set . According to the Bayesian algorithm, the preference model of group for a commodity set can be easily obtained by .
2) Predicting the preference sequence and recommendation sequence
After constructing the "mean preference space", the initial recommendation direction is obtained. Based on this, we can start to predict the preference sequence and recommendation sequence of the products in the N product set I that have been obtained based on the target user's operation. For the convenience of narrative,we define as a two-gram set. Defineindicates that the user prefers than . For N product set I, the sample-set will oversample N pairs of .
Define as the user's preference matrix for the product set, and as the commodity (degree of preference) matrix, the objective function can be defined as . Then, we can maximize the posterior by using the Bayesian algorithm to optimize the parameters, concretely, . Apply this formula on each set and optimize the parameters , we only need to optimize by using the Maximum Likelihood Estimation, namely, , where indicates the user prefer than , is a mapping function that map the difference of preference () to the target user’s preference space, maximizing the probability . The mapping function can be constructed with a neural network.
After obtaining optimized and , we can convert the set of products that need to be evaluated into , and calculate the user's preference matrix through the proposed Bayesian model. Then we can rank the item in according to computed . In the real practice and optimization, the number of product sets is not limited to two but can be defined as a finite number. The preference vector in the obtained preference matrix L corresponds to each commodity.
In general, the method proposed in this proposal borrows the idea from the collaborative filtering algorithm to calculate the similarity among user groups to obtain the "mean preference space". After obtaining the initial preference direction/space, given the user's preference data, the user's preference matrix can be calculated based on the Bayesian algorithm so as to predict the recommendation sequence.
3. Expected results
1) A method to alleviate the problem of cold-start.
2) A Bayesian-based model to predict users’ preferences and make recommendation sequences.