When predicting the rating given by the user 𝑢𝑎 to the item 𝑖,
we consider the items that were rated by 𝑢𝑎 are the potential
nearest neighbors of 𝑖. Each nearest neighbor 𝑖𝑗 has the different
effect on 𝑖. This value can be measured by the interestingness of
relationship (𝑖𝑗, 𝑖). The confidence measure is used to calculate
the strength of relationship using the examples 𝑛𝑖𝑗𝑖 whereas the
implicative intensity is used for calculating the surprisingness of
relationship using the counter-examples 𝑛𝑖𝑗𝑖̅. If two relationships
(𝑖𝑗1, 𝑖) and (𝑖𝑗2, 𝑖) have the same confidence value, we use the
surprisingness value and otherwise. Therefore, these two
measures can be combined toghether to clearly distinguish the
effect of each neighbor 𝑖𝑗 on 𝑖. Chapter 4 also uses the nearest
neighbors as Chapter 3 but its neighbors is the items; is also based
on items as Chapter 2 but it just considers the relationship of two
items instead of a set of items and one item
26 trang |
Chia sẻ: honganh20 | Ngày: 21/02/2022 | Lượt xem: 375 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Recommendation systems based on statistical implicative measures, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
the followings.
Chapter 1: An overview of statistical implicative measures
and recommendation systems.
Chapter 2: Recommendation based on statistical implicative
measures and association rules.
Chapter 3: Recommendation based on users implicative rating
measure.
Chapter 4: Recommendation based on items implicative
rating measure.
Appendices include: (1) Interestingness tool and DKHP
dataset; (2) Algorithms used for developing and evaluating the
proposed recommendation models; and (3) Some additional
experiment scenarios.
4
CHAPTER 1. AN OVERVIEW
1.1. Statistical implicative measures
1.1.1. Definition
Statistical implicative measures (SIM) are measures proposed
by the statistical implicative analysis method. SIMs are used to
detect trends in a binary attribute set or non-binary attribute set.
SIMs are asymmetric, probability based and non-linear measures.
1.1.2. Statistical implicative measures for binary data
1.1.3. Statistical implicative measures for non-binary data
1.2. Statistical implicative ratings
Statistical implicative rating measures is proposed by the
thesis using some existing SIMs. We can consider these measures
as SIMs. Statistical implicative rating measures are used to
predict the rating of a user for an item; thereby contributing to
solving recommendation problems.
1.3. Recommendation based on statistical implicative
analysis
1.3.1. Recommendation systems and research directions
1.3.2. Collaborative filtering technique
1.3.2.1. Memory based methods
1.3.2.2. Model based methods
1.3.3. Evaluating recommendation systems
1.3.3.1. K-fold cross validation method
1.3.3.2. Classification accuracy metrics
1.3.3.3. Predictive accuracy metrics
5
1.3.3.4. Rank accuracy metrics
1.3.4. Statistical implicative analysis based recommendation
1.3.4.1. Existing recommendation methods
1.3.4.2. Recommendation based on statistical implicative
measures
1.4. Conclusion
Chapter 1 focuses on obtaining the understanding on SIMs,
RSs and the accuracy metrics used for evaluating RSs. The thesis
summarizes SIMs (such as implicative intensity, entropic version
of implicative intensity, cohesion, contribution) and identify
which measures should be used by RSs and to improve the
accuracy of recommendation result. Besides, Chapter 1 also
focuses on the collaborative filtering technique and the accuracy
metrics to be used for building and evaluating recommendation
models. Moreover, Chapter 1 also presents the research
directions on RSs as well as the existing research related to RSs
based on statistical implicative analysis; then identify the scope
of study and sketch the proposal.
6
CHAPTER 2. RECOMMENDATION BASED ON
STATISTICAL IMPLICATIVE MEASURES AND
ASSOCIATION RULES
Differing from the existing recommendation models based on
the statistical implicative analysis (SIA) and association rules,
the proposed model of this chapter: Can be applied on both
binary and non-binary data; provides more SIMs (such as
implicative intensity, entropic version of implicative intensity,
cohesion) to make the recommendation; and enables to combine
one of the above measure with the contribution measure to
improve the accuracy of RSs.
2.1. Statistical implicative rules based model - SIR
The statistical implicative rules based model SIR is developed
on SIMs and association rules. The proposed model SIR is shown
in Figure 2.1. This model consists of:
- A finite set of users 𝑈 = {𝑢1, 𝑢2, , 𝑢𝑛}.
- A finite set of items (e.g. products, movies, etc.) 𝐼 = {𝑖1,
𝑖2, , 𝑖𝑚}.
- A rating matrix 𝑅 = (𝑟𝑗𝑘)𝑛x𝑚 where 𝑗 = 1. . 𝑛 and 𝑘 =
1. . 𝑚 to be used for storing the feedback (ratings) of users on
items. In binary form, 𝑟𝑗𝑘 = 1 if user 𝑢𝑗 likes the item 𝑖𝑘 and
𝑟𝑗𝑘 = 0 (or 𝑁𝐴) if 𝑢𝑗 does not like/know 𝑖𝑘. In non-binary form,
𝑟𝑗𝑘 ∈ [0,1] if 𝑢𝑗 rates 𝑖𝑘 and 𝑟𝑗𝑘 = 𝑁𝐴 if 𝑢𝑗 does not rate/know
𝑖𝑘.
- A vector 𝑅𝑢𝑎storing the known ratings of the user 𝑢𝑎 who
needs the recommendation. 𝑅𝑢𝑎 = {𝑟𝑢𝑎𝑘} where 𝑘 = 1, 𝑚
̅̅ ̅̅ ̅̅ ; in
which, 𝑟𝑢𝑎𝑘 = 𝑁𝐴 if 𝑢𝑎 does not rate 𝑖𝑘.
7
Figure 2.1: The statistical implicative rules based model.
To reduce the recommendation time, the SIR model in Figure
2.1 is improved by combining the follows simultaneously
(directly): Generating association rules, presenting those rules by
the set of four values {𝑛, 𝑛𝑎 , 𝑛𝑏 , 𝑛𝑎�̅�}, calculating the implicative
value of those rules according to a specific SIM. We can solve
this problem by using and modifying the rchic package.
(𝑢𝑎, I, 𝑅𝑢𝑎) (U, I, R)
Support threshold s
Confidence threshold c
Implicative intensity,
Entropic version of implicative
Cohesion measure
Maximum length of a rule l
{𝑎 → 𝑏 | 𝑎 ∈ 𝐼𝑘 , 𝑏 ∈ 𝐼, 𝑘 = 1, 𝑙 − 1̅̅ ̅̅ ̅̅ ̅̅ ̅}
The ruleset is
presented by the
statistical
implicative
analysis method
{𝑎 → 𝑏} = {𝑛, 𝑛𝑎 , 𝑛𝑏 , 𝑛𝑎�̅�}
{𝑎 → 𝑏} = {𝑣𝑎,𝑏}
Contribution measure
List of good items to be
recommended to 𝑢𝑎
Improved
model:
Combining
these ones
simultaneously
8
2.2. Operation of the statistical implicative rules based
model
The operation of SIR model includes two stages: Building the
filtered ruleset presented according to the SIA method; and
performing the recommendation as shown in Figure 2.2. To
reduce the recommendation time, we can pre-built the learning
model (offline).
Figure 2.2: The operational diagram of the SIR model.
2.3. Experiment
2.3.1. Data and tool
Three data sets used for the experiment are MSWeb,
MovieLens and DKHP (course registration). In which, MSWeb
i1 i2 im
u1 r11 NA r1m
u2 NA r21 r2m
un r11 rn2 NA
Pre-processing
data
Inputs
Ratings of user who requires the recommendation
R
atin
g
m
atrix
Generating rules
Filtering rules
Building model
(online/offline)
Presenting rules
according to SIA
The list of top N items
ua {i1, i13,, im-2}
Recommend-
ing items
with the
highest
implicative
values
Making recommendation
(online)
i1 i2 im-1 im
ua NA ra2 ram-1 NA
9
and DKHP are binary datasets and MovieLens is a non-binary
dataset.
We developed the Interestingnesslab tool to conduct the
experimental scenarios. Besides, some recommendation models
of the recommenderlab package are used for comparing with the
SIR model. These models are: The association rule based on
model (AR); the item based collaborative filtering model (IBCF)
using Jaccard measure; the popular model (POPULAR).
The experimental scenarios are run on the computers with the
following configurations: (1) Window 8 OS, 16 GB RAM, and
Intel Pentium G630 2.7GHz processor; and (2) Windows 10 OS,
8 GB RAM, and Intel Core i5-6200U 2.5GHz CPU processor.
2.3.2. Evaluating the SIR model on binary data
The accuracy of the SIR model is compared with that of some
existing models by the 5-folds cross validation method and the
classification accuracy metrics (via Precision - Recall curve,
ROC curve and the F1 measure combining the precision and the
recall). The experimental results show that:
- The simultaneous combination of steps at the learning stage
(in the improved SIR model) reduces the recommendation time.
- The accuracy of SIR model is the highest when the entropic
version of implicative intensity and the contribution measure are
combined together to make the recommendation.
- The accuracy of the SIR model combining the entropic
version of implicative intensity and the contribution measure is
higher than that of the compared recommendation models (AR,
POPULAR, IBCF); Especially, when the user requiring the
recommendation is not a new user (i.e. the number of items that
were rated by that user, the number of known ratings, is not too
low).
10
2.3.2. Evaluating the SIR model on non-binary data
- The accuracy of SIR model is the highest when (1) the
entropic version of implicative intensity and the contribution
measure are combined together and the user does require many
recommended items. In reality, the user will be confused by a lot
of items to be recommended.
- The accuracy of SIR model is higher than that of POPULAR
- a recommendation model based on the most popular items.
2.4. Conclusion
Chapter 2 proposes the statistical implicative rules based
model SIR applied on both binary and non-binary data; and
improves the proposed model to reduce the recommendation time.
The ruleset represented by a set of four values can be pre-built
offline and used online when someone needs recommendation.
The SIR model provides many SIMs and can be expanded by
providing other objective interestingness measures. The SIR
model is coded and integrated in the Interestingnesslab tool. The
accuracy of SIR model is evaluated: By the classification
accuracy metrics such as ROC curve, Precision - Recall curve
and F1 measure; on two types of data: Binary (MSWeb, DKHP)
and non-binary (MovieLens); according to two groups of
scenarios: Internal comparison (using the same SIR model but
the different SIMs) and external comparison (the SIR model and
some existing recommendation models: AR, POPULAR and
IBCF). The experimental results show that the SIR model should:
(1) combine the entropic version of implicative intensity with the
contribution measure to make the recommendation; (2) be used
to build RSs because the accuracy of SIR model is higher than
that of compared models.
11
CHAPTER 3. RECOMMENDATION BASED ON USERS
IMPLICATIVE RATING MEASURE
The SIR model of Chapter 2 uses the association rules and
SIMs to recommend the list of good items to users. When the
number of rules is too large, the SIR model and the existing
models - also based the SIA and the association rules - have to
face some disadvantages: The recommendation time may be long
if the learning stage is performed online; and the computer may
be overloaded. Therefore, the thesis takes attention to the rules
with length of 2 to overcome those disadvantages. Besides, the
rating given by 𝑢𝑎 (a user requires the recommendation) to the
item 𝑖 maybe similar to the ratings given to 𝑖 by the nearest users
(neighbors) of 𝑢𝑗. Moreover, each item owns the contribution to
the relationship of 𝑢𝑎 and his/her nearest user 𝑢𝑗. As a result, the
thesis combines the above characteristics to improve the
accuracy of recommendation.
3.1. KnnUIR Definition
The k nearest neighbors (i.e. users) based implicative rating
measure 𝐾𝑛𝑛𝑈𝐼𝑅 is proposed to predict the rating given by a
user 𝑢𝑎 for an item 𝑖 ∈ 𝐼 . The purpose of this proposal is to
increase the recommendation accuracy. 𝐾𝑛𝑛𝑈𝐼𝑅 - defined by
(3.1) - is based on: (1) the number of nearest users of 𝑢𝑎 - 𝑘𝑛𝑛
(the nearest neighbors 𝑢𝑗 are identified by the implicative
intensities of 𝑢𝑎 and 𝑢𝑗); (2) the ratings of item 𝑖 that were rated
by those neighbors - 𝑟𝑢𝑗𝑖; (3) the typicality of 𝑖 contributing to
the relationship of 𝑢𝑎 and 𝑢𝑗 - 𝛾(𝑖, 𝑢𝑎 → 𝑢𝑗) . The value of
12
𝐾𝑛𝑛𝑈𝐼𝑅(𝑢𝑎 , 𝑖) has to be transformed to the range [0, 1] - the
same scale as elements of rating matrix.
𝐾𝑛𝑛𝑈𝐼𝑅(𝑢𝑎, 𝑖) = ∑ 𝑟𝑢𝑗𝑖 ∗ 𝛾(𝑖, 𝑢𝑎 → 𝑢𝑗)
𝑘𝑛𝑛
𝑗=1
(3.1)
3.2. Users implicative rating based model - UIR
The users implicative rating based model UIR is developed
by using the proposed KnnUIR measure and the user based
collaborative filtering method. The UIR model shown in Figure
3.1 has the same components as the SIR model. However, this
UIR model not only predicts the rating given by a user to an item
but also recommends the list of top items to a user.
Figure 3.1: The users implicative rating based model.
3.3. Operation of the users implicative rating based model
The operational diagram of the UIR model is presented in
Figure 3.2.
(𝑢𝑎, I, 𝑅𝑢𝑎) (U, I, R)
Implicative intensity
𝑢𝑎 x U {𝜑(𝑢𝑎 , 𝑢𝑗), 𝑗 = 1, 𝑘𝑛𝑛̅̅ ̅̅ ̅̅ ̅̅ }
K nearest neighbors/users based
implicative rating measure (KnnUIR)
𝑢𝑎 x I 𝑅𝑢𝑎
′ Reclist={𝑖 |𝑖 ∈ 𝐼, 𝑟𝑢𝑎𝑖
′ ∈ 𝑇𝑜𝑝𝑁}
13
Figure 3.2: The operational diagram of the UIR model.
3.4. Experiment
3.4.1. Data and tool
The Interestingnesslab tool with the proposed UIR model; the
MSWeb, DKHP and MovieLens datasets; the recommenderlab
Presenting the relationship of ua and uj where ujU
according to SIA and calculating the implicative
intensity of (ua, uj)
Ratings of user who requires the recommendation
i1 i2 im
u1 r11 NA r1m
u2 NA r22 r2m
un rn1 rn2 NA
Pre-processing data
Rating matrix
The list of top N items
ua {i1, i13, im-2}
Predicted ratings
i1 i2 im
ua r’a1 r’i2 r’am
Calculating the typicality of i contributing to the
relationship (ua, uj)
Predicting the rating given by ua for iI using KnnUIR
Recommending items with the highest predicted ratings toua
i1 i2 im-1 im
ua NA ra2 ram-1 NA
Finding the k nearest neighbors of ua
Recommend?
No
Yes
Preparing for
calculating the
KnnUIR value
Recommending
Inputs
Outputs
14
package with existing models (POPULAR, IBCF, AR, UBCF,
ALS_Implicit and SVD); and the computers (as described in
Section 2.3.1) are also used for the experiment of this chapter.
3.4.2. Evaluating the UIR model using the classification
accuracy metrics
- The accuracy of the proposed UIR model (via Precision -
Recall curve, ROC curve and the F1 measure) is higher than that
of the AR, IBCF and POPULAR models but not much higher
than that of the UBCF model.
- The accuracy of the UIR model is lower than that of the SIR
model (Chapter 2) if the user requiring the recommendation is a
new user (given = 1), the number of nearest users and the number
of good items to be recommended are low.
3.4.3. Evaluating the UIR model using the predictive accuracy
metrics
- The contribution of an item to the relationship of two users
increases the recommendation accuracy.
- The accuracy of the proposed UIR model is higher than that
of the UBCF model (i.e. the mean absolute error MAE and the
root mean squared error RMSE are lowest) if the user requiring
the recommendation is not a new user. In the opposite case, the
accuracy of the UIR model still higher than that of UBCF model
if the number of nearest neighbors to be used for predicting
ratings is high.
3.4.4. Evaluating the UIR model using the rank accuracy metrics
The experiment is conducted for the case where the active
user rated a few of items and requires a few of recommended
15
items. The experimental result shows that the accuracy of the
proposed UIR model (via the nDCG metric) is higher than that
of the UBCF, ALS_Implicit and SVD models if the knn>=30.
3.5. Conclusion
Chapter 3 proposes a new measure - called KnnUIR - that
predicts a user's rating for an item. KnnUIR is developed from
two SIMs - the typicality and the implicative intensity. KnnUIR
incorporates many factors affecting the predicted ratings such as
the nearest neighbors, the ratings that were rated by those
neighbors, and the contribution of an item to the relationship of
user requiring the recommendation and his/her nearest neighbors.
Besides, Chapter 3 proposes a new recommendation model -
named UIR - using KnnUIR and the user based collaborative
filtering method. The accuracy of the proposed UIR model is
evaluated by: The classification accuracy metrics (for binary
data), the predictive accuracy metrics (for non-binary data) and
the rank accuracy metrics (for both binary and non-binary data);
the group of internal comparison scenarios (UIR and SIR) and
the group of external comparison scenarios (UIR and the existing
models: AR, IBCF, POPULAR, ALS_Implicit, UBCF, SVD).
Experimental results show that the accuracy of the UIR model:
(1) is higher when considering the contribution of items in
relationship of a user and his/her neighbor; and (2) is the higher
than that of the compared existing models when the number of
known ratings of user who needs the recommendation is not too
low (i.e. that user is not a new user). Moreover, the experimental
results also show that the accuracy of UIR model is lower than
that of proposed SIR model in the case of new users.
16
CHAPTER 4. RECOMMENDATION BASED ON ITEMS
IMPLICATIVE RATING MEASURE
When predicting the rating given by the user 𝑢𝑎 to the item 𝑖,
we consider the items that were rated by 𝑢𝑎 are the potential
nearest neighbors of 𝑖. Each nearest neighbor 𝑖𝑗 has the different
effect on 𝑖. This value can be measured by the interestingness of
relationship (𝑖𝑗, 𝑖). The confidence measure is used to calculate
the strength of relationship using the examples 𝑛𝑖𝑗𝑖 whereas the
implicative intensity is used for calculating the surprisingness of
relationship using the counter-examples 𝑛𝑖𝑗𝑖̅. If two relationships
(𝑖𝑗1, 𝑖) and (𝑖𝑗2, 𝑖) have the same confidence value, we use the
surprisingness value and otherwise. Therefore, these two
measures can be combined toghether to clearly distinguish the
effect of each neighbor 𝑖𝑗 on 𝑖. Chapter 4 also uses the nearest
neighbors as Chapter 3 but its neighbors is the items; is also based
on items as Chapter 2 but it just considers the relationship of two
items instead of a set of items and one item.
4.1. KnnIIR Definition
The k nearest neighbors (i.e. items) based implicative rating
measure 𝐾𝑛𝑛𝐼𝐼𝑅 is proposed to predict the rating given by a user
𝑢𝑎 for an item 𝑖 ∈ 𝐼 ; thereby increasing the recommendation
accuracy. 𝐾𝑛𝑛𝐼𝐼𝑅 is developed by the ratings of 𝑢𝑎 for items 𝑖𝑗
(𝑖𝑗 can be seen as one of potential nearest neighbors of 𝑖) and the
strength of relationship between each neighbor 𝑖𝑗 and the item 𝑖
using the confidence value 𝑐(𝑖𝑗, 𝑖) and one of SIM values - such
as the implicative intensity 𝜑(𝑖𝑗 , 𝑖) or the cohesion value
𝑐𝑜ℎ(𝑖𝑗 , 𝑖) or the entropic version of implicative intensity 𝜙(𝑖𝑗, 𝑖).
17
As a result, 𝐾𝑛𝑛𝐼𝐼𝑅 not only consideres the examples 𝑛𝑖𝑗𝑖 of
relationship 𝑖𝑗, 𝑖 but also considers the counter-examples 𝑛𝑖𝑗𝑖̅ of
this relationship.
𝐾𝑛𝑛𝐼𝐼𝑅(𝑢𝑎, 𝑖) = ∑ 𝑟𝑢𝑎𝑖𝑗 ∗ 𝑣𝑖𝑗𝑖
𝑘𝑛𝑛
𝑗=1
(4.1)
𝑣𝑖𝑗𝑖 = [
𝜑(𝑖𝑗, 𝑖) ∗ 𝑐(𝑖𝑗 , 𝑖)
𝑐𝑜ℎ(𝑖𝑗, 𝑖) ∗ 𝑐(𝑖𝑗 , 𝑖)
𝜙(𝑖𝑗, 𝑖) ∗ 𝑐(𝑖𝑗, 𝑖)
(4.2)
4.2. Items implicative rating based model - IIR
The items implicative rating based model IIR is shown in
Figure 4.1.
Figure 4.1: The items implicative rating based model.
Similar to the models of Chapter 2 and Chapter 3, the
proposed IIR model also has a finite user set, a finite item set, a
rating matrix, a vector with the ratings already rated by user
requiring the recommendation, and a vector with the predicted
ratings. Differing from the models of the previous chapters, the
IIR model uses the item matrix V to store the values 𝑣𝑗𝑘 to carry
(𝑢𝑎, I, 𝑅𝑢𝑎) (U, I, R)
Confidence measure,
Implicative intensity,
Entropic version of implicative
intensity,
Cohesion measure
I x I 𝑉 = {𝑣𝑗𝑘| 𝑗, 𝑘 = 1, 𝑘𝑛𝑛̅̅ ̅̅ ̅̅ ̅̅ }
K nearest neighbors/items based
implicative rating measure (KnnIIR)
𝑢𝑎 x I 𝑅𝑢𝑎
′ Reclist={𝑖 |𝑖 ∈ 𝐼, 𝑟𝑢𝑎𝑖
′ ∈ 𝑇𝑜𝑝𝑁}
18
out the recommendation. Matrix V can be built directly or
indirectly. In the indirect form, we generate a set of rules (similar
to Chapter 2) but only consider rules with length of 2, the
thresholds of support and confidence to be 0; then convert this
ruleset to the item matrix. However, compared to the direct
method, this approach can increase the recommendation time as
well as depends on the tools used for generating rules. Besides,
the V matrix can be built online or offline. When the number of
items and the size of the dataset is large, the recommendation
time can be shortened if we pre-build the V matrix (offline) and
store it in a file.
4.3. Operation of the items implicative rating based model
The operational diagram of the IIR model is depicted in
Figure 4.2.
4.4. Experiment
4.4.1. Data and tool
Chapter 4 also uses the datasets and tool used by the SIR and
UIR models.
4.4.2. Evaluating the IIR model using the classification
accuracy metrics
- Building the item matrix directly can reduce the
recommendation time and does not depend on the tools used for
generating rules.
- The accuracy of IIR model (via Precision - Recall curve,
ROC curve and the F1 measure) is the highest when the
implicative intensity is used for building the item matrix and knn
is the number of items of the dataset.
19
- The accuracy of the IIR model is higher than that of the
compared recommendation models (AR, POPULAR, IBCF, SIR)
when the user requiring the recommendation is not a new user.
Figure 4.2: The operational diagram of the IIR model.
4.4.3. Evaluating the IIR model using the predictive accuracy
metrics
- The accuracy of the IIR model (via MAE and RMSE) is the
highest when knn is the number of items of the dataset; and the
entropy version of implicative intensity is used for building the
i1 i2 im
u1 r11 NA r1m
u2 NA r21 r2m
un r11 rn2 NA
Pre-processing
data
Rating matrix
i1 im
i1 NA v1m
im v11 NA
Ratings of user who requires the recommendation
Building the
item matrix
Building the item matrix
with knn neighbors
Filtering the
matrix to obtain
knn neighbors
The list of top N items
ua {i1, i13,, im-2}
Predicted ratings
i1 i2 im
ua r’a1 r’a2 r’am
Recommending items
with the highest
predicted ratings
Predicting ratings
using KnnIIR
Making the recommendation
No
Yes
Inputs
Outputs
Recommend?
i1 i2 im-1 im
ua NA ra2 ram-1 NA
20
item matrix if a user only rated a few items and the cohesion
measure otherwise.
- The accuracy of the IIR model is higher than that of the
IBCF model if a user requiring the recommendation already rated
many items.
4.4.4. Evaluating the IIR model using the rank accuracy
metrics
The accuracy of IIR model (via nDCG) is higher than that of
the IBCF, ALS_Implicit models if the active user rated a few of
items and requires a few of recommended items.
4.5. Comparing the proposed models
If dataset in binary form, the SIR model is suitable for the case
in which the active user rated a few of items whereas the IIR
model fits for the other cases. Besides, if the recommendation
time is taken into account, the UIR model can be used instead of
the SIR model. If the data in non-binary form, the accuracy of
UIR model is higher than that of IIR model.
4.6. Conclusion
Chapter 4 proposes a new measure (named KnnIIR)
developed from the relationship of two items to predict ratings;
and the IIR model using the proposed measure to recommend a
list of good items to a user or predict the rating given by a user
to an item. The proposed IIR model is improved by building the
item matrix directly. This reduces the recommendation time and
avoid the reliance on the tool used for generating rules. The
accuracy of IIR model is also evaluated: On both binary and non-
binary data; according to the classification accuracy metrics, the
predictive accuracy metrics and the rank accuracy metric. The
21
experimental results show that the IIR model should: (1) use the
implicative intensity if data in binary form or the combination of
the entropic version and the cohesion measure if data in non-
binary form to build the item matrix; (2) be used to build RSs
because of the high accuracy. In addition, the experimental
results also show that: (1) the combination between the
confidence value and the implicative value of two items
improves the recommendation result; and (2) the accuracy of IIR
model is lower than that of the SIR in the case of new user.
22
CONCLUSION AND FUTURE WORKS
Results of the study
- Identifying the statistical implicative measures to be used for
RSs; then proposing and improving the recommendation model
based on SIMs and association rules to recommend the good
items to users.
- Proposing a new measure KnnUIR based on the nearest
users and some SIMs, and then proposing a new recommendation
model UIR using this measure. The proposed model can predict
the ratings given by a user to items and recommend the good
items to users.
- Proposing a new measure KnnIIR based on the nearest items
and some SIMs, and then proposing a new recommendation
model IIR using the proposed measure.
- Developing the Interestingness tool in R language used for
the experiment.
- Co
Các file đính kèm theo tài liệu này:
- recommendation_systems_based_on_statistical_implicative_meas.pdf