Mining decision laws on the data block

The experiment was performed on 3 data sets taken from

Pediatrics Department A, B of Bach Mai Hospital 2 from March

10, 2020 to March 14, 2020. Data were collected and pre-treated,

with each data set including 3 conditional index attributes,

namely disease symptoms, including fever, cough, runny nose

and 2 decision index attributes: the treatment regimen, fever virus

level were monitored over 4 days.

26 trang | Chia sẻ: honganh20 | Lượt xem: 511 | Lượt tải: 0

Bạn đang xem trước 20 trang tài liệu Mining decision laws on the data block, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

e desire to contribute of the thesis. 2. The objective of the thesis The objective of the thesis focus on solving three problems: - To find decision laws on the data block and block’s slice. - To find decision laws bettwen object groups on the block which has index attribute value change, particulary when smoothing, or roughing attribute value. - To find decision laws bettwen object groups on the block when adding or removing block’s elements. 3. Layout of the thesis The layout of the thesis consists of the introduction and the three chapters of content, the conclusions and the references. 2 Chapter 1 presents the basic concepts of data block, data mining, mining decision laws and equivalent relationship. Chapter 2 presents two research results: the first is to propose the MDLB algorithm to find decision laws on the block and block’s slice. The second is to propose the MDLB_VAC algorithm to find decision laws on the block in case the attribute value changes. In addition, giving theoretical studies on block mining, calculating complexity and setting test proposed algorithms. Chapter 3 builds a model to increase or decrease the object set of decision blocks; proposes two incremental algorithms MDLB_OSC1 and MDLB_OSC2 to find decision laws when block’s object set changes and test setting. CHAPTER 1. SOME BASIC KNOWLEDGE 1.1. Data mining 1.1.1. Definition of data mining Data mining is the main stage in the process of discovering knowledge in the database. This process’s output is the latent knowledge from data to help forecast, making decisions in business, management, production activities, ... 1.1.2. Some data mining techniques - Classification. - Prediction. - Association Rule. - Clustering. 1.2. Mining decision laws 3 1.2.1. Information System Definition 1.1 (Informattion system) Information system is a set of four S = (U, A, V, f) where U is a finite object set, different than empty objects (U is also known as the set of universe) and A is the finite and non-empty attributes set; V is the values set, in which where V=⋃ VAa∈A , Va is the value set of the attribute a  A, f is the information function f: U x A→V, where a  A, u  U: f(u,a)  Va 1.2.2. Indiscernibility Relation Given the information system S = (U, A, V, f), for each attribute subset P  A, there exists a binary relations on U, denoted IND (P), defined as follows: IND(P) = {(u,v)  U x U|u(a) = v(a), a  P) IND (P) is called an Indiscernibility. 1.2.3. Decision table Decision table is a special information system in which attribute set A is divided into two separate non-empty sets C and D, ( , )A C D C D=   =  , respectively called conditional attribute set C and decision attribute set D. The decision table is denoted as: DS = (U, C  D, V, f) or simply DS = (U, C  D). 1.2.4. Decision law Definition 1.4 (Decision law) Given the decision table DS = (U, CD), suppose U/C = {C1, C2, , Cm} and U/D = {D1, D2, , Dn} are the partitions generated by C, D. For Ci  U/C, Dj  U/D, a decision law is presented as: Ci → Dj , i=1..m, j=1..n. 1.3. Model of data block 4 1.3.1. The block. Definition 1.8 Let R = (id; A1, A2,..., An) is a finite set of elements, where id is non-empty finite index set, Ai (i=1.. n) is the attribute. Each attribute Ai (i=1.. n) there is a corresponding value domain dom(Ai). A block r on R, denoted r(R) consists of a finite number of elements that each element is a family of mappings from the index set id to the value domain of the attributes Ai (i = 1.. n). t  r (R)  t = { ti : id → dom (Ai)}i=1.. n. The block is denoted by r(R) or r(id; A1, A2,..., An), sometime without fear of confusion we simply denoted r. 1.3.2. The block’s slice Let R = (id; A1, A2,..., An), r(R) is a block over R. For each xid we denoted r(Rx) is a block with Rx = ({x}; A1, A2,..., An) such that: tx r(Rx)  tx = {t i x = t i } i=1..n , t r(R), t = {t i : id → dom(Ai)} i=1..n , x where tix (x) = t i (x), i =1.. n. Then r(Rx) is called a slice of the block r(R) at point x, sometimes we denoted rx. Here, for simplicity we use symbols: x(i) = (x; Ai ) ; id (i) = {x(i) | x  id}. We call x(i) (x  id, i = 1..n) are the index attributes of the block scheme R = (id; A1,A2,...,An ). 1.3.3. Relational algebra on the block Fusion Intersection 5 Subtraction Descartes product Descartes product with index set Projection Selection Connection permission Division 1.4 . Conclusion chapter 1 Chapter 1 of the thesis presents an overview of data mining, data mining techniques, knowledge of mining decision law, equivalence class ... The last part chapter presents basic concepts of the data block model: blocks, block’s slices, relational algebra on blocks These knowledge will be the basis for the issues presented in the next chapter. CHAPTER 2. MINING DECISION LAWS ON THE DATA BLOCK HAS VARIABLE ATTRIBUTE VALUES 2.1 Some concepts built on the block 2.1.1 Information block Definition 2.1 Let block scheme R = (id;A1,A2,...,An), r is a block over R. Then, the information block is a tuples of four elements IB= (U,A,V,f ) with U is a set of objects of r called space objects, A = ( ) 1 n i i id = is the set of index attributes of the object, V = ( ) ( ) i i x x A V  , ( )ixV is the set of values of the objects corresponding to the index attribute x(i), f is an information function UxA→ V satisfy:uU, x(i)A we have f(u, x(i)) ( )ixV . 2.1.2 Indiscernibility Relation on Block Definition 2.3 Let information block IB = (U,A,V,f). Then for each 6 index attribute set P A we define an equivalence relation, sign IND(P) defined as follows: IND(P) ={(u,v) UxU| x(i)P: f(u,x(i))=f(v,x(i))}, and called non-discriminatory relations: 2.1.3 Decision block Definition 2.5 Let information block IB = (U,A,V,f) with U is the space of objects. A = . Suppose A is divided into two sets C and D such that: C= ( ) 1, k i i x id x =  , D= ( ) 1, n i i k x id x = +  , then information block IB is called the decision block and denoted by DB=(U,CD,V,f), with C is conditional index attributes set and D is decision index attributes set. 2.1.4 Decision laws on the block and slice. Definition 2.7 Let decision block DB = (U,CD), with U is the space of objects: C = , D = , and Cx = , Dx = , xid. Then: U/C={C1,C2,,Cm}, U/C x= 1 2{ , ,..., }xx x xtC C C , U/D={D1, D2,,Dk}, U/D x= 1 2{D , ,..., }xx x xhD D , correspondingly, the partitions are generated by C, Cx, D, Dx. A decision law on a block is denoted by: Ci → Dj , i = 1..m, j=1..k , ( ) 1 n i i id = ( ) 1, k i i x id x =  ( ) 1, n i i k x id x = +  ( ) 1 k i i x = ( ) 1 n i i k x = + 7 and on the slice at point x is denoted by: Cxi → Dxj , i =1..tx, j=1..hx . Definition 2.8 Let decision block DB=(U,CD), CiU/C, DjU/D, xpC U/C x, xqD U/D x, i =1..m, j=1..k, p{1,2,,tx }, q{1,2,,hx }, xid. Then, support, accuracy and coverage of decision law Ci→ Dj on the block are: - Support: Sup(Ci,Dj) = |CiDj|, - Accuracy: Acc(Ci,Dj) = | | | | i j i C D C  , - Coverage: Cov(Ci,Dj) = | | | | i j j C D D  . Definition 2.9 Let decision block DB=(U,CD), CiU/C, DjU/D is the conditional equivalence class and decision equivalence class generated by C, D corresponding, Ci→ Dj is the decision law on the block DB, i =1..m, j=1..k. - If Acc(Ci→ Dj ) = 1 then Ci→ Dj is called certain decision law. - If 0 < Acc(Ci→ Dj ) < 1 then Ci→ Dj is called uncertain decision law. Definition 2.10 Let decision block DB = (U,CD), Ci U/C, Dj U/D, i =1..m, j =1..k is the conditional equivalence class and decision equivalence class generated by C,D corresponding; ,  are two given thresholds (, (0,1)). If Acc(Ci,Dj)   and Cov(Ci,Dj)   then we call Ci→ Dj is the decision law meaning. 2.2 Mining decision law on the data block and block’s slice algorithm (MDLB). The MDLB algorithm consists of the following steps: 8 - Step 1: Assign classes of conditional, decision equivalence on blocks (on slices). - Step 2: Calculate the support matrix on the block (on slice) - Step 3: Calculate accuracy matrix, coverage matrix - Step 4: Find the decision laws on the block. 2.3. Mining decision laws on the block when index attribute value changed. Definition 2.11(Definition of smoothing index attribute value on the block) Let decision block DB=(U,CD,V,f), with U is the space of objects, a CD, Va is the set of existing values of the index attribute a. Suppose Z={xsU | f(xs,a) = z} is the set of objects whose z value is on the index attribute a. If Z is partitioned into two sets W and Y such that: Z=WY, WY= with W={xpU| f(xp,a) = w, wVa}, Y={xqU| f(xq,a)=y, yVa}, then we say the z value of the index attribute a is smoothed to two new values w and y Definition 2.12(Definition of roughing index attribute value on the block) Let decision block DB=(U,CD,V,f), with U is the space of objects, a CD, Va is the set of existing values of the index attribute a. Suppose f(xp,a)=w, f(xq,a)=y are respectively the values of xp, xq on the index attribute a (pq). If at any one time we have: f(xp,a)= f(xq,a)=z, (zVa) then we say the two values w, y of a are roughened to the new value z. Theorem 2.1 Let decision block DB = (U, CD, V, f ), with U is the 9 space of objects, a CD, Va is the set of existing values of the index attribute a. Then, two equivalent classes Ep, Eq (Ep, EqU/E, E{C,D}) is made rough into new equivalent class Es if and only if aj  a: f(Ep,aj) = f(Eq,aj). Theorem 2.2 Let decision block DB = (U, CD, V, f ), with U is the space of objects, a C D, Va is the set of existing values of the index attribute a. Then, equivalent class Es (EsU/E, E{C,D}) smoothed into two new equivalents classes Ep, Eq if and only if we can put: f(Ep,a)=w, f(Eq,a)=y và Ep Eq= Es, w, yVa, w y. Theorem 2.3 Let decision block DB = (U, CD, V, f ). ,  are two given thresholds (, (0,1)). Suppose that if Ci → Dj is the decision law meaning on the decision block then it is also the decision law meaning on any slice of the decision block at xid. 2.3.1 Smoothing, roughening the conditional equivalente clases on the decision block and on the slice. Proposition 2.3 Let decision block DB = (U,CD,V,f ), a=x(i) C, Va is the set of existing values of the conditional index attribute a, The z value of a is smoothed to two new values w and y. Suppose that if the conditional equivalence class Cs U/C, (f(Cs,a)=z ) smoothed into two new conditional equivalents classes Cp,Cq (f(Cp,a)=w, f(Cq,a)=y, with w,yVa ) then on the slice rx, exists equivalence class Cxi satisfy: Cs  Cxi, also smoothed into two new conditional equivalents classes Cxi’ and Cxi’’ satisfy: Cp Cxi’, Cq Cxi’’ (f(Cxi’,a)=w, f(Cxi’’,a)=y). 10 Proposition 2.5 Let decision block DB = (U, CD, V, f ), a=x(i) C, Va is the set of existing values of the conditional index attribute a, the w and y values of a are roughened to the new value z. Suppose, if two conditional equivalents classes Cp,Cq  U/C, (f(Cp,a)=w, f(Cq,a)=y) is made rough into new conditional equivalent class Cs U/C ( f(Cs,a)=z ) then on the slice rx exists two conditional equivalents classes Cxi, Cxj satisfy: Cp Cxi, Cq Cxj, also is made rough into new conditional equivalent class Cxk satisfy: Cs  Cxk . 2.3.2 Smoothing, roughening the decision equivalente clases on the decision block and on the slice. Proposition 2.7 Let decision block DB = (U, CD, V, f ), a=x(i) D, Va is the set of existing values of the decision index attribute a, the z value of a is smoothed to two new values w and y. Suppose that if decision equivalent class Ds U/D ( f(Ds,a)=z ) smoothed into two decision equivalents classes Dp,Dq (f(Dp,a)=w, f(Dq,a)=y, with w,yVa) then on the slice rx, exists decision equivalence class Dxi satisfy: Ds  Dxi , also smoothed into two new decision equivalents classes Dxi’ and Dxi’’ satisfy: Dp Dxi’, Dq Dxi’’ (f(Dxi’,a)=w, f(Dxi’’,a)=y). Proposition 2.9 Let decision block DB = (U, CD, V, f ), a=x(i) D, Va is the set of existing values of the decision index attribute a, the w and y values of a are roughened to the new value z. Suppose, if two decision equivalents classes Dp,Dq, (f(Dp,a)=w, f(Dq,a)=y) is made rough into new decision 11 equivalent class Ds U/D ( f(Ds,a)=z ) then on the slice rx exists two decision equivalents classes Dxi, Dxj satisfy: Dp Dxi, Dq Dxj, also is made rough into new decision equivalent class Dxk satisfy: Ds  Dxk . 2.3.4 The algorimth of mining decision laws when smoothing, roughing index attribute values on the block and the slice (MDLB_VAC). The algorimth MDLB_VAC consists of the following steps: Step 1: Calculate the support matrix Sup (C,D) of the original block. Step 2: Incremental calculating the support matrix on the block Sup (C',D') after roughing/ smoothing the value of the index attribute. Step 3: Calculate accuracy matrix Acc (C',D'), the coverage matrix Cov (C',D') after roughing/ smoothing the value of the index attribute from the matrix Sup (C',D') Step 4: Finding decision laws on the block. 2.4 Complexity of the Sup matrix algorithms on the block and on the slice. Proposition 2.13: The support matrix algorithm for decision block and slice at xid has the same complexity of O(|U|2). Proposition 2.14: The support matrix algorithm for decision block and slice at xid after roughing the values of the conditional index attribute has the same complexity of O(|U|2). Proposition 2.15: The support matrix algorithm for decision block and slice at xid after smoothing the values of the conditional index attribute has the same complexity of O(|U|2). 12 2.6 Conclusion This chapter presents the first results of the thesis: Building some basic concepts of mining decision laws on block. On that basis, a number of related properties, propositions and theorems were stated and proved. - Building MDLB algorithm to find decision law on block and slice. - Propose and prove some results on the relationship between roughing, smoothing the values of the condition or decision attribute on the block and slice. At the same time, propose the MDLB_VAC algorithm to calculate the support matrices on the block and slice, find decision rules when the value of index attribute changes. CHAPTER 3. MINING DECISION LAWS ON BLOCK HAS OBJECT SET CHANGED. 3.1 Model of adding and removing objects on block and slice. Proposition 3.1 Let decision block DB = (U, CD, V, f ), AN and DM is a set of adding and removing objects to block decisions DB. Then we have: Acc(C’,D’ )=Acc(C’i,D’j)ij with: i =1..m+p, j = 1..h+q and 𝐴𝑐𝑐(𝐶′𝑖 , 𝐷′𝑗) = { |𝐶𝑖 ∩ 𝐷𝑗| + 𝑁ij −𝑀ij |𝐶𝑖| + ∑ 𝑁ij' −∑ 𝑀ij' ℎ 𝑗′=1 ℎ+𝑞 𝑗′=1 , 𝑖 = 1. .𝑚, 𝑗 = 1. . ℎ, 𝑁ij |𝐶𝑖| + ∑ 𝑁ij' − ∑ 𝑀ij' ℎ 𝑗′=1 ℎ+𝑞 𝑗′=1 , 𝑖 = 1. .𝑚, 𝑗 = ℎ+ 1. . ℎ+ 𝑞 𝑁ij ∑ 𝑁ij ℎ+𝑞 𝑗=1 , 𝑖 = 𝑚 + 1. .𝑚 + 𝑝, 𝑗 = 1. . ℎ+ 𝑞 , 13 Proposition 3.3 Let decision block DB = (U, CD, V, f ), AN and DM is a set of adding and removing objects to block decisions DB. Then we have: Cov(C’, D’) = Cov(C’i,D’j)ij (m+p)x(h+q), with i =1..m+p, j = 1..h+q and 𝐶𝑜𝑣(𝐶′𝑖 , 𝐷′𝑗) = { |𝐶𝑖 ∩ 𝐷𝑗| + 𝑁𝑖𝑗 −𝑀𝑖𝑗 |𝐷𝑗| + ∑ 𝑁𝑖′𝑗 − ∑ 𝑀𝑖′𝑗 𝑚 𝑖′=1 𝑚+𝑝 𝑖′=1 , 𝑖 = 1. .𝑚, 𝑗 = 1. . ℎ 𝑁𝑖𝑗 |𝐷𝑗| + ∑ 𝑁𝑖′𝑗 − ∑ 𝑀𝑖′𝑗 𝑚 𝑖′=1 𝑚+𝑝 𝑖′=1 , 𝑖 = 𝑚 + 1. .𝑚 + 𝑝, 𝑗 = 1. . ℎ 𝑁𝑖𝑗 ∑ 𝑁𝑖′𝑗 𝑚+𝑝 𝑖′=1 , 𝑖 = 1. .𝑚 + 𝑝, 𝑗 = ℎ + 1. . ℎ+ 𝑞 3.2 Incremental Calculating Acc and Cov when adding and removing objects on decision block. 3.2.1 Adding object x into decision block Case 1: Create a new conditional class and a new decision class. Acc(C’m+1, D’h+1) = 1 and Cov(C’m+1, D’h+1) = 1, j=1..h: Acc(C’m+1, D’j) = Cov(C’m+1, D’j) = 0, i=1..m: Acc(C’i, D’h+1) = Cov(C’i, D’h+1) = 0. Other way, i=1..m, j=1..h: Acc(C’i, D’j) = Acc(Ci, Dj) , and Cov(C’i, D’j) = Cov(Ci, Dj) . Case 2: Create only new conditional class Acc(C’m+1, D’j*) = 1 and Cov(C’m+1, D’j*) = 1 |𝐷𝑗∗|+1 . If k  j* then: Acc(C’m+1, D’k) = Cov(C’m+1, D’k) = 0. If i  m+1 then: Acc(C’i, D’j*) = Acc(Ci, Dj*), Cov(C’i, D’j*) = |𝐶𝑖∩𝐷𝑗∗| |𝐷𝑗∗|+1 . 14 Other way, i  m+1, j  j*: Acc(C’i, D’j) = Acc(Ci, Dj) and Cov(C’i, D’j) = Cov(Ci, Dj). Case 3: Create only new decision class Acc(C’i*, D’h+1) = 1 |𝐶𝑗∗|+1 and Cov(C’i*, D’h+1) = 1. If i  i* then: Acc(C’i, D’h+1) = Cov(C’i, D’h+1) = 0. If k  h+1 then: Acc(C’i*, D’k) = |𝐶𝑖∩𝐷𝑘| |𝐶𝑖∗|+1 , Cov(C’i*, D’k) = Cov(Ci*, Dk). Other way, i  i*, j  h+1: Acc(C’i, D’j) = Acc(Ci, Dj) and Cov(C’i, D’j) = Cov(Ci, Dj). Case 4: No new conditional class or new decision class is created. Acc(C’i*,D’j*) = |𝐶𝑖∗∩𝐷𝑗∗|+1 |𝐶𝑖∗|+1 and Cov(C’i*,D’j*) = |𝐶𝑖∗∩𝐷𝑗∗|+1 |𝐷𝑗∗|+1 - If kj* then: Acc(C’i*,D’k)= |𝐶𝑖∗∩𝐷𝑘|+1 |𝐶𝑖∗|+1 ; Cov(C’i*,D’k)= Cov(Ci*, Dk). - If u  i* then: Acc(C’u,D’j*) = Acc(Cu,Dj*) and Cov(C’u,D’j*) = |𝐶𝑢∩𝐷𝑗∗| |𝐷𝑗∗|+1 - If i  i* and j  j* then: Acc(C’i, D’j) = Acc(Ci, Dj) and Cov(C’i, D’j) = Cov(Ci, Dj). 3.2.2 Removing object x from decision block. Acc(C’i*,D’j*)= |𝐶𝑖∗∩𝐷𝑗∗|−1 |𝐷𝑖∗|−1 and Cov(C’i*,D’j*)= |𝐶𝑖∗∩𝐷𝑗∗|−1 |𝐶𝑖∗|−1 . - If kj* then: Acc(C’i*,D’k) = |𝐶𝑖∗∩𝐷𝑘| |𝐶𝑖∗|−1 and Cov(C’i*,D’k) = Cov(Ci*,Dk) - If ui* then: Acc(C’u,D’j*) = Acc(Cu,Dj*) and Cov(C’u,D’j*) = |𝐶𝑢∩𝐷𝑗∗| |𝐷𝑗∗|−1 - If i  i* and j  j* then: Acc(C’i,D’j) = Acc(Ci,Dj) and Cov(C’i,D’j) = Cov(Ci,Dj). 15 3.3 Mining decision laws algorithm using incremental calculating Acc and Cov matrix after adding and removing objects (MDLB_OSC1) Step 1: Calculate the accuracy matrix Acc(C,D) and coverage Cov(C,D) of the block before adding and removing the object. Step 2: Incremental calculating the accuracy matrix Acc(C',D') and coverage Cov(C',D') after adding and removing object. Step 3: Remove rows/ columns in matrices Acc(C',D') and Cov(C',D') that have value 0. Step 4: Generate decision laws on block. 3.4 Complexity of the mining decision laws algorithm using incremental calculating Acc and Cov matrix after adding and removing objects on decision block. Proposition 3.5: The algorimth’s complexity determining Acc and Cov is O(|U|2 ). Proposition 3.6: The algorimth’s complexity incremental caculating Acc and Cov when adding N objects is O(N|U|2 ). Proposition 3.7: The algorimth’s complexity incremental caculating Acc and Cov when removing M objects is O(M|U|2 ). Proposition 3.8: The algorimth’s complexity deleting rows/ columns in Acc and Cov matrices that have value 0 is O(|U|2 ). 3.5 Incremental Calculating Sup when adding and removing objects on decision block When adding N objects and removing M objects, we have: Sup(C’i,D’j) = Sup(Ci,Dj) + Nij – Mij, i=1..m+p, j=1..h+q 16 Other way, Mij = 0 and Sup(Ci,Dj)=0 with i=m+1..m+p, j=h+1..h+q 3.6 Mining decision laws algorithm using incremental calculating Sup matrix after adding and removing objects (MDLB_OSC2). Step 1: Calculate the Sup(C,D) of the block before adding and removing the object. Step 2: Incremental calculating the support matrix Sup(C',D') after adding and removing object. Step 3: Delete rows/ columns in Sup(C',D') that have value 0. Step 4: Calculate Acc(C',D') and Cov(C',D') through the values of Sup(C’,D') Step 5: Generate decision laws on block. 3.7 Complexity of the mining decision laws algorithm using incremental calculating Sup matrix after adding and removing objects on decision block. Proposition 3.9: The algorimth’s complexity incremental caculating Sup matrix when adding N objects is O(N|U|). Proposition 3.10: The algorimth’s complexity incremental caculating Sup matrix when removing M objects is O(M|U|). Proposition 3.11: The algorimth’s complexity incremental caculating Sup matrix to find out decision laws when adding N objects is O(|U|2). Proposition 3.12: The algorimth’s complexity incremental caculating Sup matrix when adding N objects in block’s slice at xid is O(N|U|). 17 Proposition 3.13: The algorimth’s complexity incremental caculating Sup matrix when removing M objects in block’s slice at xid is O(M|U|). 3.10 Experimental algorithms 3.10.1 Experimental objectives (1) Evaluate the enforcement of the MDLB and MDLB_VAC algorithms. (2) Evaluate the enforcement of the MDLB_OSC1 and MDLB_OSC2 algorithms. In addition, compare implementation time MDLB_OSC1 algorithm with MDLB_OSC2 algorithm. 3.10.2 Experimental data The experiment was performed on 3 data sets taken from Pediatrics Department A, B of Bach Mai Hospital 2 from March 10, 2020 to March 14, 2020. Data were collected and pre-treated, with each data set including 3 conditional index attributes, namely disease symptoms, including fever, cough, runny nose and 2 decision index attributes: the treatment regimen, fever virus level were monitored over 4 days. The element number of the data set is: Database name BVBM2KN A BVBM2KN B KID PATIENT FEVER VIRUS Number of objects 160 1360 939 Table 3.2: The basic information of Experimental data 3.10.3 Experimental tools and enviroment. 18 Programming algorithms is written with Java language. Experimental environment is PC with Intel (R) Core ™ i5 2.5Ghz configuration, 4G RAM, Windows 7 OS. 3.10.4. Experimental result After running 3 algorithms on the data sets, we obtained the following results: - With problem 1: find the decision laws on the block and slice: Figure 3.4: Founded decision laws on the block When you change min_acc and min_cov, the number of laws obtained will also change: 19 - With problem 2: Find decision laws on block and slice when smoothing, roughing index attribute values Figure 3.8: Calculate matrices Sup, Acc, Cov before and after smoothing Figure 3.5: Relationship between the number of decision laws and the threshold min_acc, min_cov 20 Figure 3.10: Calculate matrices Sup, Acc, Cov before and after roughing Figure 3.11: Founded decision laws after smoothing, roughing attribute values - With problem 3: find the decision laws on the block and slice when adding or removing objects + Results of MDLB_OSC1 algorimth: 21 + Results of MDLB_OSC2 algorimth: Comment: Two methods give the same result of the rule set with the same source set, the only difference in execution time: 3.11 Conclusion From the adding and removing object model on the decision block and slice, some properties of Acc and Cov 22 matrices have been demonstrated. Based on that, two algorithms for finding decision laws on block and slice were proposed: - Algorithm MDLB_OSC1 calculates incremental matrices Acc, Cov to find out decision laws on block and slice. - Algorithm MDLB_OSC2 calculates the incremental matrix Sup to find out decision laws. At the end of the chapter is a comparison of the two proposed algorithms and experimental settings. CONCLUSION 1) Main results of the thesis The thesis focuses on the problem of mining the decision laws on the block in some cases with the following main results: - Builded a model of mining decision laws on the data block with proven definitions, theorems, and propositions. - Proposes three algorithms to find decision laws on data block in the following cases: fixed block data; value of index attribute changes; and the object set of data block changes. 2) Future research of the thesis - Continue to study mining decidion laws in some case: the block has attributes changed, the data is not complete ... - Mining decision laws on the chain of linked decision blocks together (similar to blockchain technology). NEW FINDINGS OF THE DOCTORAL DISSERTATION This Dissertation has two key contributions, including: 23 - Builded a model of mining decision laws on the data block with proven definitions, theorems, and propositions. - Proposes three algorithms to find decision laws on data block in the following cases: fixed block data; value of index attribute changes; and the object set of data block changes. LIST OF WORKS OF AUTHOR 1. Thang Trinh Dinh, Tuyen Tran Minh, Lan Anh Do Thi, “Mining decision laws on data block has variable attribute values”, Proceedings of 19th Nati

Các file đính kèm theo tài liệu này:

mining_decision_laws_on_the_data_block.pdf