TẠP CHÍ KHỞI NGHIỆP CHO SINH VIÊN - Trang 233

229

Hội thảo Khoa học Quốc tế

...

2. BACKGROUND

2.1. Association Rule Mining Problem: In the last decade researchers, has find out that

Association rule mining (ARM) is the one of the heart process of data mining. ARM is the most

important data mining process which find out the all relations between the frequent pattern and

it doesn’t need any supervisor for that.ARM process on variable length data and determine

comprehensible results. Modern organizations having geographically distributed structure.

Characteristically, every location provincially saves its eternally increasing amount of daily data.

In such type of organize data, centralized data mining can’t discover feasible useful pattern because

it take large network communication cost. This is over come by using distributed data mining.

2.2. Apriori Algorithm: An ARM, Apriori has been produced for ARM in enormous exchange

databases by IBM’s Quest venture group. They have splitted the issue of ARM into two sections.

1. Search all the item set from the data set which has transaction support greater then minimum

support. Call it frequent item sets.

2. Generate preferred rules by utilizing these frequent item sets. Think about this illustration:

with the end goal that LMNO and LM are frequent item sets, at that point we can discover the

administer IF LM NO holds by figuring the proportion (R) =support (LMNO)/support (LM). The

rule holds just if R ≥ minimum confidence. Influence a note of that the rule will have minimum

support in light of the fact that LMNO is frequent. The method is amazingly adaptable. The method

of Apriori method is given beneath

2.3. Distributed Association Ruling: DARM find rules from different spatial data set located in

distributed environment. Conversely, parallel network connection is not having fast communication

as compare to the distributed network. So distributed mining frequently means to limit cost of the

correspondence. Scientists longed for the fast DMA to mine rules from scattered informational

collections apportioned among three diverse area .In each site, FDM finds the local support counts

and prunes all infrequent one . In the wake of completing home pruning, each site broadcasts

messages every other site to ask for their support counts. It at that point chooses whether immense

item sets are all inclusive frequent and creates the candidate item sets from those comprehensively

frequent item sets.

3. ANALYSIS OF DATA IN DISTRIBUTED ENVIRONMENT

Data Mining is technique to retrieve the effective data from the huge amount of database,

there are mainly two main goal of retrieve the data from the database, first one is the prediction

and second one is the description, from mining the data from the database there are different

data mining algorithm are available like, ARM, clustering and classification etc. within this, used

the concept of the SNARM in the geographical region, so the concept is spatial association rule

mining, in which retrieve the data from the geographical areas. Spatial association mining concept

is used to find the relationship between the different attributes by considering the threshold value

of support and confidence. And calculate the frequent item set in the distributed environment. In

this process, we divided the entire region into the three different region and each having their

spatial database SDB

1

, SDB

2

,…..SDBn and their own key values SK

1

,SK

2

,……..SK

n

, or Select N

number of region each having their own database SDB

1

, SDB

2

,…., SDBn . Every region figures

their frequent items set and support esteem. Every region are orchestrate in ring engineering at that

point locate the partial support, Now the area 1 send their Partial Support (PS) esteem to region 2

and region 2 send their incentive to region 3 and this procedure proceed till region n and after that

region n send their incentive to region 1. Region 1 subtract all the Random number an incentive

from the Partial Support esteem and ascertain their genuine support, now region 1 communicate the

real support an incentive to the whole region in the distributed.

Liên Kết Chia Sẽ

** Đây là liên kết chia sẻ bới cộng đồng người dùng, chúng tôi không chịu trách nhiệm gì về nội dung của các thông tin này. Nếu có liên kết nào không phù hợp xin hãy báo cho admin.