Site Loader
Rock Street, San Francisco

The population of the universe has been increasing well. The thickly settled states like India, earnestly dawdling behind to supply the basic needs to the people. Food is one of the basic needs that any state has to carry through. Agriculture is one of the major sectors on which one tierce of Indian population depends on. The irrigation based states like India where the H2O has been the basic resource that forges the workss ‘ growing. The chief resource for the irrigation is rainfall which is scientifically a liquid signifier of precipitation. The atmospheric rain cloud clouds are responsible for this precipitation. Prediction of the precipitation is necessary, as it has to be considered during the fiscal planning of a state. The meteoric sections of every state are really acute in entering the datasets of precipitation which are immense in content. Hence, information excavation is found to be an disposed tool which would pull out the relation between the datasets and their properties. A Supervised Learning in Quest is one such informations excavation algorithm which is finally a determination tree used to foretell the precipitation based on the historical information. The Supervised Learning in Quest determination tree utilizing addition ratio is a statistical analysis for set uping the relation between property set and precipitation which furnishes the anticipation with an truth of 77.78 % .

Keywords- Data Mining ; Decision Tree ; Meteorology ; Precipitation ; Prediction ; Rainfall ; SLIQ ;

Introduction

The growing of population is one of the major factors that affect adversely the growing of economic system of a state. It is indispensable to guarantee adequateness of substructure for supplying the basic demands of the turning population. The agricultural sector provides the most of the natural stuffs required for supplying merchandises to run into the basic demands. It is obvious that the agricultural productiveness depends on H2O handiness wherein precipitation is the primary beginning of H2O. The precipitation is due to the thick beds of the clouds in the ambiance, which would hold attained the runing point [ 26 ] . The anticipation of the precipitation forms a footing for be aftering economic system with improved truth. Hence, there is a demand to suggest the theoretical accounts for bettering truth in the precipitation anticipation.

A mathematical theoretical account is an abstract representation of a real-life job state of affairs. Many mathematical theoretical accounts which represent the real-life job state of affairss are complex. Hence, work outing such complex theoretical accounts involve in executing a big figure of arithmetic and logical operations on related informations. The innovation of the computing machine improved truth and minimized the clip in executing those operations. The anticipation of precipitation is a complex and unsure phenomenon that consequences in the complex mathematical theoretical accounts. The most of the anticipation theoretical accounts employ the immense historical informations. Here, informations excavation can be used for foretelling the precipitation more accurately.

Data excavation tools can be employed in the Fieldss of anticipation constitute unreal nervous webs, familial algorithms, ruled based initiation, nearest neighbour method, memory based logical thinking, additive discriminate analysis and determination trees. The success rate for the anticipation of the precipitation by using different informations excavation tools reported in the literature is 43.6 % [ 29 ] . Recently, Prasad et. Als proposed to use Supervised Learning In Quest ( SLIQ ) determination tree utilizing Gini index for the anticipation of the precipitation which resulted in an truth of 72.3 % [ 2 ] . This paper proposes to use SLIQ determination tree utilizing addition ratio that improves the truth from 72.3 % to 77.78 % .

The remainder of the paper is organized as follows: Section II describes relevant work. Section III provides the information about Decision Trees. In subdivision IV, a brief description about the SLIQ Decision tree algorithm is discussed. Section V describes the regulations for determination tree. Section VI describes the experimental consequences. In subdivision VII decisions are presented and eventually in subdivision VIII, the hereafter sweetenings are illustrated.

Relevant Work

Research is a uninterrupted procedure. If anyone imagines that the research on any field is completed and so he/she has to paraphrase his/her word of sentence. The research continues beyond this point. In the literature, there are many research findings which are reported for foretelling the precipitation with accurate possible rate. Some of them used the traditional methods of the unreal nervous webs for the anticipation while other methods include the recent developments like Image Processing, Linear Regression and Fuzzy logic and so on.

Frank Silvio Marzano, Giancarlo Rivolta, Erika Coppola, Barbara Tomassetti and Marco Verdecchia used a to the full nervous web attack to the rainfall field Nowcasting from infrared and micro-cook passive-sensor imagination aboard [ 6 ] . K.Richards and G.D. Sullivan, combined the characteristics of Bayesian strategy for texture analysis of the cloud images which are taken from the land [ 7 ] . C. Jareanpon, W. Pensuwon, R.J. Frank and N. Davey formed radial footing map nervous web with a specially designed familial algorithm [ 8 ] . K. Ochiai, H. Suzuki, S. Suzuki, N. Sonehara and Y. Tokunaga stated that the computational clip for larning with an acceleration algorithm can be reduced about 10 per centum by presenting a pruning algorithm [ 9 ] . I.F. Grimes, E. Coppola, M. Verdecchia and G. Visconti presented an attack to cold cloud continuance imagination derived from meteosat thermal infrared imagination is used in concurrence with numerical conditions theoretical account analysis informations as an input to an ANN [ 10 ] . Thiago N. de Castro, Francisco Souza, Jose M.B. Alves, Ricardo S.T. Ponss, Mosefran B.M. Firmino and Thiago M. de Pereria forecasted seasonal Rainfall utilizing Neo-Fuzzy nerve cell theoretical account [ 11 ] . Tuan Zea Tan, Gary Kee Khoon Lee, Shie-Yui Liong, Tian Kuay Lim, Jiawei Chu and Terence Hung IEEE treated the series of rainfall as a uninterrupted clip series [ 12 ] . Jiansheng Wu Integrated additive arrested development with ANN. The additive arrested development infusions linear features of the rainfall [ 13 ] . Hui Qi, Ming Zhang and Roderick A. Scofield developed a Multi- Polynomial High Order Neural Network ( M-PHONN ) [ 14 ] . Wint Thida Zaw and Thinn Thu Naing stated that the Multi variables multinomial arrested development ( MPR ) is one of the statistical arrested development methods used to depict the complex nonlinear input and end product relationships [ 15 ] . C. Kidd and V. Levizzani stated that the rainfall is spatially and temporally extremely variable [ 16 ] . Sanjay D. Sawaitul, Prof. K.P. Wagh and Dr. P.N. Chatur used the parametric quantities of the conditions like air current way, wind velocity, humidness, rainfall and temperature and so on for the categorization and anticipation of the hereafter conditions by utilizing the back extension algorithm [ 17 ] . Soroosh Sorooshian, Kuo-lin Hsu, Bisher Imam and Yang Hong made planetary precipitation appraisal from satellite image by utilizing unreal nervous webs [ 18 ] . Kesheng Lu and Lingzhi Wang used a bagging sampling technique is used to bring forth the preparation sets for combination theoretical account based on support vector machine for the rainfall anticipation [ 19 ] . Grant W. Petty and Witold F. Krajewski discussed in their research methods based on infrared, seeable and inactive microwave radiation measurings [ 20 ] .

Decision Tree

Decision tree is an advanced cognition find procedure with minimal clip complexness and has an easiness in the execution. It establishes relationship between the assorted datasets by detecting the concealed forms among the datasets which are immense and complex [ 3, 4 ] , [ 26, 27 ] . As it is known fact that, “ The lone manner to acquire more truth is to make more research ” , which indicates that more and more research has to be done to derive more accurate consequences. However the research should be carried out by maintaining in head the cost factor. Hence, the scientists have been bettering the determination tree algorithms. The usage of determination trees have been raised from normal statistical analysis to an effectual tool in information excavation, text excavation, information retrieval and pattern acknowledgment and so on.

The properties referred in Table I are humidity, temperature, force per unit area, wind velocity and dew point. The sum of H2O vapour in the air is referred as humidness is unseeable in nature. The temperature is the grade of hot or coldness of a organic structure or environment. The temperature is measured in grade centigrade ( oC ) . Atmospheric force per unit area is the force per unit country exerted against the land surface by the weight of air above the land surface and it is measured in bars. The speed at which air current is fluxing is referred as the air current velocity which it is measured in metres per second by an wind gauge. Pressure gradient, Rossby waves, jet watercourses and local conditions conditions chiefly affect the air current velocity which leads to the devastation. Dew point is the temperature at which the air nowadays in the ambiance can no longer keep all of the H2O vapour which is assorted with it and some of the H2O vapour must distill into liquid H2O.

As it is an established fact that the precipitation by and large depends on the assorted properties like humidness, temperature, force per unit area and air current velocity and so on. Let us see a dataset with the similar properties viz. humidness, temperature, force per unit area, wind Speed and dew point which influence the rainfall and category label as given in Table I. A determination tree is constructed as shown in Fig.1, for the informations given in Table 1.

The Table 1 shows 30 yearss informations of humidness, temperature, force per unit area, wind velocity and dew point along with the category label. This is a portion of informations from Indian Meteorological Department ( IMD ) for 15 old ages.

The determination tree is an upside-down tree with root node stand foring the full dataset which is partitioned into assorted subdivisions. The foliages of the subdivisions represent category label as shown in Fig.1.

Training Dataset

Day

Humidity

( H )

Temperature

( T )

Pressure

( P )

Wind Speed

( W )

Dew Point

( D )

Class

1

97

24

1005

14

21

Rain

2

85

26

1004

16

21

No Rain

3

91

27

1004

14

21

Rain

4

82

27

1006

16

20

Rain

5

81

26

1007

18

19

No Rain

6

95

26

1007

18

20

Rain

7

95

26

1007

16

20

Rain

8

93

26

1008

18

21

Rain

9

87

24

1005

13

21

Rain

10

88

24

1005

11

21

Rain

11

80

26

1005

14

21

Rain

12

89

26

1005

14

21

Rain

13

86

27

1006

14

21

No Rain

14

86

28

1007

10

22

Rain

15

94

27

1006

14

21

Rain

16

88

26

1004

13

21

No Rain

17

92

27

1005

13

21

Rain

18

86

27

1007

11

21

Rain

19

82

27

1006

11

21

Rain

20

76

27

1007

14

19

No Rain

21

79

27

1008

11

20

No Rain

22

75

27

1008

13

20

No Rain

23

84

27

1007

13

20

No Rain

24

88

26

1006

11

21

Rain

25

86

25

1005

16

19

Rain

26

78

28

1006

13

21

No Rain

27

79

27

1008

13

19

No Rain

28

80

28

1008

8

20

No Rain

29

84

29

1009

6

21

No Rain

30

76

27

1009

6

22

Rain

TABLE II Notations Used in Showing Sliq Alogrithm

Symbols

Description

Calciferol

Set of developing tuples with associated category labels

Disk jockey

The set of informations tuples in D fulfilling result J

|D|

The figure preparation tuples in D

C

The category label

Entropy ( D )

The information needed to sort a tuple in D

Splitinfo ( V )

Standardization to information addition.

Split point

Center of Vi and Vi+1

Volt

An attribute list

Six

Set of values in property V

Vi+1

Changed Class value in property V

Pi

The chance that a tuple in D belongs to category Ci

Di

Valuess which are greater than or equal to the Split point

Disk jockey

Valuess which are less than the Split point

The standard for partitioning dataset at a degree is explained in the following subdivision. Decision trees can be used for dataset whether it is uninterrupted or discontinuous. The class of dataset is taken into history which is called as the category label. One of the properties becomes the root node for the determination tree whereas category label is the leaf node as shown in Fig.1. The cognition based excavation is non so effectual in set uping temporal attribute relationships.

SLIQ Decision Tree Algorithm

The determination tree classifier, SLIQ [ 1 ] can manage numeral every bit good as categorical properties. It employs a pre-sorting technique for cut downing the cost of measuring numeral properties during the tree-growth stage. Further, the SLIQ utilizing the Minimum Description Length ( MDL ) rule employs a tree pruning algorithm. It is reported that the SLIQ algorithm is cheap in ensuing compact and accurate trees [ 1 ] . The SLIQ ensures scalability in sorting big datasets dwelling of a big figure of categories and properties.

In the building of the determination tree addition ratio is evaluated at every consecutive center of the property values. However, the efficiency of the SLIQ determination tree algorithm can be improved by measuring addition ratio merely at the centers of the properties where the category information alterations. The algorithm for the building of SLIQ determination tree for the anticipation of precipitation is presented below. The notations used are given in Table II.

Overview of SLIQ Decision tree growing and split points

Read dataset into the root node of the SLIQ determination tree

Generate an attribute list for each property of the dataset

Sort the property lists on property value in non-decreasing order

Calculate the information for the root node

( 1 )

Calculate the Info of attribute list ‘V ‘

( 2 )

Calculate the Gain for each property list

Gain ( V ) = Entropy ( D ) – Info ( V ) ( 3 )

Compute split information for a set of values of property ‘Di ‘ and ‘Dj ‘

Splitinfo ( V ) = ( 4 )

Determine the Gain Ratio for the property values in attribute list ‘V ‘

Gain Ratio ( V ) = Gain ( V ) / Splitinfo ( V ) ( 5 )

Determine maximal addition ratio from among the addition ratios which become the footing for the best split as shown in Table III.

Best Split =Max. Gain Ratio value of property ( 6 )

Partition the root node into foliage nodes based on the best split point

Repeat the stairss 5 through 10 reading the root node as leaf node until all leaf nodes contain the same category labels.

The primary metric for measuring the anticipation of precipitation is accuracy – the truth of a forecaster refers to how good a given anticipation can give the value of the predicted property for new or antecedently unobserved informations.

Accuracy = Correct anticipations / Entire anticipations ( 7 )

The ideal end is to bring forth compact, accurate trees in a short clip with scalability – the SLIQ determination tree algorithm used for the anticipation of precipitation takes N input properties and N category labels as an input and produces the determination tree along with the regulations.

The fake tree shown in Fig. 1 consists of 13 nodes and 7 out of them are picturing rain and the staying 6 are picturing no rain. The determination tree shown in Fig. 1 NR indicates no rain and R indicate rain.

Fig. 1. Derive Ratio based Decision Tree

TABLE III. Gain Ratio Based Split Value for assorted properties

Iteration

Humidity

Temperature

Pressure

Wind Speed

Dew Point

Split Value

Gain Ratio

Split Value

Gain Ratio

Split Value

Gain Ratio

Split Value

Gain Ratio

Split Value

Gain Ratio

Measure 1

86.0

0.2791

28.5

0.2149

1007.5

0.1146

9.0

0.0495

20.0

0.1029

Measure 2

83.0

0.1602

27.5

0.1602

1007.0

0.2050

17.0

0.0976

20.0

0.1602

Measure 3

83.0

0.4459

27.5

0.4459

1006.0

0.205

13.0

0.2367

20.5

0.2367

Measure 4

83.0

1.00

27.0

0.3112

1006.0

0.3112

15.0

0.3112

20.5

0.1510

Measure 5

77.0

0.2147

27.5

0.0563

1008.5

0.3677

7.0

0.3677

20.5

0.3677

Measure 6

83.0

-1.0

27.5

1.0

1007.5

1.0

15.0

1.0

20.5

-1.0

Measure 7

88.0

0.0176

25.5

0.0690

1007.0

0.0817

15.0

00690

20.5

0.0579

Measure 8

88.0

0.0452

25.5

0.1425

1006.0

0.0452

13.0

0.0859

20.5

0.0631

Measure 9

88.0

0.5171

27.0

0.0060

1006.0

0.0060

13.0

0.1284

20.5

-1.0

Measure 10

83.0

-1.0

27.0

0.1980

1006.0

0.1188

13.5

0.1908

20.5

-1.0

Measure 11

83.0

-1.0

27.0

0.2740

1006.0

0.2740

13.0

0.2740

20.5

-1.0

Measure 12

83.0

-1.0

27.0

1.0

1007.5

-1.0

15.0

-1.0

20.5

-1.0

Rules for Decision Tree

Once the determination tree is constructed, there is a possibility that the tree is really big to understand. Hence, to simplify the apprehension of the big determination tree the regulations are generated.

Rule 1: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & lt ; 27.5 ) and ( humidness & lt ; 83.0 ) ] Then ( Prediction = Rain )

Rule 2: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & lt ; 27.5 ) and ( humidness & gt ; = 83.0 ) ] Then ( Prediction = NoRain )

Rule 3: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 27.5 ) ] Then ( Prediction = NoRain )

Rule 4: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & gt ; =1007.0 ) and ( dew-point & lt ; 20.5 ) ] Then ( Prediction=NoRain )

Rule 5: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & gt ; = 1007.0 ) and ( dew-point & gt ; = 20.5 ) and ( temperature & lt ; 27.5 ) ] Then ( Prediction = Rain )

Rule 6: If [ ( humidness & lt ; 86.0 ) and ( force per unit area & gt ; = 1007.0 ) and ( dew-point & gt ; = 20.5 ) and ( temperature & gt ; = 27.5 ) ] Then ( Prediction = NoRain )

Rule 7: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & lt ; 25.5 ) ] Then ( Prediction = Rain )

Rule 8: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 25.5 ) and ( humidness & lt ; 88.0 ) ] Then ( Prediction = NoRain )

Rule 9: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 25.5 ) and ( humidness & gt ; = 88.0 ) and ( wind-speed & lt ; 13.5 ) and ( wind-speed & lt ; 13.0 ) ] Then ( Prediction = Rain )

Rule 10: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 25.5 ) and ( humidness & gt ; = 88.0 ) and ( wind-speed & lt ; 13.5 ) and ( wind-speed & gt ; = 13.0 ) and ( temperature & lt ; 27.0 ) ] Then ( Prediction = NoRain )

Rule 11: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 25.5 ) and ( humidness & gt ; = 88.0 ) and ( wind-speed & lt ; 13.5 ) and ( wind-speed & gt ; = 13.0 ) and ( temperature & gt ; = 27.0 ) ] Then ( Prediction = Rain )

Rule 12: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & lt ; 1007.0 ) and ( temperature & gt ; = 25.5 ) and ( humidness & gt ; = 88.0 ) and ( wind-speed & gt ; = 13.5 ) ] Then ( Prediction=Rain )

Rule 13: If [ ( humidness & gt ; = 86.0 ) and ( force per unit area & gt ; = 1007.0 ) ] Then ( Prediction = Rain )

Experimental Consequences

The information taken for the preparation needs to be sorted during the initial phase of the tree growing stage of determination tree building [ 3 ] . As per the preparation informations, humidness is the first property. Take the humidness property and its corresponding category label as a brace, place the split points whenever there is a alteration in the category label. The better split point demands to be found for increasing the truth of anticipation. For every split point identified find the center for the changed category labels and continue until it reaches the terminal of the informations as shown in Table IV.

From the Table IV it is clearly seeable that there is a alteration in the category label for the first clip at the 3rd place. Mark it as split point and take the center value of 2nd and 3rd category label values i.e. center ( 76, 76 ) =76. Similarly the 2nd split point occurs at 4th place. Mark it as split point and take the center value of 3rd and 4th category label values i.e. center ( 76, 78 ) = 77. Continuing in this order there are nine disconnected points as the category label is altering at nine places.

Repeat the process to happen out the split points for the attribute temperature shown in Table V, attribute force per unit area shown in Table VI, property air current velocity shown in Table VII and attribute dew point shown in Table VIII.

TABLE IV. Dataset screening on humidness

Humidity

Split PointClass

75

No Rain

76

76

77

80.0

80.5

81.5

83.0

86.0

87.5

88.0No Rain

76

Rain

78

No Rain

79

No Rain

79

No Rain

80

No Rain

80

Rain

81

No Rain

82

Rain

82

Rain

84

No Rain

84

No Rain

85

No Rain

86

No Rain

86

Rain

86

Rain

86

Rain

87

Rain

88

No Rain

88

Rain

88

Rain

89

Rain

91

Rain

92

Rain

93

Rain

94

Rain

95

Rain

95

Rain

97

Rain

TABLE V. Dataset screening on temperature

Temperature

Split PointClass

24

Rain

24

Rain

24

25.5

26

26.5

27

27.5

28.0

28.5Rain

25

Rain

26

No Rain

26

No Rain

26

No Rain

26

Rain

26

Rain

26

Rain

26

Rain

26

Rain

26

Rain

27

No Rain

27

No Rain

27

No Rain

27

No Rain

27

No Rain

27

No Rain

27

Rain

27

Rain

27

Rain

27

Rain

27

Rain

27

Rain

27

Rain

28

No Rain

28

No Rain

28

Rain

29

No Rain

TABLE VI. Dataset screening on force per unit area

Pressure

Split PointClass

1004

No Rain

1004

1004

1005.5

1006

1006.5

1007

1007.5

1008

1008.5

1009No Rain

1004

Rain

1005

Rain

1005

Rain

1005

Rain

1005

Rain

1005

Rain

1005

Rain

1005

Rain

1006

No Rain

1006

No Rain

1006

Rain

1006

Rain

1006

Rain

1006

Rain

1007

No Rain

1007

No Rain

1007

No Rain

1007

Rain

1007

Rain

1007

Rain

1007

Rain

1008

No Rain

1008

No Rain

1008

No Rain

1008

No Rain

1008

Rain

1009

No Rain

1009

Rain

TABLE VII. Dataset screening on air current velocity

Wind Speed

Split PointClass

6

6

7

9

10.5

11

12

13

13.5

14

15

16

17

18

No Rain

6

Rain

8

No Rain

10

Rain

11

No Rain

11

Rain

11

Rain

11

Rain

11

Rain

13

No Rain

13

No Rain

13

No Rain

13

No Rain

13

No Rain

13

Rain

13

Rain

14

No Rain

14

No Rain

14

Rain

14

Rain

14

Rain

14

Rain

14

Rain

16

No Rain

16

Rain

16

Rain

16

Rain

18

No Rain

18

Rain

18

Rain

TABLE VIII. Dataset screening on dew point

Dew Point

Split PointClass

19

No Rain

19

No Rain

19

19

19.5

20

20.5

21No Rain

19

Rain

20

No Rain

20

No Rain

20

No Rain

20

No Rain

20

Rain

20

Rain

20

Rain

21

No Rain

21

No Rain

21

No Rain

21

No Rain

21

No Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

21

Rain

22

Rain

22

Rain

Now, compare all the split points ‘ addition ratio values and the value which is maximal is the best split point for that property as shown in Table III. The addition value obtained for the property is to be divided by split info value of category label, in order to obtain the addition ratio value for that property and is shown in equation ( 9 ) .

Gain Ratio ( V ) = Gain ( V ) / Split info ( V ) ( 9 )

Repeat the above process by taking the temperature property along with the category label, Pressure attribute along with the category label, wind velocity property along with the category label and eventually dew point property along with the category label to acquire the best split points. Choose the maximal addition ratio value and that itself becomes the root node. Based on the threshold value of the root node generates the tree. Repeat the process boulder clay it is terminated with a alone category label.

The addition ratio is by and large used to mensurate the inequalities among the statistical informations and its frequences. So far, its usage is limited for the analysis of wealth and income of the economic states. Due to the inequalities present in the chances, there may be some mistake. But, irrespective of its restriction nowadays, it has a broad assortment of the applications in statistical analysis.

The addition ratio is used here for the building of the determination tree where the roots and sub-roots are classified. The usage of the addition ratio for the rainfall analysis is rather disposed because of the abnormalities present in the statistical information of the precipitation. The precipitation informations is used does non follow an order in other words a consecutive way. This may be due to the inequalities of the present property with former property. This may alter to a great extent or to some extent depending on the Mother Nature.

Some experiments have been conducted on existent informations to analyse the truth of the tree. We have used the dataset from the accuweather.com of Indian Meteorological Department. The end is to foretell the precipitation for rainfall. The dataset consists of 15 old ages of informations from the twelvemonth 1998 to 2012 containing of 5230 illustrations. Out of 15 old ages data 9 old ages information is used as preparation dataset and the staying 6 old ages information is used as trial dataset.

It has been found in Table IX, the differentiation between the success rate of anticipation and clip. It can besides be observed, that the maximal efficiency obtained is 74.1 % on one twelvemonth dataset, 77.47 % for two old ages dataset, 77.38 % for three old ages dataset, 77.17 % for four old ages dataset, 77.39 % for a five 5 old ages dataset and 77.78 % for a six old ages dataset. The mean efficiency has been found to be 77.78 % . Though, this contributes a nice efficiency or success rate, the other methods of back extension nervous webs [ 7,8 ] , [ 12-15 ] , additive discriminate statistical analysis [ 16 ] and J48 are analyzed to choose the best acting method of anticipation of the precipitation.

The published consequences for this dataset are: 64.3 % truth for backpropagation, 58 % for a additive discriminant and 68.6 % for J48. Using the same preparation and trial datasets, Since the mean truth utilizing SLIQ with addition ratio is 77.78 % as shown in Fig. 4, SLIQ utilizing addition ratio can be considered as the best acting method for the anticipation of precipitation.

TABLE IX. Result demoing the Accuracy and Time of Response

No. of records

Correct anticipations

In right anticipations

Accuracy ( % )

Time

( Sec )

363

269

94

74.104

37

728

566

162

77.471

40

995

770

225

76.381

42

1262

974

288

77.175

44

1619

1253

366

77.392

46

1981

1541

440

77.778

47

Fig 2. No. of Records Vs Correct Predictions

Fig 3. No. of Records Vs wrong Predictions

Fig 4. No. of Records Vs Accuracy ( % )

Fig 5. No. of Records Vs Time ( Sec )

The fluctuation of right anticipations with dataset is shown in Fig. 2. This indicates that there lies a additive relationship between right anticipations and figure of records in the dataset.

The fluctuation of wrong anticipations with dataset is shown in Fig.3. This indicates that there lies a non additive relationship between wrong anticipations and figure of records in the dataset. From the above graph, the figure of wrong anticipations follows a diminishing tendency up to 600 records and thereafter additions non linearly. The fluctuation of truth with the dataset is plotted in Fig. 4. The fluctuation of clip of response towards dataset is plotted in Fig. 5.

Decision

The economic system of a state depends on agricultural productiveness which is the footing for explicating economic policy. The agricultural productiveness depends on the handiness of H2O. The precipitation is the major beginning of H2O which depends on assorted properties like humidness, force per unit area, temperature, wind velocity, dew point and so on. Hence, the anticipation of precipitation becomes a hard undertaking as it has to see many parametric quantities. Many techniques such as nervous webs, unreal intelligence, used for anticipation of precipitation have less truth. So far, the maximal truth reported is 72.3 % . This survey employed SLIQ determination tree utilizing addition ratio as dividing standard. For measuring the effectivity of this theoretical account the historical informations obtained from IMD is applied. It is found that the method proposed in this paper gives higher truth when compared to the other theoretical accounts.

Future Enhancements

In this paper, we highlighted addition ratio based SLIQ determination tree algorithm, which gives maximal truth. For future execution assorted other determination tree algorithms like CART, SPRINT, ELEGANT, EC4.5 with extra parametric quantities can be developed. A determination tree must be developed for the dynamic manner of informations instead than inactive manner.

Post Author: admin

Leave a Reply

Your email address will not be published. Required fields are marked *