Weather analysis has been playing its vital role in meteorology and become one of the most challengeable problems both scientifically and technologically all over the world from the last century. This study carries historical weather data collected locally at Faisalabad city, Pakistan that was analyzed for useful knowledge by applying data mining techniques. Data includes ten years’ period [2007-2016]. It had been tried to extract useful practical knowledge of weather data on monthly based historical analysis. Analysis and investigation was done using data mining techniques by examining changing patterns of weather parameters which includes maximum temperature, minimum temperature, wind speed and rainfall. After preprocessing of data and outlier analysis, K-means clustering algorithm and Decision Tree algorithm were applied. Two clusters were generated by using K-means Clustering algorithm with lowest and highest of mean parameters. Whereas in decision tree algorithm, a model was developed for modeling meteorological data and it was used to train an algorithm known as the classifier. 10-fold cross validation used to generate trees. The result obtained with smallest error (33%) was selected on test data set. While for the number of rules generated of the given tree was selected with minimum error of 25%. The results showed that for the given enough set data, these techniques can be used for weather analysis and climate change studies.
Title = "Big Data Approach and Using Data Mining Techniques in Weather Prediction",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "6",
Pages ="467 - 518",
Year = "2017",
Authors ="M Ramesh, S Swarajhyam, B Prathyush"}
Data mining is the process of sorting through large data sets to identify patterns and establish relationships to solve problems through data analysis
Data preprocessing describes any type of processing performed on raw data to prepare it for another processing procedure
J48 standard algorithm for performing the partition has been upgraded over the time and it is totally based on the perception of information-theory
K-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.