Preparing and Mining Data
About this book
Data is a fact of life. As time goes by, we collect more and more data, making our original reason for collecting the data harder to accomplish. We don’t collect data just to waste time or keep busy; we collect data so that we can gain knowledge, which can be used to improve the efficiency of our organization, improve profit margins, and on and on. The problem is that as we collect more data, it becomes harder for us to use the data to derive this knowledge. We are being suffocated by this raw data, yet we need to find a way to use it. Organizations around the world realize that analyzing large amounts of data with traditional statistical methods is cumbersome and unmanageable, but what to do about it? Enter data mining. As both technology and data mining techniques continue to improve, the capability of data mining products to sort through the raw material, pulling out gems of knowledge, should make CEOs around the world jump up and clap their hands. Before we get too far ahead of ourselves, realize that the success of any data mining project lies in the proper execution of specific steps. There is no magic box from which a data mining solution appears. We must work with the raw data and get to know what it contains. What we get out of a data mining solution is only as good as what we put into it. T
he six steps for a data mining solution are as follows:
- Defining the problem Preparing the data
- Building the models Validating the models
- Deploying the models Managing the meta data associated with transforming and cleaning the data and building and validating the models
Author: Raymond Balint
Raymond Balint swam distance freestyle all year for the Bulldogs. His top finishes included a third-place finish against SCAD in his 1650 freestyle and a win in the 1000 free against King College.
Reviews
There are no reviews yet.