开发者

Is my path of learning data mining correct

开发者 https://www.devze.com 2023-01-16 02:15 出处:网络
Someone has just told my boss what data mining can do to a company like recommendation , predictive modelling. Basically we are a website company. I am going on leave for 6 months.

Someone has just told my boss what data mining can do to a company like recommendation , predictive modelling. Basically we are a website company. I am going on leave for 6 months. So my boss said that I can learn some DM techniques so that when I come back we can visit small shops or small companies to provide them with predictive data using data mining algos.

The shops will be only having sql files or csv files for customers or more.

Now I only know MYSQL and have no idea what data mining is and whether it works like I am thinking above, I mean, is it possible that if someone has database of customers, shopping and I can apply data mining technique . I mean

(raw mysql or sql data) or (csv files) ----data mining--> (some useful result)
  • 1) Is the above system correct or am I wrong
  • 2) Will the shops or business would like to have that or am I missing something

My PLAN of learning those is in following order. I am thinking of first getting some sql server 2008 cert because in my area most are using microsoft so may be I need to know sql

1)MCTS: SQL Server 2008, Implementation and Maintenance
2)MCTS: SQL Server 2008, Database Development
3)MCTS: SQL Server 2008, Business Intelligence Development and Maintenance

(or should I go for oracle and oracle data warehousiong ... I want to first do some databse properly)

4)Data Mining with Microsoft SQL Server 2008 (2009)     
5)Python for dummies    
6)Programming Collective Intelligence: Building Smart Web 2.0 Applications

Is my flow correct or can I achieve my result a better way. Th开发者_如何转开发e reason I am doing cert is to get some understanding for sql and in case I don't get that job after 6 months I can get into new job related to data mining or BI or at least sql server.

Please help me


Ok this is not a simple YES / NO answer. You are doing some thing right. This way you will know the SQL Server Data Mining tool set. And you will undertstand which algo to use where. (How will Naive Byes . Different from Decision Tree..etc )

Once you know this stuff , second thing is getting to know you data and how to make the FLAT tables that will serve as input. This is most important because this is the data you will use to train you modles. You dont need to know the internal mathematics behind ANN algorithm and so on. You should just know how to use it. There are data mining add-ins for excel (2007 onwards) which you can use to play around .

There are some data mining videos on http://channel9.msdn.com by Rafal Luckawiski. They are good for giving some idea on how to begin.

After this it is a matter of practice and the more you play with new data and make new models and analyze results the better you are going to become.

Let me know if you need more info on PPTs, Samples etc


Uh, to do data-mining effectively, you need to know a lot of math. Your path is like "i want to be a surgeon, so I'll learn how to cut with a scalpel". Yes, knowing some SQL and is probably necessary (just depends on how your data is organized), but FAR from sufficient.


From what you have written it close to data mining but not data scraping.

First of all, the answer by Ngu Soon Hui is diverting you in a completely wrong direction.
What he advised you is called data scraping but not data mining.
You'd better understand the differences between data mining vs. data scraping (aka website/web scraping aka screen scraping aka data harvesting):

  • Difference between Data Mining And Screen-Scraping by RITA THOMSON
  • Difference between Data Mining And Screen-Scraping
  • [Whats the difference between Data Mining and Screen Scraping] 6
    http://it.toolbox.com/wiki/index.php/What's_the_difference_between_Data_Mining_and_Screen_Scraping%3F
    (note this link is not rendered correctly by SO, you should copy&pate it to your browser)

"(raw mysql or sql data) or (csv files) ----data mining--> (some useful result)"

Just forget completely about MySql and do not loose your time on it because there is absolutely no support for datamining in MySql. Only for data scraping. Though you might have the interest in the latter. You'f better know the difference

"1)MCTS: SQL Server 2008, Implementation and Maintenance 2)MCTS: SQL Server 2008, Database Development 3)MCTS: SQL Server 2008, Business Intelligence Development and Maintenance"

Why do you need 1) and 2)? Even 3) contains only 20% of datamining.

5)Python for dummies 6)Programming Collective Intelligence: Building Smart Web 2.0 Applications

Why do you need Python?

  1. is not datamining. It is called data scraping and it is again the path in completely wrong direction from DM
0

精彩评论

暂无评论...
验证码 换一张
取 消