开发者

A basic query about data mining [closed]

开发者 https://www.devze.com 2023-03-25 09:52 出处:网络
Closed. This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing th
Closed. This question is opinion-based. It is not currently accepting answers.

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

Closed 7 years ago.

Improve this question

Using data mining, we are able to find useful patterns in a large set of data using techniques like correlation etc etc and there must exist some open source tools for this (what are some examples?).

Is this pull-based or push-based? I mean, do we provide data set as well as specific queries as input to the data mining engine and it provides us answers (as in SQL) or we only supply large data set as input to the engine and it on its own find patterns (which we never knew existed and/or we couldn't formulate queries for this) and thus we don't really pull any specific queri开发者_StackOverflowes from it, it pushes the patterns to us.

Some quick reading of Wikipedia article doesn't clarify my doubts in clear way.


As open source have a look at Weka.

In regards to the push-pull thing, well, it's a bit of both. But it's not quite that simple. You must be looking for something. E.g. if you are looking for clusters, there are unsupervised algorithms which will give you an answer with minimal guidance.

In practice things are more meaningful if you know about the data you analyse and you are looking at regularities and patterns that make sense.

Playing with Weka will give you a better idea of the range of possibilities.


Python and R are other great open source tools that have great popularity in the data mining area.


A great tool that i used recently is scikit-learn

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号