data-mining
Data mining for significant variables (numerical): Where to start?
I have a trading strategy on the foreign exchange market that I am attempting to improve upon. I have a huge table (100k+ rows) that represent every possible trade in the market, the type of trade (b[详细]
2023-04-08 01:51 分类:问答J48 not working in weka explorer
I am not able to use GUI of weka in linux (linux mint 9). 开发者_开发百科It doesn\'t allows me to use J48 from interface, whereas I am able to run it from command prompt.[详细]
2023-04-06 02:01 分类:问答Searching a text for predefined words
Hi I have a database table that looks like this word_id int(10) word varchar(30) And I have a text, I wanna see which one of the words in this text are defined in that table, what\'s the most elega[详细]
2023-04-05 00:49 分类:问答Improve the effeciency of query processing in a improved version of star schema
I have a fact table , measure table and and connected to them are dimension tables. It is just a slight modification to star schema. But know as the no of joins 开发者_开发技巧are increasing due to in[详细]
2023-04-04 18:54 分类:问答How to classify text when pre defined categories are not available
I have a problem and not getting idea which algorithm have to apply. I am thinking to apply clusteringin case two but no idea on case one:[详细]
2023-04-04 15:47 分类:问答Starting with Data Mining
I have started learning Data Mining and wish to create a small project in C++/Java that allows me to utilize a da开发者_如何学Pythontabase, say from twitter and then publish a particular set of result[详细]
2023-04-04 13:17 分类:问答Can one Twitter account lead to every other Twitter account?
Well, I\'m sure all of you are aware of the Wikipedia \'Easter egg\' that enables a user to follow every first embedded link in each article to an eventual link to the /Philosophy page.[详细]
2023-04-04 09:26 分类:问答Text Mining on huge list of strings
I have list of strings. (pretty big list of ids and strings scattered in 4-5 big files. around a GB each). These strings are formatted like this:[详细]
2023-04-02 15:55 分类:问答Expectation Maximization in Matlab on Missing Data
I have to use EM to estimate the mean and covariance of the Gaussian distribution for each of the two classes. They have some missing attributes too.开发者_运维问答[详细]
2023-04-02 12:05 分类:问答RDBMS for extremely large data sets - what are people using?
I have to perform some serious data mining on very large data sets stored in MySQL db. However, queries that require a bit more than a basic SELECT * FROM X WH开发者_C百科ERE ... tend to become rather[详细]
2023-04-02 05:42 分类:问答
加载中,请稍侯......