Tag Archives: data mining

Data Mining for the Masses and Correlation Matrices

I’m working through Data Mining for the Masses (yes, at the same time as I’m working through Machine Learning for R.) I’ve found that hitting the same topic from multiple angles helps to embed the concepts and lessons much more … Continue reading

Posted in Data Science, Information Technology | Tagged , , , , , , , | Leave a comment

Free Online Public Data Sources: An Annotated Bibliography

Free Online Health Data Sources: An Annotated Bibliography By William Murakami-Brundage 1st edition, February 2012 There exists a shortage of usable data sets and public health data. Whether your interest is biomedical engineering, health informatics, data mining, or public health … Continue reading

Posted in Digital Libraries, Education, Information Technology, Resource-a-rama, The Cloud | Tagged , , , , , , , , , , | 1 Comment

What Does a Data Analyst Do?

What does a data analyst do? As a data analyst, someone typically handles data coming from or going into a data warehouse or business intelligence system. They compile the reports, verify the quality (integrity), and use the data to assist executive- … Continue reading

Posted in Careers And Work, Education, Information Technology | Tagged , , , , , , , , , , , , , , , , , , | 4 Comments

Data Mining with RapidMiner 5.1, GATE, and Weka 3.6.4

Over the last 6 months, I have been working intensely with a host of data mining software. Some of it was good, some of it was lousy, and some of it I can only rate as excellent. You will need … Continue reading

Posted in Information Technology, The Cloud, Wordplay and Commentary | Tagged , , , , , , , | 1 Comment

The Deep Web: The Internet’s Obscure Data Mine

Deep Web. Invisible Web. The Hidden Internet. Every knowledge professional, from librarian to data mining specialist, should know that the Deep Web exists. The term “Deep Web” refers to data and resources that are not accessed by typical Googling. Examples include university-hosted databases, organizational Wikis, topic-specific forums, and other volatile locations difficult to index with typical bots and spiders. Notably, some of these massive websites can be very information-intensive – and also nigh-impossible to crawl via current public methods. Continue reading

Posted in Resource-a-rama, The Cloud | Tagged , , , , , , , , , , , | Leave a comment