2012-09-22

A list of short Data Mining Book reviews


AuthorTitleRankedISBN
Richard O. Duda and Peter E. Hart and David G. Stork Pattern Classification ***** 0471056693
I love this book. This is a classic in the field of classification, and it explains the basics as well as exhaustive details in a wonderful style. Reading this is just like Christmas.
R. Durbin and S. Eddy and A. Krogh and G. Mitchison Biological sequence analysis **** 0521629713
Neill Gershernfeldt The Nature of mathematical modeling *** 0521570956
This one is way over my head math-wise. So take this review with a grain of salt. A lot of topics are covered, but I think not in a way that is detailed enough to learn about them. It might be enough to remind you how things generally work, when you already know them, like some notes made for yourself. Also, for my purposes, only the last third is interesting, when it comes to generating models from data. The first two thirds deal with analytic models, mostly based on differential equations, where you can write down the full model, parameters and all and just have to solve it, and with their numerical solution in case that is not possible.
Claude E. Shannon and Warren Weaver The Mathematical Theory of Communication *** 0252725484
Original formulation of Shannon entropy. The real original appeared as 'A Mathematical Theory of Communication', published in two parts in the Bell System Technical Journal (BSTJ) in 1948
Günter Bamberg and Franz Baur Statistik *** 3486255401
A German standard textbook on basic statistics, covering the bases from descriptive statistics over probability up to common tests. It's actually quite readable, without simplifying incorrectly.
Ian H. Witten and Eibe Frank Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations *** 1558605525
This is a nice introduction to data mining, at least the first six chapters are good. The best thing is, that there is a Data Mining software called Weka, which is free and really useful, and you can use it to try out and get a feel for the methods presented here. The less stellar points are that the book contains chapters that double up as documentation for this toolkit, and as tutorials on how to use it. This is stuff for online documentation, especially since the toolkit has been further developed and the book is outdated in comparison. Also, some algorithms are covered twice, once superficially in an early chapter, once in gory detail later on.

No comments:

Post a Comment