Full description not available
M**L
A Non-Serious Attempt to Teach Data Science & R
For a hobbyist, this book would be sufficient for the study of data science and of R. Theory underlying techniques is minimal or nonexistent. (For logistic regression, no mathematical equation is given for the model.) This could work for a light introduction to the techniques themselves, but the book relies on R to present the models, and this is the source of my biggest gripe. The R coding is lengthy and inefficient. If you’re going to teach R, teach how to code as efficiently as possible. There’s a difference between presenting things simply to introduce a newcomer and teaching inefficient use of methods. This flaw seems to be true of other Manning books also. There are genuine textbooks on data science/machine learning/data mining/statistical learning (Murphy 2012, Nov. 2020; Kelleher, MacNamee, & D’Arcy 2015, 2020; Koller & Friedman 2009; MacKay 2003; Theodorodis 2015, 2020; Barber 2012; Aggarwal 2015; James, Witten, Hastie, & Tibshirani 2013, 2017 and Hastie, Tibshirani, & Friedman 2016; Bishop 2006). Each has its own particular emphasis—e.g., graphical models, Bayesian, optimization, and the activity classification (ML, data analysis, data mining, statistical learning) varies. WRT R are Knox (2018), Ghatak (2017), Shmueli et al. (2017, a lot of typos & the data mining is presented at a very low level), and Baumer, Kaplan, & Horton (2017). Andy Field’s “Discovering Statistics Using R” (2012) provides excellent walk-throughs of R used to solve basic statistical problem types; James et al. (2017) employs R in the authors’ ‘statistical learning’; Gelman & Hill (2006) and Fox (2015) both present R used to solve regression models (Fox’s text has a workbook using R).
I**L
"The" book to have if you want to be a data scientist and working with R
Hands-on textbook that covers pretty much all aspects of data science (with keen attention to business demands). Importantly, it doesn't shy away from discussing statistical details behind the most common routines in machine learning, which I really appreciated as I was tired of typical DS books that take a black box approach that just shows "how" without explaining the "why". I think it's worth having a copy of this book irrespective of whether you are a beginner to data science or a veteran. Highly recommended!If you buy a hard copy, I would also recommend having a look at the colorful figures in the companion soft copy of the book.
C**X
Practical Data Science is a fun interesting book
Practical Data Science is a fun interesting book.. There are parts that lost me.... like watching a ball under the three shells kind of thing... but-- the 70% of the book that I DID get is remarkable. I learned many things that I will put to use. Well worth the price of the book. For example-- I loved the lookup vectors to change values.. very interesting graphs. This book does not waste your time. Hopefully soon I can grasp everything. Good job. I recommend for all R users. I love that the examples are about real business situations and not plant life.
K**R
Fantastic introductory text for data science
This book is what I was looking for for my new job as a Credit Risk Data Modeler (basically data science applied to credit problems). It's coverage is broad, but deep and applied enough that I have been able to apply its contents in my day-to-day work. I look forward to a second edition which will hopefully rectify the following:Earlier in the book it seemed the authors took great pains to explain in layman's terms the various statistical elements of the topic they were covering. They provided very clear and meaningful explanations which made a lot of sense of complex topics. But later in the book it seemed that that approach largely went out the window and they started using more technical boiler plate to describe the various statistical tests and procedures. Rather than perhaps give the technical boilerplate (as you'd see it in a textbook) and then elaborate on it with a more human-centric explanation, they would just leave it at the nearly impervious technical description and then proceed to explain how to conduct the test/procedure/etc in R. But without understanding of what you're trying to accomplish and why, it's hard to write the code to actually do it. Keep in mind that I'm relatively well prepared for this book too, having had as much stats and econometrics as I could fit into my four-year degree. If I found some sections of the book too technical to understand then it seems likely that the book would benefit from some additional explanation and discussion in those later sections.Also, I have a good deal of "boots on the ground" experience with this book in my attempts to apply it in my daily work. I've found that it is useful, but could be more useful if there was more discussion of various practical problems. For instance, much of my work is focused on producing a predictive model of likelihood of charge-off. I.e., if we approve and fund this application, how likely is it to perform or charge-off. The book shares some high-level approaches to finding problems in data (using plots and summaries), fixing those problems using various techniques, selecting variables, and how to conduct the statistical modeling (logistic in my case). But it fails to really tie those areas together beyond the high-level. For instance, what are the assumptions of a logistic regression? How do you resolve issues in your data to ensure that you meet those assumptions and can perform a valid logistic regression? How do you really select variables when you're faced with at least 20 possibilities (and potentially many many more if you count interaction terms, unfixed variables, and variables which have been fixed in different ways)?I suppose, for what it was, that it is "mission accomplished." I'd just like to see a lot more. Perhaps there's need for a second volume? Perhaps "Advanced Practical Data Science with R?" Either this book could have a second edition with a lot more content covering finding data problems, resolving those problems intelligently (for instance, resolving missing data is basically left as "either drop the effected records" or "use the mean as a replacement or the missing value," but there are alternative methods which may be more suitable), what data problems will cause issues in OLS regression, logistic regression, and machine learning; And how to practically select variables and a model. I feel like the book gave me some tools to apply (like a small box of tools you might purchase from a hardware store), but left a lot out. So now I'm in deep water trying to figure out why my logistic regression isn't predictive enough and what I can do about it. Is it the data and how I fixed variables? Is it the variables I've selected? Should I have used automated variable selection techniques? Or just manually tried different variables? How does an experienced practitioner approach these problems? I know they iterate: explore data, clean data, select variables, select model, test model, look at data, change data, change variables, etc... but practically speaking what does it look like? In the book they offer a hand-coded basic variable selection script, and mention that one could also use stepwise variable selection. In the real world I'm reasonably sure that this is not actually done--mostly because their selection script does about as well as stepwise at selecting appropriate variables. There are many other better ways of selecting variables, I've discovered, and I wish that they'd discussed some of those ways (pros and cons), and shown how to conduct them in a meaningful fashion. Same thing with building a model. In my case, I have a whole bunch of variables, limited data (about 2000 records, with the desired outcome only occurring in 120 of those), and the automated tools (various R packages I've discovered and applied) either take a long time to run and/or yield poor results. But if not automated tools then what? Manually add variables and ANOVA test the difference between the first and second model?I'd just like more...more discussion and elaboration and examples of how practical data science is conducted. This book seems like it does a fantastic job as an introduction to the topic, but you'll quickly find that you'll be in deep water without a clue how to swim--as in my case. You'll be left to your own devices, and find yourself wishing, as I do, that there was more in the book (or another book) that I could study after this one which would help take me from beginner data scientist to intermediate.Overall, I'm very glad I bought and read the book.
K**M
Five Stars
Love this book. Any one interested in data science should get this hands-on experiential learning book.
M**E
Five Stars
The book has a lot of good examples and gets you jump started on being productive.
J**T
Good if you are getting into Data Science
Good Read
T**R
Good overview of the process, a bit lacking in terms of application
Good book for beginners even if planning to use another language than R.
A**N
Great book to learn R programming and Stats
Love this book (as well as other books from this publisher) - very useful examples, simple step-by-step explanations and great content overall! Useful for learning both R programming and stats (they are explained very well)
P**M
This is a great book that artfully bridges the gap of data science ...
This is a great book that artfully bridges the gap of data science as a process and data science as a practice. Really well written and I refer to in constantly.
D**L
Título obligado.
Un libro que debe estar en todas las estanterías de un científico de datos que use R.
C**N
Buen libro
Muy buen libro
A**R
Four Stars
Delivered as it was described.
Trustpilot
2 weeks ago
4 days ago