Product Details
Data Manipulation with R (Use R)

Data Manipulation with R (Use R)
By Phil Spector

List Price: $54.95
Price: $44.37 & eligible for FREE Super Saver Shipping on orders over $25. Details

Availability: Usually ships in 24 hours
Ships from and sold by Amazon.com

44 new or used available from $35.62

Average customer review:

Product Description

Since its inception, R has become one of the preeminent programs for statistical computing and data analysis. The ready availability of the program, along with a wide variety of packages and the supportive R community make R an excellent choice for almost any kind of computing task related to statistics. However, many users, especially those with experience in other languages, do not take advantage of the full power of R. Because of the nature of R, solutions that make sense in other languages may not be very efficient in R. This book presents a wide array of methods applicable for reading data into R, and efficiently manipulating that data.

In addition to the built-in functions, a number of readily available packages from CRAN (the Comprehensive R Archive Network) are also covered. All of the methods presented take advantage of the core features of R: vectorization, efficient use of subscripting, and the proper use of the varied functions in R that are provided for common data management tasks.

Most experienced R users discover that, especially when working with large data sets, it may be helpful to use other programs, notably databases, in conjunction with R. Accordingly, the use of databases in R is covered in detail, along with methods for extracting data from spreadsheets and datasets created by other programs. Character manipulation, while sometimes overlooked within R, is also covered in detail, allowing problems that are traditionally solved by scripting languages to be carried out entirely within R. For users with experience in other languages, guidelines for the effective use of programming constructs like loops are provided. Since many statistical modeling and graphics functions need their data presented in a data frame, techniques for converting the output of commonly used functions to data frames are provided throughout the book.

Using a variety of examples based on data sets included with R, along with easily simulated data sets, the book is recommended to anyone using R who wishes to advance from simple examples to practical real-life data manipulation solutions.


Product Details

  • Amazon Sales Rank: #14661 in Books
  • Published on: 2008-03-19
  • Original language: English
  • Number of items: 1
  • Binding: Paperback
  • 154 pages

Editorial Reviews

Review

From the reviews:

"This comprehensive, compact and concise book provides all R users with a reference and guide to the mundane but terribly important topic of data manipulation in R. … This is a book that should be read and kept close at hand by everyone who uses R regularly."(Douglas M. Bates, International Statistical Reviews, Vol. 76 (2), 2008)

"Presents a wide array of methods applicable for reading statistical data into the R program and efficiently manipulating that data." (Journal of Economic Literature, Vol. 46, no. 3, September 2008)

"R is a programming language particularly suitable for statistical computing and data analysis. … Using a variety of examples based on data sets included with R, along with easily stimulated data sets, the book is recommended to anyone using R who wishes to advance from simple examples to practical real-life data manipulation solutions." (Christina Diakaki, Zentralblatt MATH, Vol. 1154, 2009)

"The book contains much good information regarding the unique way in which R manipulates data objects. lt provides a complement to the many books illustrating statistical applications of R. It is clear that the author is very familiar with R. and the explanations and illustrations are generally helpful. Personally, I found the chapters on reading and writing data and on data aggregation most helpful, because these topics are essential in exploring data." (Jim Albert, The American Statistician, May 2009, Vol. 63, no. 2)


Customer Reviews

a must for statisticians wanting to learn R5
This book along with Jim Albert's should be read by every statistician that does a lot of statistical computing. Both books help you learn R quickly and apply it to many important problems in research both applied and theoretical. Albert emphasizes applications in Bayesian statistics whereas Spector is teaching how to do data manipulation, things like merging and transposing data sets. These techniques can be easy to do in a language like SAS after a little training but in other programming languages it can be very difficult.

Great little book5
This concise 150 page book contains a wealth of information, writen clearly and with many well-chosen examples. I liked it a lot. It covers reading and writing data in/out of the R workspace, including access to databases. The names of other chapters suggest the topics covered: "Dates", "Factors", "Subscripting", "Character manipulation", "Data aggregation", "Reshaping data".

This book will be helpful to any but the most absolutely new to R, and even the seasoned user will find interesting hints and examples. I cannot recommend it enough.

One minor qualm I have is the absence of references. Some topics (for instance, regular expressions) are fairly complex, and well documented elsewhere: a pointer or two would be helpful. Same with, for instance, SQL, which is mentioned and demonstrated briefly.

Another not-so-minor qualm is price. A book of this size from, for instance, Dover classics collection, with similar paper quality and covers, is about a third or fourth of the price. Although this is a new book I find the $54.95 tag (Amazon discounted price is about $44.50) fairly high. But this has nothing to do with the quality of the book, rather it has to do with the Springer pricing policies.

All in all, if you don't mind the price, this is a good buy.

Start here5
All too often novices wanting to use R for an analysis never get to the analysis because they can't successfully import, clean-up and restructure their data for the analysis functions. This book prevents those problems by telling you the critical data and file manipulation materials that are usually briefly (and inadequately) covered in stat books. It is a short easy read that will give you the tools to get your data ready to go.

You can see the table of contents and read the other reviews but areas that really shine include: dealing with categorical (named or ordered) factor variables, recoding numeric data into categorical variables, and also making and working with summary tables.

When it comes to data manipulation and clean-up Spector has the best coverage of any book or web FAQ. This book is very expensive for its size but it is worth every cent.