Product Details
Regular Expressions Cookbook

Regular Expressions Cookbook
By Jan Goyvaerts, Steven Levithan, Goyvaerts Jan, Levithan Steven

List Price: $45.00
Price: $29.70 & eligible for FREE Super Saver Shipping on orders over $25. Details

Availability: Usually ships in 24 hours
Ships from and sold by Amazon.com

40 new or used available from $20.76

Average customer review:

Product Description

This O'Reilly Cookbook provides more than a hundred recipes to help programmers use regular expressions to manipulate text and crunch data. Every programmer needs a grasp of regular expressions, but their power doesn't come without problems--even seasoned users often have trouble tackling performance issues. With recipes for popular programming languages such as C#, Java, JavaScript, Perl, PHP, Python, Ruby, and VB.NET, this book offers step-by-step solutions to scores of common tasks involving regular expressions. This cookbook will help you:

  • Understand the basics of regular expressions through a concise tutorial
  • Use regular expressions effectively in several programming and scripting languages
  • Learn how to validate and format input
  • Manage words, lines, special characters, and numerical values
  • Find solutions for using regular expressions in URLs, paths, markup, and data exchange
  • Learn the nuances of more advanced regex features
  • Understand how regular expression APIs differ from language to language
  • Write better regular expressions for custom needs

Whether you're a novice or an experienced user, Regular Expressions Cookbook will help deepen your understanding of this tool. You'll learn powerful new tricks, avoid language-specific gotchas, and save valuable time with this huge library of proven solutions to difficult, real-world problems.


Product Details

  • Amazon Sales Rank: #55022 in Books
  • Published on: 2009-06-04
  • Original language: English
  • Number of items: 1
  • Binding: Paperback
  • 510 pages

Features


Editorial Reviews

Amazon.com Review

Whether you're a novice or an experienced user, Regular Expressions Cookbook will help deepen your understanding of the tool. You'll learn powerful new tricks, avoid language-specific gotchas, and save valuable time with this huge library of proven solutions to difficult, real-world problems.

Searching and Replacing with Regular Expressions
Search-and-replace is a common job for regular expressions. A search-and-replace function takes a subject string, a regular expression, and a replacement string as input. The output is the subject string with all matches of the regular expression replaced with the replacement text. Although the replacement text is not a regular expression at all, you can use certain special syntax to build dynamic replacement texts. All flavors let you reinsert the text matched by the regular expression or a capturing group into the replacement. Recipes 2.20 and 2.21 explain this. Some flavors also support inserting matched context into the replacement text, as Recipe 2.22 shows. In Chapter 3, Recipe 3.16 teaches you how to generate a different replacement text for each match in code.

Many Flavors of Replacement Text
Different ideas by different regular expression software developers have led to a wide range of regular expression flavors, each with different syntax and feature sets. The story for the replacement text is no different. In fact, there are even more replacement text flavors than regular expression flavors. Building a regular expression engine is difficult. Most programmers prefer to reuse an existing one, and bolting a search-and-replace function onto an existing regular expression engine is quite easy. The result is that there are many replacement text flavors for regular expression libraries that do not have built-in search-and-replace features.
Fortunately, all the regular expression flavors in this book have corresponding replacement text flavors, except PCRE. This gap in PCRE complicates life for programmers who use flavors based on it. The open source PCRE library does not include any functions to make replacements. Thus, all applications and programming languages that are based on PCRE need to provide their own search-and-replace function. Most programmers try to copy existing syntax, but never do so in exactly the same way.
This book covers the following replacement text flavors. Refer to “Many Flavors of Regular Expressions” on page 2 for more details on the regular expression flavors that correspond with the replacement text flavors:
Perl
Perl has built-in support for regular expression substitution via the s/regex/ replace/ operator. The Perl replacement text flavor corresponds with the Perl regular expression flavor. This book covers Perl 5.6 to Perl 5.10. The latter version adds support for named backreferences in the replacement text, as it adds named capture to the regular expression syntax.
PHP
In this book, the PHP replacement text flavor refers to the preg_replace function in PHP. This function uses the PCRE regular expression flavor and the PHP replacement text flavor.
Other programming languages that use PCRE do not use the same replacement text flavor as PHP. Depending on where the designers of your programming language got their inspiration, the replacement text syntax may be similar to PHP or any of the other replacement text flavors in this book. PHP also has an ereg_replace function. This function uses a different regular expression flavor (POSIX ERE), and a different replacement text flavor, too. PHP’s ereg functions are not discussed in this book.
.NET
The System.Text.RegularExpressions package provides various searchand- replace functions. The .NET replacement text flavor corresponds with the .NET regular expression flavor. All versions of .NET use the same replacement text flavor. The new regular expression features in .NET 2.0 do not affect the replacement text syntax.
Java
The java.util.regex package has built-in search-and-replace functions. This book covers Java 4, 5, and 6. All use the same replacement text syntax.
JavaScript
In this book, we use the term JavaScript to indicate both the replacement text flavor and the regular expression flavor defined in Edition 3 of the ECMA-262 standard.
Python
Python’s re module provides a sub function to search-and-replace. The Python replacement text flavor corresponds with the Python regular expression flavor. This book covers Python 2.4 and 2.5. Python’s regex support has been stable for many years.
Ruby
Ruby’s regular expression support is part of the Ruby language itself, including the search-and-replace function. This book covers Ruby 1.8 and 1.9. A default compilation of Ruby 1.8 uses the regular expression flavor provided directly by the Ruby source code, whereas a default compilation of Ruby 1.9 uses the Oniguruma regular expression library. Ruby 1.8 can be compiled to use Oniguruma, and Ruby 1.9 can be compiled to use the older Ruby regex flavor. In this book, we denote the native Ruby flavor as Ruby 1.8, and the Oniguruma flavor as Ruby 1.9. The replacement text syntax for Ruby 1.8 and 1.9 is the same, except that Ruby 1.9 adds support for named backreferences in the replacement text. Named capture is a new feature in Ruby 1.9 regular expressions.

About the Author
Jan Goyvaerts runs Just Great Software, where he designs and develops some of the most popular regular expression software. His products include RegexBuddy, the world's only regular expression editor that emulates the peculiarities of 15 regular expression flavors, and PowerGREP, the most feature-rich grep tool for Microsoft Windows.

Steven Levithan is a leading JavaScript regular expression expert and runs a popular regular expression centric blog. Expanding his knowledge of the regular expression flavor and library landscape has been one of his hobbies for the last several years.


Customer Reviews

At last a Use Case based RegEx Book4
As much as I hate to admit it, regular expressions are hard for me. My need to use them is situation specific and I never really took the time to master them conceptually. So, when it comes time create one, I have to grope around to figure out how to meet the need at hand.

This book is really made for a person like me. The structure is problem-solution based. And, every problem is numbered in outline format. Thus, referencing back is an easy affair.

Want to know how to find bold text in an HTML file? This book will tell you how.

Want to learn how to split a sting using a regular expression? This book tells you how.

The book discusses solutions generally and in language specifics. It supports C#, Java, Javascript, Ruby, Python, PHP, Perl, VB.NET.... the entire cast of the usual characters. (No pun intended.)

The writing is clear. You can take things in a bit at a time. And, that some of the problems use those 'hard to get concepts', the topical discussions actually teach you the difficult concepts in a manner that is pretty easy to understand. Sometimes you might have to go over a section of few times to get full understanding. But the review is not a chore.

This is a good, useful book. It's helping me to become a better engineer. And believe me, I need all the help that I can get! :)

VERY VERY HIGHLY RECOMMENDED!!5
Do you regularly work with text on a computer? If you do, then this book is for you! Authors Jan Goyvaerts and Steven Levithan, have done an outstanding job of writing a book that shows you how you can use regular expressions in situations where people with limited regular expressions experience would normally say it can't be done.

Goyvaerts and Levithan, begin by explaining the role of regular expressions and introduce a number of tools that will make it easier to learn, create, and debug them. Next, the authors cover each element and feature of regular expressions, along with important guidelines for effective use. Then, they specify coding techniques and include code listings for using regular expressions in each of the programming languages covered by this book. They continue by focusing on recipes for handling typical user input, such as dates, phone numbers, and postal codes in various countries. Next, the authors explore common text processing tasks, such as checking for lines that contain or fail to contain certain words. Then, they show you how to detect integers, floating-point numbers, and several other formats for this kind of input. The authors continue by showing you how to take apart and manipulate the strings commonly used on the Internet and Windows systems to find things. Finally, the authors cover the manipulation of HTML, XML, comma-separated values (CSV), and INI-style configuration files.

This most excellent book shows you everything you need to know about regular expressions, and then some, regardless of whether you are a programmer. More importantly, if you read this book cover to cover, you'll become a world class chef of regular expressions.

Goes further and deeper than many tutorials on regular expressions5
This excellent book goes further and deeper than many tutorials on regular expressions. You might be surprised with some of the things you'll learn from reading it.

Unlike many cookbooks, this one doesn't dive into the recipes right away. I thought this was a good call because regular expressions are a specialized topic, and most developers don't work with regular expressions on a daily basis so they probably have to be reminded of the building block concepts and syntax, and get prepared for a discussion of more advanced features. Chapter One provides a list of recommended tools for working with regular expressions. Chapter 2 is a concise but very thorough discussion of building block and more advanced regular expression concepts (e.g., possessive quantifier or atomic grouping, named capturing groups, lookahead and lookbehind, etc.), including a discussion of differences in engine implementations and feature support. Chapter 3 is a hundred-plus page tutorial on how to work with regular expressions using different programming and scripting languages, including potential gotchas and workarounds. Chapters Four through Eight contain the recipes for solving real-world problems, with tips on how to improve an initial solution's readability (e.g., use named capturing groups when possible, etc.) and/or efficiency.

I was initially skeptical about the authors' ambitious goal of covering so many regular expression flavors, thinking the discussions of differences in engine supported features might prove distracting. The book is written and organized so well, however, my fear did not materialize. In fact, I was pleasantly surprised to learn that: of the covered flavors, Microsoft's DotNet regex engine supports some of the most advanced features.

There's not much to dislike about this book but if I were asked to suggest one or two things that might be of value-add to readers, I would suggest making available for download files containing appropriate subject strings for testing the book's various recipes as a convenience to readers who learn best by doing and want to follow along as they read the recipes, and for the book to include, for easy reference, a feature-support comparison matrix of the covered flavors, much like the comparison table available in the regular-expressions.info website.