Regular Expressions

The and s/// operators return the number of matches or replacements they made,respectively.You can either use the number directly,or check it for truth.

Don't use capture variables without checking that the match succeeded.

The capture variables, $1, etc, are not valid unless the match succeeded, and they're not cleared, either.

    # BAD: Not checked, but at least it "works".
    my $str = 'Perl 101 rocks.';
    $str =~ /(\d+)/;
    print "Number: $1"; # Prints "Number: 101";
 
    # WORSE: Not checked, and the result is not what you'd expect
    $str =~ /(Python|Ruby)/;
    print "Language: $1"; # Prints "Language: 101";

    # GOOD: Check the results
    my $str = 'Perl 101 rocks.';
    if ( $str =~ /(\d+)/ ) {
        print "Number: $1"; # Prints "Number: 101";
    }
 
        print "Language: $1"; # Never gets here
    }

XXX m// in list context gives a list of matches

Common match flags

/i - case insensitive match
/g - match multiple times

    $var = "match match match";
 
    while ($var =~ /match/g) { $a++; }
    print "$a\n"; # prints 3
 
    $a = 0;
    $a++ foreach ($var =~ /match/g);
    print "$a\n"; # prints 3

/m - ^ and change meaning
- Ordinarily, ^ means "start of string" and $, "end of string"
- /m makes them mean start and end of line, respectively

Use \A and \z for start and end of string regardless of /m
is the same as \z except it will ignore a final newline
- /s - . also matches newline

    $str = "one\ntwo\nthree\n";
    $str =~ /^(.{8})/s;
    print $1; # prints "one\ntwo\n"

Sets of capturing parentheses are stored in numeric variables
Parenthesis are assigned left to right:

    my $str = "abc";
    $str =~ /(((a)(b))(c))/;
    print "1: $1 2: $2 3: $3 4: $4 5: $5\n";
    # prints: 1: abc 2: ab 3: a 4: b 5: c

Avoid capture with ?:

If a parenthesis is followed by ?:, the group will not be captured
Useful if you don't want the matches to be saved

    my $str = "abc";
    $str =~ /(?:a(b)c)/;
    print "$1\n"; # prints "b"

Allow easier reading with the /x switch

If you're doing something tricky with a regex, comment it.
You can do this with the /x flag.
This ugly behemoth

is more readable with whitespace and comments, as allowed by the /x flag.

    my ($num) =
        $ARGV[0] =~ m/^ \+?        # An optional plus sign, to be discarded
                    (              # Capture...
                    (?:(?<!\+)-)? # a negative sign, if there's no plus behind it,
                    (?:\d*.)?     # an optional number, followed by a point if a decimal,
                    \d+           # then any number of numbers.
                    )$/x;

Whitespace and comments are stripped unless escaped.

Automatically quote your regexes with \Q and \E

Automatically escapes regex metacharacters
Won't escape dollar signs

    my $num = '3.1415';
    print "ok 1\n" if $num =~ /\Q3.14\E/;
    $num = '3X1415';
    print "ok 2\n" if $num =~ /\Q3.14\E/;
    print "ok 3\n" if $num =~ /3.14/;

prints

    ok 1
    ok 3

Allows arbitrary code to replace a string in a regular expression

Use and friends if necessary

Know when to use study

"This is a very long [… 900 characters skipped…] string that I have here, ending at position 1000"

Now, if you are matching this against the regex /Icky/, the matcher will try to find the first letter "I" that matches. That may take scanning through the first 900+ characters until you get to it. But what study does is build a table of the 256 possible bytes and where they first appear, so that in this case, the scanner can jump right to that position and start matching.

Handle multi-line regexes

Use re => debug

    -Mre=debug

Regexes