Overview

Package regexp implements regular expression search.

The syntax of the regular expressions accepted is the same general syntax used
by Perl, Python, and other languages. More precisely, it is the syntax accepted
by RE2 and described at , except for \C. For an
overview of the syntax, run

The regexp implementation provided by this package is guaranteed to run in time
linear in the size of the input. (This is a property not guaranteed by most open
source implementations of regular expressions.) For more information about this
property, see

  1. http://swtch.com/~rsc/regexp/regexp1.html

or any book about automata theory.

All characters are UTF-8-encoded code points.

There are 16 methods of Regexp that match a regular expression and identify the
matched text. Their names are matched by this regular expression:

  1. Find(All)?(String)?(Submatch)?(Index)?

If ‘All’ is present, the routine matches successive non-overlapping matches of
the entire expression. Empty matches abutting a preceding match are ignored. The
return value is a slice containing the successive return values of the
corresponding non-‘All’ routine. These routines take an extra integer argument,
n; if n >= 0, the function returns at most n matches/submatches.

If ‘String’ is present, the argument is a string; otherwise it is a slice of
bytes; return values are adjusted as appropriate.

If ‘Submatch’ is present, the return value is a slice identifying the successive
submatches of the expression. Submatches are matches of parenthesized
subexpressions (also known as capturing groups) within the regular expression,
numbered from left to right in order of opening parenthesis. Submatch 0 is the
match of the entire expression, submatch 1 the match of the first parenthesized
subexpression, and so on.

If ‘Index’ is present, matches and submatches are identified by byte index pairs
within the input string: result[2n:2n+1] identifies the indexes of the nth
submatch. The pair for n==0 identifies the match of the entire expression. If
‘Index’ is not present, the match is identified by the text of the
match/submatch. If an index is negative, it means that subexpression did not
match any string in the input.

There is also a subset of the methods that can be applied to text read from a
RuneReader:

  1. MatchReader, FindReaderIndex, FindReaderSubmatchIndex

This set may grow. Note that regular expression matches may need to examine text
beyond the text returned by a match, so the methods that match text from a
RuneReader may read arbitrarily far into the input before returning.

(There are a few other methods that do not match this pattern.)


Example:

  1. // Compile the expression once, usually at init time.
  2. // Use raw strings to avoid having to quote the backslashes.
  3. var validID = regexp.MustCompile(`^[a-z]+\[[0-9]+\]$`)
  4. fmt.Println(validID.MatchString("adam[23]"))
  5. fmt.Println(validID.MatchString("eve[7]"))
  6. fmt.Println(validID.MatchString("Job[48]"))
  7. fmt.Println(validID.MatchString("snakey"))
  8. // Output:
  9. // true
  10. // true
  11. // false
  12. // false

Index

Package files

exec.go regexp.go

  1. func Match(pattern , b []byte) (matched , err error)

Match checks whether a textual regular expression matches a byte slice. More
complicated queries need to use Compile and the full Regexp interface.

func

  1. func MatchReader(pattern , r io.) (matched bool, err )

MatchReader checks whether a textual regular expression matches the text read by
the RuneReader. More complicated queries need to use Compile and the full Regexp
interface.

func MatchString

  1. func MatchString(pattern string, s ) (matched bool, err )

MatchString checks whether a textual regular expression matches a string. More
complicated queries need to use Compile and the full Regexp interface.


Example:

  1. matched, err := regexp.MatchString("foo.*", "seafood")
  2. fmt.Println(matched, err)
  3. matched, err = regexp.MatchString("bar.*", "seafood")
  4. fmt.Println(matched, err)
  5. matched, err = regexp.MatchString("a(b", "seafood")
  6. fmt.Println(matched, err)
  7. // Output:
  8. // true <nil>
  9. // false <nil>
  10. // false error parsing regexp: missing closing ): `a(b`

  1. func QuoteMeta(s ) string

QuoteMeta returns a string that quotes all regular expression metacharacters
inside the argument text; the returned string is a regular expression matching
the literal text. For example, QuoteMeta([foo]) returns \[foo\].

type

  1. type Regexp struct {
  2. // contains filtered or unexported fields
  3. }

Regexp is the representation of a compiled regular expression. A Regexp is safe
for concurrent use by multiple goroutines, except for configuration methods,
such as Longest.

func

  1. func Compile(expr ) (*Regexp, )

Compile parses a regular expression and returns, if successful, a Regexp object
that can be used to match against text.

When matching against text, the regexp returns a match that begins as early as
possible in the input (leftmost), and among those it chooses the one that a
backtracking search would have found first. This so-called leftmost-first
matching is the same semantics that Perl, Python, and other implementations use,
although this package implements it without the expense of backtracking. For
POSIX leftmost-longest matching, see CompilePOSIX.

func CompilePOSIX

  1. func CompilePOSIX(expr string) (*, error)

CompilePOSIX is like Compile but restricts the regular expression to POSIX ERE
(egrep) syntax and changes the match semantics to leftmost-longest.

That is, when matching against text, the regexp returns a match that begins as
early as possible in the input (leftmost), and among those it chooses a match
that is as long as possible. This so-called leftmost-longest matching is the
same semantics that early regular expression implementations used and that POSIX
specifies.

However, there can be multiple leftmost-longest matches, with different submatch
choices, and here this package diverges from POSIX. Among the possible
leftmost-longest matches, this package chooses the one that a backtracking
search would have found first, while POSIX specifies that the match be chosen to
maximize the length of the first subexpression, then the second, and so on from
left to right. The POSIX rule is computationally prohibitive and not even
well-defined. See for details.

func MustCompile

  1. func MustCompile(str string) *

MustCompile is like Compile but panics if the expression cannot be parsed. It
simplifies safe initialization of global variables holding compiled regular
expressions.

func MustCompilePOSIX

  1. func MustCompilePOSIX(str string) *

MustCompilePOSIX is like CompilePOSIX but panics if the expression cannot be
parsed. It simplifies safe initialization of global variables holding compiled
regular expressions.

func (*Regexp) Copy

    Copy returns a new Regexp object copied from re.

    When using a Regexp in multiple goroutines, giving each goroutine its own copy
    helps to avoid lock contention.

    func (*Regexp) Expand

    1. func (re *Regexp) Expand(dst [], template []byte, src [], match []int) []

    Expand appends template to dst and returns the result; during the append, Expand
    replaces variables in the template with corresponding matches drawn from src.
    The match slice should have been returned by FindSubmatchIndex.

    In the template, a variable is denoted by a substring of the form $name or
    ${name}, where name is a non-empty sequence of letters, digits, and underscores.
    A purely numeric name like $1 refers to the submatch with the corresponding
    index; other names refer to capturing parentheses named with the (?P…)
    syntax. A reference to an out of range or unmatched index or a name that is not
    present in the regular expression is replaced with an empty slice.

    In the $name form, name is taken to be as long as possible: $1x is equivalent to
    ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0.

    To insert a literal $ in the output, use $$ in the template.

    1. content := []byte(`
    2. # comment line
    3. option1: value1
    4. option2: value2
    5. # another comment line
    6. option3: value3
    7. `)
    8. // Regex pattern captures "key: value" pair from the content.
    9. pattern := regexp.MustCompile(`(?m)(?P<key>\w+):\s+(?P<value>\w+)$`)
    10. // Template to convert "key: value" to "key=value" by
    11. // referencing the values captured by the regex pattern.
    12. template := []byte("$key=$value\n")
    13. result := []byte{}
    14. // For each match of the regex in the content.
    15. for _, submatches := range pattern.FindAllSubmatchIndex(content, -1) {
    16. // Apply the captured submatches to the template and append the output
    17. // to the result.
    18. result = pattern.Expand(result, template, content, submatches)
    19. }
    20. fmt.Println(string(result))
    21. // Output:
    22. // option1=value1
    23. // option2=value2
    24. // option3=value3

    func (*Regexp) ExpandString

    1. func (re *Regexp) ExpandString(dst [], template string, src , match []int) []

    ExpandString is like Expand but the template and source are strings. It appends
    to and returns a byte slice in order to give the calling code control over
    allocation.


    Example:

    1. content := `
    2. # comment line
    3. option1: value1
    4. option2: value2
    5. # another comment line
    6. option3: value3
    7. `
    8. pattern := regexp.MustCompile(`(?m)(?P<key>\w+):\s+(?P<value>\w+)$`)
    9. // Template to convert "key: value" to "key=value" by
    10. // referencing the values captured by the regex pattern.
    11. template := "$key=$value\n"
    12. result := []byte{}
    13. // For each match of the regex in the content.
    14. for _, submatches := range pattern.FindAllStringSubmatchIndex(content, -1) {
    15. // Apply the captured submatches to the template and append the output
    16. // to the result.
    17. result = pattern.ExpandString(result, template, content, submatches)
    18. }
    19. fmt.Println(string(result))
    20. // Output:
    21. // option1=value1
    22. // option2=value2
    23. // option3=value3

    func (*Regexp)

    1. func (re *) Find(b []byte) []

    Find returns a slice holding the text of the leftmost match in b of the regular
    expression. A return value of nil indicates no match.

    func (*Regexp) FindAll

    1. func (re *Regexp) FindAll(b [], n int) [][]

    FindAll is the ‘All’ version of Find; it returns a slice of all successive
    matches of the expression, as defined by the ‘All’ description in the package
    comment. A return value of nil indicates no match.

    func (*Regexp) FindAllIndex

    FindAllIndex is the ‘All’ version of FindIndex; it returns a slice of all
    successive matches of the expression, as defined by the ‘All’ description in the
    package comment. A return value of nil indicates no match.

    func (*Regexp) FindAllString

    1. func (re *Regexp) FindAllString(s , n int) []

    FindAllString is the ‘All’ version of FindString; it returns a slice of all
    successive matches of the expression, as defined by the ‘All’ description in the
    package comment. A return value of nil indicates no match.


    Example:

    1. re := regexp.MustCompile("a.")
    2. fmt.Println(re.FindAllString("paranormal", -1))
    3. fmt.Println(re.FindAllString("paranormal", 2))
    4. fmt.Println(re.FindAllString("graal", -1))
    5. fmt.Println(re.FindAllString("none", -1))
    6. // Output:
    7. // [ar an al]
    8. // [ar an]
    9. // [aa]
    10. // []

    func (*Regexp)

    1. func (re *) FindAllStringIndex(s string, n ) [][]int

    FindAllStringIndex is the ‘All’ version of FindStringIndex; it returns a slice
    of all successive matches of the expression, as defined by the ‘All’ description
    in the package comment. A return value of nil indicates no match.

    1. func (re *) FindAllStringSubmatch(s string, n ) [][]string

    FindAllStringSubmatch is the ‘All’ version of FindStringSubmatch; it returns a
    slice of all successive matches of the expression, as defined by the ‘All’
    description in the package comment. A return value of nil indicates no match.


    Example:

    1. re := regexp.MustCompile("a(x*)b")
    2. fmt.Printf("%q\n", re.FindAllStringSubmatch("-ab-", -1))
    3. fmt.Printf("%q\n", re.FindAllStringSubmatch("-axxb-", -1))
    4. fmt.Printf("%q\n", re.FindAllStringSubmatch("-ab-axb-", -1))
    5. fmt.Printf("%q\n", re.FindAllStringSubmatch("-axxb-ab-", -1))
    6. // Output:
    7. // [["ab" ""]]
    8. // [["axxb" "xx"]]
    9. // [["ab" ""] ["axb" "x"]]
    10. // [["axxb" "xx"] ["ab" ""]]

    func (*Regexp) FindAllStringSubmatchIndex

    1. func (re *Regexp) FindAllStringSubmatchIndex(s , n int) [][]

    FindAllStringSubmatchIndex is the ‘All’ version of FindStringSubmatchIndex; it
    returns a slice of all successive matches of the expression, as defined by the
    ‘All’ description in the package comment. A return value of nil indicates no
    match.


    Example:

    1. re := regexp.MustCompile("a(x*)b")
    2. // Indices:
    3. // 01234567 012345678
    4. // -ab-axb- -axxb-ab-
    5. fmt.Println(re.FindAllStringSubmatchIndex("-ab-", -1))
    6. fmt.Println(re.FindAllStringSubmatchIndex("-axxb-", -1))
    7. fmt.Println(re.FindAllStringSubmatchIndex("-ab-axb-", -1))
    8. fmt.Println(re.FindAllStringSubmatchIndex("-axxb-ab-", -1))
    9. fmt.Println(re.FindAllStringSubmatchIndex("-foo-", -1))
    10. // Output:
    11. // [[1 3 2 2]]
    12. // [[1 5 2 4]]
    13. // [[1 3 2 2] [4 7 5 6]]
    14. // [[1 5 2 4] [6 8 7 7]]
    15. // []

    func (*Regexp)

    1. func (re *) FindAllSubmatch(b []byte, n ) [][][]byte

    FindAllSubmatch is the ‘All’ version of FindSubmatch; it returns a slice of all
    successive matches of the expression, as defined by the ‘All’ description in the
    package comment. A return value of nil indicates no match.

    func (*Regexp)

      FindAllSubmatchIndex is the ‘All’ version of FindSubmatchIndex; it returns a
      slice of all successive matches of the expression, as defined by the ‘All’
      description in the package comment. A return value of nil indicates no match.

      func (*Regexp)

      1. func (re *) FindIndex(b []byte) (loc [])

      FindIndex returns a two-element slice of integers defining the location of the
      leftmost match in b of the regular expression. The match itself is at
      b[loc[0]:loc[1]]. A return value of nil indicates no match.

      func (*Regexp) FindReaderIndex

      1. func (re *Regexp) FindReaderIndex(r .RuneReader) (loc [])

      FindReaderIndex returns a two-element slice of integers defining the location of
      the leftmost match of the regular expression in text read from the RuneReader.
      The match text was found in the input stream at byte offset loc[0] through
      loc[1]-1. A return value of nil indicates no match.

      func (*Regexp) FindReaderSubmatchIndex

      1. func (re *Regexp) FindReaderSubmatchIndex(r .RuneReader) []

      FindReaderSubmatchIndex returns a slice holding the index pairs identifying the
      leftmost match of the regular expression of text read by the RuneReader, and the
      matches, if any, of its subexpressions, as defined by the ‘Submatch’ and ‘Index’
      descriptions in the package comment. A return value of nil indicates no match.

      func (*Regexp) FindString

      1. func (re *Regexp) FindString(s ) string

      FindString returns a string holding the text of the leftmost match in s of the
      regular expression. If there is no match, the return value is an empty string,
      but it will also be empty if the regular expression successfully matches an
      empty string. Use FindStringIndex or FindStringSubmatch if it is necessary to
      distinguish these cases.


      Example:

      1. re := regexp.MustCompile("foo.?")
      2. fmt.Printf("%q\n", re.FindString("seafood fool"))
      3. fmt.Printf("%q\n", re.FindString("meat"))
      4. // Output:
      5. // "food"
      6. // ""

      func (*Regexp) FindStringIndex

      1. func (re *Regexp) FindStringIndex(s ) (loc []int)

      FindStringIndex returns a two-element slice of integers defining the location of
      the leftmost match in s of the regular expression. The match itself is at
      s[loc[0]:loc[1]]. A return value of nil indicates no match.


      Example:

      1. re := regexp.MustCompile("ab?")
      2. fmt.Println(re.FindStringIndex("tablett"))
      3. fmt.Println(re.FindStringIndex("foo") == nil)
      4. // Output:
      5. // [1 3]
      6. // true

      func (*Regexp) FindStringSubmatch

      1. func (re *Regexp) FindStringSubmatch(s ) []string

      FindStringSubmatch returns a slice of strings holding the text of the leftmost
      match of the regular expression in s and the matches, if any, of its
      subexpressions, as defined by the ‘Submatch’ description in the package comment.
      A return value of nil indicates no match.


      Example:

      1. fmt.Printf("%q\n", re.FindStringSubmatch("-axxxbyc-"))
      2. fmt.Printf("%q\n", re.FindStringSubmatch("-abzc-"))
      3. // Output:
      4. // ["axxxbyc" "xxx" "y"]
      5. // ["abzc" "" "z"]

      func (*Regexp) FindStringSubmatchIndex

      1. func (re *Regexp) FindStringSubmatchIndex(s ) []int

      func (*Regexp)

      1. func (re *) FindSubmatch(b []byte) [][]

      FindSubmatch returns a slice of slices holding the text of the leftmost match of
      the regular expression in b and the matches, if any, of its subexpressions, as
      defined by the ‘Submatch’ descriptions in the package comment. A return value of
      nil indicates no match.

      func (*Regexp) FindSubmatchIndex

      1. func (re *Regexp) FindSubmatchIndex(b []) []int

      FindSubmatchIndex returns a slice holding the index pairs identifying the
      leftmost match of the regular expression in b and the matches, if any, of its
      subexpressions, as defined by the ‘Submatch’ and ‘Index’ descriptions in the
      package comment. A return value of nil indicates no match.

      func (*Regexp)

      LiteralPrefix returns a literal string that must begin any match of the regular
      expression re. It returns the boolean true if the literal string comprises the
      entire regular expression.

      1. func (re *) Longest()

      Longest makes future searches prefer the leftmost-longest match. That is, when
      matching against text, the regexp returns a match that begins as early as
      possible in the input (leftmost), and among those it chooses a match that is as
      long as possible. This method modifies the Regexp and may not be called
      concurrently with any other methods.

      func (*Regexp) Match

      1. func (re *Regexp) Match(b []) bool

      Match reports whether the Regexp matches the byte slice b.

      func (*Regexp)

      1. func (re *) MatchReader(r io.) bool

      MatchReader reports whether the Regexp matches the text read by the RuneReader.

      func (*Regexp)

      1. func (re *) MatchString(s string)

      MatchString reports whether the Regexp matches the string s.


      Example:

      1. re := regexp.MustCompile("(gopher){2}")
      2. fmt.Println(re.MatchString("gopher"))
      3. fmt.Println(re.MatchString("gophergopher"))
      4. fmt.Println(re.MatchString("gophergophergopher"))
      5. // Output:
      6. // false
      7. // true
      8. // true

      func (*Regexp)

      1. func (re *) NumSubexp() int

      NumSubexp returns the number of parenthesized subexpressions in this Regexp.

      func (*Regexp)

      1. func (re *) ReplaceAll(src, repl []byte) []

      ReplaceAll returns a copy of src, replacing matches of the Regexp with the
      replacement text repl. Inside repl, $ signs are interpreted as in Expand, so for
      instance $1 represents the text of the first submatch.

      func (*Regexp) ReplaceAllFunc

      1. func (re *Regexp) ReplaceAllFunc(src [], repl func([]byte) []) []byte

      ReplaceAllFunc returns a copy of src in which all matches of the Regexp have
      been replaced by the return value of function repl applied to the matched byte
      slice. The replacement returned by repl is substituted directly, without using
      Expand.

      func (*Regexp)

      1. func (re *) ReplaceAllLiteral(src, repl []byte) []

      ReplaceAllLiteral returns a copy of src, replacing matches of the Regexp with
      the replacement bytes repl. The replacement repl is substituted directly,
      without using Expand.

      func (*Regexp) ReplaceAllLiteralString

      1. func (re *Regexp) ReplaceAllLiteralString(src, repl ) string

      ReplaceAllLiteralString returns a copy of src, replacing matches of the Regexp
      with the replacement string repl. The replacement repl is substituted directly,
      without using Expand.


      Example:

      1. re := regexp.MustCompile("a(x*)b")
      2. fmt.Println(re.ReplaceAllLiteralString("-ab-axxb-", "T"))
      3. fmt.Println(re.ReplaceAllLiteralString("-ab-axxb-", "$1"))
      4. fmt.Println(re.ReplaceAllLiteralString("-ab-axxb-", "${1}"))
      5. // Output:
      6. // -T-T-
      7. // -$1-$1-
      8. // -${1}-${1}-

      func (*Regexp) ReplaceAllString

      1. func (re *Regexp) ReplaceAllString(src, repl ) string

      ReplaceAllString returns a copy of src, replacing matches of the Regexp with the
      replacement string repl. Inside repl, $ signs are interpreted as in Expand, so
      for instance $1 represents the text of the first submatch.


      Example:

      1. re := regexp.MustCompile("a(x*)b")
      2. fmt.Println(re.ReplaceAllString("-ab-axxb-", "T"))
      3. fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1"))
      4. fmt.Println(re.ReplaceAllString("-ab-axxb-", "$1W"))
      5. fmt.Println(re.ReplaceAllString("-ab-axxb-", "${1}W"))
      6. // Output:
      7. // -T-T-
      8. // --xx-
      9. // ---
      10. // -W-xxW-

      func (*Regexp) ReplaceAllStringFunc

      1. func (re *Regexp) ReplaceAllStringFunc(src , repl func(string) ) string

      ReplaceAllStringFunc returns a copy of src in which all matches of the Regexp
      have been replaced by the return value of function repl applied to the matched
      substring. The replacement returned by repl is substituted directly, without
      using Expand.

      func (*Regexp)

      1. func (re *) Split(s string, n ) []string

      Split slices s into substrings separated by the expression and returns a slice
      of the substrings between those expression matches.

      The slice returned by this method consists of all the substrings of s not
      contained in the slice returned by FindAllString. When called on an expression
      that contains no metacharacters, it is equivalent to strings.SplitN.

      Example:

      1. s := regexp.MustCompile("a*").Split("abaabaccadaaae", 5)
      2. // s: ["", "b", "b", "c", "cadaaae"]

      The count determines the number of substrings to return:

      1. n > 0: at most n substrings; the last substring will be the unsplit remainder.
      2. n == 0: the result is nil (zero substrings)
      3. n < 0: all substrings


      Example:

      1. a := regexp.MustCompile("a")
      2. fmt.Println(a.Split("banana", -1))
      3. fmt.Println(a.Split("banana", 0))
      4. fmt.Println(a.Split("banana", 1))
      5. fmt.Println(a.Split("banana", 2))
      6. zp := regexp.MustCompile("z+")
      7. fmt.Println(zp.Split("pizza", -1))
      8. fmt.Println(zp.Split("pizza", 0))
      9. fmt.Println(zp.Split("pizza", 1))
      10. fmt.Println(zp.Split("pizza", 2))
      11. // Output:
      12. // [b n n ]
      13. // []
      14. // [banana]
      15. // [b nana]
      16. // [pi a]
      17. // []
      18. // [pizza]
      19. // [pi a]

      func (*Regexp) String

      1. func (re *Regexp) String()

      String returns the source text used to compile the regular expression.

      func (*Regexp) SubexpNames

      1. func (re *Regexp) SubexpNames() []


      Example:

      1. re := regexp.MustCompile("(?P<first>[a-zA-Z]+) (?P<last>[a-zA-Z]+)")
      2. fmt.Println(re.MatchString("Alan Turing"))
      3. fmt.Printf("%q\n", re.SubexpNames())
      4. reversed := fmt.Sprintf("${%s} ${%s}", re.SubexpNames()[2], re.SubexpNames()[1])
      5. fmt.Println(reversed)
      6. fmt.Println(re.ReplaceAllString("Alan Turing", reversed))
      7. // Output:
      8. // true
      9. // ["" "first" "last"]
      10. // ${last} ${first}
      11. // Turing Alan

      Subdirectories