R regexpr Function Examples -- Perschon

Home
»
R
»

R regexpr Function

regexpr returns an integer vector of the same length as text giving the starting position of the first match or -1 if there is none, with attribute "match.length", an integer vector giving the length of the matched text (or -1 for no match). The match positions and lengths are in characters unless useBytes = TRUE is used, when they are in bytes.

regexpr(pattern, text, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)

• pattern: regular expression, or string for fixed=TRUE
• text: string, the character vector
• ignore.case: case sensitive or not
• perl: logical. Should perl-compatible regexps be used? Has priority over extended
• fixed: logical. If TRUE, pattern is a string to be matched as is. Overrides all conflicting arguments
• useBytes: logical. If TRUE the matching is done byte-by-byte rather than character-by-character

> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- regexpr("\\d+",x)
> y

[1] 6
attr(,"match.length")
[1] 4
attr(,"useBytes")
[1] TRUE

> x <- "line 4322: He is now 25 years old, and weights 130lbs"
> y <- regexpr("[[:digit:]]",x)
> y

[1] 6
attr(,"match.length")
[1] 1
attr(,"useBytes")
[1] TRUE

> if (y[[1]][1] != -1) print("match")

[1] "match"

Vector match:

>str <- c("Regular", "expression", "examples of R language")
>x <- regexpr("x*ress",str)
>x

[1] -1 4 -1

Regular Expression Syntax:

Syntax

Description

\\d

Digit, 0,1,2 ... 9

\\D

Not Digit

\\s

Space

\\S

Not Space

\\w

Word

\\W

Not Word

\\t

Tab

\\n

New line

Beginning of the string

End of the string

Escape special characters, e.g. \\ is "\", \+ is "+"

Alternation match. e.g. /(e|d)n/ matches "en" and "dn"

•

Any character, except \n or line terminator

[ab]

a or b

[^ab]

Any character except a and b

[0-9]

All Digit

[A-Z]

All uppercase A to Z letters

[a-z]

All lowercase a to z letters

[A-z]

All Uppercase and lowercase a to z letters

i at least one time

i zero or more times

i zero or 1 time

i{n}

i occurs n times in sequence

i{n1,n2}

i occurs n1 - n2 times in sequence

i{n1,n2}?

non greedy match, see above example

i{n,}

i occures >= n times

[:alnum:]

Alphanumeric characters: [:alpha:] and [:digit:]

[:alpha:]

Alphabetic characters: [:lower:] and [:upper:]

[:blank:]

Blank characters: e.g. space, tab

[:cntrl:]

Control characters

[:digit:]

Digits: 0 1 2 3 4 5 6 7 8 9

[:graph:]

Graphical characters: [:alnum:] and [:punct:]

[:lower:]

Lower-case letters in the current locale

[:print:]

Printable characters: [:alnum:], [:punct:] and space

[:punct:]

Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~

[:space:]

Space characters: tab, newline, vertical tab, form feed, carriage return, space

[:upper:]

Upper-case letters in the current locale

[:xdigit:]

Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

R Tutorials

Data Type and Structures

Loop, Condition Statements

Plotting and Graphics

Selected Functions List