Definition of Regular Expressions
String Concatenation
This is the product of two words. It is obtained by appending them together to form one long word.
s = abb
t = bab
st = abbbab
ts = bababb
It is non-commutative.
These strings can be defined formally:
\[s=x_1\ldots x_n , t=y_1\ldots y_m\] \[st=x_1\ldots x_ny_1\ldots y_m\]There is also some notation:
- $s^k$ - The string $s$ repeated $k$ times.
- $\vert s\vert$ - The length of the string.
Concatenation of Languages
This operation can be extended to languages:
-
The concatenation of languages $L_1$ and $L_2$ is:
\[L_1L_2=\{st\vert s\in L_1,t\in L_2\}\] -
The $n^{\text{th}}$ power of $L^n$ is:
\[L^n=\{s_1s_2\ldots s_n\vert s_1,s_2,\ldots,s_n\in L\}\] -
The union of $L_1$ and $L_2$ is:
\[L_1\cup L_2=\{s\vert s\in L_1 \text{ or } s\in L_2\}\]
Example
For the following languages:
- $L_1\{0,01\}$
- $L_2\{\epsilon,1,11,111,\ldots\}$
you would get the following results:
- $L_1L_2=\{0,01,011,0111,\ldots\}$
-
$L_1^2=\{00,001,010,0101\}$
This is all the combinations of the strings
0
and01
. - $L_1\cup L_2=\{0,01,\epsilon,1,11,111,\ldots\}$
Operations on Languages
See Set Theory, Kleene Star & DNF for additional examples.
- $L^*=L^0\cup L^1 \cup L^2\cup\ldots$
-
This is all the strings made up of all combinations of $L$, including the empty string.
The number of words in each length of expansion follows the Fibonacci sequence.
-
Combining Languages
We can construct languages by starting with simple ones, like $\{0\},\{1\}$ and combining them. Here are two languages and their equivalent regular expressions:
-
$\{0\}(\{0\}\cup\{1\})^*$ - All strings that start with
0
.0(0 + 1)*
-
$(\{0\}\{1\}^*)\cup(\{1\}\{0\}^*)$ -
0
followed by any number of1
s or1
followed by any number of0
s.01* + 10*
The order of operations is like so: *
,.
,+
.
Regular Expressions
A regular expression over $\Sigma$ is an expression formed using the following rules:
- The symbols $\emptyset$ and $\epsilon$ are regular expressions.
- Ever $a$ in $\Sigma$ is a regular expression.
- If $R$ and $S$ are regular expressions, so are $R+S$, $RS$ and $R^*$.
Any language corresponding to a regular expression is called regular. The set of all of them define the class of regular languages.