4.3.4.4 Counting Occurrences

4.3.4.4  Counting Occurrences

 We have seen that ? counts zero or one occurrence of a character or a parenthesized subpattern. * and + do not count, but simply say zero or more, and one or more, respectively. It is possible to do an exact count when we perform pattern matching. For example, when we write

 

/Bob{2}/

 

we are looking for exactly two lower-case b’s at the end of the pattern. Therefore, this pattern matches a string containing Bobby and Bobbit, but does not match Bolivar or Bobster.

We can change the last program we presented so that we use a count of two when we are looking for the last two digits of a citation index. In the last program we wrote \d\d toward the end of the pattern. This time, we write \d{2} instead. That is the only difference between the last program and the current one.

 Program 4.10

#!/usr/bin/perl

while (<>){
    if (/\\cite{[A-Z][a-zA-Z]*(19)?\d{2}}/){
         print $_;
    }
}

Instead of specifying an exact count, we can specify a range as seen below.

 

/b{2,4}/

/b{2,}/

/b{0,5}/

 

In the first case, we are looking for between two and four b’s, in the second case two or more b’s, and in the third case between zero and five b’s. We must put 0 as the lower limit if we want to say zero or more. Therefore,

 

/b*/

/b+/

/b?/

 

are exactly the same as

 

/b{0,}/

/b{1,}/

/b{0,1}/

 

respectively. However, for these three special cases, most people prefer to use *, + and ? as multipliers because of tradition as well as brevity.