4.3.4.3 ? : Zero or one
4.3.4.3 ? : Zero or one
The multiplier ? is used to indicate zero or one occurrence of a character or a subpattern. For example, if we write
/ab.?o/
we are looking for an a followed by a b, followed by zero or one non-\n character, followed by a c. So, this line matches the line
It is difficult to do an absolute determination of
as well as the following line.
It was reported that about three quarters
The second line matches because we have an a followed by a b, followed by nothing, followed by o.
The question mark ? can be used as a multiplier after a subpattern grouped using parentheses.
Let us now change our program a bit more and try to look for citation indices that satisfy a certain syntax that we have imposed on ourselves. Note that this syntax is not enforced by the word processing program TeX or LaTeX. It is simply a convention to which we want to adhere. We require a citation index to start with an uppercase letter, then follow it by one or more letters of either case, and then follow it by two digits giving the year of publication. The next program we write is the following.
Program 4.8
#!/usr/bin/perl
while (<>){
if (/\\cite{[A-Z][a-zA-Z]*\d\d}/){
print $_;
}
}
This program looks for the string \cite followed by a {, followed by an uppercase letter, followed by zero or more uppercase or lowercase letters, followed by two digits and finally a }. This program prints all those lines from the text file where we have citations with one author in each citation.
Let us now change the syntax of a citation index a little more by making the specification of the year of publication slightly more complex. We want to allow two digits for the year as we have already done. But, in addition, we want to allow four digits for the year also. However, in such a case, we want to make sure that the first two digits are 19 referring to the twentieth century. Therefore, in this new syntax, either one of the following is acceptable: Kalita93 or Kalita1993. However
Kalita2001 is not acceptable. The following program accomplishes this specification.
Program 4.9
#!/usr/bin/perl
while (<>){
if (/\\cite{[A-Z][a-zA-Z]*(19)?\d\d}/){
print $_;
}
}
In this program, the subpattern
(19)?\d\d}
allows us to parse the number specification properly. Here, we group 19 into a subpattern by putting it inside parentheses. Then we put a ? after the closing parenthesis. This makes the subpattern 19 optional.
