3.1.19 splitting a String, joining a List, grepping from a List
Perl provides us with a straight-forward way to take a string and extract sub-strings that are separated from each other in a consistent way. For example, if want to extract all the individual words from a line of text, we can do so very easily. We know that each word is separated from the word in front and the word behind by one or more blank spaces, except, of course, the first word and the last word.
Program 3.26
#!/usr/bin/perl use strict 'vars'; $" = "\n"; open(IN, "input.txt"); my $line =; my @words = split /\s+/, $line; print "@words\n";
This program reads the first line of text from the file input.txt and breaks it apart into a list of smaller strings based on the pattern \s+. When a line is read from a file, it is read as a single continuous string from beginning to the end. split takes two arguments: a pattern based on which splitting is performed, and the string being split. We will learn more about patterns and regular expressions later in the book, but for the time being, we should know that the predefined escape sequence \s when used inside a pattern stands for a space characterÑi.e., either a blank space, a tab space, a new line character, a return character, and a form feed character. \s+ means one or more \s characters. Now, $line does not have any new line character \n in it. Therefore, @words will contain all the words in the line read. The output of this program looks like the following.
Perl provides us with a straight-forward way to take a string and extract
These are all the words that occur in that line. They have been separated out. If we know for sure that the words are separated from each other by one or blank spaces only, we could have written the line that does the splitting as follows.
my @words = split / +/, $line;
Here, the pattern specification / +/ has a blank space character. In Perl, patterns are most usually specified by delimiting them within / in the front and at the end.
We can see that the program given above can be easily extended to read a whole file (or, even all the files in a given directory, possibly recursively) and to count and enumerate all the distinct words that occur.
The built-in function join simply takes a list of strings and puts them together sequentially into a large string. join takes two arguments: the first -a string that is used as the glue, and the second -a list of strings. For example, if we want to write out all the words that occur in the first line of the file, sorted in ascending order and one per line, in the form of an HTML list, we can write the following.
Program 3.27
#!/usr/bin/perl use strict 'vars'; $" = "\n"; open(IN, "input.txt"); my $line =; my @words = sort (split /\s+/, $line); my $printedWords = join "\n ", @words; print " \n
\n";- $printedWords
\n
The output of this program is given below.
- Perl
- a
- a
- and
- extract
- provides
- straight-forward
- string
- take
- to
- us
- way
- with
grep is a built-in function that takes a pattern as the first argument and a list as the second argument and picks out the elements from the list that satisfy the pattern. The following program prints out the names of those friends that have the pattern sson in them.
Program 3.28
#!/usr/bin/perl
use strict 'vars';
my @friends = ("Hakan Kvarnstrom", "Dawson Leary", "Jonas Olsson",
"Tommi Jokinen",
"Magnus Eriksson", "Brooke Peterson");
my @specialFriends = grep /sson/, @friends;
print join ("\n", sort @specialFriends), "\n";
The output printed by this program is given below.
Jonas Olsson Magnus Eriksson
