The Puzzling Quirks of Regular Expressions

  1. Acknowledgments
  2. Rights of (Wo)Man
  3. Credits
  4. Preface
  5. Quantifiers and Special Sub-Patterns
    1. Wildcard Scope
    2. Words and Sequences
    3. Endpoint Classes
    4. A Configuration Format
    5. The Human Genome
  6. Pitfalls and Sand in the Gears
    1. Catastrophic Backtracking
    2. Playing Dominoes
    3. Advanced Dominoes
    4. Sensor Art
  7. Creating Functions using Regexen
    1. Reimplementing str.count()
    2. Reimplementing str.count() (stricter)
    3. Finding a Name for a Function
    4. Playing Poker (Part 1)
    5. Playing Poker (Part 2)
    6. Playing Poker (Part 3)
    7. Playing Poker (Part 4)
    8. Playing Poker (Part 5)
  8. Easy, Difficult, and Impossible Tasks
    1. Identifying Equal Counts
    2. Matching Before Duplicate Words
    3. Testing an IPv4 Address
    4. Matching a Numeric Sequence
    5. Matching the Fibonacci Sequence
    6. Matching the Prime Numbers
    7. Matching Relative Prime Numbers

Support the author!
Lulu Editions
Paypal Donation
Other Publications

Finding a Name for a Function

Suppose you come across some code that a previous employee on your project, long moved on and unavailable, wrote. Their code passes unit tests and integration tests, so it probably does the right thing. But they have not given a useful name or documentation for a certain function:

def is_something(s):
    return not re.match(r'^(.+?)\1+$', s)

For this puzzle, simply provide a good name and a docstring for this function, to be kind to later programmers.

Before you turn the page…

Code is read far more often than it is written.

This puzzle certainly has many possible answers. For all of them, understanding what the regular expression is doing is the crucial element. The short pattern might look odd, and you need to figure it out. Here is a possibility.

def repeated_prefix(s):
    """Look for any prefix string in 's' and match only if
    that prefix is repeated at least once, but it might be
    repeated many times.  No other substring may occur
    between the start and end of the string for a match.
    return not re.match(r'^(.+?)\1+$', s)