GCJ 2009: Alien Language

Aliens are here again, after their numbers it’s the language turn. Googlers must like them very much, this was the first problem for the qualification round:

You can find the original text here (registration may be required)

Here is my solution in python:

#!/usr/bin/python
import sys
import re
import psyco
psyco.full()
rdl = sys.stdin.readline
 
def process(case):
    """precessing case #"""
    pattern = rdl().replace('(','[').replace(')',']')
    count = 0
    reg = re.compile(pattern)
    for w in words:
        if reg.match(w): count += 1
    return str(count)
 
L, D, cases = [int(i) for i in rdl().split()]
words = [rdl() for dumb in xrange(D)]
for case in xrange(1, cases+1):
    print "Case #%d:"%case, process(case)

The problem was about simple pattern matching, so what tool is better than regex? I simply replaced each ( with a [ and each ) with a ], made a list of words and testing all words against each pattern.

Compiling each regex before the for loop make time of execution drop from 9s to 3s for large input files. This one was very very simple to solve this way.

How did/would you solve this?

Tags: , ,

5 Responses to “GCJ 2009: Alien Language”

  1. nigma » Blog Archive » GCJ 2009 Round 1C: All Your Base Says:

    […] I say that maybe Googlers like aliens? Of course they do! Actually they like also to make cool, not so hidden, citations, but […]

  2. misan Says:

    Same here. I did use the same technique. Compiling the regex also improved the running time a lot in Java.

    I felt like cheating :-)

  3. e.nigma Says:

    I felt like that too, but that’s not cheating, it’s using the right tool for the right job ;)

  4. Mathias Grimm Says:

    thats wrong…
    isnt [az] … is (a|z). a OR z, not a to z.

  5. e.nigma Says:

    @Mathias Grimm: it works in both ways.

    (a|z) is a group and means match a OR z.
    [az] is a character class, it means match any of the character in the list “az”, and it’s equivalent to a OR z.

    If you wanted a TO z inside a character class you need to use [a-z] with the “-”.

    Thanks for stopping by :)

Leave a Reply