Advent of Code 2020 - Day 4, in Kotlin - Passport Processing • Todd Ginsberg

In today’s puzzle we’re asked to commit some light passport fraud, because evidently saving Christmas four times in a row is not enough to get out of the North Pole without some bureaucratic hassles. We’ll also look at two different ways to approach validating text - simple string matching and Regular Expressions (they aren’t as bad as you’ve heard they are).

If you’d rather just view code, the GitHub Repository is here .

Problem Input

Because the input file has each passport spread out over several rows, we’re going to have to do some manipulation. Rather than bring in the input file as a List<String> as we normally would, let’s just bring it in as a String and see if we can get it in a format we can use. To do that, let’s add a new helper function to our Resources.kt file:

// In Resources

fun resourceAsText(fileName: String): String =
    File(fileName.toURI()).readText()

Once we have that, we can grab our input as one big String, and turn it into a List<String>:

class Day04(input: String) {

    private val passports: List<String> = input.split("\n\n")

}

Our input does not have a single record per line, the records can be spread out over a few lines. By splitting our String on two newlines, we can split it into a List<String>, where each element is a passport record. We don’t care that each passport record may take up a few lines.

⭐ Day 4, Part 1

The puzzle text can be found here.

Before we start solving Part 1, I want to talk about our goal. In an application we might write for a job or a paying customer, we would probably want to do something with this data after validating it. However, in our case we don’t need to do anything because this is a puzzle. Because of this, we’re going to focus just on validation with today’s solutions.

The first step is to get a list of all the field names we are looking for. We will put them in the companion for Day04.

// In Day04

companion object {
    private val expectedFields = listOf("byr:", "iyr:", "eyr:", "hgt:", "hcl:", "ecl:", "pid:")
}

Because we can ignore cit fields, we won’t bother looking for them.

As you can see, we append : to the end of each of the field names. This is the delimiter that the passport records use. We have it there because if we don’t, we might end up looking for the keys in the values. Meaning, if we get an input like this: byr:iyr, it will match both byr and iyr, when according to the rules it should only match byr.

All that’s left is to check each passport to make sure each field we want is present.

// In Day04

fun solvePart1(): Int =
    passports
        .count { passport -> expectedFields.all { passport.contains(it)} }

We can use two functions from the Kotlin standard library to do this. First up is count. This takes a lambda that returns true or false. Every time that lambda produces true, it adds one to the count. Next, we’ll use the all function. It also takes a lambda, and operates over an Iterable. It will return true if and only if every element from the collection (expectedFields) meets our criteria. The criteria is that the passport contains the field (it).

This code does not check for keys we aren’t expecting, but my input did not have any keys it wasn’t expecting. Either that or the records were missing some other field we were expecting.

Something to keep in mind if you ever write a moderately complicated set of validations.

Star earned, onward!

⭐ Day 4, Part 2

The puzzle text can be found here.

Here’s where things get fun. Remember when I said in Part 1 that we were only going to focus strictly on validation? That is especially true now that we’ve seen what Part 2 has in store for us. Again, if this were a real application that we were writing for a customer, we’d probably do something more involved.

We are going to solve Part 2 with a series of Regular Expressions . How’s the joke go? “I decided to use Regular Expressions so solve my problem and now I have two problems”. Well, I’ve got three problems because not only am I using Regular Expressions, I have to explain them in a way that doesn’t overwhelm people!

Think of a Regular Expression as a pattern used for matching text. For example, suppose we wanted to create a pattern described like this - “This text must contain exactly two digits”. We could represent it like this: \d\d, where \d is shorthand for “digit”. Another way to represent that would be [0-9][0-9], where [0-9] means “match one character that is 0 through 9”. Another possible way is \d{2}, where \d still means digit and {2} means “exactly twice”.

That just barely scratches the surface of what Regular Expressions are. Rather than give a full lesson on Regular Expressions here, I will show you what I wrote and explain each one the best I can, in order of complexity. Regular Expressions might look complicated but even if you only know a couple of tricks, they are a nice tool to have in your programmer toolbox. You never know when you’ll want to do a quick validation on some text. They are worth learning, if only so you can learn to identify the kinds of problems that lends themselves to Regular Expressions. I will admit that one common criticism leveled at Regular Expressions is that they can be hard to read, and I do find that to be the case sometimes.

Let’s dig right in…

// In Day04 companion object

private val fieldPatterns = listOf(
    """\bpid:[0-9]{9}\b""",
    """\bhcl:#[0-9a-f]{6}\b""",
    """\becl:(amb|blu|brn|gry|grn|hzl|oth)\b""",
    """\biyr:(201[0-9]|2020)\b""",
    """\beyr:(202[0-9]|2030)\b""",
    """\bbyr:(19[2-9][0-9]|200[0-2])\b""",
    """\bhgt:((1([5-8][0-9]|9[0-3])cm)|((59|6[0-9]|7[0-6])in))\b"""
).map { it.toRegex() }

Wait! Stop! Come back! It’s not that bad!

At a high level we are taking a bunch of raw strings and mapping them to Regular Expression objects (.toRegex()).

Let’s go through each expression line-by-line and figure out what’s going on. The first thing to note is \b - this means “word boundary”. Suppose we want to match whole words, we’d use \b to tell the Regular Expression parser to not consider text within other text. A word boundary can be the end of a line, whitespace, or some other characters I’m not going to enumerate here. We use them in every example here in order to separate the keys and values we are matching, just in case some nested text shows up!

First up: \bpid:[0-9]{9}\b

I think we already know enough to solve this one! We start with a word boundary (\b) and then the literal text “pid:”. So we know which key we’re after now. Then we match exactly 9 digits (0-9). I could have written this using the \d shorthand as well, but I find 0-9 easier to reason about if I ever have to come back and read this. And our expression ends at the word boundary.

Next: \bhcl:#[0-9a-f]{6}\b

Ranges aren’t only useful for numbers, we can use them for alphabet letters as well. So a-f in this example means “any lowercase letter between a and f”. Given that, we know enough to decipher this one. Starting with a word boundary, match the literal text “hcl:#”, and then exactly six digits or letters that are a-f. If you ever have to match hexadecimal strings, you will see the [0-9a-f] or [0-9A-F] pattern a lot.

Let’s try this one: \becl:(amb|blu|brn|gry|grn|hzl|oth)\b

We’ll introduce two new concepts here. First, whenever you see a | symbol, that means “or”. We’re using it here to match one of the valid values for eye color by listing them all. Next, anything in the parenthesis is called a “capture group”. Suppose you wanted not only to match text with a Regular Expression, but you wanted to cut out parts of it to use later. That’s what a capture group does, but we’re only using it here to make our matching easier when using |.

So now we know enough to figure out this one: starting with a word boundary, match literal text “ecl:”, and then any one of amb, blu, brn, gry, grn, hzl, or oth, followed by a word boundary.

See? We’re already writing some pretty powerful expressions and we’ve only just stared.

Next up: \biyr:(201[0-9]|2020)\b,

No new concepts here. Match a word boundary, the literal text “iyr:”, and then either “201” and any digit OR 2020, followed by a word boundary. Why couldn’t we have just done “20[0-2][0-9]”? Because that would have matched 2021 through 2029, and we don’t want those years to match.

And then: \beyr:(202[0-9]|2030)\b

This is effectively the same pattern, with different years. Match a word boundary, the literal text “eyr:”, and then either “202” and any digit OR 2030, followed by a word boundary.

Building on that: \bbyr:(19[2-9][0-9]|200[0-2])\b

Match a word boundary, followed by the literal string “byr:” and then either (19 and 2 through 9 and then any digit) OR (200 followed by 0-2), and then a word boundary. Same concept as before, just more complicated as the ranges get harder to express.

Finally: \bhgt:((1([5-8][0-9]|9[0-3])cm)|((59|6[0-9]|7[0-6])in))\b

Frankly, this is the kind of thing that gives Regular Expressions a bad name, but you know everything you need to know in order to figure it out. We are going to start with a word boundary and the literal text “hgt:”. Then, we’ll have one of two options. Either our next match ends with cm or in. If it is cm, we need the match to have a 1 followed by either (5 through 8 and then any digit) OR (9 followed by 0 through 3). This is how we represent 150 through 193. If we matched in the work is similar: exactly 59 OR (6 followed by any digit) OR (7 followed by 0 through 6). And all that followed by a word boundary.

I realize that last one was a bit complicated, sorry. Hopefully most of that made sense.

All that’s left to do is to count up how many of the passport rows have a match in every Regular Expression. This code looks very similar to Part 1’s solution, except we’ll use containsMatchIn, which tells us if the Regular Expression in question matches the passport.

// In Day04

fun solvePart2(): Int =
    passports
        .count { passport -> fieldPatterns.all { it.containsMatchIn(passport) } }

Star earned!

Similar to Part 1, this code does not handle the case where keys we aren’t expecting are in the record. I didn’t have this in my input.

Parting Thoughts

This reminds me of 2017 Day 9 , which was solved entirely using Regular Expressions. They really can be handy and I encourage you to not get overwhelmed and try to practice using them a bit. I guarantee once you know the basics you’ll see all sorts of places where you can use them.

Having said that, a lot of times we don’t have to get fancy and can get away with basic string checks like we did in Part 1.

Either way, I hope you had fun and learned somethign today! See you tomorrow!

⭐ Day 4, Part 1

⭐ Day 4, Part 2

Parting Thoughts

Further Reading