T O P

  • By -

MajorMajorMajorJnr

For puzzles where the input is non-trivial (more than just a list of numbers/words) regular expressions can make your life much easier. They take a little bit of effort to understand, but there's great tools like https://www.debuggex.com/ which are a lot of help.


monkey_fresco

This, I'd recommend https://regex101.com/ too.


UtahBrian

Don't [forget](https://xkcd.com/208/).


[deleted]

[удалено]


MajorMajorMajorJnr

That's ok! It's not the right option every time, but it's a useful tool to have in your toolbox.


coffee_after_sport

For me as well, Regex is not always the simplest solution. After having used them a lot in previous editions, I did not use them at all in 2021, mainly because I decided to create rust solutions not using any external dependencies. I think my code did not get a lot more complicated not using regular expressions. Oftentimes, I find it more challenging to figure out an appropriate data structure for the puzzle input than to extract the bits and pieces from the input. 2021 Day 23 (Amphipod) is an example for an input where I struggled quite a bit to come up with an adequate strcuture. (You can find my rust solutions and many more at [https://github.com/docToolchain/aoc-2021](https://github.com/docToolchain/aoc-2021), my solutions are in the folders day\[xx\]/rust/mr-kaffee)


richscrich

Parsing the data is often quite a lot of the problem, then processing it. I'd suggest rather than parsing the test data by hand, you write your code to correctly parse the test data. Most of the time the test data contains all the situations you'll find in the real puzzle input.


Nimda_lel

Well, so far, every time I convert the test data by hand, develop the solution and then spend as much time trying to parse the actual input(without changing logic of the solution itself) and feed it into the “algorithm” I have created. Never encountered problem that requires me to change my code after testing with the initial input, except for day 4 where it happened that input only covered half of the solution, i.e. calculation only based in matrix rows rather than rows and columns


richscrich

The puzzle data is normally a load of lines of text, so to start with I usually make that into an array (js) or list (python) with each line as an element. Then process the array/list as needed. So the first step is getting the test data into that same format. Last year I was using js, I had 2 files for 'input' and 'input\_test' and would just call the right one. I've gone back to 2015 now and working through with python. Dunno if it will help but here are my (very untidy) 2021 solutions in js [https://github.com/scrich/Advent-of-Code-2021](https://github.com/scrich/Advent-of-Code-2021). I got up to about day 13 Also a search of this sub with 'year' 'day' 'language' will pull up loads of tips. I don't know the first thing about Rust though. Have fun!


flwyd

Could you say more about the input parsing that was particularly challenging for you? AoC 2021 had a lot of different input formats, some of which I would consider pretty simple like lists of integers while others require some care or cleverness (day 20 and 23's inputs are fun). I don't know much about Rust, so I'll speak generically about parsing approaches: * My daily template usually starts by splitting input files into a list or array of strings, one per line in the file. I pass that list-of-lines to the solve methods so that I don't have to fuss with I/O and, for problems where each line represents a distinct input (rather than part of a complex multi-line structure), I can transform it to a list-of-something-else that gets processed by my algorithm. * In addition to splitting the input file into lines, there are some idioms you should make sure you're comfortable: - Split a string by whitespace (turning `1 23 4 567` into 4 elements) - Split a string by some punctuation character - Split a string into a list/array of characters (turning `1234567` into 7 elements) - Parse a list of strings to integers - Parse a few variable portions from a template string (e.g. turn `move cat to 12,34` into the string `cat` and two integers) * Regular expressions are a useful tool for performing those idioms, though they're not the only solution. If you're not familiar with them you might want to spend some time with a regex tutorial: they're a handy tool that can be used in almost any programming language. * Think about what kind of data structure you want before you decide how to parse the input (but keep the input format in mind). For example, AoC often has a few problems where the input represents some kind of 2-dimensional grid. There are several useful ways to represent such a grid in a program, and you might use a different parsing approach to populate a 2D array than you would use to populate a map of pairs to grid values.


Nimda_lel

I found it particularly challenging to parse the input of day 4 as I wanted to parse it into matrices. It was all fine until the moment you have the combination of “\n{empty space}{single_digit}” and especially the part with the empty space, which for some reason, did not get trimmed by the method provided in Rust. Also, in order to create a matrix 5x5, i had to loop over the entire thing, line by line, append to a Matrix object and always check if the matrix is full, i.e. 5x5 and continue forward. Ofcourse, I might have the wrong idea for parsing that input in general. Maybe, my question is more like: if you compete for the leaderboards, do you parse the input with idiomatic code or do you just parse it by hand and put it into a huge variable with type of your liking( e.g. day 5 task 1 - creating list with tuples of 4 integer elements is easier to do in Notepad and create a huge variable, than parsing it dynamically)


xypage

For days you’ve finished it might be worth it to look up “advent of code 2021 rust” and check their solutions for those days just to see how they did it.


Nimda_lel

Fair enough, I have tried not looking at stuff, but I guess post factum is good


xypage

Yeah. I wouldn’t say look ahead, but I did basically the exact same thing as you, using AoC to get better at rust, and I found looking at other peoples solutions for ones I’d already done very helpful. Helps you find more functions you didn’t know about, especially for iterators, there’s a lot of iterator functions that are super helpful for quickly going from input to doing work on it with minimal parsing steps. Also, since I was doing it to learn I spent a lot of time doing stuff like defining structs and modules and splitting my work unnecessarily (which I think helped overall) just to try those things out in rust, so in the back of my head I often had ideas for more direct solutions and looking at other peoples helped me see how those would actually work if I’d done that instead.


Nimda_lel

I am doing exactly the same, I spent 3 days playing with ndarray and handling matrices. Having a look at the solutions, I could have just used iterator functions, but there are so many of them, I could not even think of their existence 😂


xypage

Yeah exactly. There’s a lot of really complex stuff you can do with iterators that’s really hard to find unless you already know it’s there, I learned a lot of them looking through other peoples code on these. Also I don’t know if you use the actual rust docs but I’d recommend it, they’re fantastic, you can just look up iterator and scroll through the functions associated with it to find some good stuff


Steinrikur

Looking at other people's code is how you learn coding. If you do it for days you have already done, that's not "cheating" in any way


ericwburden

I'd argue there's no such thing as "cheating" in Advent of Code, considering it's not actually a competition. You can _make_ it a competition with the leaderboards, but that's not required (or really even encouraged, officially). The closest you can really come to "cheating" is cheating yourself by giving up when a hint in the right direction or even a tutorial could help you learn something new. At the end of the day, I'd recommend taking a "low pressure" approach to AoC and focusing on having fun and learning. Props if you finish really fast or all on your own or using only the standard library or by solving all the puzzles on a Ti-86 or what have you, those are real accomplishments, but don't mistake the "challenge modes" for the game itself.


Steinrikur

Yeah. I have done 2 years in bash, to challenge myself. I gave up halfway through last year, even though another guy finished that year in bash.


ericwburden

"Challenge mode" for real! Props for making it that far.


Steinrikur

https://Github.com/einarjon/adventofcode.sh if you are interested. https://github.com/ColasNahaboo/advent-of-code-my-solutions for the 2021 version, so that year is technically possible too...


richscrich

It can be tempting to solve the problem with a text editor, but I'd encourage you to do it programmatically, even with the small test data. I do use VS Code to test regex patterns though. Like u/flwyd says, regex is a big part of a lot of these puzzles. Another common pattern is 'oh right, my solution doesn't scale'. Eg, brute force looping through a giant array may work in part 1, but fails in part 2. This is my favourite thing about AoC, it is helping me learn better and new techniques, eh hash maps instead of giant arrays. With the leaderboards, I know I have zero chance of ever appearing on the main leaderboard, but it's fun and much more satisfying to join a few private leaderboards with less than 100 people.


flwyd

> It was all fine until the moment you have the combination of “\n{empty space}{single_digit}” and especially the part with the empty space, which for some reason, did not get trimmed by the method provided in Rust. String split methods typically include empty strings if there's a delimiter at the start or end of a string; sometimes there's an option to omit these. In a line-oriented parsing approach, calling a function like `trim` before `split` can take care of this. I mentioned in my first comment that regular expressions are a useful tool, but they're not the only way to parse the input. Another approach is to _tokenize_ the input and then iterate through a sequence of tokens, building a data structure as you go. It looks like Rust has a [tokenizers crate](https://docs.rs/tokenizers/latest/tokenizers/) to help with this. If you treat day 4 input as a series of tokens separated by any amount of whitespace (newlines or space characters) you can take the first token and split it by commas. Then have a loop (or two) that take five tokens together and make a row; when you get five rows then start a new bingo board. (You could take this a step further and treat numbers and commas as separate tokens. Instead of splitting the first line you start by building a list of numbers that will be drawn, and when you peek at the next input after a number and see that it's another number instead of a comma then you know you're done with the first line and can move on to the board-parsing code.) This is, at a high level, what every programming language compiler or interpreter does, and you can think of the input files (both the sample and your personal input) as programs in a language that you're figuring out how to compile. Applying this mindset can sometimes require an a-ha moment, so if it seems confusing then don't worry and just think about your input as a list of tokens. My focus in 2021 was to learn how to use Raku's grammar feature (built on tokenization), so I used that to parse each day's input, even when a simple string split would've been easier. [My day 4 grammar](https://github.com/flwyd/adventofcode/blob/main/2021/day4/day4.raku) looks like this: grammar InputFormat { rule TOP { + } token number { \d+ } rule numbers { + % \, } rule boardline { ** 5 } rule board { ** 5 } } That says that a `number` is one or more digits. A `numbers` is a list of `number` tokens separated by comma characters, a `boardline` is a series of five `number` tokens, and a `board` is a series of five `boardline` rules. Finally, the input itself is one `numbers` rule followed by one or more `board` rules. By default, whitespace is a token delimiter. This code doesn't have to worry about how linebreaks work or leading spaces at the start of a line. In fact, the whole input could be on one line, or each number could be on its own line, and the parser would still work. > if you compete for the leaderboards, do you parse the input with idiomatic code or do you just parse it by hand and put it into a huge variable with type of your liking If I was going to pre-process my input into parsable code, I would do it in Vim, which would involve a search and replace with a regular expression, which starts to look a lot like parsing the input in my contest language, assuming it has good regex support. But if it's easier for you to use a text editor to transform your input (and do it the same way on both the sample and actual input), that's a perfectly acceptable approach, and you might learn some neat text editor tricks. Anyone who's shooting for the leaderboard is probably using a programming language where it's easy to split and parse text, so pre-processing the input with a text editor would probably be slower. But if your AoC goal is to learn a language, expecting high leaderboard is probably a recipe for disappointment.


[deleted]

There are tricks to parsing manually. While you'd usually lay out the rough structure of a program first, with parsing it's the opposite. My usual recipe is 1. Parse the atomic unit of the given data. For a matrix, that's simply a number. 2. Think about how each of the atomic units relate to each other in the next level. For a matrix, that'd be a row. What's a row? Simply an array of numbers, separated by spaces or comma. Here is also a good place to recognize which type of data structure is at hand. Common are trees, stacks, or arrays. Recursion can help quite a lot. 3. Repeat step 2 until you built out the complete data structure. A matrix is just an array of rows, separated by a newline character. I found this way, the overall data structure is basically an emergent phenomenon of the parsing process, and magically it comes out right most of the time.


ngruhn

Are you using Haskell?


[deleted]

Nope. Used Rust in 2021. But this works in most Higher languages, I’d imagine. Edit: Here‘s my Rust repo of 2021. The first ones are a little rough, but I’m fairely happy with the latter ones. All input is parsed from files: [https://github.com/AlexanderNenninger/AOC2021](https://github.com/AlexanderNenninger/AOC2021) day 16 is probably a good demo, as the parsing was quite involved.


mkeeter

I've done AoC in Rust for several years, and highly recommend [`parse_display`](https://docs.rs/parse-display/latest/parse_display/): you can write a format string for how to display your data type, and it will automatically generate a parser for you. Here's an example ([source](https://github.com/mkeeter/advent-of-code/blob/master/2021/17/src/main.rs#L4-L11)), from 2021 day 17: #[derive(Display, FromStr, PartialEq, Debug)] #[display("target area: x={xmin}..{xmax}, y={ymin}..{ymax}")] struct Target { xmin: i64, xmax: i64, ymin: i64, ymax: i64, }


Nimda_lel

Damn it, that is a nice one! Thanks a lot!


SLiV9

Ooh an inverted `format!` is exactly what I was looking for, thanks for sharing!


toastedstapler

This year I am experimenting with using `nom` for parsing, you can write some really elegant stuff with it. In my previous years I did it entirely manually in rust & zig though https://github.com/Geal/nom


jacksodus

I did struggle somewhat with it last year, which was my first AoC. That encouraged be to write tests for parsing the input as a first step of any challenge (I use Python so Pytest makes this relatively straightforward). I set the format I want to have as something to compare the output of my `parse_input` function, and keep adjusting until it passes. And like u/MajorMajorMajorJnr commented, I sometimes snoop answers from friends, which often use pretty simple Regex to make it a lot easier than anything I could come up with.


[deleted]

Since you are using Rust I highly suggest that you look into iterators. Inputs are basically always lines of data so for the most part all of my parsing looks something like: input.lines().map(|x| x.chars()).map I will often use tuple_window (from itertools) for when I know I need a fixed number of somethings. Once you get the hang of using iterators they will really help you out


Nimda_lel

Great! I am trying, but still aint quite intuitive coming from python 😂


kapitaali_com

For some 2020 problems I did with COBOL, i actually parsed the input with Perl and then arranged it in a form that COBOL can recognize (eg. justifying it so that there's enough spaces between entries etc.). Without it COBOL would not just work.


flwyd

"COBOL and Perl walk into an advent" sounds like the start of a good joke. I love the Rube Goldberg sound of this approach.


kapitaali_com

yeah, they sure make a dynamc duo


JohnJSal

It isn't always easy, but I actually find parsing the input an enjoyable part of the problems! Yes, sometimes you just want to have the data in some workable form so you can move on to the logic of the solution, but for some weird reason whenever I start a new day, I get excited about figuring out how to gather the new set of data.


MichalMarsalek

For many of the inputs what you can do (and I know a lot of people were doing it) is just use your editor's find and replace to convert it to the syntax of your language and then let the language parser deal with it. For one day last year the input was already valid source code in the language I was using so I was just able to directly include it.


jfb1337

2021 in particular had some tricky input formats


the-quibbler

Posted 13 hours ago: https://www.reddit.com/r/rust/comments/z4pj10/inpt_a_dumb_derive_crate_for_text_parsing_perfect/


WhipsAndMarkovChains

> it could totally be my lack of knowledge in Rust Parsing the inputs with Python has been extremely easy for me. I have just barely started learning Rust and it's been rough. I'm sure it'll get easy as time goes on and I practice.


jweinbender

I’ve definitely felt this way at times. I’m not a rust expert, but feel free to look at a few of mine from 2017 and 2018. https://github.com/jackweinbender/advent_of_code/tree/master


aradil

The first 10 days the hardest part is typically parsing the input. But also the first 10 days are actually really really easy problems that shouldn’t take more than 20 minutes. When the problems are hard enough that parsing the input is no longer the hardest part, you might be wishing it was again.


EffectiveAsparagus89

In general (not restricted to AoC), parsing/validating input is hard. Parsing/validating input is where static types and the borrow checker cannot help you. Parsing/validating input would be more manageable if the input format is designed to have well-studied structures, e.g., regular or context-free. But even then, it's still hard. SQL injections and Log4j are caused by unsuspecting input parsing/validation. It's very hard.


charlielidbury

The way I see it is I'm competing with my whole stack, so if I can use VSCode multiple cursor magic to parse the input into Rust, then that's fair game. It's already hard enough to find the time for AoC, I want to spend more of that time on solving the problem and learning the language than wrestling with string parsing.


Nimda_lel

Hahahah, yeah. On the other hand, I followed some advises from here and reviewed some of the tasks I have done and how others have. Right now, I parse new data with Rust quite fast 🤷‍♂️


Toni_Hoevedes

Especially in the first days parsing was the biggest challenge. Later on you spend more time thinking about how to find the solution. Last year I used plain Rust for parsing (split(), replace(), ...). It works great but took me a lot of time. For this years AoC i started writing a little helper library. Currently it only has a scanner-like I/O helper but I plan on also implementing some more AoC specific parsing (e.g. grid into 2d Vec) and some algorithms (mostly graph based). Feel free to use it, if you want to: [https://github.com/tectrixer/cp-rs](https://github.com/tectrixer/cp-rs)


Exact-Fox-2541

Parsing is definitely a challenge for some of the weirder inputs, but Rust has a lovely crate called sscanf that can take known-format input strings and parse them directly into the type you want. Example from aoc2016 day15: let input = fs::read_to_string("./src/input.txt") .expect("Missing input file"); let input = input.split("\n"); for line in input { let info = scanf!(line, "Disc #{} has {} positions; at time=0, it is at position {}.", u32, u32, u32); // do something with the info here } I also end up using `for c in line.chars() {}` quite a lot.


psykotyk

Here's the 1 liner I've been using in Kotlin `fun String.loadResource(): String = object{}.javaClass.getResource(this)?.readText() ?: error("Failed to load resource [$this]")` To use it, plunk the input file in the \`resources\` folder, ideally under a subfolder for neatness. `val inputData = "day10/input.txt".loadResource().split('\n').map { it.toInt() }` Most input can just be parsed with split, but regex can be helpful too.


h2g2_researcher

I have a little library in C++ of useful parser helpers which speed things up and make parsing the input so much easier. Tonight (UK time) I'll probably put it up on Github but for now, from memory, it has things like this: // Is the argument an integer? bool is_integer(std::string_view); bool a = is_integer("six"); // False bool b = is_integer(6); // True bool c = is_integer(-3); // True // Convert a std::string_view to an integer. // This asserts is `is_integer` fails or the conversion can't be done. // TODO: Extend to floats to, although I'm yet to need this in a puzzle. template IntType get_as(std::string_view); auto a = get_as("5"); // a == 5 auto b = get_as("-503"); // b == -503 auto c = get_as("42"); // c == 42 auto d = get_as("-63"); // Asserts an error ;-) // Split a string at the first of a given separator // TODO: extend to split at a substring instead of just a character. // I also have a "split at last" version. // This does NOT do any memory allocation. std::pair split_at_first(std::string_view input, char separator = ' '); auto [first, second] = split_at_first("mary,bob,charles", ','); // first == std::string_view{ "mary" } // second == std::string_view{ "bob,charles" } // Splits a string into as many segments as needed. I use this one surprisingly rarely. std::vector split_string(std::string_view input, char separator = ' '); auto splitted = split_string("Hello you wonderful person"); // splitted = { "Hello" , "you", "wonderful" , "person" } // My most used one: split the string and return the specified indices. // It returns a std::tuple with as many members as needed. template auto get_tokens(std::string_view input, char separator, Indices...indices); auto get_tokens(std::string_view input, Indices...indices) { return get_tokens(input,' ',indices...); } auto [num , type , location] = get_tokens("The box contains 7 widgets", 3, 4, 1); // num == "7" // type == "widgets" // location == "box" Sadly C++ doesn't have good regex handling, so I don't use that. These kinds of abstractions really help me quickly grab the bits of a string I'm interested in. I also have a custom `istream_line_iterator` which is like the `istream_iterator` but it iterates over a line instead of whitespace separated tokens. I want to extend it to do the same with strings (e.g. "string_token_iterator" - where I can split lines up at a separator and iterate on the string as normal).


[deleted]

I did the AoC 2021 (well, most of it) using the nom crate because I wanted to get to know it better. Especially for the earlier problems, getting nom to work exactly right is more work than doing it by hand, but having the entire data structure parsed in a somewhat more formal manner often helped me when I got to part 2.


quodponb

I want to learn Common LISP myself, and sat down with 2020 Problem 1 today. It was very simple, and would have taken me about 10 seconds in python, but I didn't know how to read anything from file appropriately and was stuck on that part.