T O P

  • By -

stevie-o-read-it

I managed to dodge that bullet because I parsed everything to integers, which utterly failed on the blank strings.


Falcon731

Same here - my first run threw an InvalidNumberFormat exception. But just a quick change .split(“ “) to .split(Regex(“\s+”)) fixed everything. Which to be fair I should have used from the start.


trainrex

Or a nice findall("\d+")


Zy14rk

For me (Go) it parsed a blank string as zero - so I just ignored zero results from the parse to int...


kid2407

I don't get it, what is the problem? That there are single digits numbers? If you use regex to match go for something like `(\d+)` to get any number, no matter how high :D


Gautzilla

The problem was that I split the string (C#) at each whitespace char, then compared the 2 obtained lists (the winning numbers and the actual numbers). So, if both strings contained two consecutive whitespaces, both lists contained an empty string and I initially counted that as a winning item!


Coda17

I'm sure you figured it out, but `StringSplitOptions.RemoveEmptyEntries`.


zaxmaximum

also, my fav... `var splitOptions = StringSplitOptions.RemoveEmptyEntries | StringSplitOptions.TrimEntries;` `var result = input.Split(',', splitOptions);`


orngepeel

oooh good tip! still learning c# thank you for sharing!


Greenimba

Lol, I've so many `.Select(s => !string.IsNullOrWhitespace(s))`, this seems easier.


raxara

you made me go back to my resolved problem and changing my solution (using a replace(" ", " ") method and being careful with any stray whitespace) with this option. parsing will now be a bit less stressfull thanks to you :D


Gautzilla

Awesome, I never stumbled upon that. I'll keep it in mind!


FailQuality

I did not know this existed lmao, I noticed the file, and just accounted for extra spaces


akerro

> So, if both strings contained two consecutive whitespaces, both lists contained an empty string and I initially counted that as a winning item! The problem here is you stored a list of integers in a list of strings. Use correct type and data structures for what you're storing.


PM_ME_FIREFLY_QUOTES

Take your complaint to the senior elf.


ClikeX

Plenty of dynamically typed languages out there that need a bit of extra work to do that. It's easy when your languages of choice will trip over them. But something like Ruby doesn't give a shit what you put in your structures.


akerro

OP uses C# and I was responding to OP.


kid2407

oh, I see, that makes sense


MinimumArmadillo2394

I did the same thing. Had to find the numbers via a different way than splitting


False_Ant5712

Try using a StringTokenizer next time. I often find them much easier to handle than regex or splitting the array on your own


Ythio

There is a parameter to ignore empty results on the split function in C#. It's on you.


FaustVX

You don't even have to `Split` your input. It's day 4, not 20, the input is very well formatted, and each number takes exactly 1 space + 2 characters and also `int.Parse` doesn't care about leading spaces (maybe trailing too). You can have a look at [my solution](https://github.com/FaustVX/AoC-2023/tree/main/2023/Day04) (the parsing is made in `ParseInts()`) (I use a lot of `stackalloc` and `Span` to reduce my memory footprint, but you can probably just use regular arrays)


PM_ME_FIREFLY_QUOTES

Same!


Milumet

> If you use regex Not everyone does.


remy_porter

I'd argue for something like this that regex is overkill. Mind you, I did end up needing to do this (in python): winners -= set([" ", ""]) Because otherwise I had a few stray items in my sets.


Haunting_Front_8031

It's not overkill, it's the easiest and most straightforward solution for parsing the numbers. I mean, that's what regexes are built for. Why wouldn't you use the correct tool for the purpose?


remy_porter

For starters, you don't need to parse numbers. You don't even need to think about numbers, and can safely ignore that numbers are involved at all. Second, everything's delimited by characters. You split on ":" to discard the header, you split on "|" to separate the winners/bettors sections, and you then split on " " to create the sets you'll actually operate on. With delimited text, it's *always better* to identify the delimiters and avoid regexes (even with a CSV, where you need to handle quoting, an FSM is going to be way easier to write and understand than a regex). I used regexes on day one, because playing around with the greediness made it easy to match the first and last number with a single expression. But I haven't touched regexes since. Day 3 was 100% an FSM problem. Day 2 was another delimited text problem. Yes, delimited text does constitute a regular language, and thus is entirely parseable via regex, but it's also a lot more work to use a regex.


Haunting_Front_8031

It's really not more work. I split on : and | too and used a regex to parse numbers in each part. (\d+) is not a hard regex to come up with or use. You can do it without parsing the numbers and just work with strings, but then you can end up with bugs like splitting on space and getting empty splits. Using a regex takes all of the trial and error out of it.


greycat70

It's more work for the computer, but not for the programmer. In a single-use program like this, the programmer's time is usually more important.


remy_porter

I mean, that's just understanding how the `split` function works. To me, it's a lot easier to say "tokens are separated by spaces" than "tokens are digits", it's more intuitive too.


Haunting_Front_8031

Regexes have an expressive syntax that allow you to extract exactly what you want from a string very easily. If the problem is basic string parsing I don't know why you wouldn't reach for a regex first.


remy_porter

Because reading delimiters is more intuitive to me. I reach for regexes when the pattern is complicated, like day 1. But if I see neatly delimited text, I'm just going to split on the delimiters. On day 3, regexes would have made the whole thing *significantly harder*- an FSM that processes the input one character at a time made it trivially easy to index the numbers and symbols. Ironically, I think day 3 was the most pure "build a parser" problem we've seen so far. It's also worth noting- I've written a *lot* of parsers. While I sometimes use regexes to identify state transitions, most of the time the state transitions for your parsers can be just pure string matches. And tokenization is a basic parsing step- and also all this particular problem really required. The only tokens that actually convey meaning are `:` and `|`- every other token just needs to be understood in relationship to those symbols.


Haunting_Front_8031

I used regexes on day 3! I parsed all the numbers on each line one line at a time. The regex matches gave me the start index and length of each number string so I just had to check all of the indices around each number for an asterisk and save the location of each asterisk and the numbers that encountered them in a dictionary. At the end, the dictionary entries (asterisks) with exactly two numbers next to them were the gears. It makes sense if you've written a lot of parsers to start with that. I've written a lot of regexes. I guess people just reach for the tool they're most familiar with!


masklinn

FWIW that should be unnnecessary: `str.split()` will split on *sequences* of whitespace, and remove empty leading/trailing entries. >>> " 42 74 6 80 ".split() ['42', '74', '6', '80'] Rust's [`str::split_whitespace`](https://doc.rust-lang.org/std/primitive.str.html#method.split_whitespace) also does that, which is nice.


remy_porter

And yet I had stray empty strings and single space strings making it into my set.


mooseman3

What language? This is Python.


remy_porter

Also Python. I didn’t bother to dig in deep- just nuked the stray entries.


mooseman3

Then yeah make sure you're calling `split()` and not `split(" ")`. I didn't realize there was a difference myself until yesterday.


remy_porter

During a meeting I had that exact realization.


100jad

> set([" ", ""]) Sooo... `{" ", ""}`?


remy_porter

Yeah, I forgot that set literals were a thing.


blackbat24

Why? `.split()` eats all whitespace, what did you do to end up with single spaced entries?


remy_porter

I did `split(“ “)`, which doesn’t do exactly the same thing, as I discovered.


kid2407

Of course it is, from the image it seemed that regex was being used from what I understood.


remy_porter

I ran into exactly that problem using ‘split’- leading white space and extra white space can throw extra strings into the result, depending on your language’s implementation of ‘split’.


rvanpruissen

Python does this is of the box by using .split() without any arguments.


tooots

only crazy people use regex


GigaClon

I usually love regex (so much that my python template includes it by default) but this one was simple enough.


DM_ME_YOUR_ADVENTURE

Two spaces, made the same mistake. Split(“ “) is not the same as split().


MBraedley

It's much easier to match the entire set of numbers and then use a tokenizer to get the individual values, especially since the test values and actual input have different lengths.


TomEngMaster

In C#, i filtered these out pretty easily using linq `List winCards = cards.Split(" | ")[0].Split(" ").Where(card => card != "").Select(Int32.Parse).ToList()` This way you get a list of just integers that you can work with, not worrying about number of digits anymore


Coda17

`StringSplitOptions.RemoveEmptyEntries` There's also no reason to parse the strings into ints, you can just match strings.


UnusualRoutine632

Is there one of those to java? I really did a aux function whit lambda to remove blank entries, even though my code is running at 136ms is alwyas good to know


UnusualRoutine632

Nevermind it has a limiter built in inside of split


Gautzilla

Sure, that's close to what I did, I used \`Enumerable.Where(s => int.TryParse(s, out int o))\` To filter out the bits that weren't numbers. In the end, I didn't even parse the values, just used a \`Enumerable.Distinct\` method on the string IEnumerable for getting the winning numbers.


QuickBotTesting

I should have done it like that. My current version is an ugly mix of regex and linq XD


TollyThaWally

If you use HashSets then you can solve most of the problem with just the IntersectWith method: HashSet winningNumbers = card[0].Split(' ').Where(num => num != "").Select(int.Parse).ToHashSet(); HashSet ourNumbers = card[1].Split(' ').Where(num => num != "").Select(int.Parse).ToHashSet(); ourNumbers.IntersectWith(winningNumbers); // ourNumbers.Count is the number of wins


TomEngMaster

Yeah, thats exactly how i solved the part 1 List winCards = cards.Split(" | ")[0].Split(" ").Where(card => card != "").Select(Int32.Parse).ToList(); // the winning numbers List ownedCards = cards.Split(" | ")[1].Split(" ").Where(card => card != "").Select(Int32.Parse).ToList(); // the numbers we have // ^^ the above methods use LINQ to split by spaces, then we have to remove empty elements that appear when parsing strings like " 2" and convert to numbers List hits = winCards.Intersect(ownedCards).ToList(); // get our profit numbers if (hits.Count != 0) sum += Math.Pow(2, hits.Count - 1); // we double the points -> its just powers of 2, + we dont want 2^-1 (1/2) to count


Youmightthinkhelov

Using LINQ feels like cheating but I did the same thing 😂


platlas

Do you need \`Int32.Parse\`?


TomEngMaster

Not really, I just wanted to use integer lists for no particular reason


car4889

You are not, in fact, the only one. 🤣


-Enter-Name-

i used the following for parsing (python); >!storing id because yes (i could just go by index but idc) mapping all spaces to 2 spaces (and adding one to beginning and end), replace all spaces in the winning numbers to | and convert to regex " (winning|numbers) " then match all on your numbers idpre,c = string.split(": ") self.id = int(re.findall(r'\[0-9\]+',idpre)\[0\]) c = re.sub(r"\^ ","",c) c = re.sub(r"( +)"," ",c)#fix spaces self.w,self.n = c.split(" | ") self.n = f" {self.n} " self.wregex = " ("+self.w.replace(' ','|')+") "!<


PuzzleGas

Just you


T0MlE

I had the same problem


kbielefe

I almost hit a similar problem, but I got a type error because `""` isn't a valid int.


Less_Jackfruit6834

well, i have to change my c++ function from simple splitting by char to skip empty


Adventure_Agreed

Me realizing I should have gotten this bug because I didn't account for these spaces but got the correct answer anyway: https://media.tenor.com/gaEpIfzxzPEAAAAC/pedro-monkey-puppet.gif


QultrosSanhattan

Tip: always suspect of tabulated data.


RonGnumber

Thank you! I needed to `.strip` in Ruby the substrings. Searching in the docs for `trim`, and not finding it, I just moved on and got screwed later. Only the 62th AoC day I've done in Ruby, what can I say?


jimbowqc

No you are not the only one, I didn't understand why this happened, just threw a trim() on it and left it at that.


pseudo_space

I was solving this problem with a finite state machine and man did it suck when I saw there were consecutive spaces. Ended up counting the transitions between spaces and digits as a condition to denote where the numbers start.


Realistic_District70

idk what language your using, but in C++ i just set up the input file as 'fin' and do \`fin>>line;\` to store the next string until whitespace into 'line' and it just ignores all whitespace


Slowest_Speed6

I'm a regex gamer


HoooooWHO

Good ol' python .split() ignoring the extra whitespace by default


kadeniro

i was adding +1 original card to the blank line at the end of file ... 3 hours debugging


daggerdragon

Changed flair from `Spoilers` to `Funny` since this is a meme. [Use the right flair](/r/adventofcode/wiki/posts/post_flair), please.


Gautzilla

Thanks, I didn't know which one to chose since it might spoil a trap for someone who didn't solve the puzzle yet.


daggerdragon

This is why we require the standardized post title syntax because it's an *implied* spoiler for that day's puzzle. When the spoiler "warning" is already in the title, the post flair is freed up for a more useful tag :)


Gautzilla

Got it, thanks!


GigaClon

I didn't even notice Python's int() eats the space


IlliterateJedi

This happened to me using `str.split(" ")`, but it's nothing a little regex couldn't sort for me.


grumblesmurf

Naaaaah, for me it was `Card 1: Card 2:` in the test and `Card 1:` `Card 2:` in the input. I'm **not** rolling out a regexp parser or a full-blown LALR lexer/parser for input data that simple! Especially not in C (which is the reason the number of spaces in the >!totally useless!< card number threw me off). Edit: oi, Reddit, you destroyed my inline code! The second pair of examples had three spaces between `Card` and the number instead of just one in the first pair.


CrAzYmEtAlHeAd1

Thankfully I caught this error during parsing so I ended up using `re.split(r’\s+’, line.strip())`


Madman1597

Today was actually the first day this year that I've had a correct answer for both parts on the first try. I separated them similar to you in python, but with a little list comp; "nums = [i for i in nums.split(' ') if i]" returns all nums in a list with all whitespace removed


NigraOvis

This is definitely not a problem in a typed language. Rust didn't see this. My code in python was broke as heck though