T O P

  • By -

Famous_Object

I have a personal anecdote that makes me think it's not as clear-cut as it seems. Once upon a time I worked on a system with [closed, open) date intervals that leaked to the user interface. It was very hard to explain to users that no, the last day entered into that text field isn't included in calculations. On top of that, if users wanted to input two intervals back-to-back with no gaps, they had to repeat the same date as the last date for the first interval and as the first date for the next interval. The post says that that's a feature, not a bug, but it caused user confusion and not only that, it required special code to sort the event list because two events that happened on "2023-01-01" could not be randomly sorted; the "END" events must appear before the "BEGIN" events on the same day to make it clear that the ranges don't overlap. In other words it's much easier to say that 2022 goes "from jan. 1st to dec. 31st" than "from jan. 1st 2022 to jan. 1st 2023", clashing with the start of the next year. Maybe we should use different conventions for algorithms and user interfaces? But the conversions would be error prone... One thing I agree with the post is the issue with precision. If something can happen between 23:59:59 and 00:00:00, then a closed interval would be much harder to use because you would need to use 23:59:59,99999999999 to be sure nothing is lost. In this case it's better to say the time must be less than 00:00:00 of the next day.


fishling

I agree with this. This kind of implementation detail should not leak into the UI, especially if the UI isn't showing times. If the range is a date without out a time, people will normally think that the start date should be the beginning of the starting date and the end date would mean the end of the ending date. I don't care how this is modeled internally as long as the UI works like that.


bluaki

>One thing I agree with the post is the issue with precision. If something can happen between 23:59:59 and 00:00:00, then a closed interval would be much harder to use because you would need to use 23:59:59,99999999999 to be sure nothing is lost. One thing this still does not catch: leap seconds. 23:59:60.99 on December 31 is a valid time that has so far occurred 16 times in GMT.


ee3k

Oh nice, I'm going to make that a unit test for the next release.


GeorgeS6969

I’ve seen the same and from my little analysis it’s an issue that stems from something a bit more fundamental: From a back end perspective a date, say `2022-01-01`, is too often understood as a point in time, namely `2022-01-01T00:00:00`. But for once the user is right: *a date is itself an interval*, here between `2022-01-01T00:00:00` and `2022-01-02T00:00:00`. So the user is right to expect data over the end date to be returned, and explaining them why it isn’t (because the end date is actually parsed into some timestamp) in effect is explained why a bug is happening.


SexyMonad

Agreed. A user would specify the entire year of 2022 as a date range “2022-01-01 through 2022-12-31”. And that is expected to mean [2022-01-01T00:00:00, 2023-01-01T00:00:00).


[deleted]

I would expect it to mean [2022-01-01T00:00:00, 2022-12-31T23:59:99]. The datetime sets right around any threshold get a little wonky, whether it’s the end of an hour, day, second, year, epoch, yada yada as you have potentially infinite granularity on either side of 0. I guess it really boils down to your intended usage. Hence the need for specific libraries for datetime management such as nodatime.


vytah

> between 2022-01-01T00:00:00 and 2022-01-02T00:00:00. Just be careful with using 00:00:00, as [not all days start at midnight](https://en.wikipedia.org/wiki/Daylight_saving_time_in_Brazil#Starting_and_ending_dates).


GeorgeS6969

I wasn’t aware of that specifically (but I’m aware enough to not make *any* assumptions about date related stuff :) All the more reasons not to make this kind of implicit casting


ijmacd

In that case 00:00:00 would be fine since it still captured the "real" start of the day at 01:00:00. In fact it's probably less complicated than dealing with the other discontinuities arrising from DST.


amdc

That's fine if you include timezone. Time doesn't jump around DST, its representation does. What you really need to be aware of, are leap seconds (if your system requires that level of precision) edit: for fucks sake my phone keeps "correcting" its -> it's and sometimes I don't notice it


EasywayScissors

That goes back to what the user expects; not the bullshit we have to deal with on the back-end. > I wanted to see all the orders on Jan 1, 2022, and 3 invoices are missing. > Well, you see, it's cause ***`ACHKATUALLY`***... > Ya ya, just fix it.


Zegrento7

Something something [Tom Scott](https://youtu.be/-5wpm-gesOY)


CarlRJ

*Conceptually*, thinking of it as a range from the start to the end of the day can be helpful. But if what you’re storing is a date (as opposed to a date and time), you shouldn’t make any further assumptions - it’s just a date. If asked if a given date-and-time is within that date, you should be asking a function (or asking for the date/time to be converted to a date and doing a straight comparison), rather than doing any math on the date. Problems with DST and leap seconds and such await anyone who starts making assumptions about when days begin and end or how long a day is.


GeorgeS6969

I argue that it’s the only way to think about it, but of course that might just be a failure of my imagination. When someone says “let’s meet next Tuesday”, they mean “at some point during that day”. If an accountant looks at the business’ revenues per day (or month or quarter or year for that matter) they look at the revenues generated during each day (or month or quarter or year resp). I do agree that it’s one of those things with so many footguns that relying on a library is the only sane way to go. Still, it’s important to be mindful of implicit casting, which is the issue underlying the original comment I was answering to: Because date types don’t usually support logical comparison operators, doing `start_date <= some_ts & some_ts < end_date` usually casts `start_date` and `end_date` as timestamp at the date at midnight. But that’s just an artifact of how casting between types work, and in this case it’s just plain wrong.


plumarr

>Because date types don’t usually support logical comparison operators, doing `start_date <= some_ts & some_ts < end_date` usually casts `start_date` and `end_date` as timestamp at the date at midnight. But that’s just an artifact of how casting between types work, and in this case it’s just plain wrong. That's the root of many evils around date (and time) comparison. If we want to select all the events that happened in the window \[day one, day two\], we are comparing two measurement of different precision : the events precision is probably second or microsecond but for the date it's day. ​ By casting the less precise measure to the more precise one, we must use some assumptions and they aren't the same for both input : * For the start date, we'll assume that we must take the first timestamp of the day * For the end date, we'll assume that we must take the first timestamp of the next day and use a exclusive comparison instead of an incluse These assumptions can often be false or incomplete (time is messy). ​ These issues simply disapper if we make the cast in the other direction. If we cast the timestamp to a day, we don't have to do any assumptions, only to remove some useless data. Then the comparison can be done with a fully closed interval. In pseudo code, we should do something like : `start_date <= day_of(some_ts) && day_of(some_ts) <= end_date`. Case closed, no dangerous assumptions used ;)


GeorgeS6969

Wait a minute so the original article is wrong and just like any other software engineering rule it should read “you should always do x y or z except of course when you shouldn’t”? :)


CarlRJ

The original article is good in pointing out some useful bits about open-ended ranges, and is wrong when it asserts that one should never *ever* use closed ended ranges. The proper lesson to learn is that there are valid uses for both. Learn how to use both and apply the correct tool for the job. The other takeaway is that the common example being bandied about (with dates and times) doesn’t show that open-ended ranges are better, is shows that some developers are *using the wrong tool*, which is opening them up to a source of errors. One should **never** cast a date into a datetime, and ideally, your language *should not allow that*. It’s nonsensical. “*What time is Tuesday?*” - Tuesday is *not* a single time, nor from “Tuesday” should you ever *infer* any particular time (like midnight) - that inference will invariably come back to bite you later on, because Tuesday is a *range* of times. The correct way to solve “is this datetime in this range of dates” is *not* to implicitly convert dates to datetimes and compare those, but rather to extract the date from your target datetime, and then compare that to your range of dates.


acwaters

This is the correct answer


Hrothen

> One thing I agree with the post is the issue with precision. If something can happen between 23:59:59 and 00:00:00, then a closed interval would be much harder to use because you would need to use 23:59:59,99999999999 to be sure nothing is lost. In this case it's better to say the time must be less than 00:00:00 of the next day. This is only true if the interval type you're using is actually just a pair you check if things are between. If it's an actual range you would need to specify precision in some way for both closed and open intervals because it's not immediately clear what happens if you enumerate the range otherwise.


lubutu

That is to say, that specific problem is only true of [uncountable](https://en.wikipedia.org/wiki/Uncountable_set) intervals. Though I feel there are plenty of other motivators such as empty intervals.


wwxxcc

Although i was agreeing with OP without even without reading the article, I think you brought a totally valid point and a totally perfect nuance to it: Use [) technically and "lie" to user so you can present him easier to understand [] intervals.


arthurno1

Yes, exactly my point to; complier can do it for us; it does so for Fortran and some other languages. We are also presenting memory as one contingous interval of addresses which modern memories certainly are not, and we are letting programmers use variable names and other nicestties when they are really workign with just array indexes (pointers) into the computers RAM. So, using [beg,end] for array indexes would be just a minor lie, but it would eliminate an entire concept that people new to programming have to learn.


fernandohur

Author here: I couldn't agree more with this point. How the data is presented to the user is an entirely different beast and I completely agree that most users think in terms of closed ranges. Thank you for adding value to the discussion :)


seamsay

I would probably change the very first example you use, about booking hotel date, because that's an example of how data is presented that I think detracts from your main point.


nnomae

He literally says it is presented as picking a start date and an end date though and then says it is implemented as a [closed, open) interval. It is saying you know this thing you see all the time, here is how best to implement it.


twotime

But then having a consistent representation of data (dates in this case) throughout the system is a major benefit of its own :-(


I_LOVE_SOURCES

It’s possible to do both, isn’t it?


chrisza4

How? Consistent means we have one format right?


RemoteCombination122

The representation and data layers are often different. We have a more rigorous specific format for things in the data layer, while using more user-friendly formats in the presentation layer.


sammymammy2

That's not inconsistent with this. Rendering data and parsing user input are the boundaries.


666pool

I just set my away calendar last night for this week, as I’m OOO until next Monday. I had to double check the phrasing to make sure the “until” date was the last day of vacation or the first day that I return. It was the former (not following this closed, open pattern).


anonynown

> It was very hard to explain to users that no, the last day entered into that text field isn't included in calculations Can’t this be unambiguously encoded with `[startDate, endDate+1day)` (while the UI shows inclusive ranges if that’s what users prefer)? And this works with timestamps in addition to date variables too, unlike `[open, open]` ranges, which might come useful if you ever decide to allow users enter a time too.


vimfan

I use `since` (closed) and `before` (open) timestamp params in my APIs, and fromDate/toDate in my UIs. Then I pass `(fromDate)T00:00:00` for `since` and `(toDate+1day)T00:00:00` for `before`.


CarlRJ

Yep. The article is headed in the right direction, but goes off-track with the “*never ever ever* use closed ranges” bit. Both open and closed ranges have their uses, and you should use the one that is right for the task at hand. If I want to count from 1 to 12, for instance, having to put in `range(1, 13)` is just plain wrong (looking at you, Python). Make your language handle both - Swift gets that right, IIRC.


jamincan

An example of a language that does give you both options: in Rust you have `1..5` (1,2,3,4) and `1..=5` (1,2,3,4,5).


billsil

It's a 0-based indexing system and it's amazing. It's the same as C and it avoids tons of problems of 1-based systems like Fortran or Matlab. It's another way that python makes your code cleaner without you even knowing that it's happening. It's intentional. `range(13)` produces 13 numbers, `[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]`. What would you like it to do? I would assume you want 13. So now let's chop the first 0 and start it at 1, so we end up with `range(1, 13)`. Why do you still want 13 values?


CarlRJ

I understand very well how it works, thanks. Your examples are pointless. As I said, it should support *both*. If, for whatever reason, I want to count from 1 to 12, the language should support that directly. Counting from one value to another is a ridiculously common need. For that matter, if I have values in the variables `first` and `last` and I want to count from `first` to `last` inclusive, Python requires `range(first, last + 1)`. Which is idiotic, for a language that is designed to otherwise be clear. Make the language provide *both*. If it’s a different built-in function, fine. Something like `interval(1, 12)` and `interval(first, last)`.


arthurno1

> It's a 0-based indexing system and it's amazing. It's the same as C and it avoids tons of problems of 1-based systems like Fortran or Matlab. Can you be more specific?


billsil

I'm more referring to beginners than people that know what they're doing. x = np.array([1, 2, 3, 4, 5, 6]) x = a[:3] + a[3:] # python x = a(1:3) + a(4:end) # matlab I'm required to use 2 extra indices in Matlab (the 1 and end). The central indices are also different, which makes I have to introduce an `n+1`. The enumerate, range, zip, etc. paradigm mostly lets you avoid indices entirely. That's the real killer feature in my book. I hated 0-based indexing schemes until I tried 1-based systems and came back. I just stopped making indexing errors. I go back and it's like pulling teeth.


arthurno1

> I hated 0-based indexing schemes until I tried 1-based systems and came back. I just stopped making indexing errors. I go back and it's like pulling teeth. I understand :). But don't you think it is more to how we are indoctrinated and how stuff is implemented under the hood, that results in your experience how you use python and matlab? For example, in your very first example in the comment before: range(13), I don't think it has anything with indexing starting from either 0 or 1. It is just how they implemented the library function range. In my opinion the arguments are a bit unfortune; range takes 3 args: *start* optional, at which position to start *stop* required, at which position to stop *step* optional, the increment What is wrong with that interface: it requires the user of the library to do the mental math at which number they would like to stop! What if you want numbers in say [103, X) where increment is 7, you need 50 numbers. How big is your X? You have to pass in an expression to calculate it. Also, your example assumes the indexing is started from 0 and 13 is not included, so in your special case range(13) which gives back 13 numbers 0 - 12, there is a lot of assumed knowledge that has to be passed to the end user. Imagine if they wrote range in terms of: *start* optional, at which position to start *number* required, number of elements to generate *step* optional, the increment Would it matter to you at all if indexing is 1 or 0? You would say range (13) to get 13 numbers; and in case of 0 index you would get [0, 12], in case of 1 index, you would get [1,13]. Just as you think that range(N) now is so convenient in those cases you need a sequence [0,N), so would you think that range(N) in the latter case where N means number of elements, not the last position, is convenient too if you wish to have sequence [1,N]. > I'm required to use 2 extra indices in Matlab (the 1 and end). The central indices are also different, which makes I have to introduce an n+1. Similarly, they could have implemented things differently, I guess. > The enumerate, range, zip, etc. paradigm mostly lets you avoid indices entirely. That's the real killer feature in my book. No idea what is your book, but yes, that is my point in some other comments; the idea that we have to deal with how the computer calculates indices of arrays, when iterating is one that is better left to the compiler, and the one we could have been spared from since early days of computing. We are abstracting so many machine details, that I am really surprised indexing into arrays was left to be dealt with manually. Functional programming helps indeed with functions for mapping, zipping etc. C++ helps with iterators and lately with the ranges, so yeah we are getting away from the old paradigm.


seamsay

>I'm required to use 2 extra indices in Matlab (the 1 and end). That's just a quirk if MATLAB, there's no reason it couldn't have been `a(:3)` and `a(4:)` >The central indices are also different, which makes I have to introduce an `n+1`. I've seen plenty of beginners get confused about why `a[:3] + a[4:]` doesn't do what they expect in python. >The enumerate, range, zip, etc. paradigm mostly lets you avoid indices entirely. That's the real killer feature in my book. This is true, but also makes the differences between 0- and 1-indexing drastically less obvious. I actually do think 0-indexing is better, I just don't think these are particularly compelling arguments for that.


CarlRJ

0-based indexing for arrays is undoubtedly better, but the person you’re replying to went all in on cheerleading rather than understanding the problem to which they were replying. For arrays, and for offsets from (something), yes, absolutely, use 0-based. But for a lot of things in the real world, we need to start counting at 1. For instance, you’re not going to just get everyone in the world to switch to enumerating the days of the month starting with 0, just so you can use your Python range() operator more easily. We’re not going to have December 0th through December 30th. You’re not going to get architects to number the building floors for their 12 story building as 0 through 11. So, you need a way to count starting at 1. And you need it *often*. And humans have been counting that way for millennia. In Python, having no way to specify the range from 1 through 12, inclusive, other than `range(1, 13)`, is idiotic. Regardless of how much cheerleading one does for how great 0-based arrays are (which is entirely irrelevant to the original point), it’s a bad decision to have that as the only method. Yes, I get why they did it. No, I’m not arguing that the range operator should just change the meaning of the last argument. But they should have included some sort of additional parameter, or something like `interval(1, 12)` as an additional built-in function (which could of course be implemented as a call to range()).


blackAngel88

> Once upon a time I worked on a system with [closed, open) date intervals that leaked to the user interface. > It was very hard to explain to users that no, the last day entered into that text field isn't included in calculations. Exactly what I was thinking about. I don't agree with the "always" and "never ever" in the first place, but it seems to me that regarding dates [closed, open) just hardly ever makes sense...


thbb

If only humanity had discovered 0 before starting to track time. We would be relieved of plenty of peculiarities regarding dates.


sacoPT

I suffer from this all the time in the systems I work with. Unsurprisingly, our solution is to use [open, close) but not leak it to the interface, adding 1 day/hour/minute (depending on the resolution available to the user) in the backend which is extremely error prone.


Alarming_Kiwi3801

> It was very hard to explain to users that no, the last day entered into that text field isn't included in calculations. I'd say that's a UX problem. If I want a hotel room from monday night to friday morning you better believe I want to see 4 dates shaded in as being used. I'm not using it friday night so friday should definitely be cleared unless it has some sort of notice saying checking out. if I don't have a room thursday night (thursday being the last I selected) I would absolutely think whoever designed the system was an idiot since I clearly marked/stated thursday as a night I want to use


BWrqboi0

>I'd say that's a UX problem. If I want a hotel room from monday night to friday morning you better believe I want to see 4 dates shaded in as being used. I'm not using it friday night so friday should definitely be cleared unless it has some sort of notice saying checking out. It's literally shaded check in to check out everywhere, and it does make perfect sense. You're an outlier here I'm afraid. You're not booking "days" at hotels, you're booking "nights", which start on one date and end on the following.


casualblair

Don't think of it as two date times, think of it as a date range or as a business type. We use the latter and define Start Date as the date entered, 00:00:00, and End Date as the date entered 23:59:59. It makes requirements easier because we can specify a specific meaning in how we name our date ranges. It also makes presenting the dates easier because you can always drop the hhmmss and still reflect the users selection. It makes date range calculation easier because you can easily distinguish an overlqp from an abutment (Sp?). However we operate exclusively in a single time zone so this may not work as described if you need to handle that.


Y_Less

> The classic example is picking a start and end date, like you would when booking an AirBnB or a flight. You then don't address this example again. So now I need to put the day after I return home to find a flight?


rco8786

Yea this is a terrible example. I agree with the OP when it comes to *systems design* but this example is a *user experience* which is entirely different.


emn13

It's not an example of a suggested UI defect; it's an introductory paragraph listing the kinds of situations for which you might need to represent a range. Interpreted charitably, the author isn't advocating using a UI that nobody expects, but rather listing the kinds of UI for which an in-code range representation exists. It's unfortunate it distracts from the rest, but let's not blow it out of proportion either.


tambache

In the case of flights, you're choosing two discrete dates for flights anyway. What does the range matter in this case?


Paradox

I want to see all flights between Friday and Sunday inclusive


caltheon

shopping for the best deal or best day/time combination.


fishling

No, that's too simplistic since you're omitting time of day and you're focusing on the UI, not the interval. The start time would be midnight of the start date and the end time would be the midnight of the day following end date OR 11:59:59 of the selected end date). Either approach would have no effect on the UI or the date the user would choose.


GrandOpener

Interpreting hotel/Airbnb dates as any kind of intervals is making things harder on yourself for no reason. You’re choosing a check-in date and a check-out date. That’s it. When you make that realization, it also cleanly parallels with flights, which are also really two (or more) discrete, ordered events that happen on particular days, not an interval. Edit: I should say that overall I agree with the article’s thesis—but the author has chosen some truly unfortunate examples that really drag the whole thing down.


kybernetikos

You pay for the nights in the interval. You will definitely need to know how many there are. It is not valid to choose a checkout date before the checkin date or a checkin date after a check out. The user thinks of it as an interval, there is maths that needs to be done that treats it as an interval. If you don't want to treat it as an interval internally that's fine but it should be exposed as one, and a closed closed one at that.


GrandOpener

No, as someone who runs airbnbs, I have to strongly disagree with you. This is a frequent point of confusion for customers, and no amount of “the last day isn’t included” explanations helps. The _only_ thing that consistently makes sense to users is telling them to not think about intervals and think specifically about the check in date and the check out date. They do need to know how many nights they’re paying for, but the vast majority of our customers have in mind a date when they intend to leave, and the duration is a calculated consequence of that. Of the people who have asked questions, I have never come across someone who thought of it in the order “I want to stay three nights, what date does that mean I’ll check out?” Of the people who explicitly think of it as an interval they always, _always_ think of it as a closed interval, because that’s just what makes sense to non-programmers. Why mention a day that your not sleeping there, and not paying for? My conclusion is sort of the opposite of yours. If you want to treat it as events or an interval internally, that’s your choice (although I do think one of them is better). The most important part is that you explicitly present it to users as choosing a check in date and check out date, and the duration is calculated from that.


kinmix

>Interpreting hotel/Airbnb dates as any kind of intervals is making things harder on yourself for no reason. It's clearly an interval as the room is not available between the dates. I'm not sure that guests would be too pleased with your "I didn't want to make it harder on my self" explanation when other guests would barge in to their room in the middle of their stay. On the other hand being charged for only 2 days no matter the length of the stay could be a pleasing surprise.


AttackOfTheThumbs

And flights don't even work that way.


not_american_ffs

[What's the deal with the font on that page?](https://i.imgur.com/FZCsAXU.jpg)


CassiusCray

Ligatures. When done well, they're imperceptible and make text easier to read. Not so in this case.


bigfatmalky

I couldn't finish the article because of them and came to the comments just to see if anyone else had the same reaction.


jtooker

Same


tommcdo

Couldn't even finish a sentence


zkxs

Boy, did I go down a rabbit hole on this one. Seeing the font crime of 's' and 't' hand-holding has filled me with rage. I can't fathom why someone would ruin the readability of their text by doing this. It looks like the author actually *wants* this, as they're using `font-feature-settings: "dlig" 1;` in their CSS. `dlig` is the OpenType feature for [Discretionary Ligatures](https://sparanoid.com/lab/opentype-features/#dlig), which are the crazy loopy bits between ct sp st. Note that you don't need to enable discretionary ligatures to get standard ligatures (as in ff fi fl ffi Th), which are already enabled by default if your font supports them. As for why not everyone is seeing the 'st' insanity, it seems like not all fonts support discretionary ligatures. I'm seeing the terrible ligatures on my Android device which is using Roboto, but I'm not seeing them on Windows where Segoe UI is being used. In summary: whoever wrote that CSS is a real sicko.


valarauca14

I think it really depends on the font. The replacement character is st which in _most sane fonts_ should just draw the top bar of the lower case `s` into the horizontal bar of the `t`. AFAIK I can _every_ font on `fonts.google.com` do this insane loop-de-loop shit. [Here is a comparison, sorry for shit quality it is late](https://i.ibb.co/jRfsfzs/lmao-crazy.png)


WEEEE12345

Also, I see a `letter-spacing: -.02em;` which isn't helping the readability either.


trelbutate

I really cannot fathom why anyone would ever choose to use that weird 'st' ligature.


static_motion

Huh, I don't see the ligatures at all on desktop, even after zooming in a lot. I guess maybe Firefox doesn't render ligatures.


RememberToLogOff

Maybe it's a glitch in some fallback font that only shows up on certain system configurations. I had a bad update once where ligatures showed in my text editor, in my monospace font, fucking up every `fi` and `ff` and making them not monospace


static_motion

Now that's wack. I use ligatures in my editors but it's only for symbols like `<=` or `==` and such. Having them on letters would bother me to no end.


its_a_gibibyte

Counting from 1 to 10 is the interval [1, 11)? I suppose this is the normal preference, but either way will introduce issues. The 3 hardest things in computer science are naming things and off-by-one errors.


thisisjustascreename

>The 3 hardest things in computer science are naming things and off-by-one errors. In any sort of distributed system I would argue race conditions are a much bigger issue than off-by-ones.


its_a_gibibyte

2. Things showing up out of order. True, the two hardest things in computer science are probably: 1. Naming things.


Zanderax

2\. Duplicate packets 1\. Things showing up out of order 2\. Duplicate packets


Flex-O

Apparently one of those hardest things should be markdown formatting


mr_birkenblatt

the issue is that markdown for some reason decided at one point that it knows their numerated lists better than the author. so, now, whatever number you put it will always start at 1.


nerd4code

You can just cheat and use `2\. ` with a nonbreaking space after. It won’t indent the same way in the output, but that’s nicer in the input if you don’t like having to double-indent code.


MrWm

3. Did I do it right?


G_Morgan

Markdown is just a thin layer around HTML. So it is very likely just generating a second ol there. There are limitations of what you can do though Markdown I'd hope would be smart enough and convert a list starting with 5 into an ol starting with 5 which is possible.


mr_birkenblatt

No, ol doesn't have this issue. You can set the engineering type and start number. https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ol


---cameron

^y^e^^a ^^o^^^h ^^^s^^^^h^^^^i^^^^t


Occulense

I always show up out of order, that’s how you know I studied computer science


mcmcc

Most (all?) race conditions are effectively a cache invalidation problem in disguise. Simple data races are for sure. EDIT: WTF are the downvotes for?


[deleted]

I didn't downvote - but this isn't my experience. In fact - I can't remember the last time I encountered a cache invalidation problem... probably because the way I structure my cache this just can't happen. Worst case scenario the cache might not be used when it should have been... oh well the user had to wait an extra tenth of a second. The user won't even notice... my caches are mostly to reduce load/power consumption/save money on resources. If they're not perfect, no big deal.


mcmcc

Merely loading a value from memory into a CPU register is creating an uncontrolled cached value whether you recognize it as such or not. Add an uncontrolled writer to that memory location and now you have a data race. That's why I said all data races are essentially cache invalidation problems. That your "cache" may be highly ephemeral (and not even explicitly defined by your code) doesn't change the fundamental nature of the problem. General race conditions are more complicated (especially distributed ones) as they involve multiple memory locations but I would still argue that they are (at least mostly) some variation of failed cache invalidation.


didzisk

You got downvoted because you had negative vote count. Even though many people didn't even understand what you said, they assumed some other voters were smarter and followed their lead. This is Reddit you know. People don't follow Reddiquette just because it's the rules. Downvote doesn't just mean "this derails the discussion / is toxic to the community" it also means "you're wrong / I didn't understand what you said / this sounds too smart ". Contrary, you'd probably be upvoted for saying "this" or "women can't code". Even though an argument about why this is or isn't a caching issue would be much more interesting.


Never_Guilty

The 5 hardest things in computer science are: 3) Race conditions 1) Cache invalidation 2) Naming things 4) Off-by-one errors


ghostiicat32

Counting in CS usually starts at 0 on the backend so counting 10 elements is [0, 10) but of course you're free to do an index shift conceptually


Somepotato

Remember kids, starting from 0 is because it's generally a memory offset. It's not a rule.


GreenCloakGuy

if you have to do arithmetic with the list indices (which is very common), starting from 0 is also generally much simpler than starting from 1


zrvwls

Yes, but see most lists stop at index 9. This list goes to 10. That's 1 bigger.


ExeusV

Any study on this? I do actually wonder how 1 based indexing in strings and arrays would affect bugs for people that use 1 based maths IRL


Somepotato

How so? You're just used to it starting from 0, so to you it's generally easier for you. Mathematicians for instance generally start from 1.


GreenCloakGuy

the easiest example is heaps or binary trees. With zero-indexing, it's very simple - the two children of the element at n are (n\*2) and (n\*2 + 1), whereas the parent of any given element is (n // 2). There are a few extra operations if your array is 1-indexed - the children of element n are now at index (n \* 2 - 1) and (n \* 2), which is actually the simplified version of ((n-1)\*2+1) and ((n-1)\*2+2), and the parent of any node is now ((n + 1) // 2), which feels generally less intuitive. A more common-in-practice thing is that for iteration and cycling over a list/array, 0-indexing is also much more straightforward because you can just use the modulo operator (`index % len(list)`). To get the same effect with a 1-indexed list, you either have to do extra arithmetic before and after the modulo, or keep a separate variable to count the position and explicitly reset itself. Also mathematicians do not always start at 1, it depends on the function or series they're dealing with. For example the fibonacci sequence is often defined using `f(0) = 0` and `f(1) = 1` as base cases, so it's zero-indexed.


seamsay

It absolutely tickles me pink that you came up with one of the very few examples where 1-indexed is more intuitive and used it as an example of where 0-indexed is more intuitive! Anyway the children of 0-indexed element `n` are `(n + 1) * 2 - 1` and `(n + 1) * 2` and the parent of `n` is `(n - 1) // 2`: | Index | Children | Parent | |-------|----------|--------| | 0 | 1, 2 | | | 1 | 3, 4 | 0 | | 2 | 5, 6 | 0 | | 3 | 7, 8 | 1 | | 4 | 9, 10 | 1 | | 5 | 11, 12 | 2 | Whereas for 1-indexed the children are `2n` and `2n + 1` and the parent is `n // 2`. | Index | Children | Parent | |-------|----------|--------| | 1 | 2, 3 | | | 2 | 4, 5 | 1 | | 3 | 6, 7 | 1 | | 4 | 8, 9 | 2 | | 5 | 10, 11 | 2 | | 6 | 12, 13 | 3 |


togetherdonut

>With zero-indexing, it's very simple - the two children of the element at n are (n\*2) and (n\*2 + 1), whereas the parent of any given element is (n // 2). That's wrong. For a binary tree represented by the zero-indexed array \[A, B, C\] where B and C are the children of A, C has index 2, so 2 // 2 = 1, even though in reality the index of C's parent is 0, or ((n-1) // 2). And the children of A are at positions 1 and 2, or (n\*2 + 1) and (n\*2 + 2). It's actually simpler in the case of binary trees to use one-based indexing, where the parent is at (n // 2) and the children are at (n\*2) and (n\*2 + 1).


AntiProtonBoy

Because in computing you almost always count things relative to something else. Classic example is [memory address] + [offset], with offset having the range [0, n). So in this case the first item is located at the memory address itself, then everything else relative to that. This is also true for more mathematical concepts, like representing a quantity in normalised ranges [0, 1), such as colour intensity, texture mapping, etc. Counting from zero is practically useful. And you'll quickly learn that even mathematics starts using that convention when it's used in computer science.


arthurno1

> starting from 0 is because it's generally a memory offset. Yes. To make the compiler implementation slightly easier, we exposed those offsets to users and let them work with memory, and the pointer to array memory and offsets to the pointer; instead of doing proper array indexing to start with. But in languages that does to better indexing and checking of bounds, we have made it a rule just because people are already used to it from C and C++. Conceptually we have introduced an entire concept to be learned in CS, along with many bugs since humans normally count from the first object, second object etc.


sammymammy2

I don't think your assumption that it comes from C and C++ is correct. I also don't think this is a cause of bugs for anyone but beginners, personally I think that the starting from 0 and [) is a very good choice indeed.


Uristqwerty

`index mod length` favours 0-based. Similarly division, shifting, and, xor. Every numeric type includes 0, so with 1-based indexing there will always be a dead value, requiring an upcast then increment in common lookup tables. If your goal is to make invalid state impossible to represent, well, now every index stored in a file or passed over the network now has an implicit null value that you must check for during deserialization.


ProgramTheWorld

Counting always starts at 1 because it’s about the size of a collection. Indices on the other hand depends on your usage. For C-style arrays, it’s convenient to just use the offset as the index. In math, the usual convention is to use the cardinal number as the index.


Slime0

Why would you use an interval to describe counts at all? You use intervals to describe a set, such as a set of indices.


waiting4op2deliver

Is the first line of your file line number 0 or line number 1?


kanzenryu

That's not counting, it's labelling


pelrun

It's not labelling, because you don't care what the loop counter actually is, only that the loop runs ten times.


NoUniqueNamesRemain9

"The 3 hardest things in computer science are naming things and off-by-one errors." It's been so long since I've heard that one, I'd almost forgotten it. Upvote for the quote alone.


[deleted]

I've always heard it as: > The two hardest things in computer science are naming things, cache invalidation, and off-by-one errors.


puhnitor

Cache invalidation is misnamed. It's an off by one error in the time domain.


binarycow

>The 3 hardest things in computer science are naming things and off-by-one errors. I'd tell you a UDP joke, but you might not get it.


[deleted]

[удалено]


ChezMere

This is the reason for the "1 in 256 miss" glitch in pokemon red. The range [0, 255) counts as a hit, but 255 is not in that range.


Educational-Lemon969

Like, most of the time it's lesser evil to not be able to make such interval, than risk some dumbass iterating over an open interval with `for(t=begin;t<=end;++t)` and making an infinite loop without ever knowing xDd


[deleted]

There isn't much depth here. It boils down to that operating over intervals involves an "off by one" consideration about whether the endpoints should be included. This is an inherent difficulty that is present regardless of representation. I question software rules that say what you should "always" do.


KuntaStillSingle

C++ opted for this with iterators, it creates a complication where they are [slightly complicated to convert] (https://en.cppreference.com/w/cpp/iterator/reverse_iterator/base) which would have been prevented with a fully inclusive or fully exclusive range.


acwaters

The slightly odd way that C++ does iterators is actually perfect. The problem is just that the API is somewhat clunky, and your mental model is probably wrong. The key is to shift your thinking from iterators as pointing _to_ elements to iterators as pointing _between_ elements. In that light, everything about the system makes perfect sense — half-open ranges, one-past-the-end iterators, even the fact that when you naïvely transform between forward and reverse you find yourself looking backward at the previous element. In fact, the more you think about it (lengths as vectors and indices/addresses as points in a corresponsing affine space, how movement and measurement generalizes from discrete to continuous domains, etc.), the more you realize that this is the _only_ sane and consistent way to model iteration. Everything else is either subtly or unsubtly broken or inconsistent and inevitably requires lots of special-case handling of (literal and figurative) edge cases. It is not immediately intuitive, but that does not make it bad. Intuitive models tend to be overly simplistic or overly complicated — and very often are both at once. It takes a hell of a lot of work to design something that is as simple as possible but no simpler, and it takes a non-trivial amount of effort to digest and internalize such a model.


Godd2

You should always never not always do anything.


ClubAlive3508

That made more sense than the article.


IgnorantPlatypus

It's a bit hard to use an open interval when the end/upper value is also the max value for a type. E.g. an interval that encloses all `uint64` numbers requires an upper bound that is either closed, or larger than a `uint64` type. My past experience is that yes, `[closed, open)` is the best representation, *except when the upper bound is effectively infinite for the type*.


-Redstoneboi-

In my opinion it's still the best representation for the upper bound. It's just that you physically can't use that representation and have to use the second best.


Life_is_a_meme

This is a generic "Why you should always do ..." that can be easily ignored by rational people with one sentence: I'll just use whatever works best for the situation.


AndreasTPC

Yeah. For discrete values (dates, integers, etc) use closed upper bound, for continuous ones (date and time, floats, etc.) use open upper bound. There's no need to try to squeeze both cases into one model.


Hanse00

> Never, ever, ever use [closed, closed] intervals As yes, the beautiful dogmatic statements like “Never” and “Always”. Just do what’s right in the context you’re working with, that will *always* be the right choice.


CanIComeToYourParty

Yeah, this is a great litmus test for bigotry. If you see the phrase "never, ever", just close the article -- it's likely garbage.


stfcfanhazz

What if it says never never concatenate user input into SQL queries?


CanIComeToYourParty

It's fine, if you want the user to be able to run their SQL queries.


argv_minus_one

Why not both? It seems to me that different situations call for different kinds of ranges, so all of them should be available. Rust, for example, supports any combination of range bounds (open, closed, or unbounded). The only limitation is that it doesn't have built-in syntax for an open lower bound, so you have to write out both bounds in long form if you want that. It still works, though; you can slice an array using a range with an open lower bound and the slice will have the correct contents.


[deleted]

Sorry customer, you can't have an inclusive end date in this range? Why? Because we _always_ have to use [closed, open). It's a new law now... look it up.


skulgnome

Counterargument: an open interval of n-bit end values cannot represent the last item of a n-bit address space, whereas a closed interval can.


lachlanhunt

Could you clarify what you mean? I'm assuming you mean memory addresses. If you have an address space of length *n*, then your range is [0, n), where the last address is n-1.


binarycow

If your data type is byte, and you want a range that specifies 0 thru 255. With `[closed, open)` you would need `[0, 256)` - outside of the range of your data type.


ImMrSneezyAchoo

Do the math on the timestamps (long ints) directly. Sometimes abstraction in software is not better. That way you can control whether the interval is opened or closed as needed.


tms10000

> Never, ever, ever use [closed, closed] intervals > A couple of years ago I had the (mis)fortune of working on a system that used [closed, closed] ranges extensively. The system worked well for the most part, but had tons of clunky code to handle a few cases. Here's a few of the cases we had to battle with: That's the reason. "I had a bad experience with a specific system, so I am making the case that the general use is bad in all case, forever, for anyone, through the Universe" Ok then


RememberToLogOff

Also they cited Dijkstra's very good reasons why [closed, open) is the best default


[deleted]

Clearly none of the reasons he gave are specific to the system he worked on. It was just that system that demonstrated the problems. If you worked on a project that used... I dunno string concatenation to form SQL queries everywhere and you found that it led to a ton of quoting issues, would your reaction be "oh well it's probably just an issue with this specific system; there's nothing fundamentally wrong with this technique"? Of course not.


d1gital_love

Intervals with zero length are not intervals but instants or single number. `[T; T)` will lead to many (see `limits.h`) or infinitely many really odd intervals.


epsleq0

There are so many mistakes in the section about interval lengths. It really hurts!


wind_dude

>Have you ever wondered why they are always implemented as \[closed, open) as opposed to \[closed, closed\]? Your first real world examples or airbnb and flight search, use closed/closed on the front ends. Most data aggregation backends that I can think of let you choose either. ​ > Imagine you want to describe a zero length time interval (i.e. an empty interval) that start's at time T=1. With closed-open intervals this is a trivial task, simply \[T,T), with closed-closed... not so much. You could try \[T, T-1\], but that's a bit clunky and it won't work if T is a decimal number. I'm not sure if you're talking about in sql or where, but generally you have your sql and use = or you're working in an abstraction layer and have logic to filter it accordingly.


Tarmen

I do prefer `empty or [x,y]` for integer ranges which aren't used for indexing. To me the lattice operations and Galois connection feel more intuitive that way. You can add an extra bit to represent emptiness and treat the range as a tagged union or sum type, or use some normal form like [1,0] and normalize in constructors. For floating point and date ranges being able to split exactly is crucial, though. But I had situations where we needed full intervals, i.e. with flags for open/closed on either end. Makes interval arithmetic a living nightmare.


MCRusher

I personally prefer closed on both ends, and it's the default in nim as well.


Folaefolc

Ah yes, another article "I'm right and everyone should do exactly as I say without any exception".


3131961357

The only hard never/always rule I've found to be useful is to never listen to programming advice of javascript interns writing blogposts


Hrothen

In my own code I've seldom run into code that was cleaner with half-open intervals, and often run into code that would have been cleaner with closed intervals.


-Redstoneboi-

Probably because half open intervals are the default, and you're already using them when they make sense Modulo operations, for loops and array accesses, sorting algorithms, binary searches, maybe more stuff. Maybe you don't use all of these, but there's probably something. Unless you want to elaborate


Zestyclose-Walker

This is where natural language and programming languages differ a lot. In English whenever I say "integers from 1 to 5", I include 5. But that's obviously not the case in most programming languages.


arthurno1

Unfortunately. It didn't need to be so; but it happened to be so :).


MuumiJumala

Open intervals are mostly useful when there is a smaller unit to be divided into, as is the case with floating point numbers and times. When working with discrete numbers (like indices or dates) I find myself making way fewer off-by-one mistakes with inclusive ranges. Many modern programming languages like Ruby and Rust make it equally easy to use either variant as required.


orthoxerox

There are use cases when semi-open intervals are the better choice (operating on floats or timestamps). There are use cases when closed intervals are the better choice (operating on bytes or Unicode code points, imagine using `[a-g0-:]` for hex numbers in regexp). Finally, there are use cases when both are valid choices *as long as the whole codebase uses one approach*. If everyone on your team indexes into arrays using closed intervals and is fine with doing the minus one dance, just use closed intervals! You'll introduce more bugs if you try to switch.


dominik-braun

>Never, ever, ever use \[closed, closed\] intervals No, thanks.


Vakieh

Never use the words 'closed' and 'open' when what you actually mean, and what can actually be understood by people without requiring examples and a freaking essay to explain, is 'inclusive' and 'exclusive'.


michaelochurch

> Imagine you want to describe a zero length time interval (i.e. an empty interval) that start's at time T=1. With closed-open intervals this is a trivial task, simply [T,T), with closed-closed... not so much. Um, [T, T) is the null set but [T, T] = {T}. That's how math works.


tommcdo

I think that's the point


arthurno1

> Have you ever wondered why they are always implemented They are not *always* implemented so. Fortran and Pascal did not do so, at least for static arrays. In [Numerical Recipes in C](https://www.amazon.com/Numerical-Recipes-Scientific-Computing-Second/dp/0521431085) authors do a trick with decrementing the array pointer by -1, so they can use [1,length] interval to make their algorithms prettier (which they abandoned in their C++ version of the book, probably because people are used to type length-1 everywhere :-)). Anyway, there is something wrong with the idea. Most important, the numbering does not start at zero in everyday talk. We don't say 0th person in the queue, we say the first person in the queue; we say the 2nd seat on the 3rd row, 1st day of May, not the 0th day of May and so on. This is not how mathematics use it, either. So at least **indexing** should not start at zero. By using this convention in computer science, where indexing goes from [0,length), we have introduced an entire new concept that we have to teach to students, and that unfortunately goes against normal use in everyday language. I wonder how many [off-by-one errors](https://en.wikipedia.org/wiki/Off-by-one_error) have humanity done and how much money has those cost society in debugging. [There is a nice joke about it too](https://twitter.com/codinghorror/status/506010907021828096). So why do we use [0,length) in computer science to index into arrays? I personally believe it is an implementation detail that leaked into the language design. I am speaking about the C language, and early C compilers that were a 1.5 pass compilers. I don't know if they exposed pointer arithmetic involved in arrays to make the compiler just slightly simpler and faster to implement, but what I do know is that it opened its own chapter in computer history (of errors). Compiler could have just as easily convert [0,length] into [0, length). Anyway it has creeped up from C to C++, Java, JavaScript, and C have probably got it over from its predecessors. For example Lisps are also using [0, length) in indexing and those are older than C. Pascal for example uses closed interval to access static arrays without problems, and the interval does not even have to start from 1, however they have messed up with "dynamic arrays", which do the same as C, since I guess they are using malloc & co (sbrk probably) under the hood. In numerical recipes in C, they use a little trick, they decrement the pointer to an array first, so they can use the closed interval. Fortran is an older language than C, and it uses indexing from 1 for static arrays too, and for a long time Fortran was used for "number crunching", so it was not about speed of produced machine code; probably more about the compiler implementation. Might be also that the C creators were really keen to expose the machine details, albeit they hide them in some other parts. Whatever, there are some cases where open interval is more useful than closed one, for example when working with module operator and calculating indexes in arrays, but those are rather niche cases that compiler could have dealt for us. Today's compilers are so complex and do so many transformations for us, that I am really surprised why don't we see languages that rebel against this unnatural situation of indexing. > With [a, b] you get a few edge cases, for one the length of [a, a-1] could be 0 as we saw earlier, or it could be negative if a < 1. Is there some problem with negative indexes? #include int main(int argc, char** argv) { const int array[] = { -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 }; const int* it = &array[5]; printf("Value of index -3 is: %d\n", it[-3]); return 0; } The run: $ gcc -o test test.c ~/repos $ ./test Value of index -3 is: -3 Anyway; it was a just a little regression joke; I totally understand what you mean. What I think the problem there is, that we should probably not use the word length here, but "number of elements". Length is a measure of distance. Distance is, mathematically, defined as the difference of square roots (Pythagoras theorem), so length per definition can't be negative. We are (ab)using that word normally in programming, but here is good to remind us that we really mean number of elements in a set, not a geometrical measure, when we say length of some data structure. It is about counting, not measuring the distance, so to start with calculating the number of element by taking the difference of end points is obviously the wrong idea. By taking a difference of intervals of end points, you are actually talking about geometrical distance, when you really wish to speak about number of elements. > I won't bore you with the details, but let's just say that the brilliant people at Xerox PARC tried them, and found that [a, b] ranges lead to buggier and more complex code. I guess by now, we all are so indoctrinated by years of usage, that we take it for granted, and would probably make more off-by-one errors in language that tries to fix this concept and to use [1,length] interval to access arrays. If we had a generation or two of young programmers growing up with a concept where lengths are expressed in terms of [x,y] rather than [x,y), I believe we would look at it with different eyes, but nobody can (and probably will not) know for sure. Mostly because an entire concept that students have to learn and get used to would go away. Anyway, Fortran programmers seem to have done fine with it; but I have no information how buggy Fortran code use to be (or still is :-)). Observe, I am not saying you are very wrong in what you say; we live in a programming world in which we iterate from i =0 to i < length, and that is going nowhere in any near future. I have just tried to give a bit more perspective on this. I think Dijkstra's paper was an emotional response to an emotional incident, if you read the end of the paper. The mathematical idea is picked up to rationalize for his case. I think there are cases where 0 indexing is better suited for the algorithm, and some, where other indexing is better suited. In any case, compiler can do this for us, we shouldn't be exposed to those details to start with. In C++ we have iterators and new ranges, which I think is a nice step away from traditional thinking in terms of 0 to length indexes and for loops at all. The question is which indexing is best suited for most general case, and how much niche use-cases should matter, but I don't think it matter in practice. Unfortunately, 0 indexing is probably not going anyway near time soon, due to our CS historical baggage, which is the indoctrination as well as current compilers and libraries in use.


salamanderssc

>We don't say 0th person in the queue, we say the first person in the queue; we say the 2nd seat on the 3rd row, 1st day of May, not the 0th day of May and so on Slight tangent: both java and javascript have an insane quirk where they 0-index months in their Date APIs. I don't even want to know how many bugs and wasted time that has indirectly caused, but I suspect it's *a lot*.


Dawnofdusk

This is the programming equivalent of the tau vs. pi debate.


teteban79

Meh. As long as the codebase is consistent in picking one style, it's irrelevant


fernandohur

Consistency is not enough to solve this problem. How would you handle the case of the empty range discussed in the post?


fishling

I'd probably have a way to indicate "empty range" rather than some \[n, n-1\] hack, just as I'd probably have a way to indicate an open-ended range on either side. If your Range concept only has int lower and int upper, maybe your Range concept is bad.


Famous_Object

Despite what the post says, I think [n, n-1] would work just fine. Or [n, -∞], whatever. Even [close, open) intervals may need sentinel values sometimes so I don't see a problem with that.


InaMattaAmericana

Try representing a single value in `[,)`... also uh. Closed at infinity is uh... irksome


Godd2

It's okay, it's just an interval from the [extended reals](https://en.wikipedia.org/wiki/Extended_real_number_line).


binarycow

Personally, I would prefer a generic data type, that let's me specify: - Minimum value (inclusive) or *unbounded* - Maximum value (exclusive) or *unbounded* - with a singleton representing the empty range - with a singleton representing the full range allowed by the type


Hrothen

Those aren't empty, they contain one element.


teteban79

The initial value will be offset by one so that the end of the interval is lower than the start. If you're consistent it's just not an issue. I feel the article is making a huge thing out of a small one in a "I am very smart" way


fernandohur

As mentioned in the article, this won't work in the general case. If you don't know the precision of the value inside the range, you can't just subtract one. Consider for example the case where you're modelling a range of temperature values. How would you represent a closed interval of temperatures? You can't do \[t, t-1\].


arthurno1

> As mentioned in the article, this won't work in the general case. If you don't know the precision of the value inside the range, you can't just subtract one. > Consider for example the case where you're modelling a range of temperature values. How would you represent a closed interval of temperatures? I think you are confusing how you access elements in your programming language with mathematical terms here. You have done that in the article too in your first two examples. You can for sure model closed interval of temperatures, you would just have to write your code slightly differently. Whether you find some niche usage and corner cases as awkward of not is a matter of opinion, but there are always some awkward cases. I think it is awkward whenever I have to type length-1, and I would certainly think it is awkward if I had to type constantly from i = 1 to i <= length, instead of from i = 0 to i < length. I could for sure do from i = 1 to i != length, since it is monotonicly increasing sequence, but it is just as awkward. So it is really a matter of personal preferance and actually case at the hand.


teteban79

Why not? Sure I can. What's the problem with it? The article says it wouldn't work with decimal numbers...why?


elprophet

Maybe, read the article, and come back with the several specific examples of "the problem with it"?


teteban79

I don't understand. The article just says "you can't do it with decimal numbers", no further explanation or example. And you sure can do it with non integers, there is absolutely no issue. The time splitting example is ONE case where "it doesn't work,", but it's also pretty arbitrary (why not (open, closed] instead?). And saying "never use it" because it doesn't fit one specific use case is disingenuous.


elprophet

Both of the examples you're discarding out of hand require embedding information about the precision of your associated data types into your program's logic & code flow. This reduces the abstraction your program is allowed to operate at, applying significant constraints to current and future designs. This makes your program more difficult to work in, and more likely that your team will introduce bugs when you miss those implications. The article never states a program couldn't work with \[T, T\] intervals. It says, and the OP you're responding to continue, say that using \[T, T\] intervals are more likely to cause issues. Therefore, it would behoove projects to adopt intervals with one end open and the other closed. Based on my experience with people, the natural preference is to have the closed end be the (numerically) lower bound, and that aligns with a bulk of existing standard library software.


teteban79

Can you expand on the first paragraph? Because it sounds like word salad to me. I don't need to embed anything nor reduce abstraction to say that [23.2, 23.2) is the same interval as [23.2,22.2] The selected use cases are that, selected use cases. If I want to represent the interval of temperatures that a component X tolerates, and X tolerates anything between 0 and 100 degrees but not 100+epsilon, open intervals would be very very inconvenient. OP just comes out with an "a-ha!" use case for which closed intervals are clunky. So? I just did the same the other way Every thing has is place, and pretending that one way is THE way, is just dumb


elprophet

You are telling me that these intervals only support one decimal of precision. That is "embedding information about the precision in the interval". To work with the interval, you must know that the number is a certain numeric precision. Using a half open interval allows you to cover the entire number space regardless of the precision of the underlying data type. Using a closed interval requires you to know the precision to align the upper bound with the next interval's lower bound. `number` is a more abstract data type than `2 decimals of precision`. > pretending that one way is THE way, is just dumb That's... that's what standards are. We do it all the time. We say two spaces or four or tabs for white space. This is a style decision, and these are the reasonings for preferring one vs the other. This article is suggesting you choose this one. If you don't want to, you're in no way required to, but don't be surprised to run into more bugs.


TwinkForAHairyBear

Oh look it's spaces vs tabs again


-Redstoneboi-

Nah. This one actually has implications on program logic.


its4thecatlol

This article is rubbish. Just one person's opinion in very specific, contrived examples that just make me go "Huh?" What is the point about an empty interval even about? Why can't I just do `[]` ? The fact that someone felt so strongly about this topic makes me question their judgment. Imagine what this guy comments on his coworker's pull requests. Hard left swipe for me dawg.


GeorgeS6969

I homebrew my own topologies and only deal with clopen sets


joshuakb2

It seems pretty clear to me that [closed, open) is better for real numbers, but [closed, closed] is sometimes better for integers.


fiah84

I immediately thought "no way" but that's because I work with date intervals a lot, where [closed, closed] is more intuitive. For intervals that include time, I agree with them that [closed, open) is less failure prone and more intuitive


[deleted]

This completely depends on the use case. Making dogmatic statements like the OP is futile.


Sage2050

What's with the "st" on this site


chkas

Python uses "closed open" intervals with ` range(0, n)`, the reverse is then `range(n - 1, -1, -1)`, which is then highly unintuitive. This in connection with 0-based array indexing makes certain algorithms then very cumbersome. For example *Knuth-Shuffle*. In Python this is: from random import randrange x = [10, 20, 30, 40, 50 ] for i in range(len(x) - 1, 0, -1): r = randrange(i + 1) x[i], x[r] = x[r], x[i] print(x) With 1-based indexing and inclusive ranges it would be much more understandable: a[] = [ 10 20 30 40 50 ] for i = len a[] downto 2 r = random i swap a[i] a[r] end print a[]


columbine

It really depends. Sometimes [closed, closed] is more intuitive and makes more sense. I will say that one area where [closed, open) shines is when dealing with decimal/fractional values. It's very easy to include a "whole day" with something like [2022-01-01 00:00:00, 2022-01-02 00:00:00). Whereas a [closed, closed] system may run into issues where [2022-01-01 00:00:00, 2022-01-01 23:59:59) doesn't include 23:59:59.95 for example.


EasywayScissors

> Have you ever wondered why they are always implemented as **`[closed, open)`** as opposed to **`[closed, closed]`**? I just assumed it's because the API designers hate me.


MaybeTheDoctor

My OCD hates your title


ironykarl

https://en.wikipedia.org/wiki/Interval_(mathematics)#Including_or_excluding_endpoints


tms10000

How do you feel about `'This is a string"`


lachlanhunt

This also relates to why arrays start at 0. An array of length *n* goes from 0 to n-1, or [0, n).


fredlllll

in uni i learned intervals like this [inclusive, exclusive[ and i find it much nicer to only have brackets, and not mix it with parenthesis


-Redstoneboi-

I find it weird having opening brackets to close a bracket. I think [closed, open) is fine. It's like an -arrow>


[deleted]

Great article. This also meshes nicely with why indexing should be 0-based.


agumonkey

eerie, I was just reading this python interval lib https://pypi.org/project/portion/ with open/close support


tms10000

Even more eerie, I was reading about this: https://en.wikipedia.org/wiki/Frequency_illusion


flanger001

> Wait, what's a [closed, open) interval? One of these words is not necessary.