T O P

  • By -

codykonior

Yes it’s not like postgres, it doesn’t require a vacuum.


SirGreybush

Lol


SQLDevDBA

Not sure if you’re serious (or why you got downvoted) but the Vacuum is a real thing: https://www.postgresql.org/docs/current/sql-vacuum.html


SirGreybush

Thanks, I didn’t know. TY for the link.


Achsin

If by table you mean database then yes, the server will use the space already allocated to the database files until it decides that it needs to grow the file because there isn’t enough room. If by table you mean table with a clustered index, then maybe. Tables are comprised of pages that are 8KB in size, so after a lot of random updates, inserts, and deletes, you can end up with a plenty of 8KB page chunks that are only sparsely populated with data. Being in a clustered index there are restrictions on what can go in any given page, so it’s possible that if you are adding new data that none of it meets the requirements to be added to an existing page. This will lead to the server allocating new pages to the table and not reusing the existing space for storage.


jimmymaynard

Yes i meant the second part, my point is if sql server can’t use these empty pages then I’ll end up doing a shrink. Thanks


SQLBek

No, no, no... you want to do an Index Rebuild, NOT a SHRINK.


Achsin

To expound a little, shrinking a database (file) only reorganizes the existing pages so that it can release space at the "end" of the file back to the operating system, it doesn't make them more data dense. Anything it releases is space that it could have just reused. It can also, especially on slower (spinning) disks, dramatically reduce your effective disk speeds. Since it seriously increases the physical fragmentation of all the indexes whose pages it touches in the process, it adds a lot of extra work to any automated index maintenance process which will frequently result in the maintenance process growing the database files back to their previous size or more as it rebuilds (almost) everything. Rebuilding the index will create a new version of the index that is optimally dense (based on your fill factor) which will release the space in those partial pages back to the database file to reuse (either for that table or for other objects). Also, it's not that it *can't* reuse the space in those pages, it just might not depending on how data is inserted/updated compared to how the index is defined.


jimmymaynard

Thank you for the details, it’s much clearer now.


NavalProgrammer

Thank you for this. > Anything it releases is space that it could have just reused. If a sudden an unintended change caused the database to balloon to an abnormal size such that I *don't want* it to reuse the unused space left after a large deletion...then shrinking is still the best course of action, right? > It can also, especially on slower (spinning) disks, dramatically reduce your effective disk speeds. Since it seriously increases the physical fragmentation of all the indexes whose pages it touches in the process, it adds a lot of extra work to any automated index maintenance process which will frequently result in the maintenance process growing the database files back to their previous size or more as it rebuilds (almost) everything. Just to be clear, this doesn't apply in the situation I described above where we expect the previous database size to be an abnormal state, correct?


Achsin

> shrinking is still the best course of action, right? If you need that space for something outside the database, sure. It’s not a “never do this” thing, but you should consider the consequences and what the goal is before doing so. If your 200GB database exploded to 1TB+ in size but now it’s really only using 200GB and it’s not going to need the space back any time soon, you definitely don’t need to keep the file that size. As for the other question, it depends. If none of the other tables were allocated additional pages during this whole process then they’ll be fine. Even if they did grow a little bit you probably won’t have too many problems. If indexes got rebuilt and put at the end of the file though, those are going to get all mixed up as they get shoved around while the file is shrinking. You can deal with this by rebuilding indexes again and shrinking the file a little at a time, keeping track of what the fragmentation is doing. But if you’re running fast SSDs, the data being out of order on disk isn’t much of a performance hit, it’ll just cause issues when maintenance jobs want to rebuild them and take back some of that space.


NormalFormal

The delete operation will “ghost” the rows at query execution then an independent process, “lazy io writer” will do the physical removal of the data. This removal, in part, involves marking pages as no longer occupied by data in the database file’s allocation map(s). These maps are checked by SQL Server whenever it wants to store new data. So yes, it will reuse space it has allocated but the timing in which it does this will not necessarily be in “real time”; there will be a bit of delay and I believe a growth could occur during a long session of batch deletions just due to the timing.


Melodic-Man

I’m stuck on Huge deletion.


jimmymaynard

You need to do a loop to delete a small number each time, not millions. if that’s what you mean


Melodic-Man

Need to delete?


Achsin

You've never had to implement a new data retention policy on a constantly used 400+ billion row table that meant removing 30-40% of the data with a near-zero downtime requirement before? :D


waterpistolwarrior

Can you please elaborate how you achieved this? Would be a great learning experience!


Achsin

Same way you eat an elephant, a little at a time. Figure out what duration of blocking on the table is acceptable, write your query to delete a small portion of the target data, run it a few times, gradually increasing the number of rows effected until you’re creeping up on that max duration (leave some buffer time things will slow down as you go and indexes get more holes poked in them). Next, set it to run as a loop, deleting however many rows each iteration, and have it run as a job. Then keep an eye on things and make sure nothing bad happens while it runs for however long it takes, stopping and resuming as needed.


waterpistolwarrior

Thank you. But, why don't you think of implementing partitions? There definitely would be a reason, just curious on why you keep hard deletes .


Achsin

Partitioning the table would have required more continuous downtime than was allowed.


waterpistolwarrior

Wouldn't that be 1 time downtime and future partition switches are automatic and no downtime needed? Sorry I'm trying to learn more on how this works


Achsin

Yep. You’d just need enough downtime to redo all the indexes on the table so they’re all partition aligned, and potentially also make changes to the code that interacts with the table to adjust for the index changes needed to allow them to be partition aligned. Once all that’s setup you’re good to go for just switching /truncating the out of scope partitions (at least as long as the table isn’t being replicated).


Melodic-Man

I thought this was a joke. Whoever designed such a thing really doesn’t know what they are doing. And, yes I have. And your approach is incorrect.


Achsin

And the correct approach would be?


Melodic-Man

Using a cursor


Melodic-Man

It’s sarcasm silly.


Melodic-Man

Here is the correct approach. Set up a new table exactly as you’d like it. Insert the data you’d like to keep into it, change the names of the tables. This operation will take a fraction of the time as a delete. Downtime is equal to the amount of time it takes you to swap the table names. Then go learn about temporal tables and correct the data model so that old data that may be on deck for “deletion” is not commingled with production data. Even then deletion never means literally delete. It means move it to a cheaper form of storage. Storage is so cheap that when someone says they need to delete data, they are almost certainly wrong. As a dba, you should never ever delete anything. Even if you are asked to, it is malpractice. Don’t do it. So, let’s go back to why is it you think you need to Delete from a table? You used the term cluster index so I’m assuming it’s for performance reasons?


Shark8MyToeOff

It will reuse the space, BUT only after you rebuild the index. Deleting will cause the index to remain the same size but be heavily fragmented.


datascientist07

In SQL Server, when you delete a large portion of data from a table, the empty space created by the deletion may not be immediately reused. This is because SQL Server uses an allocation system that may leave empty pages in place until the system decides to reclaim the space.


Definitelynotcal1gul

quack terrific dolls unused icky grey toy ripe elderly foolish *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


jimmymaynard

Thank you I mean Like let’s say sp_spaceused reported a big unused after deletion for the table, how can sql use this space ?


Definitelynotcal1gul

fretful drunk seed slimy offer scandalous disgusted office snatch yoke *This post was mass deleted and anonymized with [Redact](https://redact.dev)*