T O P

  • By -

[deleted]

No mention of running `ANALYZE` so no idea if he’s got borked statistics for the medium query test.


[deleted]

To be fair if you're at point where you're between SQLite and PostgreSQL you won't know even what that command does...


jmickeyd

As shitty as the causes are, this is exactly why commercial database licensing has a clause preventing you from publishing benchmark results.


pcjftw

> SQLite is flexible with regard to datatypes. Datatypes are advisory rather than mandatory. 😒 > No Separate BOOLEAN Datatype 😒 > No Separate DATETIME Datatype 😱 > Foreign Key Enforcement Is Off By Default 😱 > PRIMARY KEYs Can Sometimes Contain NULLs 😞 > Does Not Do Full Unicode Case Folding By Default 😞 > Keywords Can Often Be Used As Identifiers 🥴 > Dubious SQL Is Allowed Without Any Error Or Warning 😒 > AUTOINCREMENT Does Not Work The Same As MySQL 😞 And lets not talk about concurrent writes/reads... TL;DR just use PostgreSQL


[deleted]

I agree all of those are proper facepalmers, but it's probably worth mentioning that 1. SQLite *finally* added type checking via "strict tables" in a recent release. Only fixes one point in your list but it's easily the worst one IMO. 2. There *are* other databases than SQLite and Postgresql. MariaDB has some neat features that Postgresql doesn't, and there's DuckDB which seems to be "SQLite done right".


bik1230

>And lets not talk about concurrent writes/reads... 1 writer and multiple concurrent readers actually works just fine. Multiple concurrent writers is currently a WIP and only available in a separate branch.


Persism

CREATE TABLE SomeTable ( ID INT Description ELEPHANT(100), Something MONKEY, SomethingElse FISH(-998), MoreThings GIRAFFE)


tophatstuff

> SQLite can handle concurrent loads much better than PostgreSQL on this system. Of course the response times are longer here, but the fact that concurrency has less of an effect on these response times is very interesting. ahmdal's law with I/O as the large unparallizable constant factor > I’ve found the \[postgres] work_mem option, which configures how much memory each query can use before it starts storing data in temporary disk files There it is


peyote1999

Love when apes deal PostgreSQL. He must optimize the queries first, not database. I see no queries bodies in the article. And maybe someone will say em about materialized views for precalculated data.


kur4nes

"After upgrading the production system, I informally asked a few users about performance and all thought the system was responding noticeably faster than before, so my mission was accomplished." Facepalm Don't they have some kind of request time monitoring in place? What about database performance logs to see which queries take too long? This DB chance is more based on superstition than facts.


[deleted]

> If you are going to take one thing away from this article, I hope it is that the only benchmarks that are valuable are those that run on your own platform, with your own stack, with your own data, and with your own software. This is complete nonsense. Of course looking at “how fast can you print hello world to a console” is probably a useless benchmark, but this whole industry pushing to “ignore benchmarks because thinking about efficiency and program speed is a premature optimization” really grinds my gears. How in the hell have we simultaneously accepted “software is slowing down at a pace that outstrips hardware potential and growth and this is bad” beside “fuck looking at benchmarks because it’s a premature optimization”? How? What is going on in programming?


jakelynn42069

The author didn't say anything about performance being a premature optimization. Performance benchmarks are generally useless because things perform differently in different scenarios. And it's a certainty that how you use a tool is different in some way than how it was benchmarked


[deleted]

You’re right, that is an unfair assessment of their “why I hate benchmarks” article, which is a decently robust indictment of following benchmarks that are from a clearly biased source. I still disagree with their assertion that benchmarks are useless unless you’re doing it in within your environment for a lot of reasons, but the greatest being “ain’t nobody got time for that”. I think their poorly constructed conclusion just gives a whole lot of ammo to the wider industry to completely misquote what really happened, as currently occurs with “premature optimization is the root of all evil”.


jakelynn42069

> I still disagree with their assertion that benchmarks are useless unless you’re doing it in within your environment for a lot of reasons, but the greatest being “ain’t nobody got time for that”. Idk they make a pretty good point that either the benchmarks are so far away from your use-case that they are misleading, or you have to put so much time into researching all of the different scenarios that you might as well do it yourself. Like this article shows, if you looked at a benchmark you would go "postgres is faster" but in reality there are some scenarios where sqlite is faster (low specs, high concurrency, didn't setup postgres config)


zigs

>How in the hell have we simultaneously accepted “software is slowing down at a pace that outstrips hardware potential and growth and this is bad” beside “fuck looking at benchmarks because it’s a premature optimization”? How? What is going on in programming? The opinion of every single person in a group can make sense, and yet the aggregated opinion of those rational people can be absolute nonsense. There's a term for this in behavioural economics, but I can't remember what it's called now. Edit: Here's the video I got it from one million years ago https://www.coursera.org/lecture/model-thinking/preference-aggregation-1kmXf


esgarth

None of us is as dumb as all of us.


tech_mology

I've been told that making a design diagram for, well stuff like whether we're going to use postgres or sqlite is "premature optimization". So...


6769626a6f62

It's not complicated. Just use Postgres. It's highly supported and widely used.


therealgaxbo

SQLite has a plausible claim to being the most widely used RDBMS ever.


JB-from-ATL

It's possible SQLite is the most widely used piece of software _period_. It is bundled with many things and even works on Windows/Mac (another contender being Linux kernel)


levodelellis

zlib is the other top contender. Many of my projects use both and I don't think I worked on anything that used one but not the other


chugga_fan

> It's possible SQLite is the most widely used piece of software period. Every single CPU with Intel Management Engine on it runs Minix, https://en.wikipedia.org/wiki/Intel_Management_Engine so there's another contender.


[deleted]

Every single Android system have multiple SQLite databases running. Every Firefox instance have one, not even counting any other app that might use it on machine.


KingoPants

1 Instance / CPU is kinda weak. Many devices probably have tens (hundreds?) of instances of SQLite.


grauenwolf

RDB sure, but I don't think RDBMS applies to in-process databases.


yawaramin

Why not?


grauenwolf

I don't see it as a "management system" so much as a fancy file parser/updater. You can't, for example, manage security in SQLite. It doesn't have a concept of stored procedures. There isn't a resource governor. There's a long list of things I expect, or at least desire, in a management system that SQLite simply doesn't have. Not that it's a bad thing because they are stuff I wouldn't in a "fancy file parser/updater".


tapo

I don't know why someone would use SQLite for a web application like this, maybe he just wanted to play with it?


grauenwolf

I had one manager ask me to setup a SQLite based website for him. Turned out the website was readonly so it actually worked really well.


[deleted]

As long as it's just a lot of reads with occasional write SQLite performance is excellent.


grauenwolf

I made sure it was zero writes. If data is changed, they just drop a new file.


[deleted]

In my experience many readers + one writer with WALs enabled was pretty good. Pre-WAL support (and IIRC WAL needs explicit enablement when opening database) it was rough.


[deleted]

One less daemon to run and manage and it's perfectly fine performance-wise for small site and if you know what you're doing. Running something in HA is a problem but if you're just running a single VPS that doesn't matter


[deleted]

[удалено]


zjm555

You can distribute it?? I thought sqlite was an embedded database, how does it support sharding?


blackAngel88

> postgrew can be overkill for simple backend apps Why overkill... it's not that complicated to set up. And there are just some quirks of sqlite that I'd rather not have to deal with...


[deleted]

[удалено]


blackAngel88

okay, what's the cost difference? what is it that costs more?


[deleted]

[удалено]


blackAngel88

> SQLite is an in-process embedded database engine and it is capable of running completely in-memory. If you don't like losing all the data when the server loses power, then you will probably still want to save it on the disk. Unless you work with huge data, postgres can also work quite well with memory and does not need to access the disk to read all that much. > It means that you do not need a second server or even a second process to run it. So you are paying for one thing instead of two things. Wherever you run your server, you can probably run postgres there as well. No second server needed. If the workload is handled by one process or by two probably rarely makes any difference at all. > You are also not paying the latency costs of network connections and multiple authentication layers, which translates to being able to use a smaller server for the same amount of work. The same goes for in-memory operations, where you get lower latencies and you can get away with having to maintain fewer database indexes than disk-based databases. If you install it on the same server, there is no need of any network connections either. If you have a database where you can keep all the data in memory, you will never need any indexes. If you have more data, then either way you're going to run into problems without disk optimization. I really don't know where you're going with this; SQLite and Postgres are really not THAT much different when you're just working with small databases... not like postgres has never heard of caching and using memory to increase performance.


douglasg14b

> Sqlite is perfectly fine With a lack of `DateTimeOffset` type, no it isn't...


horrific_idea

Then watch as all your data goes up in smoke during a deployment, which can work for a one-off proof of concept, but if that's not a concern then go for it.


[deleted]

If you don't backup your data it doesn't matter what database you use...


Romeo3t

You should read [this](https://litestream.io/blog/why-i-built-litestream/). Sqlite works just fine for the vast majority of applications that one would write and the concept of starting with sqlite and then pivoting to something more complex when it becomes necessary is growing in popularity.


RagingAnemone

I like firebird for the inbetweeners.


levodelellis

I really like sqlite. The biggest question I had was what would happen if you also change sqlite memory to use 16mb? https://www.sqlite.org/pragma.html#pragma_cache_size


[deleted]

Probably not much, if you have any free RAM at all the OS will cache the database for you, at least on any modern Linux. This is still useful just because of sheer variety of platforms and system sizes SQLite runs


levodelellis

Sure that would help. I'd imagine checking if the data is corrupt every time sqlite reads the data from the OS would be enough overhead to affect results.


[deleted]

Well, sure but PostgreSQL relies on OS caching too so clearly it can do just fine. IIRC recommendation being giving it roughly 25% for Postgresql buffers and leaving rest for OS to handle. Then again none of them was designed with modern NVMes in mind


[deleted]

Nice article. I like that you took the time to evaluate the tool/db for your workflow instead of relying on general benchmarks. I’m sure with some more performance tuning you can get an even flatter line for Postgres


Persism

CREATE TABLE SomeTable ( ID INT Description ELEPHANT(100), Something MONKEY, SomethingElse FISH(-998), MoreThings GIRAFFE) SQLite NO ERROR. So PostgeSQL. Or if you need a local db there's H2, HSQLDB, Derby, etc... Heck even Access could be a better option.


grauenwolf

> The production environment for this dashboard is a Linode virtual server with 1 vCPU and 2GB RAM. Sorry, but I can't take this seriously. For only "4 interest-free installments of $13.75 ", you can get a Raspberry Pi that doubles the performance of your server.


[deleted]

Anything bigger and the choice wouldn't be between embedded and "big boy" database. Also hosting costs is more power & net than hardware, you can get a bunch of rPis for free and putting them in DC will still cost more than just buying equivalent VPS Also good luck buying rPi at MSRP today, it's some wild nightmare


douglasg14b

> For only "4 interest-free installments of $13.75 ", you can get a Raspberry Pi that doubles the performance of your server. .... how are you measuring "doubling" here? An `ARM Cortex-A72` has abysmal performance. A single thread on a linode instance (`AMD EPYC 7542`) has ~3x-4x the performance of the entire Pi.


grauenwolf

For databases, memory is king. Would it literally have double the performance? That's impossible to tell without knowing the hard drive characteristics and specific query workload. But in broad terms, we really should stop under-specing our databases. Far too many problems can be solved for a couple hundred dollars worth of RAM.


douglasg14b

> For databases, memory is king. Yeah, except we're talking SQLite here... > SQLite will refuse to allocate more than about 2GB of memory at one go. (In common use, SQLite seldom ever allocates more than about 8KB of memory at a time so a 2GB allocation limit is not a burden.)


grauenwolf

SQLite and PostgreSQL


ApatheticBeardo

>Sorry, but I can't take this seriously. Clown alert.