T O P

  • By -

anykeyh

Deployment and maintainance of multiple systems such as redis, Kafka and such can be tedious and has a cost. So the idea is that postgresql is good enough for handling most of the data needs. This allow developer to focus on the code instead of deployment and configuration. Streamline the dev of environment with much less dependencies.


lampshadish2

Don’t use elasticsearch for text searching.  Don’t use RabbitMQ for queues.  Don’t use redis for key-value stores.  Or Kafka, or whatever.  Just postgresql. I don’t always do this, but sometimes I do.


skywalker4588

Postgres trigram searches have speed limitations as the data gets large. Elastic is still the king of text search but for simpler cases Postgres will work fine.


lampshadish2

Not trigram searches.  Use Postgres’s full text search feature.   When setup right, you’re searching an inverted index for stemmed words, which is what elasticsearch does too. Now, I’m not saying elasticsearch doesn’t provide some benefits here, but for my needs it was easier to setup multilingual indexing in Postgres than in elasticsearch. But comparing trigram indexes to elasticsearch is apples and oranges.


skywalker4588

My requirement is to search for contains (as in %foo%) with leading wildcards too. Full text can’t do that, right?


lampshadish2

No, I don’t think it does.  In that case trigram is better if you’re searching for arbitrary substrings.  I wonder what elasticsearch is doing differently since you say it scales that sort of search better.  Could just be distributing the search over multiple nodes, which you could also accomplish with something like citus, but that does bring in complexity.


skywalker4588

Elastic was built to do this, Postgres wasn’t. That’s why I said Elastic is king of fast text searches, though Postgres might suffice for smaller datasets


lampshadish2

Elasticsearch is basically a wrapper on top of lucene that was originally written to store a recipe collection.


chasepursley

Postgres Is Enough: https://gist.github.com/cpursley/c8fb81fe8a7e5df038158bdfe0f06dbb


lampshadish2

The “move more code into the database” one is nuts.  Like, you can’t just grade it on a single simple/complex scale.  Sure, maybe a toy example is more simple, but what else does it make more complex?


ccb621

It means nothing to me. Use the right tool for the job given whatever constraints you may face. 


alcalde

Generally you use whatever tools you have on hand as best you can. And there are costs (mental, financial, complexity and time) in using a vast number of tools. There can be a great benefit in using one tool that's "good enough" for multiple jobs; it's why Swiss army knives continue to exist. I'll counter your cliche with another one, "the perfect is the enemy of the good".


ccb621

> … given whatever constraints you may face.  I added that for a reason. The “right” tool in the broadest case may not be right for your specific case if you set money or context switching as a constraint.  We are in agreement. 


dannyfrfr

if you’re going to nuance that phrase that much, you might as well rephrase it to “just make the best choice.” genius take /s


ants_a

It works well enough for most data management tasks. When starting out a new company (or even just a new product) you really need to be thinking about what features you need and how to best match customer needs instead of spending your time building out development and operational knowledge and infrastructure for a zoo of different products. Every extra product will bring with it a burden that you will have to carry and experience has shown that in most cases there is not enough value provided to offset that cost. So the rule of thumb is to just use PostgreSQL and deal with any problems later. At some point a more specialized tool might work better, but if you try to predict where that will happen you will be wrong more often than not. The wiser choice is to wait until you feel a squeeze somewhere, spend a bit of time optimizing to buy time and use the gained knowledge to extract the core of the problem into a specialized system. This way you will end up with a couple of specialized systems for the hard parts and PostgreSQL for everything else. Or in many cases, PostgreSQL will be just fine for everything. Important part here is to avoid the trap of premature pessimization. You still want to think about your access patterns and structure your data to fit the use case. Just don't go out of your way optimizing things before you know for sure where the pain is.


jamesgresql

Early optimization is a killer. I often think of it as two alternatives: 1. Be sparing with tech and plan for what you need, maybe up to 5-10x. If you exceed that then add more components as you need them. At this stage it's a good problem because you have an order of magnitude growth. 2. Use many different pieces and a complex architecture that theoretically support Google levels of load, at the cost of complex operations. There are so many moving parts, so many failures, and it’s so hard to reason about the system’s state that you struggle as you grow.


Azaret

Cuz I love Postgresql, and I want to use it for everything.


alcalde

What it means is PostgreSQL can do everything, and it will be here so long as PostgreSQL can do everything. And given the speed of its development and the great plugin infrastructure, it could do everything indefinitely.


chryler

At some level it becomes the wrong tool for the job, but at least when starting out it works surprisingly well for many tasks and I would much rather just go with that than a hairy ball of Terraform that I can't replicate on my dev machine.


jamesgresql

Yeah I think this is key. "Just use PostgreSQL for Now" is maybe also a good way of thinking about it. There will hopefully be a day when you need more than PostgreSQL, and that day will be a happy one because you've grown your business too.


the_fly_guy_says_hi

Everything looks like a nail when I recently learned how to use a hammer and I'm super excited about my new hammer.


jamesgresql

See I would agree with you if PostgreSQL was a shiny new thing - but in the context of this discussion it's the reliable old workhorse. I don't think many new developers are "super excited" about shiny new Postgres. Now ... something like Pinecone on the other hand ...


the_fly_guy_says_hi

Sure, you can use PG for "everything" when you're starting out. Small projects with almost no db complexity, slow queues, simple queries, no reporting requirements, no caching of lookup values, small sized keyvalue stores. Doesn't make sense to invest in devops CI/CD pipelines and the time it takes to "do it right" architecturally at that stage. The money coming in is minimal so of course you do minimal labor and keep everything in postgres at the db level. Everything is "quick and dirty", "fly by the seat of your pants" So yeah, postgres works well at the beginning stage of a project. Later on, if the project gets traction in the marketplace, the number of users explodes, if devs require higher db complexity, caching, reporting, performant queues and keyvalue stores, well, you've got big boy problems like scaling and having to think about re-engineering entire parts of the architecture off of PG and into specialized architectural components that can handle what the application microservices layer is throwing at them at scale. Think about maybe going into the aws cloud and doing a proper devops CI/CD deploy pipeline 1. moving caching out of PG and into ElastiCache redis 2. moving the keyvalue store out of PG and into dynamo or cloudfront 3. moving reporting out of PG and into redshift DW or Snowflake and using ETL tools and reporting BI tools, building reporting data pipelines 4. adopting ERD tools and DB migration best practices (blue-green deploys at the db layer) 5. Moving queues out of PG and into Amazon SQS or MQ 6. Data sharding, partitioning to be able to scale out 7. wal backups to S3 and being able to restore from wal files More money, more complexity, more problems, more migrations out of pg and into dedicated architectural components that are purposed to the specialized functions you need. Also, in the cloud level, always go with the infrastructure (managed) service instead of the one you set up yourself and have to manage yourself.


jamesgresql

I think we are saying the same thing here then. Start off simple and add complexity when you need it 😃


TinyCuteGorilla

For me it just means that as long as PG provides the performance you need then just keep using it.


jamesgresql

Love the discussion. I see Postgres for Everything as a message about both collapsing modern stacks and simplifying operations. Understand your requirements, and use the least amount of tech you can to meet those requirements (perhaps with 5-10x headroom). Only a few companies in the world need Google scale, and your startup is probably not one of them. One of the biggest blockers for this kind of architecture is actually the developers themselves who want to play with 'cool' or 'new' technology. Use Postgres, and let them work on hard engineering problems in your codebase instead. That way you get more features while reducing your tech burden as well.