Hello,

As the title suggests, how do you manage your DBs for docker services.

Do you spin a new DB for every new docker cluster or do you have a centralized DB that is accessible to the docker clusters.

What are the pros and cons of both method?

For the moment, I spin a new DB for every services as I feel it is easier to backup the service in case of a problem.

  • placebo@lemmy.zip
    link
    fedilink
    English
    arrow-up
    2
    ·
    2 hours ago

    Given that database management systems already provide clear separation between services in the form of databases, users, and permissions, I see no need to spin up new database instances for each individual service. You say it’s easier to back up tightly coupled services and databases, but why? I find it easier to back up a single database server than multiple servers.

    The real concern with shared databases is performance: some services, under certain conditions, can generate load that degrades database performance for everyone. But that’s usually a problem for large enterprises, not self-hosters.

    • Croquette@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      43 minutes ago

      You say it’s easier to back up tightly coupled services and databases, but why?

      Because it is set and forget in my docker compose. I can backup the container and bring it down without affecting other services.

      But that is my inexperience talking and this is is why I wanted to see what other people were doing, and having perspective like yours to learn.

  • Daniel Quinn@lemmy.ca
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 hours ago

    I’ve had a really hard time figuring out how to get cloud native pg working 'cause I couldn’t get longhorn working for disk space.

    So instead I went with a separate Raspberry Pi that isn’t part of the cluster to host a single Postgres instance.

    It’s inelegant, but has worked for years. Still, I’d rather host a separate pg instance for each project… I just have to figure the above out first.

  • Consti@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    ·
    8 hours ago

    I typically have one DB service per app service (not just ler cluster, unless multiple services need the same db).

    Advantages:

    • Simple backup/data organization, each service is self-contained
    • True isolation: Unless you manually create DB accounts for each service, likely all your services have access to all data, and even with accounts there are data leaks and exploits

    Disadvantages:

    • You have more services running than strictly needed, but this is a minuscule impact on performance (the overhead of the DB service is typically not noticable)
  • irmadlad@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    8 hours ago

    I just run whatever db is required in the docker compose stack. I’ve thought about running separate db like mongo, mysql, sqlite and having them as central db services for the containers. I don’t see a downside to doing that. It just seems to me that it’s easier just to run what is required by the compose stack itself. For what I’m doing, they don’t seem to eat up much resources.

  • Zikeji@programming.dev
    link
    fedilink
    English
    arrow-up
    7
    ·
    8 hours ago

    Individually. If the app requires a DB, I put it in the compose file. This simplifies both backups and migrations. My tooling for backups has a pre and post script I can customize on a per app basis so I just have the pre do whatever *dump for that DB and the post clean it up (backup takes a tar of the folder).

  • cecilkorik@lemmy.ca
    cake
    link
    fedilink
    English
    arrow-up
    8
    arrow-down
    1
    ·
    8 hours ago

    I use a single centralized database running directly on the host (postgres) and make it accessible to the docker clusters.

    Pros: It’s all in one place to backup all important data for every service, I find this a much easier and more reliable way to backup all services, and I’m confident if you tried it you would too. The application data becomes a first-class citizen instead of hiding inside some nameless docker volume. Significantly less database overhead. Consistent database version and tools.

    Cons: Lots. You need to manage and backup the database. You need to manage the database users and passwords too. Making the DB accessible to the docker clusters is nontrivial and can be fragile. You can no longer use the default “official” docker compose files, since they will almost never support an actual database service without several modifications, they’re always built for spinning up their own docker container database. So you’re going to be doing a bunch of work setting up users and passwords and performing extensive surgery on the docker-compose before you even start up the application, which adds a lot of friction, with lots of opportunity for error. All things considered, it’s actually quite painful. Technically if one application abuses your database hard enough and exhausts its memory to crash it or something it would affect other applications too, but that’s true of any services running on shared hardware abusing anything on that hardware, so it’s not a realistically concerning con.

    I consider it worthwhile, but I might be wrong. Also I hate docker in general. I understand why people use it. It’s the same reason I use it. But I still hate it. I think system installations are so much easier to manage in the long run, but initially more work, and you need to invest that work at a time when you’re not even sure if you really want to run this application or if it’s going to be compatible with the rest of your environment. So docker is the easy solution. But then you’re basically trapped in dockerland. It’s not that bad, I just hate it in principle. I wish there were a better way.

    • glizzyguzzler@piefed.blahaj.zone
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      Containers lower the bar since the developer doesn’t need to make their program work on every system - just the container’s system.

      Price we pay for more programs. And they bring boons like read-only, rootless, limited capabilities, and constrained perf limits (esp if you use Podman with Quadlets).

      And don’t feel trapped - the Dockerfile is a recipe to build that program. Probably want to do it in an LXC container since it’ll want to use /data for its data or something. But the LXC container can also be run as a user but the program thinks it’s root. Plenty of security abounds!

      I think it’s worth the price and you’re not trapped. They’re trapped with you and your robust Quadlet files

    • northernlights@fedia.io
      link
      fedilink
      arrow-up
      1
      ·
      4 hours ago

      Same, exactly what I do. For the part about backups, there are tools that make it really easy. I’m using databasus for instance. Once I had set up a couple applications, adding a new db, user, and backup config took just a few minutes every time really. In the end figuring out how to backup every docker stack’s individual DB sounds more complicated to me.

      Edit: plus my first service hosted in this lab was my own matrix instance so I needed a solid DB, then it was already there so might as well use it.

    • Croquette@sh.itjust.worksOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 hours ago

      That’s a good perspective.

      Also I hate docker in general. I understand why people use it. It’s the same reason I use it.

      I am the same. But many services offer docker as the main installation method and many times, the bare metal method is poorly documented.

      So docker it is. And it’s a good skill to have no matter what since it is so widespread.

      I never thought about the issues of setting a docker service with an external database. I don’t mind dealing with the users and tables of a database, but having to dig deep in docker compose settings is always a bad time.

    • tburkhol@slrpnk.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 hours ago

      I started doing the One True Database method because I got worried that the high write count on all the little db’s was abusing a raspberry pi’s SD card. Moved them all to a bigger server with NVME and mirroring to a RAID.

      Not all the compose files make obvious how to reconfigure the db host. Homeassistant uses s a sqlite db built into the container, rather than a separate unit, but you can force it to use a remote db through its config file. May or may not be worth hiding db user/pass in a .env And sometimes there’s trouble restarting after power failure, depending on what order the database, pi, and various containers come back up.

      I also feel it’s worthwhile. I feel better being able to check on all the databases. Feel better not writing to the SD card so much. Feel better offloading those megabytes and cpu cycles from the little pi. It’s been fun snooping through database structures. There have been a couple times where I decided to query one of the ccontain databases directly, or cross from one project to another, and it’s easier (for me) to give a different user privileges to the database and query some deep bit of data than to figure out how to extract it from an API or frontend.

      I’m not even running that many services, but why would I want the overhead of 6 separate mysql instances when I could just have one?

      • ohulancutash@feddit.uk
        link
        fedilink
        English
        arrow-up
        2
        ·
        3 hours ago

        sometimes there’s trouble restarting after power failure, depending on what order the database, pi, and various containers come back up.

        depends_on solves this problem

  • bacon_pdp@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 hours ago

    One database service but separate databases running inside of the service. Each database has 3 accounts: table_owner (no remote access), proc_owner (only table specific permissions and the owner of all stored procedures; no remote access) and application_account (no table access and only execute permissions on the proc_owner’s stored procedures).

    Which means that even if the application is compromised, it can not compromise the database. It can only use approved stored procedures that check their inputs and abort on the smallest deviation from expected inputs.

  • NGram@piefed.ca
    link
    fedilink
    English
    arrow-up
    4
    ·
    8 hours ago

    I too spin up a new DB for every service that needs one. It makes maintenance less intrusive since I only have to shut down one service at a time to update or do a full file backup and I can safely tweak the DB config to best suit the service without worrying about negatively impacting other services on the same DB. I also like the flexibility to choose the best DB for the service, e.g. I use Postgres for a lot of the services I develop but they do also support MySQL/MariaDB, but I have other services that are more tested on MariaDB.

    I’d imagine having one database for everything could eventually cause performance problems across all services if the database gets too big or clunky or queries get too complex, even if just one service is causing the problem.

    I also run a lot of physical servers (I have about 10 low power computers I use as servers right now) so having a dedicated DB per service allows me to get better performance since the DB can be on the same machine as the service. Not that they need high performance, but it also helps with the efficiency.

  • IratePirate@feddit.org
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    6 hours ago

    //Edit: terminology improvements

    One per service. Not just because it’s easy, but because it allows you to minimise blast radius. If you’re running one DB DBMS per container service and one container gets compromised, only that one DB DBMS is compromised. With one centralised DB DBMS for all containers, a breach of one container means access to the data of all containers that use this without the need for any lateral movement.

    • placebo@lemmy.zip
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      a breach of one container means access to the data of all containers

      How so? Each service uses its own database with credentials that provide access only to that database. Unless on top of a breach in your container there is some zero-day in your DBMS - which I find highly improbable - no other data will be affected.

  • ser@lemmy.zip
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    1
    ·
    8 hours ago

    Am having problem getting MariaDB to work with NextCloud. Any help would be appreciated.

    • darcmage@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      3
      ·
      7 hours ago

      Start a new thread with your compose file (minus the sensitive stuff). I’d be happy to take a look and I’m sure others will step in as well.