What Is a Database Transaction?

(planetscale.com)

69 points | by 0x54MUR41 2 hours ago

4 comments

MHordecki 1 hour ago
I’ve found this article lacking. Like some other articles in this space, it introduces isolation levels through the lens of the phenomena described in the SQL standard, but I find that there’s a different, more intuitive approach.
I think it’s more tractable to define this problem space starting from the concept of (strict) serializability, which is really a generalization of the concept of thread safety. Every software engineer has an intuitive understanding of it. Lack of serializability can lead to execution-dependent behavior, which usually results in hard-to-diagnose bugs. Thus, all systems should strive towards serializability, and the database can be a tool in achieving it.
Various non-serializable levels of database transaction isolation are relaxations of the serializability guarantee, where the database no longer enforces the guarantee and it’s up to the database user to ensure it through other means.
The isolation phenomena are a useful tool for visualizing various corner cases of non-serializability, but they are not inherently tied to it. It's possible to achieve serializability while observing all of the SQL phenomena. For example, a Kubernetes cluster with carefully-written controllers can be serializable.
[-]
- bddicken 59 minutes ago
  Author here. This is good feedback.
  The combination of transactions, isolation levels, and MVCC is such a huge undertaking to cover all at once, specially when comparing how it's done across multiple DBs which I attempted here. Always a balance between technical depth, accessibility to people with less experience, and not letting it turn into an hour-long read.
  [-]
  - jaxr 9 minutes ago
    I love the work planetscale does on keeping this type of content accurate yet accessible. Keep it up!
- lateforwork 14 minutes ago
  Most RDBMSs offer serializable isolation if you need it. Often you don't need it. The downside of using serializable isolation unnecessarily is reduced concurrency and throughput due to increased coordination between transactions.
- Rapzid 59 minutes ago
  https://aphyr.com/posts/327-jepsen-mariadb-galera-cluster
  More notation, more citations, more better.
  [-]
  - bddicken 49 minutes ago
    Notation is useful. Citations are nice for further reading. But I don't agree more of this makes for a better article!
  - peterclary 49 minutes ago
    Looks like the author is geoblocking in protest of the UK Online Safety Act (and fair enough).
- mika6996 1 hour ago
  Then recommend a better explanation?
interlocutor 44 minutes ago
A lot of database tools these days prioritize instant sharing of updates over transactions and ACID properties. Example: Airtable. As soon as you update a field the update shows up on your coworkers screen who also has the same table open. The downside of this is that Airtable doesn't do transactions. And the downside of not doing transactions is potentially dangerous data inconsistencies. More about that here: https://visualdb.com/blog/concurrencycontrol/
rishabhaiover 2 hours ago
It's an absolute pleasure reading planetscale blogs. I'm curious about what tool is used to make these visualizations?
[-]
- bddicken 58 minutes ago
  Author here. Thank you! These visuals are built with js + gsap (https://gsap.com)
zzzeek 48 minutes ago
I think this is a great post to have but I'm going to make a critical usability suggestion:
* the videos should have "pause" and a "step at a time" control *
Even at the "half speed", without a deep knowledge of the context, the videos move way too fast for me to read the syntax that's invoking and line it up with the data on the left side. I (and im definitely not the only one) need to be able to sit on one step and stare at the whole thing without the latent anxiety of the state changing before I've had a chance to grok the whole thing.
this has nothing to do with familiarity with the concepts (read my profile). I literally need time to read all the words and connect them together mentally (ooh, just noticed this is pseudo-SQL syntax also, e.g. "select id=4", that probably added some load for me) without worrying they're going to change before watching things move.
please add a step-at-a-time button!