Tell me why this UUID pattern is stupid

Considering if you have multiple databases with uuidv7 pk’s and debezium+kafka pushing updates from each to the others, and simple consumers that just insert, delete and update records from the Kafka queue, you’ll end up with round trips for each operation at a minimum, and infinite loops if your debezium connector doesn’t drop messages where before and after are identical.

Why not then implement a different uuid spec similar to v7 in which the last 62 or 60 bits are a combination of random data and an identifier of the datacenter (say the first 30 bits of the hash of the datacenter.

Then in your debezium connectors simply add a plugin that takes the datacenter name as a config option and parses the uuid of every payload’s record and drops those in which the datacenter bits do not match its config, thereby eliminating round trips and also risks of infinite loops.

There’s a somewhat greater risk of collision, but as long as you’re not taking tens of thousand of inserts per ms should be fine. Maybe this is a poor man’s version of something that has a more robust spec, but it seems viable to me.

3 points | by yakkasean 1 hour ago

1 comments

  • benoau 1 hour ago
    It sounds similar to Twitter's Snowflake?

    https://en.wikipedia.org/wiki/Snowflake_ID

    • yakkasean 56 minutes ago
      It does look similar. I’m confused though how they coordinate sequence bits with only 2^10 (1024) workers. Surely they have more web servers than that, so sequence must be coordinated in a centralized way. Also, this is a 62 bit spec.