まあこれまでのMastodonと比べればbreaking changeなのはこっちの方なので、これはこれで慣れてたじゃんというのは、まあ、正しい…

@unarist Hmmm. I don't think this breaks since_id. Even if it's newly-fetched status, since_id is about chronology. "Give me toots written after this"

I agree with the concern of faking time.

The delay should not be a problem since regardless of when sidekiq executes delivery, "published" is the original created_at?

@Gargron New behavior seems correct for published order, but it's different from current one, arrived order. We can fetch complete timeline by since_id={cached_statuses.latest.id} now, but it's not guaranteed with the new behavior.

Technically, 1 msec delay of status arriving may cause losing it from timeline for apps which only use REST API (e.g. Amaroq). Also ISO8601 encoding strips msec part, so it will be past timestamp even if zero latency on delivery. The case is not rare.

cc @Clworld

@unarist @Clworld But arrived order was wrong all along... If you refresh someone's profile with since_id, do you expect a 3mo old status to show up in the results? Even if it just arrived?

@unarist @Gargron I think really big delay isn't caused by Sidekiq delay and it's caused by other functions (fetching boosted original status, search toot url, etc). I suggest arrival numbering for simple toot delivery.(Sidekiq delay is not big because it will retry only 3 times(OStatus) and 8 times(ActivityPub))

@Clworld @unarist Aha, I think I can understand your concern now. It's the very short term delay fluctuations that could mess up since_id. Yes, I understand. Okay. Then there needs to be a flag on the processing classes that's true only when they're called from the inbox controller.

@unarist @Clworld So when called from inbox controller -> local timestamp; otherwise -> created_at timestamp

@Clworld @unarist And if created_at timestamp is in the future, fallback to local timestamp too.

@Clworld @Gargron btw, since chronological order of lower bits of new id is not guaranteed, new status may have past id if both statuses were created in a millisecond, I think.

If so, using latest status id as since_id is still may cause fetching leakage, which many apps (incl. WebUI) does and we've provided via Link header github.com/tootsuite/mastodon/ If client wants to new statuses without leakage, they should clear lower bits before use for since_id.

@unarist @Clworld True. Maybe there is a way to make the sequence bits increment for each duplicated timestamp, instead of using randomness and retries?

@Clworld @unarist Ah, but I think the salt is still misplaced... I don't think we need that...

@unarist @Gargron Note: Remember that Home TL ordering redis keys are treated as double and only have 53bit precision and msec timestamp is already 41 bit length and Snowflake id will be 57bit now. So, last 4 bit on Snowflake id is ignored on redis Home TL sorting key.

@Clworld @unarist Okay, I guess we should make the IDs a bit smaller than 57bit? 4 bit is the salt, I think, so without the salt it should be exactly 53bit? @aschmitz

@Gargron @Clworld @aschmitz Also I've realized the sequence I've mentioned is shared in the table, so often overflows 16 bits and we can't use it for salt part with correct order. I don't know how hard to check existing record count of specified timestamp is...

@unarist @Clworld @aschmitz We could make a table that stores and increments sequences for each timestamp, maybe? I'm afraid this couldn't be done with redis because that data cannot be afforded to be lost...

@Gargron @unarist @Clworld Jumping in here: strong agree on "don't backdate IDs when they're 'normal' incoming statuses".

Redis home TL keys aren't an issue: they now have their ID as value too, and Redis sorts on key and then value. (Although it sorts on ASCII value, which means there's a minor chance of a wrong ordering in a little bit when that rolls over to another digit, but it's unlikely to bite anyone and won't happen again for decades.)

@Clworld @unarist @Gargron

The "sequence" ending is effectively random for each millisecond (and then incremented therein), so it's somewhat unlikely that it will ever roll over in practice, but zeroing out the last 16 bits in Link headers when the most current status is included is probably safest, yes. I can investigate doing that if you'd like.

Any other concerns? 🙂

@Gargron @Clworld @unarist

We still need something, in case multiple things happen in the same millisecond (which is increasingly likely/common). Removing it also doesn't actually get us anything: a few bits fewer, but still need to deal with biggish ints, and a potentially wrapping sequence counter. If we keep with millis, the impact is minimal and easy to address. (Using a separate table to reset that to zero every milli adds a lot of overhead too.)

@Gargron @Clworld @unarist I actually missed that specific post, but it's a common way to handle things: that's what I'd do if I had to merge lots of app servers into the database without a single serial counter. The low 16 bits let us do that (say 2**6 app servers and then 10 bits for a counter) in the future without expanding. The hashed ID is just a way of letting the DB assign things without giving away a count or requiring coordination of servers.

@aschmitz @Clworld @unarist can we do it more like that? no salt and consequent inserts are guaranteed to be chronologically correct

@Gargron @Clworld @unarist In practice, not without a lot of extra work. This would require coordinated assigning of worker IDs and process/thread-global storage of counters that got reset every millisecond. The latter is easy enough, the former is practically intractable given the myriad configurations people deploy.

@aschmitz @Clworld @unarist Hmmmmm no, wait. The twitter version of snowflake IDs requires worker_id, but the instagram one does not. shard_id in the code refers to the Postgres shard!! Of which there is only ever one in Mastodon.

@Gargron @Clworld @unarist I should note that both the Instagram and Twitter models are only serializable between milliseconds, not overall: if two different processes handle messages in the same millisecond, they will be out of order by ID. This is basically standard. We can handle it from the Link headers by just zeroing out the last bits with effectively no ill effects.

@aschmitz @Clworld @unarist but there is only one postgres process? the timestamp function is within postgres

@Gargron @Clworld @unarist Oh, sorry, yes, if you narrow to one shard on the database. However, note that their low bits still wrap around, because they do a modulus on a Postgres sequence. There's no good way to reset that to zero every millisecond in Postgres.

(I maintain that this isn't a problem, and if two statuses that arrived in the same millisecond show up out of order, nobody will notice/care. We just have to make sure they aren't skipped.)

@aschmitz @Clworld @unarist I am concerned about the zeroing of lower bits requirement for apps because while other collections use Link headers only, status timelines and notifications often use id values straight out of the result set rather than Link headers

@unarist @Clworld @aschmitz As a last resort we could try using redis to keep the sequence increment, but I feel like it might be possible in postgres if you keep a table of timestamp->sequence rows?

@Gargron @unarist @Clworld I think that adds a lot of overhead (row lock and update?) for every addition. Not sure Redis helps much, other than being faster.

@aschmitz @unarist @Clworld If you zero the lowest bits in a max_id/since_id, won't it return the same status as the last/first one in same-millisecond cases?

新しいものを表示
新しいものを表示

@Gargron @Clworld @unarist Major version bumps are the time for breaking changes, right? 😉

@aschmitz @Clworld @unarist Yes, but it's not a good change. It makes the API harder to use instead of easier. I'd like to try to avoid that.

@Gargron @unarist @aschmitz (I was thinking we will give up under msec ordering and pray toot won't happen within msec. :blob_sweat_smile:)

ログインして会話に参加
GGTea - Mastodon

Mastodonは、オープンなウェブプロトコルを採用した、自由でオープンソースなソーシャルネットワークです。電子メールのような分散型の仕組みを採っています。