神楽坂しえる (@Clworld@md.ggtea.org)

@Gargron New behavior seems correct for published order, but it's different from current one, arrived order. We can fetch complete timeline by since_id={cached_statuses.latest.id} now, but it's not guaranteed with the new behavior.

Technically, 1 msec delay of status arriving may cause losing it from timeline for apps which only use REST API (e.g. Amaroq). Also ISO8601 encoding strips msec part, so it will be past timestamp even if zero latency on delivery. The case is not rare.

cc @Clworld

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:11

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:11

2017年10月04日 11:11

神楽坂しえる @Clworld@md.ggtea.org

@unarist @Clworld But arrived order was wrong all along... If you refresh someone's profile with since_id, do you expect a 3mo old status to show up in the results? Even if it just arrived?

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:25

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:25

2017年10月04日 11:25

@unarist @Gargron I think really big delay isn't caused by Sidekiq delay and it's caused by other functions (fetching boosted original status, search toot url, etc). I suggest arrival numbering for simple toot delivery.(Sidekiq delay is not big because it will retry only 3 times(OStatus) and 8 times(ActivityPub))

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:27

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:27

2017年10月04日 11:27

@Clworld @unarist Aha, I think I can understand your concern now. It's the very short term delay fluctuations that could mess up since_id. Yes, I understand. Okay. Then there needs to be a flag on the processing classes that's true only when they're called from the inbox controller.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:28

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:28

2017年10月04日 11:28

@unarist @Clworld So when called from inbox controller -> local timestamp; otherwise -> created_at timestamp

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:29

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:29

2017年10月04日 11:29

神楽坂しえる @Clworld@md.ggtea.org

@Clworld @unarist And if created_at timestamp is in the future, fallback to local timestamp too.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017-10-04T11:34:17Z

@unarist @Gargron Yes, that's what I suggest! thx!

2017年10月04日 11:34 · · SubwayTooter · · ·

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:02

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:02

2017年10月04日 12:02

@Clworld @Gargron btw, since chronological order of lower bits of new id is not guaranteed, new status may have past id if both statuses were created in a millisecond, I think.

If so, using latest status id as since_id is still may cause fetching leakage, which many apps (incl. WebUI) does and we've provided via Link header https://github.com/tootsuite/mastodon/blob/d5091387c6ddbe03b118b0cfb6d74cf821b84fb2/app/controllers/api/v1/timelines/public_controller.rb#L48 If client wants to new statuses without leakage, they should clear lower bits before use for since_id.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:07

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:07

2017年10月04日 12:07

@unarist @Clworld True. Maybe there is a way to make the sequence bits increment for each duplicated timestamp, instead of using randomness and retries?

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:11

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:11

2017年10月04日 12:11

@Gargron @Clworld Current implementation uses sum of secret value and sequence number, so probably you can do it by use sequence number only. https://github.com/tootsuite/mastodon/blob/d5091387c6ddbe03b118b0cfb6d74cf821b84fb2/lib/mastodon/timestamp_ids.rb#L28-L41

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:12

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:12

2017年10月04日 12:12

@unarist @Clworld That function is only for new statuses

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:14

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:14

2017年10月04日 12:14

神楽坂しえる @Clworld@md.ggtea.org

@Clworld @unarist Ah, but I think the salt is still misplaced... I don't think we need that...

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 12:16

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 12:16

2017年10月04日 12:16

@unarist @Gargron Note: Remember that Home TL ordering redis keys are treated as double and only have 53bit precision and msec timestamp is already 41 bit length and Snowflake id will be 57bit now. So, last 4 bit on Snowflake id is ignored on redis Home TL sorting key.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:19

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:19

2017年10月04日 12:19

@Clworld @unarist Okay, I guess we should make the IDs a bit smaller than 57bit? 4 bit is the salt, I think, so without the salt it should be exactly 53bit? @aschmitz

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:21

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:21

2017年10月04日 12:21

@Gargron @Clworld @aschmitz Also I've realized the sequence I've mentioned is shared in the table, so often overflows 16 bits and we can't use it for salt part with correct order. I don't know how hard to check existing record count of specified timestamp is...

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:25

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:25

2017年10月04日 12:25

@unarist @Clworld @aschmitz We could make a table that stores and increments sequences for each timestamp, maybe? I'm afraid this couldn't be done with redis because that data cannot be afforded to be lost...

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:38

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:38

2017年10月04日 12:38

@Gargron @unarist @Clworld Jumping in here: strong agree on "don't backdate IDs when they're 'normal' incoming statuses".

Redis home TL keys aren't an issue: they now have their ID as value too, and Redis sorts on key and then value. (Although it sorts on ASCII value, which means there's a minor chance of a wrong ordering in a little bit when that rolls over to another digit, but it's unlikely to bite anyone and won't happen again for decades.)

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:42

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:42

2017年10月04日 12:42

@Clworld @unarist @Gargron

The "sequence" ending is effectively random for each millisecond (and then incremented therein), so it's somewhat unlikely that it will ever roll over in practice, but zeroing out the last 16 bits in Link headers when the most current status is included is probably safest, yes. I can investigate doing that if you'd like.

Any other concerns? 🙂

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 17:11

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 17:11

2017年10月04日 17:11

@aschmitz @Clworld @unarist Could we get rid of the salt? I think that's overengineered.

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 18:18

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 18:18

2017年10月04日 18:18

@Gargron @Clworld @unarist

We still need something, in case multiple things happen in the same millisecond (which is increasingly likely/common). Removing it also doesn't actually get us anything: a few bits fewer, but still need to deal with biggish ints, and a potentially wrapping sequence counter. If we keep with millis, the impact is minimal and easy to address. (Using a separate table to reset that to zero every milli adds a lot of overhead too.)

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 18:48

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 18:48

2017年10月04日 18:48

@aschmitz @Clworld @unarist have you seen this? https://engineering.instagram.com/sharding-ids-at-instagram-1cf5a71e5a5c

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 19:52

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 19:52

2017年10月04日 19:52

@Gargron @Clworld @unarist I actually missed that specific post, but it's a common way to handle things: that's what I'd do if I had to merge lots of app servers into the database without a single serial counter. The low 16 bits let us do that (say 2**6 app servers and then 10 bits for a counter) in the future without expanding. The hashed ID is just a way of letting the DB assign things without giving away a count or requiring coordination of servers.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 20:41

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 20:41

2017年10月04日 20:41

@aschmitz @Clworld @unarist can we do it more like that? no salt and consequent inserts are guaranteed to be chronologically correct

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:12

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:12

2017年10月04日 21:12

@Gargron @Clworld @unarist In practice, not without a lot of extra work. This would require coordinated assigning of worker IDs and process/thread-global storage of counters that got reset every millisecond. The latter is easy enough, the former is practically intractable given the myriad configurations people deploy.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:13

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:13

2017年10月04日 21:13

@aschmitz @Clworld @unarist Hmmmmm no, wait. The twitter version of snowflake IDs requires worker_id, but the instagram one does not. shard_id in the code refers to the Postgres shard!! Of which there is only ever one in Mastodon.

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:16

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:16

2017年10月04日 21:16

@Gargron @Clworld @unarist I should note that both the Instagram and Twitter models are only serializable between milliseconds, not overall: if two different processes handle messages in the same millisecond, they will be out of order by ID. This is basically standard. We can handle it from the Link headers by just zeroing out the last bits with effectively no ill effects.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:16

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:16

2017年10月04日 21:16

@aschmitz @Clworld @unarist but there is only one postgres process? the timestamp function is within postgres

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:22

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:22

2017年10月04日 21:22

@Gargron @Clworld @unarist Oh, sorry, yes, if you narrow to one shard on the database. However, note that their low bits still wrap around, because they do a modulus on a Postgres sequence. There's no good way to reset that to zero every millisecond in Postgres.

(I maintain that this isn't a problem, and if two statuses that arrived in the same millisecond show up out of order, nobody will notice/care. We just have to make sure they aren't skipped.)

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:24

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:24

2017年10月04日 21:24

@aschmitz @Clworld @unarist I am concerned about the zeroing of lower bits requirement for apps because while other collections use Link headers only, status timelines and notifications often use id values straight out of the result set rather than Link headers

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:25

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:25

2017年10月04日 21:25

@unarist @Clworld @aschmitz As a last resort we could try using redis to keep the sequence increment, but I feel like it might be possible in postgres if you keep a table of timestamp->sequence rows?

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:42

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:42

2017年10月04日 21:42

@Gargron @unarist @Clworld I think that adds a lot of overhead (row lock and update?) for every addition. Not sure Redis helps much, other than being faster.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:43

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:43

2017年10月04日 21:43