神楽坂しえる (@Clworld@md.ggtea.org)

@unarist Hmmm. I don't think this breaks since_id. Even if it's newly-fetched status, since_id is about chronology. "Give me toots written after this"

I agree with the concern of faking time.

The delay should not be a problem since regardless of when sidekiq executes delivery, "published" is the original created_at?

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 10:44

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 10:44

2017年10月04日 10:44

@Gargron New behavior seems correct for published order, but it's different from current one, arrived order. We can fetch complete timeline by since_id={cached_statuses.latest.id} now, but it's not guaranteed with the new behavior.

Technically, 1 msec delay of status arriving may cause losing it from timeline for apps which only use REST API (e.g. Amaroq). Also ISO8601 encoding strips msec part, so it will be past timestamp even if zero latency on delivery. The case is not rare.

cc @Clworld

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:11

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:11

2017年10月04日 11:11

@unarist @Clworld But arrived order was wrong all along... If you refresh someone's profile with since_id, do you expect a 3mo old status to show up in the results? Even if it just arrived?

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:25

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:25

2017年10月04日 11:25

@unarist @Gargron I think really big delay isn't caused by Sidekiq delay and it's caused by other functions (fetching boosted original status, search toot url, etc). I suggest arrival numbering for simple toot delivery.(Sidekiq delay is not big because it will retry only 3 times(OStatus) and 8 times(ActivityPub))

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:27

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:27

2017年10月04日 11:27

@Clworld @unarist Aha, I think I can understand your concern now. It's the very short term delay fluctuations that could mess up since_id. Yes, I understand. Okay. Then there needs to be a flag on the processing classes that's true only when they're called from the inbox controller.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:28

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:28

2017年10月04日 11:28

@unarist @Clworld So when called from inbox controller -> local timestamp; otherwise -> created_at timestamp

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:29

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 11:29

2017年10月04日 11:29

@Clworld @unarist And if created_at timestamp is in the future, fallback to local timestamp too.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:34

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 11:34

2017年10月04日 11:34

@unarist @Gargron Yes, that's what I suggest! thx!

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:02

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:02

2017年10月04日 12:02

@Clworld @Gargron btw, since chronological order of lower bits of new id is not guaranteed, new status may have past id if both statuses were created in a millisecond, I think.

If so, using latest status id as since_id is still may cause fetching leakage, which many apps (incl. WebUI) does and we've provided via Link header https://github.com/tootsuite/mastodon/blob/d5091387c6ddbe03b118b0cfb6d74cf821b84fb2/app/controllers/api/v1/timelines/public_controller.rb#L48 If client wants to new statuses without leakage, they should clear lower bits before use for since_id.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 12:16

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月04日 12:16

2017年10月04日 12:16

@unarist @Gargron Note: Remember that Home TL ordering redis keys are treated as double and only have 53bit precision and msec timestamp is already 41 bit length and Snowflake id will be 57bit now. So, last 4 bit on Snowflake id is ignored on redis Home TL sorting key.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:19

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:19

2017年10月04日 12:19

@Clworld @unarist Okay, I guess we should make the IDs a bit smaller than 57bit? 4 bit is the salt, I think, so without the salt it should be exactly 53bit? @aschmitz

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:21

**unarist** @unarist@mstdn.maud.io · 2017年10月04日 12:21

2017年10月04日 12:21

@Gargron @Clworld @aschmitz Also I've realized the sequence I've mentioned is shared in the table, so often overflows 16 bits and we can't use it for salt part with correct order. I don't know how hard to check existing record count of specified timestamp is...

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:25

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 12:25

2017年10月04日 12:25

@unarist @Clworld @aschmitz We could make a table that stores and increments sequences for each timestamp, maybe? I'm afraid this couldn't be done with redis because that data cannot be afforded to be lost...

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:38

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:38

2017年10月04日 12:38

@Gargron @unarist @Clworld Jumping in here: strong agree on "don't backdate IDs when they're 'normal' incoming statuses".

Redis home TL keys aren't an issue: they now have their ID as value too, and Redis sorts on key and then value. (Although it sorts on ASCII value, which means there's a minor chance of a wrong ordering in a little bit when that rolls over to another digit, but it's unlikely to bite anyone and won't happen again for decades.)

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:42

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 12:42

2017年10月04日 12:42

@Clworld @unarist @Gargron

The "sequence" ending is effectively random for each millisecond (and then incremented therein), so it's somewhat unlikely that it will ever roll over in practice, but zeroing out the last 16 bits in Link headers when the most current status is included is probably safest, yes. I can investigate doing that if you'd like.

Any other concerns? 🙂

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 17:11

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 17:11

2017年10月04日 17:11

@aschmitz @Clworld @unarist Could we get rid of the salt? I think that's overengineered.

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 18:18

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 18:18

2017年10月04日 18:18

@Gargron @Clworld @unarist

We still need something, in case multiple things happen in the same millisecond (which is increasingly likely/common). Removing it also doesn't actually get us anything: a few bits fewer, but still need to deal with biggish ints, and a potentially wrapping sequence counter. If we keep with millis, the impact is minimal and easy to address. (Using a separate table to reset that to zero every milli adds a lot of overhead too.)

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 18:48

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 18:48

2017年10月04日 18:48

@aschmitz @Clworld @unarist have you seen this? https://engineering.instagram.com/sharding-ids-at-instagram-1cf5a71e5a5c

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 19:52

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 19:52

2017年10月04日 19:52

@Gargron @Clworld @unarist I actually missed that specific post, but it's a common way to handle things: that's what I'd do if I had to merge lots of app servers into the database without a single serial counter. The low 16 bits let us do that (say 2**6 app servers and then 10 bits for a counter) in the future without expanding. The hashed ID is just a way of letting the DB assign things without giving away a count or requiring coordination of servers.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 20:41

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 20:41

2017年10月04日 20:41

@aschmitz @Clworld @unarist can we do it more like that? no salt and consequent inserts are guaranteed to be chronologically correct

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:16

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:16

2017年10月04日 21:16

@Gargron @Clworld @unarist I should note that both the Instagram and Twitter models are only serializable between milliseconds, not overall: if two different processes handle messages in the same millisecond, they will be out of order by ID. This is basically standard. We can handle it from the Link headers by just zeroing out the last bits with effectively no ill effects.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:16

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:16

2017年10月04日 21:16

@aschmitz @Clworld @unarist but there is only one postgres process? the timestamp function is within postgres

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:22

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:22

2017年10月04日 21:22

@Gargron @Clworld @unarist Oh, sorry, yes, if you narrow to one shard on the database. However, note that their low bits still wrap around, because they do a modulus on a Postgres sequence. There's no good way to reset that to zero every millisecond in Postgres.

(I maintain that this isn't a problem, and if two statuses that arrived in the same millisecond show up out of order, nobody will notice/care. We just have to make sure they aren't skipped.)

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:24

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:24

2017年10月04日 21:24

@aschmitz @Clworld @unarist I am concerned about the zeroing of lower bits requirement for apps because while other collections use Link headers only, status timelines and notifications often use id values straight out of the result set rather than Link headers

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:25

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:25

2017年10月04日 21:25

@unarist @Clworld @aschmitz As a last resort we could try using redis to keep the sequence increment, but I feel like it might be possible in postgres if you keep a table of timestamp->sequence rows?

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:42

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:42

2017年10月04日 21:42

@Gargron @unarist @Clworld I think that adds a lot of overhead (row lock and update?) for every addition. Not sure Redis helps much, other than being faster.

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:43

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 21:43

2017年10月04日 21:43

@aschmitz @unarist @Clworld If you zero the lowest bits in a max_id/since_id, won't it return the same status as the last/first one in same-millisecond cases?

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:53

**aschmitz** @aschmitz@lardbucket.org · 2017年10月04日 21:53

2017年10月04日 21:53

@Gargron @unarist @Clworld Only if you zero those. In most cases, it'll just be before a random status from that millisecond. (Which in practice will just be "before the last status you loaded", but not always.)

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 22:26

**Eugen Rochko** @Gargron@mastodon.social · 2017年10月04日 22:26

2017年10月04日 22:26

@aschmitz @unarist @Clworld I am worried now... Will wait for unarist and clworld to wake up, because I want to hear what they think...

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 03:42

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 03:42

2017年10月05日 03:42

@Gargron @aschmitz @unarist I guess what it means:
Current snowflake id which Mastodon uses does not sequential within msec.
So, Don't return same msec status may cause lack of status within msec when some cases in paging.
Returning same msec status help clients for getting all status, but it causes duplication of status when there is statuses which created in same msec.
So, client should perform dedup and ignore statuses already got when calling REST API.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 03:42

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 03:42

2017年10月05日 03:42

@unarist @aschmitz @Gargron
Users of REST client(which does not do dedup) may see duplication of status rarely when there is statuses which is created within same msec.
Using sequential id only for redis may works but feel some difficulty and may incompatible with backfilling.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 04:19

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 04:19

2017年10月05日 04:19

@Gargron @aschmitz @unarist
There are 2 problems.
1. Id non-sequentiality within msec. It causes lack of status when getting new status.
2. Redis key precision problem
Additional filtering for redis result can solve 2, but can't solve 1.
Returning same msec statuses at range border can solve both 1 and 2, but It causes status dup rarely and client should dedup them.
Public page should do some additional filtering for dedup (I think public page does not have 'read new' function).

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 04:29

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 04:29

2017年10月05日 04:29

@unarist @aschmitz @Gargron I found some misunderstanding about max_id and since_id, API doesn't return max_id and sice_id itself..
API returns ids smaller than max_id and not return max_id... returning same msec id will confrict with them...

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 04:34

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 04:34

2017年10月05日 04:34

@Clworld @unarist @Gargron Proposed fix returns statuses with IDs:

> since_id & ~0xffff
and
< max_id | 0xffff

But removes statuses with IDs of since_id or max_id.

In the (somewhat uncommon) case two toots occurred in the same millisecond, this will return a spurious duplicate that the client will have to dedup (or deal with).

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 05:41

**神楽坂しえる** @Clworld@md.ggtea.org · 2017年10月05日 05:41

2017年10月05日 05:41

@aschmitz @unarist @Gargron Note: We should taking care about API should return statuses as many as DEFAULT_STATUSES_LIMIT when there is more statuses in range when filtering redis result. Returning statuses count < DEFAULT_STATUSES_LIMIT has meaning there is no more statuses in requested range. When removing max_id and since_id, 2 more statues we should get from redis.

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 13:08

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 13:08

2017年10月05日 13:08

@Clworld @unarist @Gargron I made some notes to move this off a super-long thread: https://github.com/tootsuite/mastodon/issues/5228

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 13:12

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 13:12

2017年10月05日 13:12

@Gargron @unarist @Clworld Option 4 seems pretty good to me if I can figure out a way to synchronize it. Basically no user visible impact.

**unarist** @unarist@mstdn.maud.io · 2017年10月05日 13:20

**unarist** @unarist@mstdn.maud.io · 2017年10月05日 13:20

2017年10月05日 13:20

@aschmitz @Gargron @Clworld I like option 4 too, although we should care that it only skip *recent* millisecond. If the latest status has been created in enough time before, it shouldn't be skipped. We may want to determine "most recent millisecond" in the DB, since DB and Rails can be running on a different server.

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 14:10

**aschmitz** @aschmitz@lardbucket.org · 2017年10月05日 14:10

2017年10月05日 14:10

@unarist @Gargron @Clworld Yep, definitely. I think there's an easy way to do this, just need to make sure it works with @Clworld's point about returning enough statuses.

**神楽坂しえる** @Clworld@md.ggtea.org · 2017-10-05T14:55:30Z