19:01:55 Hi guys, I was here a couple weeks ago asking for research project ideas for my master seminar course. Moneromooo suggested I take a look at new methods for selecting ring inputs for large ring sizes. So I went ahead and read the Miller et al paper that discusses binning and feel that its a good place to start for my project. I have a few questions for you guys (Apologies for the incoming wall of text): 19:02:09 * humandoinghumant sent a long message: < https://matrix.org/_matrix/media/r0/download/matrix.org/AKZoSvNXfKWAqsrIoLfElzBF/message.txt > 19:05:39 If a bin consists of N contiguous outputs, it can be described as a start offset plus a small number for how many outputs to pick. This is smaller than a start offset plus deltas for each of the bin members (which would be one byte per extra bin member). 19:05:57 If bins are always the same size, the small number can also be omitted. 19:06:47 Performance should not be impacted. If anything, cache locality might make it a bit faster, but it'll probably be within noise. 19:07:29 I'm not sure what you're asking about ring reuse. 19:11:12 cool thesis idea 19:11:58 "how to select decoys without being spotted" sort of stuff. 19:23:20 Keep in mind that Miller et al. discuss a method for ensuring bin assignment that can't be easily gamed by miners 19:24:36 Malicious miners may attempt to order transactions to pack blocks in malicious ways 19:33:51 That would be quite cool. Picking something like 8 rings of 8 contiguous outputs, or 16 x 4 19:36:41 How to choose bins can also be deterministic, ie: pick a seed, generate a set of bin offsets deterministically, add an offset so one of the chosen bins includes your real output, and your ring is (seed, offset). 19:37:32 ^ that would be ideal 19:38:31 But this leaks probabilistic age: the higher the offset, the older the real out is likely to be. Can be countered though. 19:41:24 moneromooo: Of course! Sorry, in hindsight that was pretty straightforward and I should have thought of it myself, but thank you. 19:41:24 As for ring reuse, you mentioned it was a related subject so I was just wondering if there was any specific literature that covered it just like how miller et al discussed binning. 19:41:47 Thanks for the reminder Sarang. 19:42:16 I can't think of any. Ring reuse or partial ring reuse can leak information about which outputs are spent. 19:42:39 The simplest example is: if a N-ring is used N times, you can deduce all the outputs it references are spent. 19:43:09 There's a tool (blockchain_blackball.cpp) which calculates such things. It's not exhaustive though. 19:45:11 Good to know, thank you! And dang, deterministically choosing bins is also a good idea... 19:46:10 If it's deterministic, it also needs a reference height so the distribution references past outputs. This leaks the rough date/time the tx was made. 19:47:01 That can be quantized to leak less, but too coarse quantization will make the distribution start to be different from the expected spend distribution, so it's a tradeoff. 19:47:05 What if the bins were also counted the way outputs are, from 0? 19:47:46 That's half a question. There's a premise, but no... how would you call it... 19:48:01 implication candidate maybe ? 19:48:14 Typing on mobile, gimme a sec :p 19:49:32 So if I'm spending output 175934 and the bin size is 8, that bin includes all outputs from 175928 to 175935 19:50:28 That way you only need to specify the bin numbers 19:50:45 Yes, that works if they're all aligned. 19:51:43 That way, with a ring size N, each bin can be reused N*8 times before being blackballed 19:52:02 Would that improve anything? 19:52:24 How would it compare to non-aligned bins? 19:52:49 If there's one bin per ring, it doesn't change anything. If there are multiple bins per ring, it doesn't follow AFAICT. 19:53:20 But maybe it depends on how you set those bins up. 19:53:39 I'm considering the scenario of multiple bins, so like ring size 64 with 8x8 bins 19:56:38 I'm not sure what you're implying. If you consider a given bin, it can be reused many more times than 8x8==64 since there are other candidates for the real spend in other bins. Is that irrelevant to your point ? 19:57:43 The reusability would be bin size * ring size, no? 19:58:01 So 64x8 in this example ? 19:58:06 Yes 19:58:28 One thing I like doing to sanity check these things is take extreme numbers to see if they pass the smell check. 19:58:43 So imagine you have bins of size... the entire size of the chain. 19:59:23 If there are a million outputs on the chain, you'll need a million txes before you can say all outputs are spent. 19:59:37 Ring size would be a million. Bin size would be a million too. 20:00:05 So your argument seems to break here, 20:00:45 No, because each new tx would be a adding an output to the chain 20:00:48 Picking a bin the size of the entire chain kinda breaks the point of having a bin, doesn't it? 20:01:02 Not if the million txes are made all at once. 20:01:11 Right 20:01:13 Well, it's the perfcet case :) 20:01:20 Kinda what zcash does. 20:01:48 (Although it would be extremely cool to have a ring signature that allowed to include "the entire chain until now" in its sig without having to go through every single output) 20:01:49 It doesn't matter whether it makes sense in practice, it's here just to check extreme cases, which are often how you can see if a theory breaks. 20:03:03 I'll mull it over over dinner and while :D might come up with something that makes sense 20:03:09 * I'll mull it over over dinner and wine :D might come up with something that makes sense 20:03:12 ty 20:39:23 Ok, here's one thing that I've noticed: output binning poses a deanonymization problem when someone is trying to spend an output as quickly as possible when bins are ordered. At worst, you can't spend an output before at least $bin_size outputs have been generated (if the output you're trying to spend is the first one in the new bin) 20:40:26 There is already a minimum output age for spends 20:41:18 Given that we currently enforce a 10 block output age before allowing an output to be spent, that would mean that the "maximum" bin size would be 11 20:42:04 Or in general, the user would have to wait max(min_output_age, bin size) outputs to be generated before being able to spend a fresh output 20:44:37 And even then you'd still reveal that your output was probably in the last bin (I think?). A good way to mitigate this for people in a hurry would be forcing one bin to always be the latest one available for spending 20:47:12 Although we could probably get rid of the minimum output age completely and only use the bin size as a reference for spending 20:47:38 "You can only spend an output if it's part of a full bin" 20:49:05 Which, given the current number of transactions per block, would actually enable users to spend their funds much more quickly in terms of block times, even with a hypothetically larger bin size (like 16, or 32) 20:51:47 Though I'm not sure how all this would be affected in case of a chain reorg. Perhaps a middle ground could be reached? Like "Wait 5 blocks AND 16 new outputs before spending an output": best case you can spend within 5 blocks (10 minutes) if there's enough usage, or worst case 16 blocks (32 minutes) if there are 32 empty blocks in a row right after you received the monero you're trying to spend 20:52:49 (I've picked 5 blocks because I remember seeing a reply on reddit earlier about chain reorgs of >=5 blocks being rare, but that number could be adjusted accordingly of course) 20:54:07 oh wait, the empty blocks scenario would actually be bad, because block rewards can only be spent after 60 blocks. So if you're in a bin full of coinbase transactions getting spent before 60 blocks you're screwed 20:58:20 The 60 blocks confirmation time of coinbase outputs throws a massive wrench in the output binning problem. Either you include them and you risk getting screwed when there's not enough chain activity, or you don't and you can't spend coinbase outputs at all (???) 22:10:02 We currently weight the output selection to help mitigate the effects of nonuniform block density 22:10:09 It's nontrivial