00:37:47 I always assume that one of us is secretly a 3 letter agency, and somebody else is probably a Fortune 500 company that is still in stealth mode about cryptocurrency adoption. :- P 00:38:21 Anyways, what do y'all think about extra decoy outputs for the sake of artificially increasing the anonymity set? 00:38:23 https://github.com/MyHush/sietch 00:38:35 I don't agree with 100% of the statements in that GitHub issue, but I do find the idea thought provoking 00:39:25 Kind of depends on ratios between transaction volume, output volume, and ring size 00:39:59 s/GitHub issue/Sietch writeup on GitHub 00:39:59 Isthmus meant to say: I don't agree with 100% of the statements in that Sietch writeup on GitHub, but I do find the idea thought provoking 00:42:42 This has been brought up several times over the years 00:50:29 The previous context was mainly about the idea of reducing the effects of entities colluding with known outputs to reduce the effective anonymity set 01:38:00 <[keybase] unseddd>: sarang: agree that increasing the anonymity set is a good idea, along with removing the ability of malicious actors to collude in a way that reduces the anonymity set for all users. am advocating for not using an algorithm that is too greedy, rejecting legitimate auxilliary input(s), claiming they are malicious 01:41:52 You can't prevent entities from colluding 01:42:55 Nor can you reliably detect what outputs belong to such entities without significant external information 01:43:33 <[keybase] unseddd>: what i mean, is removing user control over vulnerable aspects of the protocol as much as possible, e.g. deterministic tx-building to defeat Janus attacks 01:43:54 FWIW the percentage of outputs that need to be controlled by colluding entities (or otherwise suspected/known to be non-signers) to destroy a ring signature is nontrivial 01:44:01 and keep in mind that this is very time-based 01:44:12 since decoy selection by default is not uniform over time 01:44:26 unseddd: decoy selection is not protocol enfoced 01:44:28 *enforced 01:45:04 There have been proposals to do so, but they do not play nicely with accurate spend distribution estimates 01:46:32 <[keybase] unseddd>: great! that should make implementing/integrating hardening changes there won't require any consensus changes / hard-forks, right? 01:46:49 What kind of change 01:47:13 Changes to non-enforced decoy selection are, by definition, not enforced =p 01:47:56 <[keybase] unseddd>: still don't fully understand the idea of non-uniform distribution, if spend distributions can't be analyzed on Monero 01:48:33 They can't directly 01:48:47 But some early transactions can be 01:48:54 and this distribution can be compared to transparent chains 01:49:01 They turn out to be similar 01:49:26 <[keybase] unseddd>: the distribution algo from stdlib also causes clang builds to fail for Monero, so replacing it with a uniform random distribution would fix two things 01:49:27 This is assumed to be a reasonable approximation 01:49:43 A uniform random selection from the chain is not suitable 01:49:58 <[keybase] unseddd>: maybe Monero used uniform random distribution before, and it caused issues? 01:50:01 because outputs are not equally likely to be spent as a function of their age on chain 01:50:19 Newer outputs are _much_ more likely to be spent 01:51:05 If the selection algorithm differs significantly from the expected spend age distribution, you can build a heuristic for the most likely signer based on this 01:51:27 So while moving to a simple distribution is useful for protocol-enforced decoy selection, it is terrible for adversarial heuristics 01:52:15 If there are problems with the current distribution, it might be possible to rewrite a custom version that builds properly with other tools 01:52:21 (I have not run into this problem personally) 01:57:15 <[keybase] unseddd>: try building Monero with clang, you'll run into the issue 02:00:44 I want R = fxn[seed, height] 02:01:08 seed is any integer, height is a block height 02:01:14 <[keybase] unseddd>: also advocating for enforcing decoy selection, at least in some of the ways mentioned for Janus mitigation, i.e. making tx-building verifiable, even if optional 02:01:31 Isthmus: ? 02:01:52 And the output R is a set of ring members [technically, a list of output indices] that satisfy our decoy selection algorithm 02:01:53 Oh you mean deterministic ring selection 02:02:00 Yes, this method exists 02:02:08 but in general requires inverse transform sampling 02:02:16 Oooh, tell me more 02:02:33 <[keybase] unseddd>: Isthmus: you want fxn? ;p 02:02:36 Oh, having to generate sets of R's to find one that includes your output? 02:02:44 fxn, yes plz 02:03:06 http://spar.isi.jhu.edu/~mgreen/mixing.pdf 02:03:13 For simple distributions it's efficient 02:03:19 (I have simple code that shows examples of this) 02:03:30 For more complex distributions, not so much 02:03:39 <[keybase] unseddd>: _gib Isthmus fxn_ 02:03:45 Because the verifier needs to use the seed data to reconstruct the output set 02:04:39 The paper shows it for uniform sampling and the old triangular sampling 02:05:09 O rly? 02:05:13 yes 02:05:20 * Isthmus scopes it out 02:05:24 but our distribution is not nearly so straightforward as lines 02:05:39 It relies on keyed hash functions 02:06:09 Here's a weird, maybe heretical question 02:06:48 How closely does our decoy selection algorithm need to match the real spend time distribution 02:07:03 If we ask "what is the ideal decoy selection algorithm" the trivial answer is "our best approximation of the spend time" 02:07:03 <[keybase] unseddd>: not familiar, will check it out. perhaps uniform-random is naive, as am still familiarizing myself with Monero attach surface / design choices. however, continuing a heuristic based only on information exposed by the earliest txes seems a large technical debt 02:07:17 But is this the *only* possible approach that would provide adequate cover? 02:07:27 <[keybase] unseddd>: *attack 02:08:54 Miller et al. look at this difference 02:09:28 Well, it's important to note that there is (ideally) no way to check the validity of age-based guesses 02:10:01 So on their own, they provide no particular provable data 02:10:43 and it's not at all clear what a quantifiable definition of "plausible deniability" for a ring signature looks like in practice 02:11:28 unseddd: how does the selection algorithm imply technical debt? 02:12:09 We have no particular reason to think that Monero spend age distributions differ from the combination of known Monero spend data and Bitcoin spend data, which show the same trends 02:12:14 (and we can't check this anyway) 02:12:21 <[keybase] unseddd>: unfortunately don't think practical knowledge of plausible deniability will be know until it's used in a court case or similar 02:12:31 and FWIW the distribution is not that complicated 02:12:51 it just doesn't play as nicely with sampling that would make deterministic selection reasonable 02:15:22 <[keybase] unseddd>: since the effect of uniform random selection on deniable plausibility is unknown, and the effects of non-deterministic selection is known (Janus), would favor mitigating the latter given more research on the former 02:17:19 I don't really think much about "plausible deniability," since it's an artificial construct that will vary by every jurisdiction and decade. My only interest is in statistical obfuscation. 02:18:09 Janus is about subaddress faking, not decoy selection 02:18:21 And uniform selection is a terrible option 02:18:35 <[keybase] unseddd>: basically, if we can measure the effects of uniform random distribution on selection, and it is found to not break anything, lets use uniform random 02:18:35 Under all but the simplest risk assumptions 02:19:01 Uniform selection doesn't break anything except statistical expectations of spend ages 02:19:02 <[keybase] unseddd>: why terrible? 02:19:12 Because it basically gives away the likely signer 02:19:35 And an adversary can use that to weight the likely true tx graph 02:19:43 <[keybase] unseddd>: statistics that cannot be verified, or even measured, on the current chain? 02:20:03 No but they can be used as part of broader graph techniques 02:20:13 And they're a really good heuristic 02:20:59 <[keybase] unseddd>: giving away the likely signer is disasterous, if that really is the effect, definitely no uniform random 02:22:33 <[keybase] unseddd>: good heuristic based only on unspent early txes right? am misunderstanding something? create a turnstyle-like protocol, and remove the ability to perform those heuristics. problem solved, no? 02:25:32 That's not really the point 02:25:59 The point is that we know how users tend to spend outputs based on other chains and early Monero data 02:26:41 <[keybase] unseddd>: thought removing heuristics was the point, where have gone wrong? 02:27:13 Selecting decoys according to expected spend patterns is the mitigation to this heuristic 02:27:32 That's exactly why we use the algorithm we do 02:27:41 And continue to iterate on it 02:28:37 <[keybase] unseddd>: right, just trying to think of a more permanent solution, so you do not have the technical debt going forward 02:28:53 How is it technical debt 02:29:08 I don't really follow 02:29:57 <[keybase] unseddd>: having to continually update a selection algo based on best-guesses and heuristics sounds like technical debt to me 02:29:58 yeah, the output selection doesn't create technical debt afaict. 02:30:24 its more like a technical burden 02:30:51 a lot of aspects of security are continually moving targets 02:30:56 <[keybase] unseddd>: burden == debt ... 02:31:10 It's not ideal, but it is a consequence of ring signatures 02:31:31 well yeah, semantics 02:31:32 you can't really pick a single algorithm and carve it into stone. it'd be like the maginot line, someone will walk around it. 02:31:53 simple answer is ringsize a bajillion 02:32:03 The goal is transaction uniformity, which depends in part on usage patterns 02:32:15 We need to adapt to them as best we can 02:32:29 This is part of it 02:33:15 <[keybase] unseddd>: hyc: get that security, and the fight for privacy is ever-moving, just tring to think of things that will make Monero have to move less in this particular direction. totally understand that i do not comprehend the full picture yet 02:33:56 those are good thoughts to have. less human hands the better 02:33:56 There are techniques like binning that can apply better at larger ring size 02:34:26 These have the added advantage of reducing communication complexity 02:35:09 <[keybase] unseddd>: sarang: are those ring sizes practical? 02:35:19 That's the current goal 02:35:25 We're getting there 02:36:11 <[keybase] unseddd>: ok awesome! then will direct my attention there :) 02:36:18 I assure you it is not an easy problem to solve 02:36:47 nonsense! just put it on a blockchain! 02:36:58 Brilliant 02:37:17 <[keybase] unseddd>: no, wouldn't think it is, but at willing to throw some braincells at it 02:37:24 Please do 02:37:43 Nobody has totally solved it yet 02:38:33 <[keybase] unseddd>: is there a tracking issue for ongoing efforts? 02:39:28 the current effort is in CLSAG isn't it? 02:40:26 Clsag is more of a stopgap 02:40:52 <[keybase] unseddd>: oh, so CLSAG and it's successors would enable large enough ring sizes for binning? 02:41:07 Other efforts include Ommiring, Triptych, Lelantus, RCT 3, Arcturus 02:41:19 Clsag would not 02:41:38 Some are externally developed, others are in house 02:45:41 <[keybase] unseddd>: sarang: is there a paper(s) or description of the binning technique you mentioned? 02:48:15 unsedd, https://eprint.iacr.org/2019/186.pdf 02:48:22 i think thats the reference 02:49:20 https://petsymposium.org/2018/files/papers/issue3/popets-2018-0025.pdf 02:51:25 woops 03:00:40 <[keybase] unseddd>: gingeropolous, sarang: many thanks :) 03:29:18 <[keybase] unseddd>: yeah, that paper makes it obvious that uniform distribution is terrible for input selection. sorry for wasting time suggesting it 04:25:15 <[keybase] unseddd>: would be interesting to re-run the POPETS18 analysis using most current chain data, see if their results still hold. still need to read the IACR paper 07:48:39 <[keybase] unseddd>: by re-running the analysis, only mean using the current 11-mixin with Gaussian distribution, and compare with say 11-mixin, (3,4)-bin binning strategy 07:51:56 <[keybase] unseddd>: the 7-mixin, 4-bin gives 4.0 min-untraceability in the POPETS18 paper, so curious to see if the 11-mixin, 4-bin strategy gets close to ideal 1/11 chance of 100% traceability