11:04:18 sarang: possible avenue for publication? https://www.mdpi.com/journal/cryptography/special_issues/Preserve_Enhance_Privacy 11:04:27 or anybody else interested for that matter ^ 14:20:26 I'll share this later at the meeting, but I completed an analysis comparing the spend-age distributions of deducible coinbase outputs to compare to non-coinbase outputs 14:20:46 Here's a plot: https://usercontent.irccloud-cdn.com/file/Wa3Y2Zrs/cdf.png 14:21:08 I also included the gamma distribution from Miller et al. (dashed line) 14:21:34 The distributions for coinbase/non-coinbase/all are extremely close; it's tough to differentiate the lines in this CDF 14:52:07 is that good or bad? 14:55:37 I think it's a good thing for the current selection algorithm; or at least pretty neutral 14:56:01 If the spend-age distribution was very different for coinbase, it could indicate the need for a change to that part of output selection 14:56:27 This doesn't necessarily remove the need to consider coinbase outputs differently; but it is good data to know 14:56:54 It's also a good confirmation of the Miller et al. parameters 14:57:03 and is reproducible 14:58:58 Getting recent data from a chain like bitcoin would still be useful, since all this data is from deducible transactions, the vast majority of which are quite old at this point 14:59:29 and the Miller et al. paper's look at the bitcoin chain found its spend patterns trended newer than for monero 15:00:03 I'll share some additional plots on transaction types and deducibility over time during the meeting 16:27:31 Weekly research meeting begins here at 17:00 UTC (about half an hour from now) 16:27:33 .time 16:27:33 2020-06-17 - 16:27:33 16:27:36 good bot 16:41:07 sarang, over time does the distribution get stretched out? so more recent tx will tend to have older members than historical tx? 16:41:44 I'm running that but don't have windowed results just yet 16:42:52 my thought is the analytical 'gamma distribution' might be less accurate than a numerically generated 'pure selection distribution' 16:44:10 While it's not directly comparable, the Miller analysis did identify a shift from newer to older on the bitcoin blockchain when looking at more recent transactions 16:44:34 See page 11, figure 10: https://maltemoeser.de/paper/monerolink.pdf 16:45:50 Im referring to the decoy distribution, as in a gamma selection over 10k blocks will pick more recent tx than a gamma selection over 100k blocks 16:46:22 Yes, but there's a pretty sharp decay 16:46:22 which would imply no global 'gamma distribution' curve 16:46:29 I get what you mean though 16:46:36 it can't reasonably scale forever 16:47:06 There have been recommendations (anecdotal and in wallet) to avoid very old spends for this reason 16:47:19 Or rather, to carefully note that they may stick out 16:47:27 yes for sure, although my immediate concern is the graph you posted :p 16:48:05 FWIW that data is all old... there's no ground-truth data for after the CT transition except for old spends 16:48:37 But yes, I'm running time windows as well to watch the difference in distribution 16:49:26 There's also the meta-question of diminishing returns 16:50:06 Using any kind of reasonable approximation to spend patterns (like the current selection distribution) is of huge value over the previous iterations, but small tweaks may be of limited marginal value compared to other possible heuristics 16:50:10 e.g. output merging 16:50:12 etc. 16:58:24 OK, we'll start the meeting presently! 16:58:45 The usual agenda: https://github.com/monero-project/meta/issues/474 16:58:52 Logs will be posted there after the meeting 17:00:11 Let's start with GREETINGS 17:00:13 Hello! 17:00:52 Hi 17:01:54 Greetings_ 17:03:05 hello 17:03:16 Hi 17:03:52 Righto, on to ROUNDTABLE, where anyone is welcome to share research of interest 17:04:11 Isthmus_: noticed you just added to the agenda; care to go first? 17:06:26 If Isthmus_ isn't quite ready, I can share a few things 17:06:35 Ah come back to me in 5 17:06:41 juggling something else real quick 17:06:46 I've done some further analysis on transaction tracing 17:06:54 I'll post a few plots of interest here... one sec 17:07:29 All transactions, by date and type: https://usercontent.irccloud-cdn.com/file/Pqx68K2E/all.png 17:07:56 I like hidden, transition, denominated 17:07:57 The same data, scaled: https://usercontent.irccloud-cdn.com/file/Cm4J3FRc/all-scaled.png 17:08:42 Deducible transactions, by date and type (same scale as first plot): https://usercontent.irccloud-cdn.com/file/rcPhKQZ4/deducible.png 17:08:55 Note that "deducible" means "at least input deducible" for this analysis 17:09:42 From the deducible transactions, I ran spend age distributions and further categorized by coinbase and non-coinbase 17:09:55 "at least 1 input deducible" 17:10:02 The CDF of this distribution: https://usercontent.irccloud-cdn.com/file/teVxZrIe/cdf.png 17:10:15 s/input deducible/1 input deducible/ 17:10:15 sarang meant to say: Note that "deducible" means "at least 1 input deducible" for this analysis 17:10:19 thanks sgp_ 17:10:42 That CDF plot also includes the gamma distribution used for output selection, and first examined in the Miller et al. paper 17:11:03 yup, for transactions with multiple inputs rings (to the observer, sarang obviously knows this) 17:11:18 Notably, coinbase outputs have an essentially identically spend-age distribution to non-coinbase outputs 17:11:29 We didn't previously know if/how they might differ 17:11:55 A big disclaimer is that _zero_ "hidden"-type transactions are deducible, and don't factor in to this data at all 17:12:08 We have _no_ hidden-type transactions to use as a direct ground-truth dataset for this 17:12:12 Which is a good thing! 17:13:17 The fraction of all transactions (non-cumulative) that have at least one deducible input: https://usercontent.irccloud-cdn.com/file/jiDfKje2/proportion.png 17:13:26 I'm astonished they are so similar tbh 17:13:33 One guess as to when the CT crossover happened... 17:13:40 lol 17:14:11 UkoeHB_ had asked before the meeting about if/how this distribution changes over the time period of the dataset, which is data I'm presently running and should have later today 17:14:54 But at the very least, this is useful since it both shows that the Miller distribution is reasonable, as well as suggests that coinbase outputs do not require any particular special treatment from a spend-age perspective 17:15:07 This is not to say that's the only factor in selection 17:15:15 but it is one factor that we previously had no data for 17:16:03 but it gives me initial confidence that we should separate the ring types and shouldn't make the selection of a particular different selection algorithm a showstopper 17:17:07 I'm also updating a writeup that includes scripts and instructions for how to run this data yourself 17:17:10 BTC data would obviously be nice to confirm since it's more recent 17:17:21 as well as supporting incremental updates, to make it straightforward to produce these plots over time in a consistent way 17:17:43 I strongly encourage folks to review these scripts once posted and/or run them yourself to verify my conclusions 17:17:56 as will the over time data UkoeHB_ suggested, in the case that Monero coinbase outputs were spent later than average in its early history, for example 17:18:47 Unfortunately I don't have the proper setup to run the BTC data 17:19:03 I will of course have the time-based Monero data 17:19:21 it just takes time to run the deducibility analysis 17:19:44 how will you graph time-based? the average number of blocks after the generation of a coinbase output before it's spent? 17:19:55 I'll pull some windows within the dataset and overlay them 17:20:13 Using the spend transaction as the target for the window 17:20:21 Ages are always relative to the spend transaction 17:20:51 This should make it straightforward to see any substantive changes 17:21:24 okay, I'll let you know later if I have concerns or am confused 17:21:26 Aside from this, the CLSAG audit by JP Aumasson and his colleague Antony Vennard is continuing 17:21:49 That's what I wanted to share today; were there other questions on any of this, before I pass the baton to Isthmus_ or others? 17:22:49 Isthmus_: ready to go? 17:22:55 Sure 17:23:00 Have at it! 17:23:01 Here’s our first draft of the audit framework for post-quantum security. Thoughts on mechanisms or algorithms to add? 17:23:07 https://usercontent.irccloud-cdn.com/file/mRxqUX65/image.png 17:23:17 This is the same image as posted to the agenda? 17:23:30 How are you defining the concern types? 17:23:36 TL;DR of image is: 17:23:41 Adversary definition: {Shor's Algorithm, Grover's algorithm, Fourier Fishing/Checking, Simon's Algorithm, Deutsch–Jozsa algorithm, Bernstein–Vazirani algorithm (Hidden Linear Function Problem), Possibly vulnerable to a future method employed by a Quantum Computer but lacking any known algorithm} 17:23:45 Attack surface: {Ring Signatures, RingCT, One-time "Stealth" Addresses, Pubkey derivation, Forge amounts?, Bulletproofs, RandomX proof-of-work, Block / Transaction hashing, PRNG, Fiat-Shamir Transform, Schnorr Signature, ??} 17:24:27 Anything jump out that we're missing? 17:25:09 Does the secrecy/privacy of amount commitments fall under RingCT? 17:25:16 "RingCT" can be interpreted broadly 17:25:18 or not broadly 17:25:42 Yea, you could label it either way 17:25:50 The questions are essentially 1) forgery, 2) unmasking amounts 17:25:55 Perhaps payment IDs could be added as well, since they're intended to be private 17:26:02 Oh yea! 17:26:16 Also: how are you defining the "concern" types on the chart? 17:26:57 keccak, (our particular usage of) chacha20 17:27:04 Oh, those "concerns" are just our research notes to each other. Not formally part of the table 17:27:14 * Isthmus_ makes note of that 17:27:34 Isthmus_: related to mooo's notes... are you concerning yourself with only on-chain stuff, or local stuff too? 17:27:56 e.g. is local wallet encryption out of scope 17:28:21 sarang may want to add any new crypto primitive from triptych ? 17:28:54 I'd certainly welcome Triptych proof analysis from that 17:29:00 since it's heavily DL based 17:29:20 ^^ 17:29:21 "all DL stuff is toast" :/ 17:29:26 RE on-chain vs local: Hmm, I had only been considering the on-chain stuff until now, but we could also glance at local :- ) 17:29:36 I think the local stuff depends on the threat model 17:29:37 My #1 priority is attack vectors that enable retroactive deanonymization 17:29:46 Which would mostly be on-chain stuff 17:29:47 If someone gets on your machine, you have bigger worries 17:31:14 If you have ideas for local security mechanisms to check (e.g. local encryption) feel free to let me know and I'll add them to the list 17:31:20 Pubkey derivation is very general and is used in other feature/mech as a primitive, or is it more like account and subaddress? 17:31:33 Right now it looks like the biggest fundamental issue is that an adversary leveraging Shor’s algorithm can find private keys based on public keys. This means that if you give your public address to somebody, they could create a wallet with your private key and scan your entire account history (circumventing almost all privacy) 17:31:34 Yeah I also wondered what that term means 17:31:42 sounds like diffie-hellman 17:31:48 The proposed recipient encrypted data scheme in the rpd branch uses chacha20 fwiw. 17:31:58 :D 17:32:01 So neither keccak nor this are local only. 17:32:06 excellen 17:32:16 s/excellen/excellent 17:32:16 sarang meant to say: excellent 17:32:18 good bot 17:32:31 I'm really looking forward to the results of this analysis 17:32:39 Primary key to private key should be breakfast for Shor's algorithm 17:32:46 Will also look at subaddresses, etc 17:32:47 Isthmus_: do you know of other projects doing this kind of in-depth work? just curious 17:33:23 I expect an unfortunate whirlwind of "Monero is not quantum-safe! Run for it!" from this =p 17:33:48 But having a solid picture of the protocol relative to a hypothetical quantum adversary will be fascinating 17:33:55 Haha, yea we're going to add a lot of "this also applies to Bitcoin, Zcash, and anything else in your portfolio" disclaimers 17:34:05 The big question for me is whether stealth addresses are secure. If there’s a way to go from stealth addresses to private keys, we’re all toast. 17:34:20 As opposed to only toast if you've given somebody else your address 17:35:26 Isthmus_: is there anything you need from this group to assist with your current work on this? 17:35:33 Can I ask a silly question? 17:35:36 sure 17:35:57 If I send sarang a transaction, and then erase my computer, and restore from seed, will I be able to recover your address from the on-chain transaction? 17:36:18 Not without external information 17:36:22 *PHEW* 17:36:28 Okay, otherwise things were going to get scary recursive real fast 17:36:34 There have been ideas from time to time to encode this for precisely this reason 17:36:48 and you can do this in extra yourself, I suppose 17:37:09 It would be bad from this perspective, since if I get pubkeyA, then I derive privkeyA 17:37:17 It would 17:37:38 and it's certainly a design tradeoff, regardless of quantum considerations 17:37:59 Cool, that's all I have for today. 17:38:06 Thanks Isthmus_ ! 17:38:12 Looking forward to future updates for this project 17:38:17 Were there other questions for Isthmus_? 17:38:45 I think that the likely conclusion is that a quantum adversary would be able to steal everyone's funds, but would not be able to link payments unless they also know your address 17:39:08 the "privacy" of stealth addresses shouldn't be obviously compromised by QC, but other far more catastrophic things would be 17:40:10 I am personally interested to see how the conclusions from this project compare to the risks of other protocols 17:40:10 Yea, I think we're going to look closely at QRL and quantum cryptocash for inspiration to address these fundamental issues first 17:40:21 you can't link blockchain to unknown address but you could link known addresses 17:40:24 That is, does the Monero protocol do better, worse, or about the same as other protocols with similar goals 17:40:43 luigi1111w: AFAIK this is also what Zcash has concluded about their quantum resistance 17:40:50 (as a comparison) 17:41:09 "you can't link blockchain to unknown address but you could link known addresses" < could you elaborate slightly? 17:41:32 given a particular address you could determine if it matches an output 17:41:45 but you couldn't derive that address from the output 17:41:58 Ah, gotcha 17:42:05 When you say "matches an output" do you mean created an output or received an output 17:42:16 received 17:42:33 although by extension also created because untraceability is compromised 17:42:48 Yep, that makes sense and matches our preliminary thining 17:42:56 s/ini/inki 17:42:56 Isthmus_ meant to say: Yep, that makes sense and matches our preliminary thinking 17:45:09 OK, did anyone else wish to share research of general interest to this group? 17:45:33 Yesterday I published an updated version of the swap 17:45:36 https://github.com/h4sh3d/xmr-btc-atomic-swap/blob/dl-proof/whitepaper/xmr-btc.pdf 17:46:02 Nice! 17:46:08 Anything in particular to comment on it here? 17:46:37 I corrected the one-time VES usage, and I confirm that it is now correct ; ) 17:46:51 noted 17:47:25 thanks h4sh3d[m] 17:47:44 I'll add more details in the paper in the next days but the protocol is done 17:47:55 Before moving on, are there any other general questions, or other research topics to address today? 17:48:01 (from anyone) 17:51:00 OK, on to ACTION ITEMS 17:51:30 I'll get my analysis toolset updated and posted, as well as finalize that time-windowed spend-age distribution data and provide it on this channel 17:51:42 And continue working with the CLSAG audit team 17:52:00 I had to set aside the output merging algorithm design, but will return to it 17:52:09 Anyone else have action items they care to share? 17:53:47 I want some confirmation that we think coinbase-only rings make sense for the CLSAG update specifically 17:55:24 What data will/do you use to make this assessment? 17:55:51 obviously I support this as-is, even without more coinbase vs non-coinbase data 17:56:47 noted 17:57:03 Is there particular data you think would help assess this, that we currently do not have? 17:57:19 I see it as an incremental improvement either way, even more so if we can adjust to fit a different selection algo for each 17:57:50 Such an algorithm likely wouldn't need to separately account for spend age, as we now know 17:58:23 makes implementation even easier then 17:59:00 All right, our hour is just about up 17:59:10 Are there any last questions or comments before we adjourn? 17:59:13 I have a minimum-age aglorithm on a whiteboard that might help with this 17:59:19 but haven't ported over to data queries yet 17:59:22 What do you mean Isthmus_? 18:00:30 Every single monero output has a minimum plausible age. If you follow up the transaction tree far enough, you'll always encounter coinbases. 18:00:46 So I can point at any output and say, "it is no younger that N hops from this coinbase" 18:01:10 Ah, got it 18:01:12 Ah I'm late for another meeting crap 18:01:13 g2g 18:01:16 * Isthmus_ bolts to a zoom room 18:01:18 Yeah, would love to see what you come up with 18:01:25 OK, I suppose we can adjourn 18:01:32 Thanks to everyone for joining in! 18:01:36 is that necessary for this specific change, or is that mostly for a broader analysis? 18:01:38 * sarang goes to post logs to the agenda 18:01:43 Thanks for hosting 18:02:02 (I see it as the latter personally) 18:03:55 (waiting for people to join the call) 18:05:00 It's part of a bigger picture. Relating to coinbase-only, I like the idea of changing blockchain tracing capabilities from "came from *this* coinbase" to "came from *a* coinbase" 18:05:20 Ah call starting now :- / 18:05:59 Isthmus_: thanks, agreed on both counts there 19:08:05 Isthmus if tx private keys are produced deterministically (as has been suggested by some) then for subaddress recipients you could multiply the tx pub key by the inverse of the tx private key to see the recipient's spend key. Tx private keys are randomly generated by the core software atm. 19:09:27 and there may be a way to extract someone's private view key from transaction data if keccak preimages can be found, since output commitments can be brute forced if the DLP is broken 19:10:24 and if the view key is known then the spend key is trivially found 19:10:47 so keccak preimage is the stealth address line-of-defense 19:17:04 Time-windowed spend-age CDF, all deducible outputs: https://usercontent.irccloud-cdn.com/file/5EccXpmE/cdf_window.png 19:19:21 If anyone has trouble interpreting due to the colors, let me know and I'll see what I can do 19:19:54 ^ UkoeHB_ in particular had been curious about this data 19:20:37 Note that all these CDF plots take into account the change in target block time, but do not otherwise take into account block timestamps 19:20:53 They assume fixed constant block targets specified by the protocol 19:46:36 yeah kinda lines up with my expectation 19:47:57 wait nvm those are all pre-ringct lol, so not referring to decoys at all 19:53:25 These are actual spend ages for those blocks 19:53:34 Nothing more, nothing less 20:29:27 it's actually slowed down over time? lol 20:29:32 my intuition is so wrong 20:30:19 still they are all mostly similar 21:20:53 Sarang what is the "output merging" you mentioned? 21:37:08 Looks like an IRCCloud error 21:37:16 Any missed messages recently? 21:44:32 xmrmatterbridge> Sarang what is the "output merging" you mentioned? 21:47:59 thanks scoobybejesus 21:48:23 cankerwort: suppose that (for some reason) a single transaction directed multiple outputs to you 21:48:49 no prob. that was the only one 21:49:12 If you later generate a tranasction where these outputs are spent, each such output will be contained within a different ring in the transaction 21:49:22 This may also occur by chance due to the selection process, of course 21:49:37 So an adversary may try to use these statistics to build a heuristic 21:52:46 This can be generalized through the graph as well 22:24:22 Very interesting thank you