-
bertptrs
Hi all, I've previously asked about replacing the JSON serialization in epee with a nicer library. There were some valid questions regarding the perceived benefits and potential slowdown, so I've sunk some time into testing this out. You can see my results here:
github.com/bertptrs/json-speedtest
-
bertptrs
In short: everything I've tested compiles faster, results in a smaller binary, and runs faster
-
bertptrs
Given this information, would you be willing to accept a PR that replaces the current serialization code (in the wallet-rpc etc) with something like json_dto for example?
-
kovri-slack
<woodser> this is good work
-
moneromooo
It seems odd json_dto is faster than rapidjson, yet is said to be a wrapper around it.
-
rockhouse
awesome work bertptrs!
-
moneromooo
That'd be the KV_SERIALIZE system only, right ?
-
moneromooo
(or also the FIELD one, which can do both json and binary)
-
moneromooo
Anyway, if those timings are fair and there are no hidden strings, they're enough for me to agree with the replacement.
-
bertptrs
moneromooo: rapidjson doesn't really have any deserialization primitives, so if you look at the code, you'll see that it's very ad-hoc. Most of the difference in speed between json_dto and rapidjson is due to small optimizations, such as reserve()ing the correct size for a vector
-
bertptrs
This would indeed just be for the KV_SERIALIZE system
-
moneromooo
vtnerd: ^
-
moneromooo
BTW, we do have some invalid JSON in so called binary calls, eg getblocks.bin.
-
moneromooo
The intent is extra speed. It'd be nice to look at whether the bin/notbin distinction still makes sense with json_dto.
-
moneromooo
I don't think I've ever checked how much speed is gained by this.
-
tomsmeding
The only real reason why that could improve performance is because it reduces the amount of network data used; there are ridiculously fast (en|de)coders for e.g. base64
-
bertptrs
moneromooo: for hidden strings: feel free to browse the test code, it's only ~300LOC, and most of them are trivial. For invalid JSON: that's a bit unfortunate, but I'd need to look into it
-
bertptrs
Could you give me some direction on how I could best compare the bin/not bin functions? AFAICT there's just 6 (or 9, if you count the aliases) of them, so I'm hoping it's not too much trouble
-
moneromooo
Compare ? If you mean test them, then they're called from wallet2.
-
moneromooo
ie, refresh, and getblocks.bin gets called.
-
moneromooo
The difference is KV_SERIALIZE_AS_BLOB (or similar name).
-
moneromooo
The "blob" means the vector is memcpied as is, rather than go through json dump/parse.
-
bertptrs
Ah, I see. That might be faster in some cases, question is of course how much
-
vtnerd
I have no idea why we would use json_dto - we can do the same thing without that library
-
vtnerd
it uses rapidjson internally, we might as we switch the epee json from a custom `std::map` to rapidjson, which is most of the speed is probably coming from
-
vtnerd
and json_dto is going to be slower than the work I just for zmq-json (which unfortunately has always been separate from epee-json stuff) because it writes to a rapidjson DOM first instead of writing directly to a stream
-
vtnerd
and fwiw, I can get something like an additional 3x speedup (from that last PR) in zmq-json output by using the json output used by blocks/txes (which is separate from epee! oh lord) .... but the patch required a bit more custom work
-
vtnerd
json_dto is going to b slower-> json_dto output is going to be slower
-
vtnerd
as for bin/not bin ... the time spent in hex is decent based on my timing profile I did. I looked at the ASM for the function in use, and theres nothing obvious that can be done
-
vtnerd
I re-worked the zmq stuff some more - I have some ideas on how to get something like epee but with proper .h/.cpp and support msgpack. A little worried about the advanced macro usage though
-
vtnerd
in summary, bertptrs if you had something based around rapidson for epee then it might be worth it, but Im still not seeing json_dto usage
-
luigi1111w
vtnerd fluffypony only fluffy has access to the supercop repo easily I think
-
vtnerd
hmm, ok
-
vtnerd
and unfortunately the next PR is hell. Theres a decent chunk of ASM to maintain constant-time and batched field inversions on the ECDH. But hey moo, its not C++ lol
-
vtnerd
its 95% identical to existing ASM in the library, so that should help with the review at least
-
selsta
new supercop PR incoming?
-
selsta
:D
-
vtnerd
anyway luigi1111w -> should that repo be distributed on write access? I wouldn't expect frequent changes
-
vtnerd
yes, I had a really nice benchmark utility and everything thats just sitting here on my drives which is absurd
-
vtnerd
this will differ from the ryo implementation because its using C++ function aliases (`using generate_key_derivation`, etc.)to do compile-time binding, so theres no runtime overhead if using the default portable monero library
-
moneromooo
If we can get most of the speedup by changing small things, I'm all for it.
-
luigi1111w
we just have to ping fluffypony fluffypony fluffypony until he shows up to grant access
-
selsta
I can ping him on wire but I used my weekly ping there already
-
selsta
:D
-
vtnerd
moneromooo : json_dto is doing what the epee/serialization folder is doing or what the src/serialization/json_object.h (zmq-json stuff) is doing.
-
vtnerd
more specifically its what src/serialization/json_object.h is doing - its trying to provide some utilities for mapping C++ to rapidjson
-
vtnerd
thats kind of why I'm negative on it ... we already have something similar (in another part of the code :/), and I just did a PR that accelerates its output further but not wasting cycles on creating a DOM (which json_dto does)
-
moneromooo
Odd. I had kinda assumed the speedup was from not creating a DOM :)
-
fluffypony
luigi1111w: done
-
fluffypony
see, you just have to say my name 3 times
-
vtnerd
rapidjson DOM should be more performant than the one in epee, but I've never clocked it do know the difference
-
vtnerd
well the secret is out now, all hell will break loose
-
moneromooo
bertptrs: how do you feel about checking a bastard of epee and rapidjson as vtnerd is suggesting ? :)
-
moneromooo
Not having to move the loads of KV_SERIALIZE would be nice.
-
luigi1111w
the genie has arrived
-
vtnerd
the serialize functions all take a templated storage type, so starting with a copy of `epee::portable_storage` is what you'd want. I'm not sure how much hell its going to be with the function signatures though
-
moneromooo
When I tried, it pulled lots (that patch you did not like due to the std::string needed for find).
-
moneromooo
Well, I was modifying the class, not replacing it I guess.
-
vtnerd
if you were blowing up the epee stuff entirely, then we might as well leverage `src/serialization/json_object.h`, unless there's a good reason not to use it
-
vtnerd
*yes I realize re-doing KV_SERIALIZE is not ideal, I'm saying why blow it for json_dto when we got nearly the same thing already
-
vtnerd
moo : yup, there might be some funky things when working with teh function signatures. as an example, the epee thing isn't doing `reserve` anywhere due to its design
-
vtnerd
there should be a way to work that in, but requires getting a headache first
-
endogenic
<3 rapidjson