Medium size interesting projects

I originally posted this to Reddit, but since the site is a clusterfuck a moderator immediately took it down, anyway, I hope to have better luck here.



Hi guys.

I have been interested on poker solvers lately, I want to understand their innards just not using them, I found this OS project: https://github.com/bupticybee/TexasSolver , so I had to go back (after lots and lots of years) to C++, and I was not good on it to begin with. Anyway, I like the project since it hits a pretty good balance between being interesting, a manageable size so I can "recreate" it and learn from it and it also has potential to be expanded to add things I'd like to it. This is a very rare case, since most projects are either trivial (Hangman, school directory ), boring or not interesting to me (a logger!!!!, yet another parser, games) or impenetrable behemoths. Do you know something along these lines?

Good hypothetical examples: A beer recipe software, a small and simple ERM , a statistical program (with GUI) for sport analytics, a streaming program.

Bad examples: Web frameworks, any programming tool in general, games, game of life type of stuff.

There is of course : https://github.com/fffaraz/awesome-cpp , but it is so big you still dont know what is good and what is just a waste of time so I wanted to take a shortcut.

BONUS UNRELATED QUESTION:

The original project (Poker solver) generates a solution as a JSON file (around 50 Mb) in some cases, this gives a huge performance cost (It takes up to 30 min!! to write it to disk). I am just starting to work with the project but I was wondering if there is a simpler option , like just using a plain text file or something supposedly with better performance like protobuff. The content of the file is the complete strategy in a particular case, so it has some kind of tree structure but I suppose it can be serialized in a simpler way without losing the information.
json is reasonably good for text. It isnt too much bloat, just a little.
50 mb takes a fraction of a second to read and write to a disk. the time is not the file size, its being squandered in the code.

binary files are faster, smaller, and generally better all around ... but they are difficult to debug and validate.

you can serialize a tree easily. You can look up the example of putting a binary tree into an array, to see how to do it, and that concept extends for any sized tree.
Another way to do it in c++ is to use a vector for your memory management. instead of new, use push-back. Instead of a pointer, use the [index] integer. deleting strategy varies, you can lazy delete and purge them later, or swap to the end of the vector and pop off, or find other ideas.
son is reasonably good for text. It isnt too much bloat, just a little.
50 mb takes a fraction of a second to read and write to a disk. the time is not the file size, its being squandered in the code.


I thought so too, but according to this open issue: https://github.com/bupticybee/TexasSolver/issues/52
It took 20 min to save the file, but tbf the guy says it saved more than 1Gb!!! which is absurd for something that is pure text and should take a few dozen MB tops!. As I wrote I am just beginning with the project but it seems there is a potential for a huge optimization there. Commercial solvers dont have that issue.
it is still a code problem. you can read and write a GB PDQ too, a few seconds on a modern computer. Whether a GB is absurd or not I leave to you -- not sure what the output really is here -- but its not tied to the file size. I just installed an 11gb game over the web in far less time than what you are quoting. It could be improved, but its not the size of the file that is taking the time here, but either the way it is being written or some other tangential processing.

I see the coder is blaming the json format as unsuitable. (bad pun?). I didnt look at the code but maybe he has a library to make the json records, and its taking its sweet time to bundle it up? That would do it. It may be too much work but if that is the issue, just write json format directly, don't run something that tries to do it for you.
Last edited on
I agree with you mostly. Be aware I am very green at C++ so I can be talking out of my ass, but he seems to be using a well regarded JSON library: https://github.com/nlohmann/json, so I suppose as you wrote, the problem is not the writing per se but the app code. What I find weird is that the author is aware of the problem and he does not seem to be a novice so I was kinda confused.

The solution files can be "big" (50mb-1gb), but at the end is just text you dump:

strategy":{"actions":["CALL","FOLD"],"strategy":{"4c3c":[0.0,1.0],"4d3d":[0.0,1.0],"4h3h":[0.0,1.0],"4s3s":[0.0,1.0],"5c3c":[0.0,1.0],"5c4c":[0.0,1.0],"5d3d":[0.0,1.0],"5d4d":[0.0,1.0],"5d5c":[0.0,1.0],"5h3h":[0.0,1.0],"5h4h":[0.0,1.0],"5h5c":[2.5824634576565586e-05,0.9999741911888123],"5h5d":[2.5824634576565586e-05,0.9999741911888123],"5s3s":[0.0,1.0],"5s4s":[0.0,1.0],"5s5c":[0.0,1.0],"5s5d":[0.0,1.0],"5s5h":[2.626265268190764e-05,0.999973714351654],"6c4c":[0.0,1.0],"6c5c":[0.0,1.0],"6d4d":[0.0,1.0],"6d5d":[0.0,1.0],"6d6c":[2.4980874513857998e-05,0.9999749660491943],"6h4h":[0.0,1.0],"6h5h":[0.0,1.0],"6h6c":[4.3094147258671e-05,0.9999569058418274],"6h6d":[4.3094147258671e-05,0.9999569058418274],"6s4s":[0.0,1.0],"6s5s":[0.0,1.0],"


Besides JSON is terrible to work with, a 24 MB file takes forever to open in my laptop.
Besides JSON is terrible to work with, a 24 MB file takes forever to open in my laptop.

well, any 24 MB text file would chug up a lot of text editors. Notepad would probably just keel over and die on the spot.

JSON shines when using it to transfer requests. A lot of databases in the cloud or online services use it. Its not meant to be super human readable / friendly. Its not really meant to be output/results; if it is, you would likely have a viewer that pretty presents it. Ive loaded millions of large records to a database with it, in less time than this poker program is running. Its all about generating the text rapidly. Which for this sample data, is also going to mean optimizing number-to-text routines (some of the built in c++ ones are poor performance; its a well known issue).
Last edited on
Possibly c++44 might have std::json........
Last edited on
Topic archived. No new replies allowed.