Comparison of PDF files has been done, of course, as have comparisons of various document types. You are looking for very specific things, so that part is... specific. If you just want a more general "is this file the same as that one" within some tolerances you might find that to be already done somewhere.... and such a think might make a good starting point. Whether it was done in c++ or not, I don't know. Probably someone somewhere has, c++ being popular and pdf being an ancient format.