If anyone is able to and willing to review and help, below is the CMAKE configuration that I'm using to build the arm64-v8a library. Why is my custom built library slower than the official OpenCV Android arm64-v8a release? What can I do to match performance?