BeyondBench - Contamination-Resistant Evaluation of Reasoning in LLMs