Object detection has been implemented in all sorts of real-life scenarios such as facial recognition, traffic monitoring and medical imaging but the research that has gone into object detection in drawings and cartoons is not nearly as extensive. The Where's Wally puzzle books give a good opportunity to implement some of these real-life methods into the fictional world. The Wally detection framework proposed is composed of two stages: i) a Haar-cascade classifier based on the Viola-Jones framework, which detects possible candidates from a scenario from the Where's Wally books, and ii) a lightweight convolutional neural network (CNN) that re-labels the objects detected by the cascade classifier. The cascade classifier was trained on 85 positive images and 172 negative images. It was then applied to 12 test images, which produced over 400 false positives. To increase the accuracy of the models, hard negative mining was implemented. The framework achieved a recall score of 84.61% and an F1 score of 78.54%. Improvements could be made to the training data or the CNN to further increase these scores.