Classifying Arabic dialect texts is a critical preprocessing step for natural language processing tasks and helps identify the demographics of text sources. However, the significant variability among dialects and the lack of standardized orthography make this task highly challenging. This study explores the use of a token-free large language model for Arabic dialect classification. We fine-tune the Hugging Face ByT5 pretrained model, which operates without requiring a traditional tokenizer, enabling it to handle the diverse and non-standard vocabulary of Arabic dialects effectively. Comparative experiments with convolutional neural networks (CNNs) and long short-term memory (LSTM) networks show that ByT5 achieves superior performance, offering better accuracy and robustness. Extensive evaluation on the QADI dataset highlights the ByT5 model's state-of-the-art results, achieving an F1 score of 74%. These findings underscore the importance of token-free approaches and transfer learning in overcoming the complexities of Arabic dialect classification.