Nowadays, systems ensuring natural interaction between humans and machines are rapidly evolving. Among them, the task of identifying the user’s language holds particular importance. This article analyzes the problem of language identification (LID) based on speech signals, its application areas, challenges, and modem approaches. It compares traditional machine learning methods (GMM, SVM, i-vcctor) with deep neural network-based approaches (CNN, RNN, Transformer) for language recognition. Additionally, the paper discusses key evaluation metrics such as Accuracy, Precision, Fl-score, and Equal Error Rate (EER) for assessing system performance. Advanced methods for handling complex scenarios like code-switching and openset LID are reviewed, with a focus on practical perspectives for under-resourced languages like Uzbek. The results of the study provide a solid theoretical and practical foundation for developing multilingual interactive voice systems.