AN ASSESSMENT OF LITERATURE ON THE EXTRACTION OF MULTIWORD EXPRESSIONS: FOCUS ON PEDAGOGICAL IMPLICATIONS
Abstract
Multiword expressions are difficult to deal with. The difficulty in handling them ranges from definition to classification. Researchers in the areas of Natural Language Processing and Computational Linguistics have neither provided a universal definition of what multiword expressions are, nor have they clearly classified multiword expressions. The focus of this paper is not to provide a generally acceptable definition, nor a clear classification of multiword expressions. Rather, the paper provides a pathway of approaching multiword expressions in linguistic research, especially through their extraction. Through extensive literature review, we argue that multiword expressions can be dealt with statistically or linguistically. This paper concludes that whereas some researchers have argued for the use of statistics-based methods, others have adhered to linguistic approaches. However, this study does not argue that these two approaches to MWE extraction have not been used concurrently. There are some studies that have used a hybrid approach. Therefore, there is no simple and best method in the extraction of MWEs in a corpus. A study uses a specific approach according to its overall objective. Whatever approach in handling multiword expression is adopted, the overriding objective must be add pedagogical value to extracted multiword expressions.