OERs have high-potential to satisfy learners in many different circumstances, as they are available in a wide range of contexts. However, the low-quality of OER metadata, in general, is one of the main reasons behind the lack of personalised, OER based services such as search and recommendation. As a result, the applicability of OERs remains limited. Nevertheless, OER metadata about covered topics (subjects) is essentially required by learners to build effective learning pathways towards their individual learning objectives. Therefore, in this paper, we report on a work in progress project proposing an OER topic extraction approach, applying text mining techniques, to generate high-quality OER metadata about topic distribution. This is done by: 1) collecting 27 lectures from Coursera and Khan Academy in the area of an important skill in the area of Data Science (i.e. Text Mining as our first focus), 2) applying Latent Dirichlet Allocation (LDA) on the collected resources in order to extract existing topics related to the skill, and 3) defining topic distributions covered by a particular OER. To evaluate our model, we used the data-set of educational resources from Youtube, and compared our topic distribution results with their manually defined target topics with the help of 3 experts in the area of data science. As a result, our model extracted topics with 76% of F1-score.
CITATION STYLE
Molavi, M., Tavakoli, M., & Kismihók, G. (2020). Extracting topics from open educational resources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12315 LNCS, pp. 455–460). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-57717-9_44
Mendeley helps you to discover research relevant for your work.