Accuracy and functionality of Albanian language texts classification with SentAl algorithm
Session
Information Systems and Security
Description
Nowadays one of the important and typical task in supervised machine learning in the field of sentiment analysis is a text classification. Sentiment analysis is a field dedicated to extracting subjective emotions and sentiments from the text. Corpora of written texts are excellent data sets for doing sentiment analysis and for algorithm training. A common use of sentiment analysis is to find out whether a text expresses negative, positive or neutral sentiment. Orientation of sentiment is usually expressed in terms of positive or negative opinion (binary classification), but classification can also be multi-class classification (i.e. neutral, very positive, positive, negative, very negative), or be associated with emotions (i.e. sad, angry, scary, happy, etc). Different existing classifiers have not been very successful for Albanian language texts, and the aim of this paper is to show the accuracy and functionality on Albanian language texts classification of our proposed algorithm called SentAl. The SentAl algorithm is based on grammatical categories and on the frequency of words with positive or negative opinion. The result of the SentAl algorithm will be compared with the Naive Nayes algorithm for text classification.
Keywords:
Sentiment, classification, SentAI, orientation, accuracy, classifiers, algorithm
Session Chair
Naim Preniqi
Session Co-Chair
Blerton Abazi
Proceedings Editor
Edmond Hajrizi
ISBN
978-9951-550-19-2
Location
Pristina, Kosovo
Start Date
26-10-2019 1:30 PM
End Date
26-10-2019 3:30 PM
DOI
10.33107/ubt-ic.2019.82
Recommended Citation
Neziri, Vehbi; Dervishi, Ramadan; and Caka, Ali, "Accuracy and functionality of Albanian language texts classification with SentAl algorithm" (2019). UBT International Conference. 82.
https://knowledgecenter.ubt-uni.net/conference/2019/events/82
Accuracy and functionality of Albanian language texts classification with SentAl algorithm
Pristina, Kosovo
Nowadays one of the important and typical task in supervised machine learning in the field of sentiment analysis is a text classification. Sentiment analysis is a field dedicated to extracting subjective emotions and sentiments from the text. Corpora of written texts are excellent data sets for doing sentiment analysis and for algorithm training. A common use of sentiment analysis is to find out whether a text expresses negative, positive or neutral sentiment. Orientation of sentiment is usually expressed in terms of positive or negative opinion (binary classification), but classification can also be multi-class classification (i.e. neutral, very positive, positive, negative, very negative), or be associated with emotions (i.e. sad, angry, scary, happy, etc). Different existing classifiers have not been very successful for Albanian language texts, and the aim of this paper is to show the accuracy and functionality on Albanian language texts classification of our proposed algorithm called SentAl. The SentAl algorithm is based on grammatical categories and on the frequency of words with positive or negative opinion. The result of the SentAl algorithm will be compared with the Naive Nayes algorithm for text classification.