Accuracy and functionality of Albanian language texts classification with SentAl algorithm

Session

Information Systems and Security

Description

Nowadays one of the important and typical task in supervised machine learning in the field of sentiment analysis is a text classification. Sentiment analysis is a field dedicated to extracting subjective emotions and sentiments from the text. Corpora of written texts are excellent data sets for doing sentiment analysis and for algorithm training. A common use of sentiment analysis is to find out whether a text expresses negative, positive or neutral sentiment. Orientation of sentiment is usually expressed in terms of positive or negative opinion (binary classification), but classification can also be multi-class classification (i.e. neutral, very positive, positive, negative, very negative), or be associated with emotions (i.e. sad, angry, scary, happy, etc). Different existing classifiers have not been very successful for Albanian language texts, and the aim of this paper is to show the accuracy and functionality on Albanian language texts classification of our proposed algorithm called SentAl. The SentAl algorithm is based on grammatical categories and on the frequency of words with positive or negative opinion. The result of the SentAl algorithm will be compared with the Naive Nayes algorithm for text classification.

Keywords:

Sentiment, classification, SentAI, orientation, accuracy, classifiers, algorithm

Session Chair

Naim Preniqi

Session Co-Chair

Blerton Abazi

Proceedings Editor

Edmond Hajrizi

ISBN

978-9951-550-19-2

Location

Pristina, Kosovo

Start Date

26-10-2019 1:30 PM

End Date

26-10-2019 3:30 PM

DOI

10.33107/ubt-ic.2019.82

This document is currently not available here.

Share

COinS
 
Oct 26th, 1:30 PM Oct 26th, 3:30 PM

Accuracy and functionality of Albanian language texts classification with SentAl algorithm

Pristina, Kosovo

Nowadays one of the important and typical task in supervised machine learning in the field of sentiment analysis is a text classification. Sentiment analysis is a field dedicated to extracting subjective emotions and sentiments from the text. Corpora of written texts are excellent data sets for doing sentiment analysis and for algorithm training. A common use of sentiment analysis is to find out whether a text expresses negative, positive or neutral sentiment. Orientation of sentiment is usually expressed in terms of positive or negative opinion (binary classification), but classification can also be multi-class classification (i.e. neutral, very positive, positive, negative, very negative), or be associated with emotions (i.e. sad, angry, scary, happy, etc). Different existing classifiers have not been very successful for Albanian language texts, and the aim of this paper is to show the accuracy and functionality on Albanian language texts classification of our proposed algorithm called SentAl. The SentAl algorithm is based on grammatical categories and on the frequency of words with positive or negative opinion. The result of the SentAl algorithm will be compared with the Naive Nayes algorithm for text classification.