Computer Science and Communication Engineering

Buduri_icbme2025-German Reddit Sentiment Analysis Using BERT and ML Techniques

Bledi Buduri, University for Business and Technology - UBTFollow

Session

Computer Science and Communication Engineering

Description

This study presents a comprehensive workflow for collecting, preprocessing, and analyzing German-language comments from Reddit for sentiment analysis. Initially, a Pythonbased scraper using the Reddit API was developed to extract topic-specific comments, which were then cleaned by removing special characters, links, and irrelevant tokens. The dataset underwent tokenization, stopword removal, and normalization using stemming and lemmatization to produce a structured corpus suitable for machine learning tasks.Sentiment classification was performed using the German BERT model (oliverguhr/german-sentimentbert), categorizing comments as positive, negative, or neutral. To further evaluate performance, vectorization techniques such as Bag-of-Words and TF-IDF were applied, followed by machine learning classifiers including Logistic Regression, Random Forest, and Naive Bayes. Performance metrics were assessed using confusion matrices, classification reports, and error analysis.Additionally, visualizations were created to highlight the most influential words contributing to positive and negative sentiment classification, as well as graphical representations of prediction errors. This integrated approach demonstrates how structured preprocessing combined with advanced modeling can enhance the accuracy of sentiment analysis on social media data. The methodology provides a solid foundation for monitoring public opinion and extracting insights from user-generated content in German.

Keywords:

Sentiment Analysis, German BERT, Reddit

Proceedings Editor

Edmond Hajrizi

ISBN

978-9951-982-41-2

Location

UBT Kampus, Lipjan

Start Date

25-10-2025 9:00 AM

End Date

26-10-2025 6:00 PM

DOI

10.33107/ubt-ic.2025.73

Recommended Citation

Buduri, Bledi, "Buduri_icbme2025-German Reddit Sentiment Analysis Using BERT and ML Techniques" (2025). UBT International Conference. 5.
https://knowledgecenter.ubt-uni.net/conference/2025UBTIC/CS/5

This document is currently not available here.

COinS

Oct 25th, 9:00 AM Oct 26th, 6:00 PM

Buduri_icbme2025-German Reddit Sentiment Analysis Using BERT and ML Techniques

UBT Kampus, Lipjan

Computer Science and Communication Engineering

Buduri_icbme2025-German Reddit Sentiment Analysis Using BERT and ML Techniques

Session

Description

Keywords:

Proceedings Editor

ISBN

Location

Start Date

End Date

DOI

Recommended Citation

Browse

Search

Author Corner

Links

Connect with UBT

Computer Science and Communication Engineering

Buduri_icbme2025-German Reddit Sentiment Analysis Using BERT and ML Techniques

Presenter Information

Session

Description

Keywords:

Proceedings Editor

ISBN

Location

Start Date

End Date

DOI

Recommended Citation

Share

Browse

Search

Author Corner

Links

Connect with UBT