Résumé

Localising Twitter users when trying to analyse local trends, events, or mood is a useful capability. However, there is still no method able to reach high precision and recall. Research projects attempting to localise Twitter users to a precise radius (e.g., 10km) managed to localise at most 60% of users correctly. In this paper, we propose a way to classify them by the country they are located in, instead of finding a precise localisation. We apply our technique to Switzerland and locate the users to inside or outside of the country. Among different features, we used relations of users to a list of "Swiss Influencers" accounts - that is, accounts which are mostly of interest to Swiss people. A full classification pipeline was implemented and tested. We have found that our best classification models achieved an accuracy of 95%, with a maximum precision of 98%, and a maximum recall of 91%. This goes to show that our binary classification problem, while potentially not being specific enough for certain types of applications, can amount to significantly more reliable results.

Détails

Actions

PDF