Quantcast
Channel: Snurblog - Social Media
Viewing all articles
Browse latest Browse all 266

The Limitations of Twitter as a Data Source

$
0
0

The next speaker in this ICA 2018 session is Fabian Pfaffenberger, who also highlights the unreliability of Twitter data. The API’s 1% sample is extremely biased, and the search API is also unreliable in what it delivers; historical data is especially incomplete as the search API delivers only tweets posted in the past 6-7 days and will not include deleted tweets or tweets from subsequently deleted or suspended accounts.

User information is also incomplete, and geodata is largely unreliable and limited to some 1% of all tweets. Further, genuine users are mixed with bots in the datasets – better bot identification tools are sorely needed. And whatever we encounter may not be representative in any meaningful way – Twitter is already a niche medium, and Twitter users may be especially interested in engaging with leading users. Its userbase appears to be stagnating at this stage.


Viewing all articles
Browse latest Browse all 266

Trending Articles