When I was active in the Bitcoin community, I experienced a surge in Twitter followers with foreign languages. After that, I decided to stop using a cron job for a Twitter command line interface to automatically follow users back.
Since then, I have been trying to figure out a way to identify foreign-looking tweets and unfollow the twits/tweeps who post them. (Twitter users are commonly referred to as twits or tweeps). I have several Twitter accounts, so the cleanup process was sure to be time-consuming.
To my knowledge, Twitter doesn’t provide language information about tweets. Recently, I became aware of Python’s language detection module, and I knew I had discovered a solution.
On the first run, this script identified nearly 100 Twitter accounts to be unfollowed. As I improve the logic for identifying tweets that look like gobbledy-gook, this single file is likely to evolve into a full-blown Python package.