It should be so simple. If you're on Twitter, and you don't want people to know where you're tweeting from, don't click the "share your location" button. Put another way: If you don't want your tweets geotagged ... don't geotag them.
Data being what they are, though, things may be a little more complicated than that. A new paper from IBM research claims to have created an algorithm that can predict a user's location, at the city level, based on his or her most recent tweets. With nearly 70 percent accuracy. And without the help of geotags.
Yes. The researchers—Jalal Mahmud, Jeffrey Nichols, and Clemens Drews—note that, per one study, fewer than 1 percent of tweets are actively geo-tagged. Furthermore, according to another, only 26 percent of a random sample of over 1 million Twitter users reported their city-level locations in their profiles. So, in theory, it should be really hard to tell where people are tweeting from based on their tweets alone. The IBM researchers wanted to find a way to bypass that locational opacity ... and, it seems, they did.
Their algorithm looks at Twitter users' last 200 tweets and analyzes the semantic clues those tweets' provide about their locations of origin. It looks at things like references to city-specific locations, mentions of sports teams, and hashtags within the tweets themselves. It then cross-references those mentions against known frequencies of those references on a city-by-city basis. Using all that information, the algorithm creates a probabilistic model of a user's location.