Spatial Twitter Analysis

Collection of Open Source GIScience work


Spatial Twitter Analysis

Wang et al. (2016) developed a methodology to analyze twitter activity during wildfires around the San Diego area in May of 2014. They used the twitter search API to collect tweets that contained the words “wildfire” and “fire” between May 13, and May 22, 2014 and then a sub-set of tweet containing information about specific locations where fires occurred during that time frame. From the totality of tweets collected, they were only interested in those with geospatial information. They used the ‘coordinates’ field to filter out useless tweets. Having georeferenced tweets, they performed kernel density estimation to understand the spatial pattern of the tweets. This may represent a problem because areas with larger population may produce more twitter activity in general. Thus, the results may appear as a population map which would not provide useful information for the analysis. Therefore, they had to normalize it by making a dual kernel density estimation where the number of tweets in each unit of analysis was divided by its corresponding population value. They also conducted other analysis such as interaction networks or word analysis to understand how people where communicating about the fires. Finally, the graphed twitter activity over time and concluded that as wildfires evolve, twitter activity about them increases too.

Regarding replicability and reproducibility, this methodology has growing potential. In terms of reproducibility, we would need the tweet ids in order to use the rehydratoR package to obtain the raw data. Nevertheless, I think they provided a clear outline of their methods, although there may be unclear specificities that could hurdle the reproduction of this study. In terms of replicability, I think this analysis can be applied to different contexts such as other types of natural hazards, or extrapolated to other social phenomena that sparks twitter activity. Nonetheless, the replicability of this sort of analysis may be limited to cultural contexts where twitter is widely used as a source of information, and where there is reliable electrical infrastructure and access to the internet given the short temporal scales of analysis. For example, in countries like Venezuela where Twitter is widely used, these kinds of analysis may be limited by the ability of people in the country to tweet given the constant internet and electricity outages, especially when coupled with natural disasters or civil unrest.

Relevant articles

Wang, Z., X. Ye, and M. H. Tsou. 2016. Spatial, temporal, and content analysis of Twitter for wildfire hazards. Natural Hazards 83 (1):523–540. DOI:10.1007/s11069-016-2329-6

Main Page