TWITTER LOCATION-BASED DATA: EVALUATING THE METHODS OF DATA COLLECTION PROVIDED BY TWITTER API
Keywords:Social media, Twitter, location data, data analysis
Twitter data analysis is an emerging field of research that utilizes data collected from Twitter to address many issues such as disaster response, sentiment analysis, and demographic studies. The success of data analysis relies on collecting accurate and representative data of the studied group or phenomena to get the best results. Various twitter analysis applications rely on collecting the locations of the users sending the tweets, but this information is not always available. There are several attempts at estimating location based aspects of a tweet. However, there is a lack of attempts on investigating the data collection methods that are focused on location. In this paper, we investigate the two methods for obtaining location-based data provided by Twitter API, Twitter places and Geocode parameters. We studied these methods to determine their accuracy and their suitability for research. The study concludes that the places method is the more accurate, but it excludes a lot of the data, while the geocode method provides us with more data, but special attention needs to be paid to outliers.
M. J. Cumbraos-Sánchez, R. Hermoso, D. Iñiguez, J. R. Paño-Pardo, M. Á. A. Bandres, and M. P. L. Martinez, “Qualitative and quantitative evaluation of the use of Twitter as a tool of antimicrobial stewardship,” International Journal of Medical Informatics, vol. 131, p. 103955, 2019.
P. Rafail, “Nonprobability sampling and Twitter: Strategies for semibounded and bounded populations,” Social Science Computer Review, vol. 36, pp. 195-211, 2018.
Z. Saaya and T. W. Hong, “The development of trust matrix for recognizing reliable content in social media,” International Journal of Computing, vol. 18, pp. 60-66, 2019.
T. H. Nazer, G. Xue, Y. Ji, and H. Liu, “Intelligent disaster response via social media analysis a survey,” ACM SIGKDD Explorations Newsletter, vol. 19, pp. 46-59, 2017.
J. R. Ragini, P. R. Anand, and V. Bhaskar, “Big data analytics for disaster response and recovery through sentiment analysis,” International Journal of Information Management, vol. 42, pp. 13-24, 2018.
K. Zahra, M. Imran, and F. O. Ostermann, “Automatic identification of eyewitness messages on twitter during disasters,” Information Processing & Management, vol. 57, p. 102107, 2020.
I. T. Hamdan and A. Malik, “Demographic analysis of Twitter users during the 2017 Iran-Iraq earthquake,” Proceedings of the 2018 Fifth HCT Information Technology Trends (ITT), 2018, pp. 149-153.
V. Kharde and P. Sonawane, “Sentiment analysis of twitter data: a survey of techniques,” arXiv preprint arXiv:1601.06971, 2016.
R. K. Bakshi, N. Kaur, R. Kaur, and G. Kaur, “Opinion mining and sentiment analysis,” Proceedings of the 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), 2016, pp. 452-455.
C. Puschmann and A. Powell, “Turning words into consumer preferences: How sentiment analysis is framed in research and the news media,” Social Media+ Society, vol. 4, p. 2056305118797724, 2018.
A. Reyes-Menendez, J. Saura, and C. Alvarez-Alonso, “Understanding# WorldEnvironmentDay user opinions in Twitter: A topic-based sentiment analysis approach,” International Journal of Environmental Research and Public Health, vol. 15, p. 2537, 2018.
N. Öztürk and S. Ayvaz, “Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis,” Telematics and Informatics, vol. 35, pp. 136-147, 2018.
P. A. Longley, M. Adnan, and G. Lansley, “The geotemporal demographics of Twitter usage,” Environment and Planning A, vol. 47, pp. 465-484, 2015.
P. Zola, P. Cortez, and M. Carpita, “Twitter user geolocation using web country noun searches,” Decision Support Systems, vol. 120, pp. 50-59, 2019.
J. Mahmud, J. Nichols, and C. Drews, “Home location identification of twitter users,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 5, p. 47, 2014.
A. Rahimi, T. Cohn, and T. Baldwin, “Semi-supervised user geolocation via graph convolutional networks,” arXiv preprint arXiv:1804.08049, 2018.
X. Zheng, J. Han, and A. Sun, “A survey of location prediction on Twitter,” IEEE Transactions on Knowledge and Data Engineering, vol. 30, pp. 1652-1671, 2018.
T. H. Do, D. M. Nguyen, E. Tsiligianni, B. Cornelis, and N. Deligiannis, “Multiview deep learning for predicting twitter users' location,” arXiv preprint arXiv:1712.08091, 2017.
Twitter. (2019, 12/12/2019). Twitter developer documentation. [Online]. Available at: https://developer.twitter.com/
M. N. Y. Utomo, T. B. Adji, and I. Ardiyanto, “Geolocation prediction in social media data using text analysis: A review,” Proceedings of the 2018 International Conference on Information and Communications Technology (ICOIACT), 2018, pp. 84-89.
K. Lin, A. Kansal, D. Lymberopoulos, and F. Zhao, “Energy-accuracy trade-off for continuous mobile device location,” Proceedings of the 8th International Conference on Mobile Systems, Applications, and Services, 2010, pp. 285-298.
B. Hecht, L. Hong, B. Suh, and E. H. Chi, “Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles,” Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2011, pp. 237-246.
How to Cite
LicenseInternational Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:
• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.