Curated by Deen Freelon, Ph.D. | freelon at american dot edu | http://dfreelon.org | @dfreelon
This is a list of data collection tools for social media—just Twitter and Facebook for now, but please feel free to add headings for other social media services. If you want to add content, you'll need to join this wiki by creating a Wikidot account. Each tool is listed once for each social media platform it can collect data from.
I have not personally used all of the software on this list, so I can’t vouch for the quality or functionality of those I haven’t used. I do, however, personally verify that each product at least claims to be able to collect Twitter and/or Facebook-related data. If your submission is removed, it is probably because I could not verify that it claims to do so. I also remove products that violate Twitter or Facebook’s terms of service, e.g. by scraping their web interfaces directly. If I’ve removed something you’ve added and you think it fits the aforementioned criteria, please contact me.
Twitter data collection tools (applications):
- DD-CSS: http://dd-css.com/ (get CSV/JSON: follower ids, last 3200 tweets, list members info)
- Discovertext: https://discovertext.com/
- DMI-TCAT: https://github.com/digitalmethodsinitiative/dmi-tcat
- BU-TCAT: http://www.bu.edu/com/bu-tcat/
- Flocker: http://flocker.outliers.es/
- Follow the Hashtag: http://analytics.followthehashtag.com/#/
- iScience Maps: http://maps.iscience.deusto.es/
- Naoyun: http://matthieu-totet.fr/Koumin/tools/naoyun/
- Netlytic: https://netlytic.org/
- NodeXL: http://nodexl.codeplex.com/
- Nvivo/Ncapture: http://www.qsrinternational.com/products_nvivo_add-ons.aspx
- SocioViz: http://socioviz.net/
- Sodato: http://cssl.cbs.dk/software/sodato/ haven’t been able to create an acct for this
- TAGS: http://tags.hawksey.info/
- Tweet Archivist: https://www.tweetarchivist.com/
- Chorus-TweetCatcherDesktop: http://www.chorusanalytics.co.uk/
- Twitter Demand Collector and Analyzer: http://bensresearch.com/TwitterDemand/ (Paper on site explains use of the tools)
- Twitonomy: http://www.twitonomy.com
- Webometrics: http://lexiurl.wlv.ac.uk/index.html
Twitter data collection tools (modules & libraries; require programming knowledge):
- 140dev: http://140dev.com/
- Hosebird: https://github.com/twitter/hbc
- Pattern: http://www.clips.ua.ac.be/pattern
- poll.emic: https://github.com/sbenthall/poll.emic
- Python-Twitter: https://github.com/bear/python-twitter
- Social Feed Manager: http://gwu-libraries.github.io/social-feed-manager/
- SocialMediaMineR: http://cran.r-project.org/web/packages/SocialMediaMineR/
- streamR: http://cran.r-project.org/web/packages/streamR/
- T: https://github.com/sferik/t
- tStreamingArchiver: https://github.com/brendam/tStreamingArchiver
- twarc: https://github.com/edsu/twarc
- Twecoll https://github.com/jdevoo/twecoll (tweets and graphs)
- tweepy: https://github.com/tweepy/tweepy
- Twitter Stream Downloader: https://github.com/mdredze/twitter_stream_downloader
- Twitter-Tap: http://cran.r-project.org/web/packages/twitteR/index.html
- TwitterGoggles: https://github.com/libbyh/TwitterGoggles and https://github.com/pmaconi/TwitterGoggles
- TWurl: https://github.com/twitter/twurl
- twutil: https://github.com/tapilab/twutil
- Twython: https://github.com/ryanmcgrath/twython
- yourTwapperKeeper: https://github.com/540co/yourTwapperKeeper
Facebook data collection tools (applications):
- Digitalfootprints: http://digitalfootprints.dk/
- Discovertext: https://discovertext.com/
- Infoextractor: http://www.infoextractor.org/
- Netvizz: https://wiki.digitalmethods.net/Dmi/ToolNetvizz and https://apps.facebook.com/netvizz/
- NodeXL (with Social Network Importer): http://socialnetimporter.codeplex.com/
- Nvivo/Ncapture: http://www.qsrinternational.com/products_nvivo_add-ons.aspx
- Sodato: http://cssl.cbs.dk/software/sodato/ haven’t been able to create an acct for this
Facebook data collection tools (modules & libraries; require programming knowledge):
- Facebook Python SDK: https://github.com/pythonforfacebook/facebook-sdk
- Facepager: https://github.com/Facepagerstrohne/Facepager
- fb_scrape_public: https://github.com/dfreelon/fb_scrape_public
- RFacebook: http://cran.r-project.org/web/packages/Rfacebook/index.html
- SocialMediaMineR: http://cran.r-project.org/web/packages/SocialMediaMineR/
Social data vendors (these sell data for analysis within their own platforms or independently):
- Brandwatch: http://www.brandwatch.com/
- Crimson Hexagon: http://www.crimsonhexagon.com/
- Datasift: http://datasift.com/
- Gnip: http://gnip.com/
- Plus one social: http://plusonesocial.com/
- Pulsar: http://www.pulsarplatform.com
- SocialPeeks: http://socialpeeks.com/
- Sysomos: http://www.sysomos.com
- Texifter: http://texifter.com (No longer available to buy data, the website redirects to DiscoverText)
- Twitris: http://twitris.knoesis.org
Instagram data collection tools (applications):
- Snoopreport: https://snoopreport.com
Other (misc links that don’t belong under any of the other sections)
- All My Plus: http://www.allmyplus.com/ [read the public data available through the Google+ API]
- COLLECTING TWITTER DATA: INTRODUCTION R & Python: http://stats.seandolinar.com/collecting-twitter-data-introduction/
- Cur.to: http://cur.to
- Google Fusion Tables - Alternative to NodeXL. Not as powerful (no data collection, no centrality calcs, for example) but good for simple visualizations: https://support.google.com/fusiontables/answer/2571232
- Import.io: https://import.io/ (advanced web scraping tool; no programming skills required)
\*\*\*
\*\*\*
\*\*\*