An investigation into representation learning for natural language and social media as an initial step towards modeling social dynamics and combating social issues online.
The internet, especially social media, is rapidly changing how people interact with information. With extreme shifts in information dynamics that have brought about innovation, so too has it created a broad class of social problems that must be addressed. These issues include misinformation, disinformation, social manipulation, social bots, political radicalization, and filter bubbles. Instead of allowing these issues to propagate further, they should be reckoned with directly. In order to do so, one needs a set of tools designed to handle social data. In line with these goals, Look, Don't Tweet is a research-based thesis that qualifies aspects that are necessary to learn representations designed to tackle these problems. In doing so, a Python module, PyConversations, is developed for cross-platform social media analysis. Additionally, large-scale representation learning is explored, producing results that question the default approach in NLP of using off-the-shelf large Transformer models to tackle social media-based tasks.