To Err.Is Human: Characterizing the Threat of Unintended URLs in Social Media

Abstract

To make their services more user friendly, online so-cial media platforms automatically identify text that correspondsto URLs and render it as clickable links. In this paper, we showthat the techniques used by such services to recognize URLs areoften too permissive and can result in unintended URLs beingdisplayed in social network messages. Among others, we show thatpopular platforms (such as Twitter) will render text as a clickableURL if a user forgets a space after a full stop at the end of asentence, and the first word of the next sentence happens to be avalid Top Level Domain. Attackers can take advantage of theseunintended URLs by registering the corresponding domains andexposing millions of Twitter users to arbitrary malicious content.To characterize the threat that unintended URLs pose to socialmedia users, we perform a large-scale study of unintended URLsin tweets over a period of 7 months. By designing a classifiercapable of differentiating between intended and unintended URLsposted in tweets, we find more than 26K unintended URLs postedby accounts with tens of millions of followers. As part of our study,we also register 45 unintended domains and quantify the trafficthat attackers can get by merely registering the right domainsat the right time. Finally, due to the severity of our findings,we propose a lightweight browser extension which can, on thefly, analyze the tweets that users compose and alert them ofpotentially unintended URLs and raise a warning, allowing usersto fix their mistake before the tweet is posted.

Publication
In Network and Distributed System Security Symposium 2021
Web Security