| Consumer Internet of Things (IoT) devices are extremely popular, providing users with rich and diverse functionalities, from voice assistants to home appliances.
These functionalities often come with significant privacy and security risks, with notable recent large-scale coordinated global attacks disrupting large service providers.
Thus, an important first step to address these risks is to know what IoT devices are where in a network.
While some limited solutions exist, a key question is whether device discovery can be done by Internet service providers that only see sampled flow statistics.
In particular, it is challenging for an ISP to efficiently and effectively track and trace activity from IoT devices deployed by its millions of subscribers---all with sampled network data. In this paper, we develop and evaluate a scalable methodology to accurately detect and monitor IoT devices at subscriber lines with limited, highly sampled data in-the-wild. Our findings indicate that millions of IoT devices are detectable and identifiable within hours, both at a major ISP as well as an IXP, using passive, sparsely sampled network flow headers. Our methodology is able to detect devices from more than 77% of the studied IoT manufacturers, including popular devices such as smart speakers. While our methodology is effective for providing network analytics, it also highlights significant privacy consequences. |