Refine
Document Type
- Article (5)
- Conference Proceeding (3)
Keywords
- adblocker (1)
- consent banner (1)
- cookie banner (1)
- cookies (1)
- filter lists (1)
- privacy (1)
- tracking protection (1)
- user tracking (1)
- web measurement (1)
- web privacy (1)
Web measurement studies can shed light on not yet fully understood phenomena and thus are essential for analyzing how the modern Web works. This often requires building new and adjustinng existing crawling setups, which has led to a wide variety of analysis tools for different (but related) aspects. If these efforts are not sufficiently documented, the reproducibility and replicability of the measurements may suffer—two properties that are crucial to sustainable research. In this paper, we survey 117 recent research papers to derive best practices for Web-based measurement studies and specify criteria that need to be met in practice. When applying these criteria to the surveyed papers, we find that the experimental setup and other aspects essential to reproducing and replicating results are often missing. We underline the criticality of this finding by performing a large-scale Web measurement study on 4.5 million pages with 24 different measurement setups to demonstrate the influence of the individual criteria. Our experiments show that slight differences in the experimental setup directly affect the overall results and must be documented accurately and carefully.
Third-party tracking is a common and broadly used technique on the Web. Different defense mechanisms have emerged to counter these practices (e.g. browser vendors that ban all third-party cookies). However, these countermeasures only target third-party trackers and ignore the first party because the narrative is that such monitoring is mostly used to improve the utilized service (e.g. analytical services). In this paper, we present a large-scale measurement study that analyzes tracking performed by the first party but utilized by a third party to circumvent standard tracking preventing techniques. We visit the top 15,000 websites to analyze first-party cookies used to track users and a technique called “DNS CNAME cloaking”, which can be used by a third party to place first-party cookies. Using this data, we show that 76% of sites effectively utilize such tracking techniques. In a long-running analysis, we show that the usage of such cookies increased by more than 50% over 2021.
Measurement studies are essential for research and industry alike to understand the Web’s inner workings better and help quantify specific phenomena. Performing such studies is demanding due to the dynamic nature and size of the Web. An experiment’s careful design and setup are complex, and many factors might affect the results. However, while several works have independently observed differences in
the outcome of an experiment (e.g., the number of observed trackers) based on the measurement setup, it is unclear what causes such deviations. This work investigates the reasons for these differences by visiting 1.7M webpages with five different measurement setups. Based on this, we build ‘dependency trees’ for each page and cross-compare the nodes in the trees. The results show that the measured trees differ considerably, that the cause of differences can be attributed to specific nodes, and that even identical measurement setups can produce different results.
Software updates take an essential role in keeping IT environments secure. If service providers delay or do not install updates, it can cause unwanted security implications for their environments. This paper conducts a large-scale measurement study of the update behavior of websites and their utilized software stacks. Across 18 months, we analyze over 5.6M websites and 246 distinct client- and server-side software distributions. We found that almost all analyzed sites use outdated software. To understand the possible security implications of outdated software, we analyze the potential vulnerabilities that affect the utilized software. We show that software components are getting older and more vulnerable because they are not updated. We find that 95 % of the analyzed websites use at least one product for which a vulnerability existed.
Künstliche Intelligenz (KI) ermöglicht es, komplexe Zusammenhänge und Muster aus großen Datenmengen zu extrahieren und in einem statistischen Modell zu erfassen. Dieses KI-Modell kann anschließend Aussagen über zukünftig auftretende Daten treffen. Mit dem zunehmenden Einsatz von Künstlicher Intelligenz rücken solche Systeme auch immer mehr ins Visier von Cyberkriminellen. Der Artikel beschreibt umfassend Angriffsszenarien und mögliche Abwehrmaßnahmen.
Cookie notices (or cookie banners) are a popular mechanism for websites to provide (European) Internet users a tool to choose which cookies the site may set. Banner implementations range from merely providing information that a site uses cookies over offering the choice to accepting or denying all cookies to allowing fine-grained control of cookie usage. Users frequently get annoyed by the banner’s pervasiveness as they interrupt “natural” browsing on the Web. As a remedy, different browser extensions have been developed to automate the interaction with cookie banners.
In this work, we perform a large-scale measurement study comparing the effectiveness of extensions for “cookie banner interaction.” We configured the extensions to express different privacy choices (e.g., accepting all cookies, accepting functional cookies, or rejecting all cookies) to understand their capabilities to execute a user’s preferences. The results show statistically significant differences in which cookies are set, how many of them are set, and which types are set—even for extensions that aim to implement the same cookie choice. Extensions for “cookie banner interaction” can effectively reduce the number of set cookies compared to no interaction with the banners. However, all extensions increase the tracking requests significantly except when rejecting all cookies.
Abstract
Filter lists are used by various users, tools, and researchers to identify tracking technologies on the Web. These lists are created and maintained by dedicated communities. Aside from popular blocking lists (e.g., EasyList), the communities create region-specific blocklists that account for trackers and ads that are only common in these regions. The lists aim to keep the size of a general blocklist minimal while protecting users against region-specific trackers.
In this paper, we perform a large-scale Web measurement study
to understand how different region-specific filter lists (e.g., a blocklist specifically designed for French users) protect users when visiting websites. We define three privacy scenarios to understand when and how users benefit from these regional lists and what effect they have in practice. The results show that although the lists differ significantly, the number of rules they contain is unrelated to the number of blocked requests. We find that the lists’ overall efficacy varies notably. Filter lists also do not meet the expectation that they increase user protection in the regions for which they were designed. Finally, we show that the majority of the rules on the lists were not used in our experiment and that only a fraction of the rules would provide comparable protection for users.