About | HeinOnline Law Journal Library | HeinOnline Law Journal Library | HeinOnline

30 Wash. Int'l L.J. 28 (2020-2021)
Two Decades of Laws and Practice around Screen Scraping in the Common Law World and Its Open Banking Watershed Moment

handle is hein.journals/pacrimlp30 and id is 36 raw text is: Copyright © 2020 Washington International Law Journal Association

TWO DECADES OF LAWS AND PRACTICE AROUND
SCREEN SCRAPING IN THE COMMON LAW WORLD
AND ITS OPEN BANKING WATERSHED MOMENT
Han-Wei Liut
Abstract: Screen scraping-a technique using an agent to collect, parse, and
organize data from the web in an automated manner-has found countless applications
over the past two decades. It is now employed everywhere, from targeted advertising,
price aggregation, budgeting apps, website preservation, academic research, and
journalism, to name a few. However, this tool has raised enormous controversy in the
age of big data. This article takes a comparative law approach to explore two sets of
analytical issues in three common law jurisdictions, the United States, the United
Kingdom, and Australia. As the first step, this article maps out the trajectory of
relevant laws and jurisprudence around screen scraping legality in three common law
jurisdictions-the United States, the United Kingdom, and Australia. Specifically, the
article focuses on five selected issue areas within those jurisdictions-digital
trespass statutes, tort, intellectual property rights, contract, and data protection. Our
findings reveal some level of divergence in the way each country addresses the legality
of screen scraping. Despite such divergence, one may see a sea change amid the trend
of data-sharing under the banner of Open Banking in coming years. This article
argues that to the extent that these data sharing initiatives enable information flow
between entities, it could reduce the demand for screen scraping generally, thereby
bringing some level of convergence. Yet, this convergence is qualified by the
institutional design of data sharing schemes-whether or not it explicitly addresses
screen scraping (as in Australia and the United Kingdom) and whether there is a top-
down, government-mandated data-sharing regime (as in the United States).
Cite as: Han-Wei Liu, Two Decades of Laws and Practice Around Screen
Scraping in the Common Law      World and its Open Banking Watershed
Moment, 30 WASH. INT'L L.J. 28 (2020).
INTRODUCTION
Text and data mining are, broadly speaking, overarching terms
covering a range of techniques to extract useful information and explore
patterns in data that might not be identified otherwise. ' One popular
technique is screen scraping-also known as web scraping, data
scraping, web data extraction, or web data mining-which refers to
constructing an agent to download, parse, and organize data from the web
in an automated manner.2 Put differently, screen scraping uses a software
agent to mimic browsing interactions between web servers and people.3
Dr. Han-Wei Liu, Lecturer, Monash University, Australia. The author is grateful for Tiana
Moutafis and Lily Raynes for excellent research assistance.
1 See generally RONEN FELDMAN & JAMES SANGER, THE TEXT MINING HANDBOOK: ADVANCED
APPROACHES IN ANALYZING UNSTRUCTURED DATA (2006).
2 The terms web scraping and web crawling are sometimes used interchangeably. Some data
scientists remark that although the difference is vague, the term crawler means that a program's ability
to navigate web pages on its own, perhaps even without a well-defined end goal or purpose, endlessly
exploring what a site or the web has to offer. SEPPE VANDEN BROUCKE & BART BAESENS, PRACTICAL
WEB SCRAPING FOR DATA SCIENCE: BEST PRACTICES AND EXAMPLES WITH PYTHON 3, 155 (2018).
3 Daniel Glez-Pena et al., Web Scraping Technologies in an API World, BRIEFING IN
BIOINFORMATICS 788, 789 (2014) (describing web scraping as [s]tep by step, the robot accesses as many

What Is HeinOnline?

HeinOnline is a subscription-based resource containing thousands of academic and legal journals from inception; complete coverage of government documents such as U.S. Statutes at Large, U.S. Code, Federal Register, Code of Federal Regulations, U.S. Reports, and much more. Documents are image-based, fully searchable PDFs with the authority of print combined with the accessibility of a user-friendly and powerful database. For more information, request a quote or trial for your organization below.



Short-term subscription options include 24 hours, 48 hours, or 1 week to HeinOnline.

Contact us for annual subscription options:

Already a HeinOnline Subscriber?

profiles profiles most