Sign In
  • InsideTrack
  • January 06, 2021

    Legal Research: Surf the Discarded Web with the Wayback Machine

    Stymied by a broken link, or need to look at older copies of local ordinances or a website that has disappeared? With records going back to 1996, the Wayback Machine is the first place to look for old website content.

    Carol Hassler

    time loop

    Jan. 6, 2021 – Constant change may be the internet’s most reliable attribute. When websites change, most of us don’t notice. After all, having the most updated information is usually ideal for researchers and casual browsers alike. What happens when you need to research older information that has been removed from a website?

    There are several research strategies to find information that has disappeared, but my first source is always the same – The Wayback Machine.

    About the Wayback Machine

    With a little over 20 years of archived publicly available webpages, the Wayback Machine is the first place to look for older website content. A project by the Internet Archive, the website captures snapshots of websites and publicly available online documents during a given time, and makes them available to researchers.

    Carol HasslerCarol Hassler is a law librarian at the Wisconsin State Law Library. She is a member of the Law Librarians Association of Wisconsin (LLAW). LLAW's Public Relations Committee coordinates regular contributions by its members to InsideTrack.

    Launched in 2001, the Wayback Machine’s archive goes back to 1996. Find everything, from ridiculous fads like the "Hampster Dance" from 1997, to official government websites like this version of the State Law Library website from 1999.

    Archived Sites Are Useful

    The Wayback Machine is very useful for legal research. Law review, bar publications, or legal opinions may include links to websites or online reports, or a continuing legal education (CLE) program from last year might have provided a list of websites that are already out of date.

    Current websites also suffer this fate. Broken links are common on the web, and sometimes the information that was linked is simply no longer online. The Wayback Machine is one way to recover this information.

    Researching archived websites also delivers a treasure trove of historical information. Company websites, a rich resource for competitive data, product sheets, or analysis, change over time.

    Historical local ordinances or other sources that aren’t typically found in statewide databases can sometimes be found using the Wayback Machine.

    How to Use the Wayback Machine

    Using the Wayback Machine is simple:

    • visit web.archive.org/web/;
    • type or paste in the URL you want to find; and
    • search for the results.

    If a URL is archived, the first page you see is a calendar of snapshots. You can browse archived pages using the year slider at the top, and then the individual dates within each year (see Figure 1).

    Figure 1. The Wayback Machine’s calendar shows the number of captures per year for the Wisconsin Department of Justice website.

    Figure 1. The Wayback Machine’s calendar shows the number of captures per year for the Wisconsin Department of Justice website.

    The year slider near the top graphs how many captures the archive has per year. The number of archived snapshots can also vary widely from page to page even on the same website.

    Choose the year you want to research, then navigate to the month and date. Circles around dates indicate that an archive of the page exists for that day. The larger the circle, the more captures were made of that page. Select a highlighted date to see the captures for that day, and click on the snapshot link to view the archived page (see Figure 2).

    Figure 2: The circles around the dates show when a webpage was captured by the Wayback Machine.

    Figure 2: The circles around the dates show when a webpage was captured by the Wayback Machine.

    When a Webpage Isn’t Archived

    The Wayback Machine has not archived every page or document that was ever published to the web. Some websites may be crawled every month, or almost daily. Other websites – or specific pages on a website – may only be visited once a year or every other year.

    The archive also only captures publicly accessible pages. If a page is protected by a password, has on-the-fly content, or issues a “do not crawl” request to web crawlers, that page will not end up in the archive.

    It can sometimes take weeks or months for archived snapshots to appear on the site, so the Wayback Machine isn’t very helpful for researching extremely current information that appeared online and was removed a few days later.

    When a URL is not archived, use the options on the result page to explore the archived website in general. For example, searching for a report posted on the Wisconsin Department of Agriculture, Trade and Consumer Protection (DATCP) website results in an error page, such as clicking on the link for the PDF “Guidance on Possession and Sale of Cannabidiol (CBD) in Wisconsin” on this version of the DATCP website from 2018-20 (see Figure 3).

    Figure 3: Not all links are captured by the Wayback Machine, but the website provides a link to search for other archived documents on that page – such as on the DATCP website.

    Figure 3: Not all links are captured by the Wayback Machine, but the website provides a link to search for other archived documents on that page – such as on the DATCP website.

    While that particular document was not captured, you can explore other archived pages by browsing through a related list of captured links. Since many websites like to store reports or other downloadable documents in similar folders, this can be a helpful way to browse old archived documents.

    With it, you can broaden your search of the archive for more results, or browse for documents with a slightly different file name than the one you tried to find.

    If a specific URL turns up no results, search the main domain name instead – that is, the main URL of a website (such as WisBar.org). Then, use the calendar to browse the website’s captured pages over time. Surfing through the captured webpage feels similar to browsing the website when it was live.

    As an added bonus, surfing an archived webpage will often lead to other archived webpages from other agencies, companies, or organizations, making broader archive research feel seamless.

    Case Study: Old Ordinances

    I often use the Wayback Machine to look for older copies of local ordinances. While the most accurate option is to check with the municipal clerk or corporation counsel, you may find immediate answers by browsing their website archives using the Wayback Machine.

    Start by entering the URL for the current ordinances webpage into the Wayback Machine search box. If the URL for the ordinances page or municipality has not changed over the years, this is an easy way to see which years have been captured.

    In this example, using the current City of Alma ordinances link shows several captures going back to 2013 (see Figure 4).

    Figure 4: The Wayback Machine shows 23 captures from 2013 to 2020 for the City of Alma ordinance website.

    Figure 4: The Wayback Machine shows 23 captures from 2013 to 2020 for the City of Alma ordinance website.

    Clicking on links in an archived page will automatically land you on the version with the closest date. For the City of Alma’s ordinances, selecting the archived link for Chapter 30 lands the user at a 2016 capture of the document, which is the earliest version available.

    The calendar toolbar follows you and allows you to jump from year to year, even if you’re viewing an older PDF document (see Figure 5).

    Figure 5: The Wayback Machine provides a 2016 version of a City of Alma ordinance.

    Figure 5: The Wayback Machine provides a 2016 version of a City of Alma ordinance.

    In my experience, it’s fairly common to find a wide range of captured years – even for links on a single page like this ordinances page.

    Finding Older URLs and Domain Names

    Sometimes websites change their domain name. Local government websites can have a variety of different domains (.gov, .us, .com, .org), and it can be tough to keep track of URLs over time.

    If a capture only goes back a few years, sometimes it means that the domain name has changed, and older archives are available through the older URL.

    To find older domain names, browse the State Law Library’s ordinances page in the Wayback Machine. Then, use the calendar to go back to the year prior to the earliest capture of the website you are researching and look for the municipality’s link in that list (see Figure 6).

    Figure 6: One way to find old ordinance URLs is to use the State Law Library’s ordinances page on the Wayback Machine.

    Figure 6: One way to find old ordinance URLs is to use the State Law Library’s ordinances page on the Wayback Machine.

    This trick also works with agency websites. If a government agency was merged in 2010 and you want to research an older agency’s website, you first need to determine the old URL.

    Use the Wayback Machine to visit a site you know would have links to agency pages – like the State Law Library or Wisconsin.gov portal. Navigate to the pages captured before 2010 to find older links to related agency websites.

    Browse these sites to find older links to government information:

    Research is a Journey – Other Sources

    While the Wayback Machine is a useful tool, it’s not the only online archive you can search.

    There are many other website archives to investigate as well – such as the Wisconsin Digital Archives, Library of Congress, and HathiTrust. You can learn more about these resources in these InsideTrack articles:

    And as always, don’t hesitate to ask a law librarian for help with your research or to get more tips on how to leverage the Wayback Machine for your own work.

    ​​​​

Join the conversation! Log in to comment.

News & Pubs Search

-
Format: MM/DD/YYYY