Jump to content

soft 404

From Wiktionary, the free dictionary

English

[edit]

Etymology

[edit]

Term introduced in 2004 by Ziv Bar-Yossef et al.

Noun

[edit]

soft 404 (plural soft 404s)

  1. A 404 page ("the result of trying to follow a broken link within a website") which reports a "200 OK" HTTP response code rather than "404 Not Found", falsely indicating to the browser that the page exists and was loaded correctly.
    • 2012, Luis Meneses, Richard Furuta, Frank Shipman, “Identifying “Soft 404” Error Pages: Analyzing the Lexical Signatures of Documents in Distributed Collections”, in Panayiotis Zaphiris, George Buchanan, Edie Rasmussen, Fernando Loizides, editors, Theory and Practice of Digital Libraries: Second International Conference, TPDL 2012, Paphos, Cyprus, September 23-27, 2012, Proceedings, Springer-Verlag Berlin Heidelberg, →ISBN, pages 201–202:
      To ensure that the server would return a soft 404, the random sequence of characters was appended to an absolute URL and file path. [] Given that our research deals with identifying and predicting soft 404s, we chose to use a ratio of 50% normal pages and 50% soft 404s.
    • 2013, Richard Rogers, “National Web Studies”, in Digital Methods, The MIT Press, →ISBN, page 141:
      However, redirects also may be “soft 404” messages to hide broken links.
    • 2016, B. Barla Cambazoglu, Ricardo Baeza-Yates, Scalability Challenges in Web Search Engines, Springer Nature Switzerland AG, published 2022, →ISBN, page 23:
      These so-called soft 404 error pages can degrade the performance of a web crawler since they are usually not worth downloading and indexing.
    • 2020, Natércia A. Batista, Michele A. Brandão, Michele B. Pinheiro, Daniel H. Dalip, Mirella M. Moro, “Data from Multiple Web Sources: Crawling, Integrating, Preprocessing, and Designing Applications”, in Valter Roesler, Eduardo Barrére, Roberto Willrich, editors, Special Topics in Multimedia, IoT and Web Technologies, Springer Nature Switzerland AG, →ISBN, part III (Data Collection and Analysis), page 228:
      The data crawling also has specific problems that must be observed such as time between requests, soft-404 error, identification of URL patterns of the pages, and extraction of data from the source code.

Further reading

[edit]