le mardi 26 septembre 2023

################################################################################
wiki.lwn.net
################################################################################

a subdomain of LWN.net, wiki.lwn.net redirects to a malicious site claiming to offer free downloads for copyrighted material.

https://wiki.lwn.net/ on its own, without any path, redirects to google.com.

````````````````````````````````````````````````````````````````````````````````
$ curl "https://wiki.lwn.net/"
<html>
<head>
    <link rel="icon" href="favicon.ico" type="image/x-icon" />
<meta http-equiv="refresh" content="1; url=https://google.com/">
<title>google</title>
</head>
<body style="text-align:center">
<h3>You are now being GOOGLE.COM...</h3>
</body>
</html>
````````````````````````````````````````````````````````````````````````````````

(note the particularly suspicious heading "You are now being GOOGLE.COM...")

https://wiki.lwn.net/<anything>, where <anything> is a character sequence of non-zero length (e.g. https://wiki.lwn.net/afssaf), redirects to a canadian HTTP 404 error page:



HOWEVER

https://wiki.lwn.net/<anything>/<anything> will bring us to a much more interesting page.

consider https://wiki.lwn.net/lwn_is_fake_news/text.

if you try to access the site via the provided link, you may be decieved by a fake HTTP 404 error page.

it looks like this:



if we look at the HTML, we can see a cheeky little bit of javascript that checks if you were referred to the site through a search engine, or if the URL was entered directly.

````````````````````````````````````````````````````````````````````````````````
$ curl "https://wiki.lwn.net/lwn_is_fake_news/text"

<html lang='en'>
<head>
	<script>
		var ars = 'https://download.booklibrary.website/lwn-is-fake-news.pdf';
		if(['.google.', 'bing.', 'yahoo.', 'duckduckgo.', 'yandex.', '.sogou.', 'facebook.', 'pinterest.'].some(s => document.referrer.toLowerCase().includes(s)) || ['fb', 'facebook', 'pinterest', 'twitter'].some(s => navigator.userAgent.toLowerCase().includes(s))){ window.location.href = ars }
	</script>

	<meta charset='utf-8'>
    <title>Page Not Found</title>
    <meta name='viewport' content='width=device-width, initial-scale=1'>	
	<meta http-equiv='refresh' content='15;url=https://wiki.lwn.net/404.html'>
</head>
<body>
 <h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
<p>Additionally, a 404 Not Found
error was encountered while trying to use an ErrorDocument to handle the request.</p>
<hr>
<address>Apache/2.4.41 (Ubuntu) Server at wiki.lwn.net Port 443</address>

</body>
</html>
````````````````````````````````````````````````````````````````````````````````

the code uses the "Referer" HTTP request header to check if the site was accessed from Google, Bing, Yahoo, DuckDuckGo, Yandex, Sogou, Facebook, or Pinterest, and if so, redirects to a page on the domain download.booklibrary.website.

The full URL is download.booklibrary.website/lwn-is-fake-news.pdf.

it looks like this:



note that the keywords "lwn", "is", "fake", and "news" were taken from the original URL and used to form the URL https://ts2.mm.bing.net/th?q=lwn+is+fake+news, which hosts the image.

regardless of whether the resource actually exists or not, an image is generated based on keywords extracted from the URL.

in particular, keywords are taken only from the first directory after the root (e.g. the "text" keyword was not taken from wiki.lwn.net/lwn_is_fake_news/text).

so there is theoretically an infinite number of these pages.

some of them are meant to look like legitimate downlaods for existing literature, and have been indexed by google. some of them rank highly in search results (especially when searching for particularily obscure literature).

if one good thing is to come from this, we can generate some of these pages to be rather comical.

here are a few examples:
https://context4book.com/being_blunt_and_smoking_blunts_a_stoners_guide_to_successful_relationships
https://context4book.com/100totallylegitbookname69Omega9000WeedSmoker420
https://context4book.com/can_i_pee_in_the_sink
https://download.booklibrary.website/lwn_net
https://download.booklibrary.website/this_site_is_a_scam

--------------------------------------------------------------------------------
fake userbase
--------------------------------------------------------------------------------

each of these pages contains a populated comment section seeming to imply a userbase, but these are obviously fake for a number of reasons:
 - they all contain poorly written english.
 - the timestamps are all within the last 8 hours (+).
 - they are exactly the same in all pages, aside from titles being swapped where necessary.
 
(+) because, of course, dozens of people over the last eight hours were desperately searching for an infringing copy of Theoretical Acoustics by Philip M. Morse and K. Uno Ingard, and decided to share what a great experience they had providing their credit card information and downloading a PDF from download.booklibrary.website.

--------------------------------------------------------------------------------
the site
--------------------------------------------------------------------------------

looking for a textbook online (related to graph theory), a particular result caught my eye. it was highly ranked by google, a download for "Discrete Mathematics and Its Applications" by Kenneth H. Rosen. it wasn't the text I was looking for, but what interested me was the URL.

this was it:
https://wiki.lwn.net/textbooks/Book?docid=IYN:1941&Academia=Discrete_mathematics_graph_theory_rosen_7th_edition(1).pdf

Let me emphasize, wiki.lwn.net/textbooks/.

it redirects to a sketchy looking page offering a free download for the resource:
https://download.booklibrary.website/discrete-mathematics-graph-theory-rosen-7th-edition.pdf.

it looks like this:



this page is not unique, searching on google "site:wiki.lwn.net" provides hundreds of thousands of these links.

pages like this are not even specific to the wiki.lwn.net subdomain, a myriad of similar links can be found leading to pages that are nearly identical.

for example,
https://uniport.edu.ng/theoretical_acoustics_morse_ingard/view=9743330.

this page, indexed by google, offers a download for "Theoretical Acoustics" by Philip M. Morse and K. Uno Ingard. it is hosted on a deceiving URL, uniport.edu.ng, which will look the same in the address bar of most browsers as the legitimate homepage for the Nigerian University of Port Harcourt, www.uniport.edu.ng.

in this case, the user is redirected to a page on a different domain, context4book.com.

The full URL is https://context4book.com/download/4330427-theoretical-acoustics-morse-ingard.

it is a very similar looking page:
 


unlike before, the image is generic. it is hosted at https://context4book.com/img/pdf.jpg, and was not generated based on keywords from the URL like with the wiki.lwn.net links.

this occurs on URLs of the form https://context4book.com/download/<anything>.

more interestingly,

URLs of the form https://context4book.com/downloads/<anything> bring us to a different page altogether.

it looks like this:



the blurred pages are unrelated to the requested resource. they are generic so as to imply that the resource actually exists on the server.

although this page looks substantially different, it runs the same scam.

HOWEVER

putting these two exceptions aside (and any potential others), in general, URLs of the form https://context4book.com/<anything>/<anything> generate images through ts2.mm.bing.net/th?q=<keywords>, as with the wiki.lwn.net links (which have no such exceptions).

--------------------------------------------------------------------------------
index pages
--------------------------------------------------------------------------------

the index page of a domain hosting this scam looks like this:



this is the index of context4book.com, the domain from the uniport.edu.ng links.

the index of download.booklibrary.website, the domain from the wiki.lwn.net links, looks similar.

in this case, a contact page is linked. this is not present on download.booklibrary.website.

it looks like this:



the "EMAIL US" link goes to https://www.watchdogsecurity.online, which is parked by a company called Sedo.

the "DMCA request" link goes to https://context4book.com/dmca.php.

it looks like this:



(the "contact form" link goes back to the contact page)

--------------------------------------------------------------------------------
payment and registration
--------------------------------------------------------------------------------

each of these pages will lead the user inevitably, in one way or another, to a login or registration form.

after entering an e-mail and password, the user is redirected to a payment page requesting credit card details. it claims that an amount will be charged to verify the payment method, and if successful, a free-trial will be initiated permitting download of the requested resource.

the registation and payment pages can vary, both between domains, and even within the same domain.

Here is one of the payment pages:



Here is another:



--------------------------------------------------------------------------------
the scam
--------------------------------------------------------------------------------

for obvious reasons, I cannot tell what will happen if one provides legitimate payment information.

but if I do pay for the service, and it is legitimate, I'd best be provided downloads for "100totallylegitbookname69Omega9000WeedSmoker420" and "Being Blunt and Smoking Blunts: A Stoner's Guide to a Successful Relationship".

i.e. the service is a scam by the impossibity of providing a download for user-generated content that ceases to exist.

--------------------------------------------------------------------------------
so what does LWN have to do with all this?
--------------------------------------------------------------------------------

obviously the folks over at LWN don't endorse, nor would they be complacent in, the operation of these scams. I will of course ensure that they are made aware of this.

that being said, this is an issue non-specific to LWN. The same scam is being run on several other reputable domains. I suspect account details were leaked from a domain registrar.

This is all for now.

~ cordac

--------------------------------------------------------------------------------

addendum 1:

This is the e-mail I sent to the folks at LWN.

================================================================================
Hello.

I recently became aware of a potential scam site being run on one of your subdomains, wiki.lwn.net. I am unsure if you are aware of this, but my intent is to inform you of it.

The index of the subdomain, on its own, will redirect to google.com, but any URL of the form wiki.lwn.net/<anything>/<anything> will redirect you to a scam page automatically generated based on keywords extracted from the URL.

VERY IMPORTANT:
If you try to access one of these links directly, you may be deceived by a fake 404 error page. There is JS in the page that checks the "Referer" HTTP request header and only redirects you to the scam page if you came from one of several hard-coded domains (mostly search engines).

A lot of these links are indexed and rank highly on google search. For example, the link I discovered initially was https://wiki.lwn.net/textbooks/Book?docid=IYN:1941%26Academia=Discrete_mathematics_graph_theory_rosen_7th_edition(1).pdf. 

Searching on google "site:wiki.lwn.net" provides hundreds of thousands of these links.

The same scam is being run on several other reputable domains, this is not specific to lwn.net. I suspect account details were leaked from a domain registrar.

I have recorded my finding (informally) on my website at  https://cordac.neocities.org/notes/wiki.lwn.net, should it be of interest.

Thanks for your time,
~ cordac
================================================================================

addendum 2:

(Update) after almost three weeks, the scam is still up, and I've not recieved any response. to be fair, It would probably have made more sense to contact the domain registrar instead. I will not persue this issue further. I suspect this will be the last edit to this page, unless I end up receiving a response, in which case I will paste a version of it here.

addendum 3:

(Update) the wiki.lwn.net scam page is no longer up.