Respect nofollow links #32

Open
opened 2022-07-03 21:43:55 +02:00 by seanhamlin · 1 comment
seanhamlin commented 2022-07-03 21:43:55 +02:00 (Migrated from github.com)

The tool at present appears to not respect nofollow links. This resulted in a scan against a site enumerating all the facets and pagination in a given search page (to the extent of 122,000 pages).

It would be great if by default nofollow links were respected. There generally is a good reason when people use these.

The tool at present appears to not respect nofollow links. This resulted in a scan against a site enumerating all the facets and pagination in a given search page (to the extent of 122,000 pages). It would be great if by default nofollow links were respected. There generally is a good reason when people use these.
omohokcoj commented 2022-08-21 19:18:02 +02:00 (Migrated from github.com)

@seanhamlin adding the 'exclude' xpath selector should solve this issue:
image
I don't think that making ref=nofollow excluded by default is a good idea since some people might expect those pages being crawled.

@seanhamlin adding the 'exclude' xpath selector should solve this issue: <img width="542" alt="image" src="https://user-images.githubusercontent.com/5418788/185803031-83f6dbb6-257b-4e8b-b0a9-f52b18b4a6f4.png"> I don't think that making ref=nofollow excluded by default is a good idea since some people might expect those pages being crawled.
dan added this to the siteinspector project 2025-10-26 13:12:46 +01:00
dan removed this from the siteinspector project 2025-10-26 13:14:51 +01:00
dan added this to the siteinspector project 2025-10-26 13:23:50 +01:00
dan removed this from the siteinspector project 2025-10-26 13:25:40 +01:00
This discussion has been locked. Commenting is limited to contributors.
No description provided.