Salesforce

Scanning a Website

« Go Back
Information
Scanning a Website
UUID-621498be-7e5c-23af-3bfd-e772340b4933
English
Checked
Content

Adding and scanning your website in OneTrust Cookie Consent is the first step in organizing and owning website content for ease of compliance. This is done by scanning and identifying first- and third-party cookies, tags, trackers, pixels, beacons and more.

When a new website is added to the application to be scanned, a record is created for the domain and the scan. The scan will remain in the Pending status while records are being created and until the scanner starts.

The scanner, a virtual machine, which runs on a Mozilla Firefox browser, will pick up a message from the queue that a scan has been submitted and begins the scanning process. The scanner will identify the robots exclusion standard or robots.txt

Robots.txt communicates which areas of the site are not available for scanning. This allows the scanner to scan by finding the sitemap for the website. From this point, the scanner acts like a user clicking from page to page within the website and records the cookies that are found on page load.

The scanner will only scan pages that are within the domain that you have entered.  For example, if I was scanning onetrust.com and there is link within onetrust.com to zentoso.com, zentoso.com pages would not be scanned. Subdomains will be included in the scan if scanned from the root domain.  For example, if you scan www.onetrust.com, only that subdomain will be scanned. If you scan onetrust.com, all subdomains will be included in the scan.

Caution

If your scanned domain does not align with the domain where the script is integrated, consent will not be correctly captured and the banner will reappear every time you load the page.

The scanner treats any URL starting with www. as a subdomain.

If you scan www.onetrust.com, the OneTrust cookies will only write to this subdomain with base script functionality. 

If you scan from onetrust.com, the OneTrust cookies will write to www.onetrust.com, cookies.onetrust.com, consent.onetrust.com, and so on.

For more information, see OneTrust Cookies.

Any cookies that are dropped on a user action and not page load, for example, a form submission other than login or add to cart action, will not be recorded in the scan results.

While the scanner will attempt to reach all pages it can, scanning every page in your site is not necessary.  Scanning the majority of the pages will record the cookies and allow you to perform the necessary blocking activities efficiently.

If there is a particular page that you would like to include in the scan, or scan with priority, add it to the target pages in the scan configuration. Detailed instructions included below. You can also enter your sitemap directly in the scanner configuration.  Details instructions included below. If you put a page limit in the scan configuration the scanner will only scan up to the limit of pages.

The scanner can reach behind the login form. With the entered credentials and form details, the scanner matches the form on the site and replicates submitting the credentials entered from the application.  

The scanner will attempt to reach the login URL first. The scanner login can redirect to a different domain to reach a login form. The login URL and re-direct domain both will need to be entered in the login configuration.

When the scanning activity has completed, the status changes to Processing Data.

During the data processing activity, the scan data will be migrated to your tenant and categorized based on the OneTrust Cookie database Cookiepedia.

Once the data processing activity is complete, the scan will move into a Completed status.

Scanner Statuses

  1. Pending

    Your scan has been submitted and the proper tasks are being created for the scan to begin.

  2. Scanning

    The scanner is now scanning the domain that you have entered to be scanned. The scanner acts like a user clicking from page to page. During this process the cookies dropped on each page load will be recorded.

    Note

    While it is normal for scans to take multiple days, if your scan is taking too long time to complete you can use Get Help the Context menu to retrieve additional scan metadata. This will help support resolve the issue.

  3. Processing Data

    The data collected by the scanner is being processed and polished from our scanner database to your tenant database. During this process the cookies recorded are compared to our Cookiepedia data base for categorization. If a prior categorization is not found, the cookie will be categorized in the "Unknown" category.

    Note

    If you attempt to reprocess data that is still in Processing Data, you will receive a notification. Please allow more time for the reprocess to be completed.

    ReprocInProg.png
  4. Completed

    The scan has completed when in this state, you can click into the domain and view the scan results.

  5. Failed Login

    Authentication of the scanner failed. Check your login credentials.

  6. Scan Error

    This status most likely means that the scan has completed, but there was an issue migrating the data. Open the Context menu and select Re-process. Allow the scanner at least 2 hours to re-process the scan results.

To add a website to the scanner

  1. On the Cookie Compliance menu, select Websites. The Websites screen appears. 

  2. Click the Add Website button. The Scan Website screen appears.

  3. Complete the fields. For more information, see Add Website Screen Reference.

  4. Click the Scan and Configure button.

    Note

    Click the Scan Only button to scan your website without configuring a container.

  5. Select an Experience Kit (geolocation rule group) to which you want to assign the website.

    For information on configuring custom Experience Kits, see To configure a custom Experience Kit.

  6. Click the Scan button. The Review Configurations screen appears.

    review_config_scan_only.png
  7. Review your Banner and Preference Center configurations.

  8. Click the Confirm button.

  9. When the website scan is finished, the status will display as Completed.

To configure a custom Experience Kit

  1. On the Assign to Container screen, click the Create Custom Experience Kit button. The Select Audience screen appears.

    select_framework.png
  2. Enter a name and select the framework(s) for your Experience Kit.

    Configure additional languages by clicking the Manage Languages button.

  3. Click the Next button. The Choose a Cookie Banner Layout screen appears.

    banner_layout_config.png
  4. Configure your banner layout and branding.

  5. Click the Next button. The Choose a Preference Center Layout screen appears.

    pc_layout_config.png
  6. Configure your Preference Center layout and branding.

  7. Click the Next button. The Review Configurations screen appears.

    review_scan_config.png
  8. Review your configurations and make changes as needed.

  9. Click the Confirm button.

Add Website Screen Reference

add_new_website.png

Field

Description

Website URL

Enter the URL of the site you want to scan. The depth of the scan will depend on how the URL is entered. The domain entered will be top level domain scanned with the subdomains being scanned depending on how the URL is entered as described below.

URL Entered

Scanning Result

www.example.com

Only scan the root domain.

example.com

Scan the root domain and all subdomains.

www.example.com/sub

Only scan the subdomain.

example.com/sub

Scan the subdomain and all lower domains.

Note

The scanner text field has a character limit of 1024 characters.

Organization

Select the organization responsible for the website.

Geolocation

Note

This field is currently available upon request to enterprise licenses only. To enable this feature, contact OneTrust support.

Select the location from which you want the scan to originate.

geolocation_scan_field.png

Limit scan to [number] pages

Enter the number of pages to which the scan should be limited.

This will make the scan complete faster, but may produce incomplete results.

Slow Scan

Increases the timeout values on scanned pages to detect "lazy loading" cookies.

Note

To enable the Slow Scan setting, contact OneTrust Support.

Limit to this path within site

Enable this setting to limit the scan to a certain path within the domain.

Note

In the Website URL field, enter the URL using the format domain.com/path/.

Enable Unique User Agent

Enable this setting to set the scanning agent to OneTrustBot.

To whitelist the OneTrust user agent: 

Mozilla/5.0 (X11; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0;OneTrustBot;

Scan Pages with Query Parameters

If you want to limit the scan to pages with certain query parameters ( which appear at the end of a url in the format ?parameter=value), enter the parameters separated by commas.

Target Pages to Scan

If you want to include, exclude, or target certain pages in the scan, configure the fields and enter the URLs for the pages or paths.

target_pages.png

Field

Description

Page List Name

Enter a name for the list of pages you want to scan.

Include / Exclude / Target

Select an option to include only, exclude, or target specific pages in your scan.

  • Include: scan only the pages, paths, or subdomains you specify.

  • Exclude: scan your domain except for the pages, paths, or subdomains you specify.

  • Target: scan the pages, paths, or subdomains you specify first before scanning the rest of your domain.

Page / Path / Subdomain

Select an option to define the scope of your inclusion, exclusion, or targeting.

  • Page: limit the scan to specific pages on your domain.

  • Path: limit the scan to all the pages on specific paths within your domain.

  • Subdomain: limit the scan to specific subdomains within your domain.

URLs

Enter the exact URLs for the pages, paths, or subdomains, with each one on its own line.

Sitemaps URLs

If you want to scan a site in a particular pattern or scan pages that may not be easily accessible through the website interface, enter the URIs for hosted XML sitemaps, each on its own line.

To re-scan an existing website

  1. On the Cookie Compliance menu, select Websites. The Websites screen appears.

  2. Hover over the row for the website which you want to re-scan until the Context Menu icon context_menu_icon_v2.png appears. 

  3. Click the Context Menu icon. The Context menu appears.

  4. Select Re-scan. The Re-audit modal appears.

    rescan_modal.png
  5. Complete the fields. For more information, see Add Website Screen Reference.

  6. Click the Start Scan button.

To reassign a website to a different organization

  1. On the Cookie Compliance menu, select Websites.  The Websites screen appears.

  2. Hover over the row for a website you want to reassign until the Context Menu icon appears. 

  3. Click the Context Menu icon. The Context menu appears.

  4. Select Reassign. The Reassign Organization modal appears.

    reassign_org_scan.png
  5. In the Organization field, select the organization to which you want to assign the website.

  6. Click the Reassign button.

To schedule a website scan

  1. On the Cookie Compliance menu, select Websites.  The Websites screen appears.

  2. Hover over the row for the website for which you want to schedule a scan until the Context Menu icon appears. 

  3. Click the Context Menu icon. The Context menu appears.

  4. Select Schedule. The Schedule Scan modal appears.

    schedule_scan.png
  5. Configure the fields.

    Field

    Description

    Scan frequency (in months)

    Select the number of months to determine scan frequency.

    Next Scan

    Select a date for the next scan.

    Geo-location

    Select a scanner location.

  6. Click the Save button.

Note

You can schedule scans for your domains based on new content being added and maintaining your Cookie Consent implementation. Read more here.

To stop a website scan

  1. On the Cookie Compliance menu, select Websites.  The Websites screen appears.

  2. Hover over the row for an In Progress website scan you want to stop until the Context Menu icon appears.

  3. Click the Context Menu icon. The Context menu appears.

  4. Select Stop. A confirmation modal appears.

  5. Click the Confirm button.

Note

If a scan is Scanning, the scan will stop, and the results will be available as the latest scan result.

If a scan is Pending and there are no previous scans completed on the site, the scan is removed.

If a scan is Pending and there is a history of one or more successful scans, then only the new scan is removed from the pending list, and previous reports will be available as before.

To view a website's activity history

In Cookie Consent, you can access an audit log of activity for each of your domains to review any changes made over time.

  1. On the Cookie Consent menu, select Websites. The Websites list screen appears.

  2. Select a website from the list. The Website Details screen appears.

  3. Go to the Activity tab. The Activity screen appears.

  4. Review the website's activity history.

To export website data

  1. On the Cookie Consent menu, select Websites. The Websites list screen appears.

  2. Click the Column Selector icon. The Website Column Selector modal appears.

    website_column_selector.png
  3. Use the arrow keys to configure column visibility for the Websites list screen. This configuration will be reflected in the export.

    Note

    Column sorting will also be reflected in the export file. Click the header of a column to sort the list screen in ascending or descending order of that column's data.

  4. Click the Save button.

  5. Click the Export button in the header. A confirmation modal appears.

  6. Click the Confirm button.

  7. Click the Notifications icon Notification_ICon.PNG. A website data export appears in the Notifications popover.

    data_export_notif.png
  8. Click Download. An excel report downloads to your device.

  9. View the data export.

To delete a website from the scanner

  1. On the Cookie Compliance menu, select Websites.  The Websites screen appears.

  2. Hover over the row for a website you want to delete until the Context Menu icon appears.

  3. Click the Context Menu icon. The Context menu appears.

  4. Select Delete. A confirmation modal appears.

  5. Select the data you want to delete.

  6. Click the Confirm button.

Cookie Scanner IP Addresses

Follow the links to find the current scanner IP addresses.

OneTrust

CookiePro

 

Powered by