SharePoint Crawls - Full, Incremental, Continuous

Nov 30, 2013 at 10:05 PM


Full Crawl

Full Crawl of Content source will re index all the content from beginning
 
Important Points to Consider:
 
  • If new managed property has been introduced, we need to run Full Crawl of content source     
  • If new crawl rules are created/updated/deleted, Full crawl of content source is required
  • If incremental crawl has been failed
  • If software update or service pack has been installed on the servers
  • Expensive in terms of performance issues 

Incremental Crawl

Incremental Crawl of Content source will only process those items which are changed since the last crawl happened.
 
Important Points to Consider:
 
  • Most preferred after the full crawl has been done.
  • Does not hamper the performance as it will crawl only modified documents not the entire content source.
  • The incremental crawl will retry items and postpone crawling the item if the error persists.
 
A limitation with the Full Crawl and Incremental Crawl is we cannot schedule both to execute parallel. For example if the Full Crawl is already running then the Incremental Crawl cannot be triggered until the Full Crawl completes, if you try to Stop Full Crawl then also it is mandatory to finish at least once successful Full crawl before triggering any Incremental Crawls.
 
So Microsoft has come up with the concept of Continuous Crawl

Continuous Crawl

With Continuous Crawl you can maintain the content index as fresh as possible.               
    More than one continuous crawl can run in parallel
    one deep change will not result in degraded freshness on all following changes
 
The impact of a "Continuous Crawl" is the same as an incremental crawl.
 
At the parallel execution of crawls, the "Continuous Crawl" within the parameters defined in the "Crawler Impact Rule" which controls the maximum number of requests that can be executed by the server (default 8).

Enable Continuous Crawl using PowerShell

 
#Get Search Service Application(SSA)
$ssa = Get-SPEnterpriseSearchServiceApplication
 
#Get the Content Source for which you want to enable continuous crawl
$cs = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa -Identity "Local SharePoint sites"
 
#Set the EnableContinuousCrawls property to true
Set-SPEnterpriseSearchCrawlContentSource -Identity $cs -EnableContinuousCrawls $True
 
#Set the interval - You can skip this part by default SharePoint will run  continuous crawl for every 15 minutes. Here I am using 30 minutes
$interval = "30"
 
$ssa.SetProperty("ContinuousCrawlInterval", [int]$interval)
 
$ssa.Update()
 

To Disable Continuous Crawl using Powershell

 
We can use the same PowerShell script mentioned above, with the following change in the line where we need to set ‘EnableContinuousCrawls’ property to ‘False’
 
Set-SPEnterpriseSearchCrawlContentSource -Identity $cs -EnableContinuousCrawls $True
 
 

Found this article by Dhaval Shah valuable? Help by Sharing ...

  • Click on the banners at the top of article or in the right panel to visit my blog's sponsors. They are all hand-picked and are selected based on providing great products and services to the SharePoint community.
  • I’d be very grateful if you’d help it spread by Sharing. Below, you should find links to sharing this article on your favorite social media sites.
Related Posts by Dhaval Shah blog comments powered by Disqus