1

I've been having an issue on an Umbraco 7.5.6 site hosted in an App Service on Azure where the indexes seem to be dropped after an unspecific amount of time.

We're storing information, including some custom fields, on published news articles in the External Examine index to query stories from the index. This is consumed by our client-facing search API.

Initially, we thought that this might be caused by Azure swapping servers so removed the {computerName} parameter from the path under ExamineSettings.config. However, that didn't appear to have any effect.

Our current index path is ~/App_Data/TEMP/ExamineIndexes/External/

The ExamineSettings.config file is as follows:

<Examine>
<ExamineIndexProviders>
<providers>
  <add name="InternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
       supportUnpublished="true"
       supportProtected="true"
       analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

  <add name="InternalMemberIndexer" type="UmbracoExamine.UmbracoMemberIndexer, UmbracoExamine"
       supportUnpublished="true"
       supportProtected="true"
       analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net"/>

    <!-- default external indexer, which excludes protected and unpublished pages-->
    <add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"/>

</providers>
</ExamineIndexProviders>

<ExamineSearchProviders defaultProvider="ExternalSearcher">
<providers>
  <add name="InternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
       analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net"/>

  <add name="ExternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine" />

  <add name="InternalMemberSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
       analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" enableLeadingWildcard="true"/>

</providers>
</ExamineSearchProviders>

</Examine>

Due to the unpredictable nature of this issue, short of writing a WebJob to republish the articles on a regular basis, I'm unsure of what to try next.

  • I would open a ticket to Azure support to see if they can figure out the process that is deleting the files. – Mario Lopez Oct 30 '17 at 05:31
  • I suspect that rather than being "dropped" the re-index process is deleting the index and then failing to replace it. Can you post the relevant lines from your `ExamineSettings.config` file? – Jason Elkin Oct 31 '17 at 12:40
  • @JasonElkin the `ExamineSettings.config` file is pretty much the default provided with Umbraco, the only relevant setting I saw in the [docs](https://our.umbraco.org/documentation/Reference/Config/ExamineSettings/) was `RebuildOnAppStart="true"`, which defaults to true. I've added settings to the ticket – Joe Mitchard Oct 31 '17 at 15:13

1 Answers1

1

First thing to do is update your examine config

The filesystem attached to web apps is actually a UNC share which can suffer from IO latency issues which in turn can cause Umbraco to flip out a little bit.

Try updating your ExamineSettings.config as per the following and add this to the indexer(s):

directoryFactory="Examine.LuceneEngine.Directories.SyncTempEnvDirectoryFactory,Examine"

The SyncTempEnvDirectoryFactory enables Examine to sync indexes between the remote file system and the local environment temporary storage directory, the indexes will be accessed from the temporary storage directory. This setting is required due to the nature of Lucene files and IO latency on Azure Web Apps.

This should take performance issues out of the equation.

Then, debugging

Indexing issues should be picked up in Umbraco's logs (some at Info level, some at Debug). If you're not already capturing Umbraco's logs then use something like Papertrail or Application Insights to collect the logs and see if you can identify what's causing the deletion (you may need to drop logging level to Debug to catch it).

N.B if you do push logs to an external service then wrap it in the Async/Parallel provider from Umbraco Core: here's an example config.

<log4net>

  <root>
    <priority value="Info"/>
    <appender-ref ref="AsynchronousLog4NetAppender" />
  </root>

  <appender name="AsynchronousLog4NetAppender" type="Umbraco.Core.Logging.ParallelForwardingAppender,Umbraco.Core">
    <appender-ref ref="PapertrailRemoteSyslogAppender"/>
  </appender>

  <appender name="PapertrailRemoteSyslogAppender" type="log4net.Appender.RemoteSyslogAppender">
    <facility value="Local6" />
    <identity value="%date{yyyy-MM-ddTHH:mm:ss.ffffffzzz} your-site-name %P{log4net:HostName}" />
    <layout type="log4net.Layout.PatternLayout" value="%level - %message%newline" />
    <remoteAddress value="logsN.papertrailapp.com" />
    <remotePort value="XXXXX" />
  </appender>

  <!--Here you can change the way logging works for certain namespaces  -->

  <logger name="NHibernate">
    <level value="WARN" />
  </logger>

</log4net>
Jason Elkin
  • 527
  • 3
  • 13