So, we’ve been experiencing an intermittent issue with our production Epicor for the past 3-4 weeks and are running out of ideas to try to fix it.
About 2-4 times a day, users’ Epicor will close with the message “Epicor is Offline”. Clicking “Retry” gives the message “The Caller was not Authenticated by the Service”.
Recycling the app pool after this happens helps for a time, but it inevitably goes down again.
Epicor suggested removing the server from the domain and adding it back, which we did last night, but are still receiving the issue.
From an Epicor server standpoint, nothing changed when it started happening. Epicor support is claiming it’s some sort of domain issue and aren’t really providing assistance because of that.
We’re hosted privately in the Azure cloud if that helps anyone.
I’m finding the following in the Event Viewer around the time it went down.
Here’s some more detail of the error I found in my local Event Viewer:
Unable to reach the server. Retrying...
System.ServiceModel.Security.SecurityNegotiationException: The caller was not authenticated by the service. ---> System.ServiceModel.FaultException: The request for security token could not be satisfied because authentication failed.
at System.ServiceModel.Security.SecurityUtils.ThrowIfNegotiationFault(Message message, EndpointAddress target)
at System.ServiceModel.Security.SspiNegotiationTokenProvider.GetNextOutgoingMessageBody(Message incomingMessage, SspiNegotiationTokenProviderState sspiState)
--- End of inner exception stack trace ---
Server stack trace:
at System.ServiceModel.Security.IssuanceTokenProviderBase`1.DoNegotiation(TimeSpan timeout)
at System.ServiceModel.Security.SspiNegotiationTokenProvider.OnOpen(TimeSpan timeout)
at System.ServiceModel.Security.WrapperSecurityCommunicationObject.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Security.CommunicationObjectSecurityTokenProvider.Open(TimeSpan timeout)
at System.ServiceModel.Security.SecurityProtocol.OnOpen(TimeSpan timeout)
at System.ServiceModel.Security.WrapperSecurityCommunicationObject.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.SecurityChannelFactory`1.ClientSecurityChannel`1.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
at System.ServiceModel.Channels.CommunicationObject.Open()
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at System.ServiceModel.ICommunicationObject.Open()
at Epicor.ServiceModel.Channels.ChannelEntry`1.CreateNewChannel()
at Epicor.ServiceModel.Channels.ImplBase`1.GetChannel()
at Epicor.ServiceModel.Channels.ImplBase`1.HandleContractBeforeCall()
at Ice.Proxy.BO.ReportMonitorImpl.GetRowsKeepIdleTimeWithBallonInfo(String whereClauseSysRptLst, Boolean getBallonInfo, String whereClauseSysTask, String whereClauseSysTaskLog, DataSet& sysMonitorData)
And this is where I falter. I’m not too sure how to troubleshoot a domain controller issue. Was kinda hoping someone would have some suggestions to look at.
Otherwise DNS issues it what it seems like. You can DNS flush the client machine if its a single user issue, otherwise you might need to enter a record for the epicor server address in your DNS.
Close Epicor
Open Command Prompt
ipconfig /flushdns
Open Epicor and Connect
ipconfig /displaydns
This should show all the sites that have used DNS since flushing and all the automatically assigned records for your local network. It might give you some ideas.
Try leaving and joining the domain on a client computer you’re not worried about breaking. If you can’t successfully join the domain controller that’s a problem.
You should contact a MSP if you aren’t able to tackle this issue.