2023.1.X Update Post Mortem and PSA Regarding Performance (and how to fix it)

jgiese.wci · September 7, 2023, 2:28pm

We’re currently running debugger on TaskAgent to see if we can catch the problem in the act. With some additional logging @josecgomez has narrowed down the issue is on the client side. We’re hoping it’s not a race issue that running debugger will mess with and not exhibit the issue, but that’s a clue too I suppose.

Additional logging showed the IIS server respond to the client with 200 response, but the client never showed a response and hung. Other loops in the task agent continued to run so the service didn’t lock up, something went into exception drop or infinite loop.

JeffLeBert · September 7, 2023, 2:31pm

I wasn’t hiding, I was asleep.

We’ve been testing in a few SaaS customers, and I haven’t heard of any issues so far.

That being said it sounds like there is an issue. I haven’t been able to reproduce anything locally that I could debug. It’s always difficult when you can’t see the problem yourself and have to rely on the kindness (and logs) of strangers.

The bit of information from @jgiese.wci about Fiddler has me thinking of issues I’ve seen in the past.

I’ve been working with @josecgomez and things seem to be pointing back to something very low level.

I’ve got a couple ideas I can try, but without being able to reproduce on my machine, it is a bit slow going.

Rest assured; we are working on this.

JeffLeBert · September 7, 2023, 5:57pm

Currently, we only support GZip, not deflate. No reason we couldn’t add it, just figured we should always use GZip.

Not sure going down this path will help anything. At this point it is low on my list.

HLalumiere · September 7, 2023, 6:10pm

Was really just a debugging idea, to see if the same lockup would happen with deflate…

But as long as you are compiling a list, why not also add Brotli?

Olga · September 7, 2023, 6:13pm

You know, right, that smart client is not supposed to be used in the future?

JeffLeBert · September 7, 2023, 6:16pm

We are looking at Brotli! This is not bundled in IIS so extra downloads, etc. would be needed.

HLalumiere · September 7, 2023, 6:31pm

Brotli is supported in browsers now, too…

Also, define “future”… I guarantee you, 10 years from now, I will still have customers on 10.2.500… Hell we still have 5-6 customers on 905 with a Progress backend… lol

Olga · September 7, 2023, 6:52pm

I doubt Epicor will port it backward to WCF

as for browser, IIS already support it naturally

and client side will be handled by browsers, like it now handles json compression.

jgiese.wci · September 7, 2023, 7:15pm

TaskAgent is not the smart client but exhibits the same issue. Along with DMT.

We’re starting to suspect my clients lock up because I’m on prem, and Jose’s do not because he is cloud and that’s just enough latency.

My DMT and TaskAgent also lock up on compression, my outlier compared to @josecgomez was my clients.

That theory also helps explain why once we introduce logging and/or Fiddler we can’t get it to happen.

Mark_Wonsil · September 7, 2023, 7:27pm

Are they configured to use SSL connections or are they still using NET.tcp? Just curious.

JeffLeBert · September 7, 2023, 7:27pm

There was a pretty good network performance improvement in 11.2.200.18. If you aren’t on that release or higher, you will probably want to move. It cuts down the number of round trips needed to make a call.

Is anybody here having this issue running an earlier version than 11.2.200.18?

JeffLeBert · September 7, 2023, 7:28pm

Task Agent uses the same networking code as the WinForms client. NET.TCP is no longer an option.

jgiese.wci · September 7, 2023, 7:33pm

All SSL

Olga · September 7, 2023, 7:51pm

you are right. I don’t know future plans for DMT, but TA definitely does not need any UI.

Latency suspiction is understandable but afaik TA does not get huge responses from server, so compression/no compression sould not affect much in its case from latency stand point…
If it were async deadlock, then CPU would be low…
But I am sure Jeff will figure it out

HLalumiere · September 7, 2023, 8:57pm

I highly suspect it’s due to a specific .NET version incompatibility (or glitch) with .NET Core… The HttpClient implementation in Core is completely different, and I know of other cases where it caused issues. Probable 2023 is not having the issue because .NET 4.8 should be pretty much at parity with .NET 6. But 4.7.2 might not be… Not sure when the version on the client side was changed…

I also have it setup in a 2023.1 migration environment on premise, no issue was reported as of yet…

Olga · September 7, 2023, 9:03pm

Theoretically, server is completely independent from client. I can have server on linux and php and yet still be able to create .NET client application.
Of course in this case our custom serialization is involved, but in this case I think it would fail on deserialization, not with compression.

4.7.2 was may years ago. Since we moved from WCF, client was already on 4.8 for a long time.

HLalumiere · September 7, 2023, 9:04pm

Here’s one example… I agree, it’s old, but sometimes a specific combination can cause unexpected issues…

github.com/dotnet/runtime

HttpClient doesn't decompress "deflate" correctly

opened 12:27PM - 17 Jun 20 UTC

closed 10:12AM - 09 Oct 20 UTC

ramondeklein

bug area-System.Net.Http

### Description I am using .NET Core v3.1 to fetch data from a REST API that us…es *deflate* compression. I use the following code: ``` var httpClientHandler = new HttpClientHandler { AutomaticDecompression = DecompressionMethods.All }; using var httpClient = new HttpClient(httpClientHandler); var request = new HttpRequestMessage(HttpMethod.Get, "https://api.example.com/api/v1/get"); var response = await httpClient.SendAsync(request); ``` This request sends out the following headers: ``` GET https://api.example.com/api/v1/get HTTP/1.1 Host: api.example.com Accept-Encoding: gzip, deflate, br ``` The server responds with (parsed with Fiddler): ``` HTTP/1.1 200 OK Server: openresty/1.15.8.1 Date: Wed, 17 Jun 2020 12:06:27 GMT Content-Type: application/json; charset=utf-8 Transfer-Encoding: chunked Connection: keep-alive charset: utf-8 Access-Control-Allow-Origin: * Access-Control-Allow-Headers: Content-Type, Authorization, X-Requested-With Content-Encoding: deflate Vary: Accept-Encoding Strict-Transport-Security: max-age=15724800; includeSubDomains ``` My .NET Core client cannot decompress the data and fails with "**The archive entry was compressed using an unsupported compression method.**". When I invoke this command using Curl or Postman, then it works fine and the result is a valid JSON result. The binary data looks like this: ``` 00000000: 789c edd4 414b c330 1407 f0af 22ef 9c43 x...AK.0...."..C 00000010: 92a6 8deb 7517 3d74 8808 1ec6 0e61 895d ....u.=t.....a.] 00000020: a14d 244b 6132 f6dd 4daa 42ab 87a5 13d7 .M$Ka2..M.B..... 00000030: 83cd a92f fcf3 dafc 0a6f 7d84 da08 b932 .../.....o}....2 00000040: 9013 042f 4ac9 4a97 9f95 15ae 32fa 5e42 .../J.J.....2.^B 00000050: ce09 43d0 5407 6543 45b2 0481 8f59 1f56 ..C.T.eCE....Y.V 00000060: daed 215f 1f7b 7588 30cc 1608 9cb0 a572 ..!_.{u.0......r 00000070: cfaa 2a77 0ef2 c49f 92f6 ad10 ce29 fba0 ..*w.........).. 00000080: ec16 728c 406c 5d2b ea22 f45e 9aa6 11da ..r.@l]+.".^.... 00000090: 9f86 9bb0 309c 504c df8c 47f7 2db0 5f14 ....0.PL..G.-._. 000000a0: 4e1b 04a5 35ed ebc7 a777 8fdd c528 c7df N...5....w...(.. 000000b0: bb13 ecb7 764a c8a5 6975 5787 d3ce 3851 ....vJ..iuW...8Q 000000c0: 3ffd 0c76 fb77 fdb4 7713 87c7 4eb2 970c ?..v.w..w...N... 000000d0: 37fb 72a7 b3fb 24ee c9ec 3e89 3b9b dd29 7.r...$...>.;..) 000000e0: 27e7 dcd3 31ec e979 f5f4 8fd5 e371 46a1 '...1..y.....qF. 000000f0: 93f8 9f79 013a 8d33 a797 9167 0372 3a20 ...y.:.3...g.r: 00000100: 67f8 f7e4 fe75 78b4 fab8 5110 3f0b e287 g....ux...Q.?... 00000110: 413a 40e2 0324 f63f a7c1 0453 f876 769f A:@..$.?...S.vv. 00000120: c47d 31bb 5fcf 7df3 0ea6 e930 f2 .}1._.}....0. ``` After some digging, I found that the first two bytes `78 9C` are the ZLIB header and the final four bytes `A6 E9 30 F2` are the Adler32 checksum that is part of ZLIB compressed data. It seems that .NET uses the standard [DeflateStream](https://docs.microsoft.com/en-us/dotnet/api/system.io.compression.deflatestream) when it encounters **deflated** content and this stream cannot deal with deflated data that has this ZLIB header. [RFC2616 section 3.5](https://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5) specifies: > **deflate** The "zlib" format defined in [RFC 1950](https://tools.ietf.org/html/rfc1950) in combination with the "deflate" compression mechanism described in [RFC 1951](https://tools.ietf.org/html/rfc1951). .NET Core and .NET Framework don't implement the RFC 1950 part of this specification, so if the HTTP server uses the ZLIB envelope the data cannot be decompressed and the call fails. ### Expected behaviour The .NET Core implementation should detect that a ZLIB envelop is being used and decompress the data according to this header. ### Workaround Only specify the `DecompressionMethods.GZip` for the HttpClientHandler's `AutomaticDecompression` property. If the server behaves properly, then it should use GZip compression instead (or not use compression after all). If you have a bad behaving HTTP server it may still send deflated data instead.

josecgomez · September 7, 2023, 9:52pm

I thought so too Hugo but after some research

Compression is done in IIS outside .net and it’s independent of net version

The compression happens after dot net appserver returns the payload in the iis engine

The compression being an issue is not about compression itself, it seems it’s just highlighting a race condition or timing issue because it made things much faster.

Hence why when we introduced latency via fiddler or debugger problem disappears

JeffLeBert · September 11, 2023, 5:54pm

Good news, it looks like we’ve found a solution. Still testing, but everything looks good. I was able to run a test application that pummeled the application server all weekend and had no issues. Usually, the test app would break in under 10 minutes.

This looks like an issue with the automatic decompression that .NET FW 4.8 is doing on the client. I turned off the automatic decompression and added code to do the decompression ourselves. The current guess is that it is a threading issue, which moving the decompression to a later point solves.

I will be adding this code to 2024.1 (11.2.500) as it is our current development branch. We are readying 2023.2 (11.2.400) for release soon, so I will probably add the code back there as well.

MikeGross · September 11, 2023, 6:11pm

Solid work @JeffLeBert - thank you!

Topic		Replies	Views
2023.2.4 Bad Performance Kinetic ERP performance , kinetic	7	1047	May 2, 2024
Epicor\Kinetic Performance Issues Kinetic ERP kinetic	39	7688	November 8, 2023
SaaS 2022.1 >> 2022.2 - the much awaited upgrade Service Packs and Major Upgrades (any → 20XY.Z) 2022-1 , 2022-2	38	4546	February 2, 2023
Upgrade warning-10.2.600 Epicor ERP 10 upgrade	64	9961	December 20, 2021
Kinetic 2022.1.3 upgrade Kinetic ERP e10 , sql	20	3041	July 29, 2022

2023.1.X Update Post Mortem and PSA Regarding Performance (and how to fix it)

Related Topics