We are considering adding another non-production environment at our company and I am looking for some general feedback on what other On Prem customers are doing with their Test/Dev/Pilot/etc. environments…
Not interested in hearing about the server setup / infrastructure side… I am curious about the functional use cases for each environment - development, testing, training, etc. Also how frequently do you overwrite it with a copy of production.
I’ve seen a lot of environments over the years and have seen little consistency once a customer is past their initial go live… For us our environments have historically been dictated by whatever major projects or upgrades we’re working on, and I think we’ll always have to deal with that to a degree, but I also think we should have “steady state” environments that are all on the same version of Epicor and have clear and unchanging purposes.
I can share more about where I think we’re going with this at our company, but I’d really like to hear what others are doing, what has worked or not worked, or what you would do in an ideal world.
I keep mine simple - I have a Prod and Test environment that are always the same version and open to the user community all the time. I copy Prod to test every quarter and have a SQL script to make all the necessary changes in the DB for company names, directories, external services, task agent, etc.
The we have a Development sandbox for use by two of my team. This is a copy of Prod as well, but only when we feel like updating.
Everything is a VM and we use snapshots religiously. The Sandbox is also where we ‘test’ all the patches and version updates over and over again until we feel like the process will go smoothly on actual ‘upgrade day’.
I don’t recommend this… but we have 9. Do we need 9? Absolutely not. It’s a pain to keep up with. I would say we could probably work with Live, Test, Train, maybe a second train env since we are implementing multiple branches at once. A pure dev environment, and one slot used for various other uses where you don’t want to mess up another environment. So that would be down to 6 I guess, but only because we are a large team within a large company. Once the whole company is live, I would think we could drop down to live, test, train, and dev. And maybe that free spot for weird situations.
Pilot is my Dev instance.
Test is for user experimentation, but I’ll also promote items from dev to Test for user feedback. Everyone has access.
Prod is prod.
Pilot and Test are refreshed from Prod about monthly. Whenever I complete a project, or series of projects. Actually doing a refresh right now.
For major changes and updates we’ll spin up a duplicate VM as needed.
We have 5 active companies along with a consolidations company. Refreshing a pilot/test could affect several ongoing projects. I’ve found it easier to have multiple segregated environments, but we haven’t reached 9 … yet …
This is a DevOps question. Without Adult Supervision (meaning good DevOps practices), what I call “development diarrhea” becomes a significant risk. I had one customer with three really outstanding (kids) young professionals in their development department, each with their own 4-instance Epicor environment, and zero Adult Supervision, and it was a mess. The kept overwriting each others work. I had a different customer with only developers, both fairly seasoned professionals, one using TEST and one using TRAIN, and THEY kept overwriting each other.
It’s not the number of environments or instances (although it is self-evident that fewer is better), it’s the management of what you have.
DevVNext (whatever version we plan to upgrade to next; this is on a separate server)
Our DB is well over 200GB, so the hard disk space for 3x that number is already a lot. Plus space for the log files, 2 app OS’s the SQL server OS, I just find it hard to justify asking for more environments than that. (I’m pooling all that because even though it’s 3 servers, it’s still one virtual environment.)
I mean sure, it would be nice to have a 4th one just for integration testing, because I hate to sabotage the DB and them think it was something they did wrong.
But I think 3 is plenty for the small crew we have (just 2 of us full time).
Refresh schedule is when I feel like it and/or when a project is done etc.
Staging
to test customizations prior to release to production. Updated nightly.
Upgrade
Updated for the most recent version running production data. Similar to Pilot or Test in cloud. To test new functionality and run automated testing.
Projects
Copy of production restored or updated for special projects as requested.
Education
Using Azure Serverless AKS (2024.1+) which only run as needed. Restore to Epicor base images on demand.
Development
This will be the most controversial… Each developer has their own VM running locally with no production data whatsoever. There’s been to many data breaches with devs and 3rd party consultants. The test image contains only the static information as a base install (GL, GL Control Codes, UOMs, Part Class, Group, etc) Test data is added as needed. This test data is used to drive automated tests. Snapshots are taken and systems are restored between test runs. It has to be stupid quick to spin up dev instances.
Other Considerations
Have to think about interfacing systems when in non-production environments:
Didn’t even consider those for this question. I have SSRS installed on all instances (and keep it up to date even with my custom reports). I also have a Dev ECM instance, and my ‘script’ takes care of ECM, QuickShip, Avalara, and Collaborate connections to the associated “dev” instances (or whatever needs to change). All of my non-Prod instances point to the Dev instance of those products.
Yup. And neither of the environments I spoke about had ANY effective management. (Note the use of the word “effective”… in both cases there WERE managers…)
I can’t comprehend developing like this. I would spend eternity just trying to recreate the data conditions that would represent a valid test of a customization, especially with reporting. So many things you only discover AFTER you test it. There is no way you can know BEFORE you test it what all those edge cases are that you need to worry about, in order to create the right data in advance.
That’s because most of us don’t! Hey, I did say it was controversial!
Kent Beck champions a programming method called Test Driven Development, which states that you write your tests AS YOU DEVELOP. You don’t continue coding until all tests pass. Well, to write those tests, you need data to cover your test scenarios. These tests are mostly unit tests and not integration tests that we do. One could certainly do this for Epicor Functions though. Method and Data Directives would be tougher since they can’t be tested in isolation without some kind of test harness, but I bet smart folks like @jgiese.wci, @hkeric.wci, @klincecum, @Olga, can rig something like this.
The point of this is to be able to test for regressions any time your code or the underlying framework changes. Think about how much faster we could upgrade Kinetic versions if we actually did work this way.
You are absolutely correct. But when edge cases do come up, and you’ve figured out the issue, you can then add a new test to cover that case. You will also know if you break anything when you do fix it.
What I didn’t mention above is that I would love a way to right mouse click a field and download all of the related data, anonymize it, and use that as input into the dev system. There are tools out there that will create test data too, but I’m not sure how well that works with a 3rd party system like Kinetic vs home grown.
Yes, but this is debugging and not testing. Here we would use the staging or test system and observability tools to look for the bad data or logic and then write the test to cover it.
The question to ask ourselves: is the current methodology helping us debug and upgrade faster? I’m certainly open to other suggestions.
We are currently doing automated testing (POC) using browser stack + selenium for both ui & endpoints.
My colleague designed our unit tests and has written up some awesome documentation. (Reserved for Ry)
Of course. The danger we’re trying to avoid here is data breaches from test systems. Recently, Microsoft was breached because of an insecure test system. It’s OK to use production data, we just need to make sure we’re protecting production data no matter which environment it is in. I’m not the only crazy person to suggest this. We just don’t want to end up where Fox News did a couple of years ago.