I sometimes have to import a CSV file with 250,000 rows, which takes a very long time, usually around 22 hours. Someone told me that if I split the file into 4 separate ones and run them simultaneously, then it will be much faster. This seems silly but sure enough, it definitely uses more CPU that way and finishes in almost 1/4 of the time when split into 4.
My question is, does anyone know of a better way to speed up DMT imports that doesn’t require me to split up my CSVs? What is the rate limiting factor, why is DMT so slow? When I run a DMT import, DMT only uses about 2% of the available CPU. It seems as if it is artificially throttled in some way.
It’s all to do with the fact that the server is geared up to run x number of sessions and you are trying to push everything through 1 of those sessions.
The server is still looking to perform work for everyone else, even if there is no one else….
It really depends what you are uploading. The limiting factor is usually the checking of the database for data which needs to be there to validate your imported record.
As an example, Part import is one of the slowest imports ever in Epicor because it has many mandatory fields which need to be checked against other tables.
How you can make it faster:
make sure you only have the minimum columns needed in your DMT. Ex: if you do a part update, do you really need the part class. If not, skip it and Epicor will have one less check to do before validating your record;
split the file in a few DMT imports - look like DMT is single thread, when you start multiple DMTs it will use more processors and be slightly faster up to some point (i.e. if you split the file in 10, it will not be 10 times faster);
if your DMT is for update, not all mandatory fields might be needed. Less columns, more speed;
you might have to consider importing less data but more often (instead 250000 records per month better 12500 imported per day or even automate it with some service connect workflow to pick the data from some place every 30 minutes or so).
It seems to me that by breaking your main file into smaller files and running them simultaneously you are in a sense forcing the DMT to run multi-threaded. It would be nice if DMT had a multi-thread setting or feature but I’m not sure that it does? Maybe something to submit as an Enhancement/Feature Request here on e10help if they aren’t already working on it?
Until then, could you use Powershell to break apart your main file into smaller ones and then call the DMT to process them simultaneously? Essentially use Powershell to force DMT to run multi-threaded on files over a certain size?
If you do a search here on e10help and on Google I think you’ll find some good examples of how to automate DMT using Powershell and then you just need to figure out how to get Powershell to break the file up into smaller files…just a thought that I hope can maybe help.
Also turn off any BPMs and run your Benchmark in TST. It may even be a BPM… We had a BPM written by a Certified Epicor Consultant… Which ran at 35 seconds PER!!! I cut that bad boy down to 0.001s
We DMT in sometimes 200,000 at a time, no problem.
It also depends on what you are importing. Allowing changes to engineered jobs, for example, will allow you to import updates without first having to un-engineer and then going back later and engineering them. Disable credit checking on open orders in the Customer will allow you to add sales order lines more quickly than if credit checking was happening after every add/update row.
My import is Bill of Operations. The reason I am hesitant to cut up my CSV is twofold: 1) I can’t do it in Excel, because Excel drops leading zeros from our PartNums, changes PartNums like 3202E20 into 3.2e24, drops the trailing period if the revision has a “1.”, etc. So this all has to be done in Notepad++ manually. 2) I’m not sure about this but I was told that operations have to be imported in consecutive order. This means if I split up my .CSV and the last two lines in one are Op 20, Op 30, then the first line in the next .CSV created starts at Op 40. So by the time it gets to Op 20 in the first .CSV, then Op 30, 40 have already been imported form the second .CSV. I can work around this manually by only splitting the CSV at the end of the last Op for any given part, but this means that it can’t be done programmatically through powershell, etc.
After writing that out, it occurs to me that I can split the initial export-to-CSV up into several queries though, each producing its own CSV. This will work much better.
To prevent the truncation of zeros in excel, set the format of the cells to ‘text’ and then paste in your data and save it as a .xlsx file extension.
Not sure what version of DMT you are using but you might want to look into maybe using a DMT playlist which would enable you to use a list of import files if you break them apart. It is similar to going the powershell route.