The three different types of data manipulation

Insights

image
Do you research and compare your options before making a decision on data manipulation. Picture: Pexels

It is worth exploring the meaning of manual data manipulation and the three different types there are.

In simple terms, data manipulation is the moving around and preparing of data before any analysis takes place. Meanwhile, the three different types include manual, semi automated and fully automated.

So if you collect data from your bench test equipment, put it on your local machine, re-arrange the data in Excel and then push it to your analysis tool, that is manual data manipulation.

If you collect data by putting it in a machine and run scripts on it, perhaps Pearl or R, then that would be considered semi-automated data manipulation.

If your data is automatically copied from testers, moved across the network to a server, and prepared for you so you only need to choose the analysis that you need (or receive automatic reports) that is fully automated.

As we move through these levels a number of things change. With the manual manipulation of data it takes a long time to do anything with a data file or datalog.

It entails setting up columns, checking alignment and sorting, as well as, plotting a chart and adding the labels. Then you're required to move it to a folder, run the script, and check the output.

Whereas, with a fully automated system you can simply view the results. More automation translates to reduced time per file.

Figure 1 below shows that time per file goes down as the automation increases.

pastedGraphic.png

Figure 1 – Time per file

The other thing to consider is the chances of error by manually manipulating the data. In some instances, it is still necessary to do some manual manipulation but this can often increase risk of error and time spent error checking and then debugging. With some macros or scripts it takes a lot more time to write these but once they are known to be correct nothing should change so the chance of error per file goes down significantly.

With full automation as long as the data format does not change there is far less chance of error. Many eagle eye’d engineers around the world are checking even after professional software teams have tested the code thoroughly. This is demonstrated in figure 2 below.

pastedGraphic_1.png

Figure 2 – Chance of error

Another aspect to consider is the cost per file of manual manipulation of data.

It is huge when time per file and chance of error are considered together. While automated systems are expensive it would seem that the return on investment point could be on a far smaller number of data files that we might intuitively imagine. See figure 3 below.

pastedGraphic_2.png

Figure 3 – Combined, Time per file x Chance of error

In conclusion there are a few choices when it comes to manipulating data. You can can do it manually with a huge cost per file or add automation with scripts written by engineers (although make sure you get documentation, source code, backups and support prepared in case those engineers leave the company!).

The third option is to buy a system that can take care of all of this along with a strong support package.

With all things considered, it won't be long before you see a return on investment!

Interested in learning more about yieldHUB or confident in making the switch? Start the process today and contact us.