google.com, pub-4497197638514141, DIRECT, f08c47fec0942fa0 Industries Needs: Data Engineer vs Data Scientists

Monday, February 7, 2022

Data Engineer vs Data Scientists


2.1 Data Scientist

Data scientists aren’t like every other scientist.

Data scientists do not wear white coats or work in high tech labs full of science fiction movie equipment. They work in offices just like you and me.

What differs them from most of us is that they are the math experts. They use linear algebra and multivariable calculus to create new insight from existing data.

How exactly does this insight look?

Here’s an example:

An industrial company produces a lot of products that need to be tested before shipping.

Usually such tests take a lot of time because there are hundreds of things to be tested. All to make sure that your product is not broken.

Wouldn’t it be great to know early if a test fails ten steps down the line? If you knew that you could skip the other tests and just trash the product or repair it.

That’s exactly where a data scientist can help you, big-time. This field is called predictive analytics and the technique of choice is machine learning.

Machine what? Learning?

Yes, machine learning, it works like this:

You feed an algorithm with measurement data. It generates a model and optimises it based on the data you fed it with. That model basically represents a pattern of how your data is looking You show that model new data and the model will tell you if the data still represents the data you have trained it with. This technique can also be used for predicting machine failure in advance with machine learning. Of course the whole process is not that simple.

The actual process of training and applying a model is not that hard. A lot of work for the data scientist is to figure out how to pre-process the data that gets fed to the algorithms.

Because to train a algorithm you need useful data. If you use any data for the training the produced model will be very unreliable.

A unreliable model for predicting machine failure would tell you that your machine is damaged even if it is not. Or even worse: It would tell you the machine is ok even when there is an malfunction.

Model outputs are very abstract. You also need to post-process the model outputs to receive health values from 0 to 100.


 2.2 Data Engineer

Data Engineers are the link between the management’s big data strategy and the data scientists that need to work with data.

Data Engineers are the link between the management’s big data strategy and the data scientists that need to work with data.

These platforms are usually used in five different ways:

• Data ingestion and storage of large amounts of data 13

• Algorithm creation by data scientists

• Automation of the data scientist’s machine learning models and algorithms for production use

• Data visualisation for employees and customers

• Most of the time these guys start as traditional solution architects for systems that involve SQL databases, web servers, SAP installations and other “standard” systems.

 

But to create big data platforms the engineer needs to be an expert in specifying, set[1]ting up and maintaining big data technologies like: Hadoop, Spark, HBase, Cassandra, MongoDB, Kafka, Redis and more.

What they also need is experience on how to deploy systems on cloud infrastructure like at Amazon or Google or on premise hardware.

 

2.3 Who Companies Need

For a good company it is absolutely important to get well trained data engineers and data scientists. Think of the data scientist as the professional race car driver. A fit athlete with talent and driving skills like you have never seen.

What he needs to win races is someone who will provide him the perfect race car to drive. That’s what the solution architect is for.

Like the driver and his team the data scientist and the data engineer need to work closely together. They need to know the different big data tools Inside and out.

That’s why companies are looking for people with Spark experience. It is a common ground between both that drives innovation.

Spark gives data scientists the tools to do analytics and helps engineers to bring the data scientist’s algorithms into production. After all, those two decide how good the data platform is, how good the analytics insight is and how fast the whole system gets into a production ready state.

 

No comments:

Post a Comment

Tell your requirements and How this blog helped you.

Labels

ACTUATORS (10) AIR CONTROL/MEASUREMENT (38) ALARMS (20) ALIGNMENT SYSTEMS (2) Ammeters (12) ANALYSERS/ANALYSIS SYSTEMS (33) ANGLE MEASUREMENT/EQUIPMENT (5) APPARATUS (6) Articles (3) AUDIO MEASUREMENT/EQUIPMENT (1) BALANCES (4) BALANCING MACHINES/SERVICES (1) BOILER CONTROLS/ACCESSORIES (5) BRIDGES (7) CABLES/CABLE MEASUREMENT (14) CALIBRATORS/CALIBRATION EQUIPMENT (19) CALIPERS (3) CARBON ANALYSERS/MONITORS (5) CHECKING EQUIPMENT/ACCESSORIES (8) CHLORINE ANALYSERS/MONITORS/EQUIPMENT (1) CIRCUIT TESTERS CIRCUITS (2) CLOCKS (1) CNC EQUIPMENT (1) COIL TESTERS EQUIPMENT (4) COMMUNICATION EQUIPMENT/TESTERS (1) COMPARATORS (1) COMPASSES (1) COMPONENTS/COMPONENT TESTERS (5) COMPRESSORS/COMPRESSOR ACCESSORIES (2) Computers (1) CONDUCTIVITY MEASUREMENT/CONTROL (3) CONTROLLERS/CONTROL SYTEMS (35) CONVERTERS (2) COUNTERS (4) CURRENT MEASURMENT/CONTROL (2) Data Acquisition Addon Cards (4) DATA ACQUISITION SOFTWARE (5) DATA ACQUISITION SYSTEMS (22) DATA ANALYSIS/DATA HANDLING EQUIPMENT (1) DC CURRENT SYSTEMS (2) DETECTORS/DETECTION SYSTEMS (3) DEVICES (1) DEW MEASURMENT/MONITORING (1) DISPLACEMENT (2) DRIVES (2) ELECTRICAL/ELECTRONIC MEASUREMENT (3) ENCODERS (1) ENERGY ANALYSIS/MEASUREMENT (1) EQUIPMENT (6) FLAME MONITORING/CONTROL (5) FLIGHT DATA ACQUISITION and ANALYSIS (1) FREQUENCY MEASUREMENT (1) GAS ANALYSIS/MEASURMENT (1) GAUGES/GAUGING EQUIPMENT (15) GLASS EQUIPMENT/TESTING (2) Global Instruments (1) Latest News (35) METERS (1) SOFTWARE DATA ACQUISITION (2) Supervisory Control - Data Acquisition (1)