Home Wiki Programming & Logic Python for Engineers: From Zero to Data Analysis
Programming & Logic

Python for Engineers: From Zero to Data Analysis

Why Python for Industrial Engineers?

Imagine a factory floor with dozens of sensors -- temperature, pressure, vibration, flow rate -- each logging a reading every second. By the end of the day you have millions of numbers. How do you analyze them? How do you detect that a motor is overheating?

Python is the most widely used language for data analysis and machine learning. Its syntax reads almost like plain English, and its ecosystem of scientific libraries makes it the ideal tool for engineers who need to process real-world data without years of programming training.

Why Python specifically?

Feature Details
Easy syntax Code reads like English sentences
Engineering libraries NumPy, Pandas, Matplotlib, SciPy
Machine learning TensorFlow, scikit-learn, PyTorch
Automation File processing, report generation, device control
Huge community Answers exist for virtually every question

Variables and Data Types

A variable in Python is a named container for a value. You do not need to declare the type -- Python infers it automatically:

# Sensor data
temperature = 78.5          # float
machine_id = "CNC-042"      # string
is_running = True            # boolean
sensor_count = 12            # integer

# List of sensor readings
readings = [78.5, 79.1, 80.3, 77.8, 81.2]

# Dictionary for machine information
machine = {
    "id": "CNC-042",
    "type": "Lathe",
    "location": "Hall 3",
    "max_temp": 95.0
}

Core data types:

  • int -- whole numbers: 42, -7, 1000
  • float -- decimal numbers: 3.14, 78.5
  • str -- text: "hello", "CNC-042"
  • bool -- logical values: True or False
  • list -- ordered sequences: [1, 2, 3]
  • dict -- key-value pairs: {"temp": 78.5}

Conditions: Making Decisions

Suppose you are monitoring a motor's temperature and want an alert when it exceeds a threshold:

temperature = 88.5
max_allowed = 85.0
critical = 95.0

if temperature > critical:
    print("CRITICAL! Shut down motor immediately!")
    # send_emergency_stop()
elif temperature > max_allowed:
    print(f"Warning: temperature {temperature}°C exceeds limit")
    # send_alert_to_operator()
else:
    print("Temperature within normal range")

Notice how the code reads almost like a plain English sentence.

Loops: Repeating Tasks

In industrial settings you often need to scan through a batch of readings or process an entire day of data:

# for loop: check each reading
readings = [78.5, 85.2, 92.1, 77.3, 88.9, 96.5]
max_allowed = 90.0

alerts = []
for i, temp in enumerate(readings):
    if temp > max_allowed:
        alerts.append(f"Reading {i+1}: {temp}°C - exceeded!")

print(f"Total violations: {len(alerts)}")
for alert in alerts:
    print(f"  - {alert}")
# while loop: continuous monitoring
import time

def read_sensor():
    """Simulate a sensor reading"""
    import random
    return round(random.uniform(70, 100), 1)

monitoring = True
while monitoring:
    temp = read_sensor()
    print(f"Current temperature: {temp}°C")
    if temp > 95:
        print("ALERT! Stopping monitor")
        monitoring = False
    time.sleep(1)  # wait one second

Functions: Organizing Code

A function is a reusable block of code that performs a specific task. Instead of repeating yourself, write it once and call it whenever needed:

def check_temperature(temp, machine_name, max_limit=85.0):
    """
    Check machine temperature and return status.
    """
    if temp > max_limit * 1.1:  # more than 110% of limit
        return "critical", f"{machine_name}: critical temp ({temp}°C)"
    elif temp > max_limit:
        return "warning", f"{machine_name}: above limit ({temp}°C)"
    else:
        return "normal", f"{machine_name}: normal ({temp}°C)"


# Using the function
machines = [
    ("CNC-01", 82.3),
    ("CNC-02", 91.7),
    ("Pump-05", 96.2),
    ("Compressor-03", 78.1),
]

for name, temp in machines:
    status, message = check_temperature(temp, name)
    if status != "normal":
        print(f"[{status.upper()}] {message}")

NumPy: Fast Scientific Computing

NumPy is the foundational library for scientific computing in Python. It handles arrays and mathematical operations at near-C speed:

import numpy as np

# Vibration sensor readings for one hour (3600 samples)
vibration_data = np.random.normal(loc=2.5, scale=0.8, size=3600)

# Basic statistics
print(f"Mean:       {np.mean(vibration_data):.2f} mm/s")
print(f"Std Dev:    {np.std(vibration_data):.2f} mm/s")
print(f"Max:        {np.max(vibration_data):.2f} mm/s")
print(f"Min:        {np.min(vibration_data):.2f} mm/s")

# Detect anomalies (beyond 3 standard deviations)
mean = np.mean(vibration_data)
std = np.std(vibration_data)
anomalies = vibration_data[np.abs(vibration_data - mean) > 3 * std]
print(f"Anomalies detected: {len(anomalies)}")

NumPy vs plain Python performance:

Operation Plain Python NumPy
Sum 1M elements ~150 ms ~1 ms
1000x1000 matrix multiply Minutes Fraction of a second
Memory usage Inefficient Optimized

Pandas: Sensor Data Analysis

Pandas is the go-to tool for reading and analyzing tabular data -- perfect for sensor logs stored in CSV files:

import pandas as pd

# Read sensor CSV
df = pd.read_csv("sensor_log.csv")

# View the first 5 rows
print(df.head())

Assume the CSV contains columns: timestamp, machine_id, temperature, vibration, pressure

# Quick summary statistics
print(df.describe())

# Filter: only readings above 85°C
high_temp = df[df["temperature"] > 85]
print(f"High temperature readings: {len(high_temp)}")

# Group by: average temperature per machine
avg_per_machine = df.groupby("machine_id")["temperature"].mean()
print(avg_per_machine.sort_values(ascending=False))

# Add a new column: temperature alert flag
df["temp_alert"] = df["temperature"] > 85

# Save results
df.to_csv("analyzed_data.csv", index=False)

Practical Example: Analyzing a Full Day of Factory Data

Suppose you receive a CSV file with 24 hours of sensor data from a factory:

import pandas as pd
import numpy as np

# Load data
df = pd.read_csv("factory_24h.csv", parse_dates=["timestamp"])

# 1. General summary
print("=== Daily Summary ===")
print(f"Total readings: {len(df):,}")
print(f"Period: {df['timestamp'].min()} to {df['timestamp'].max()}")
print(f"Machines monitored: {df['machine_id'].nunique()}")

# 2. Detect violations
limits = {"temperature": 85, "vibration": 5.0, "pressure": 150}

for param, limit in limits.items():
    violations = df[df[param] > limit]
    pct = len(violations) / len(df) * 100
    print(f"\n{param}: {len(violations)} violations ({pct:.1f}%)")
    if len(violations) > 0:
        worst = violations.loc[violations[param].idxmax()]
        print(f"  Worst reading: {worst[param]} at {worst['timestamp']}")
        print(f"  Machine: {worst['machine_id']}")

# 3. Hourly analysis: when do problems peak?
df["hour"] = df["timestamp"].dt.hour
hourly_avg = df.groupby("hour")["temperature"].mean()
peak_hour = hourly_avg.idxmax()
print(f"\nPeak average temperature at hour: {peak_hour}:00")

# 4. Per-machine report
print("\n=== Machine Report ===")
for machine in df["machine_id"].unique():
    m_data = df[df["machine_id"] == machine]
    avg_t = m_data["temperature"].mean()
    max_t = m_data["temperature"].max()
    alerts = len(m_data[m_data["temperature"] > 85])
    print(f"{machine}: avg={avg_t:.1f}°C, max={max_t:.1f}°C, alerts={alerts}")

Next Steps

Once you have mastered these basics, you can expand into:

  • Matplotlib for plotting sensor data charts
  • SciPy for advanced statistics and Fourier analysis
  • scikit-learn for predictive maintenance and anomaly detection
  • openpyxl for automated Excel report generation
  • PLC integration via pymodbus or opcua protocols

Python is not a replacement for industrial control languages like Structured Text or Ladder Logic -- but it is the ideal tool for data analysis and intelligent decision-making on the factory floor.

Python data-analysis NumPy Pandas scripting automation بايثون تحليل البيانات البرمجة النصية المكتبات العلمية الأتمتة بالبرمجة معالجة البيانات