Factory Data Analytics: Turning Sensor Numbers Into Smart Decisions
Why Data Analytics Matters for Factories
A modern factory produces enormous volumes of data every second: temperatures, rotation speeds, pressure levels, product weights, and more. But data alone has no value unless it is transformed into actionable information.
What Can Data Analytics Reveal?
- Hidden downtime causes: Recurring patterns that operators do not notice
- Energy waste: Machines consuming more power than usual
- Quality deviations: Subtle changes that precede defect appearance
- Improvement opportunities: Bottlenecks in the production line that can be eliminated
- Maintenance predictions: Early indicators of impending failure
The shift from intuition-based decisions to data-driven decisions is the essence of Industry 4.0.
Data Collection: From Sensor to Database
Data Sources in a Factory
- Direct Sensors: Temperature, pressure, vibration, flow
- Control Systems (PLC/SCADA): Machine states, production counters
- MES System: Production orders, inspection results, downtime records
- ERP System: Production plans, inventory, customer orders
- Other Systems: Inspection cameras, HVAC systems, energy meters
Data Flow Architecture
Sensor → Edge Gateway → MQTT Broker → Time-Series Database → Dashboard
Time-Series Databases
Industrial data has a temporal nature, so it requires a database optimized for time-series:
- InfluxDB: The most popular, open-source
- TimescaleDB: An extension for PostgreSQL
- QuestDB: Very high write performance
- SurrealDB: A multi-model database with time-series support
Data Cleaning: Handling Gaps and Anomalies
Raw industrial data is rarely clean. Cleaning is a critical stage before any analysis.
Common Problems
Missing Data (Gaps)
- Sensor connectivity lost for a period
- Edge gateway failed to transmit data
- Solution: Linear interpolation for short gaps, or mark as "unavailable" for long ones
Outliers
- Unrealistic sensor reading (such as 500 degrees for a motor)
- Conversion error or unit mismatch
- Solution: Apply logical bounds and use moving averages for smoothing
Duplicate Data
- Same reading sent twice due to MQTT QoS 1
- Solution: Deduplicate based on timestamp and sensor identifier
Sensor Drift
- Gradual change in reading accuracy over time
- Solution: Periodic calibration and comparison with reference sensors
Live Dashboards: Grafana and Power BI
A dashboard is the interface that transforms raw data into a visual picture that decision-makers can understand.
Grafana
An open-source tool specialized in live monitoring:
- Supports dozens of data sources (InfluxDB, PostgreSQL, MQTT)
- Interactive time-series charts
- Flexible alerting system (email, Slack, SMS)
- Web-shareable dashboards
- Free and well-suited for industrial time-series data
Power BI
A Microsoft tool for business analytics:
- Excellent integration with Excel and SharePoint
- Interactive analytics with data drill-down capabilities
- Scheduled reports sent automatically
- Well-suited for management and financial reporting
When to Use Each?
| Need | Grafana | Power BI |
|---|---|---|
| Live machine monitoring | Excellent | Limited |
| Monthly management reports | Limited | Excellent |
| Instant alerts | Excellent | Basic |
| Interactive historical analysis | Good | Excellent |
Performance Indicators: KPI and OEE
Core Industrial KPIs
- OEE: Overall Equipment Effectiveness (Availability x Performance x Quality)
- MTBF: Mean Time Between Failures
- MTTR: Mean Time To Repair
- Production Rate: Parts per hour or per shift
- Reject Rate: Percentage of rejected parts
- Energy Per Unit: Kilowatt-hours per product
- Changeover Time: Time needed to switch between products
How to Choose the Right KPIs
- Define your goals (reduce downtime? improve quality? save energy?)
- Select only 5-7 indicators to avoid clutter
- Ensure they can be measured automatically
- Set numerical targets for each indicator
- Review monthly and adjust as needed
Practical Example: Dashboard for a Production Line
Let us build a live dashboard for an electronics assembly line:
Required Data
- Parts produced counter (from PLC)
- Status of each station: running, stopped, in maintenance (from MES)
- Quality inspection results (from AOI cameras)
- Energy consumption (from smart meters)
- Clean room temperature and humidity (from environmental sensors)
Dashboard Layout
Top Row: Large key indicators
- Current OEE (target: 80%)
- Parts produced today / target
- Reject rate (target: below 2%)
Middle Row: Time-series charts
- OEE over the past week
- Downtime causes (pie chart)
- Production rate trend (line chart)
Bottom Row: Station details
- Status of each station with color coding (green/yellow/red)
- Last 10 alerts
Configured Alerts
- OEE drops below 70% for 30 minutes
- Station stopped for more than 15 minutes
- Reject rate exceeds 3%
- Room temperature goes outside the allowed range
Summary
Data analytics transforms a factory from an intuition-based environment to a fact-based one. The journey starts with collecting data from sensors and storing it in time-series databases, then cleaning it from gaps and anomalies, and finally displaying it on dashboards that support rapid decision-making. Choose clear and few performance indicators, build a simple dashboard for a single line, then develop it as experience accumulates.