Huawei Cloud Lab2 realtime analytics reference project.
This repository extracts the reusable code and configuration from the lab:
- Generate simulated ecommerce event data with Python.
- Ingest files with Flume and write them to Kafka.
- Analyze Kafka streams with DLI Flink SQL.
- Store aggregate results in RDS MySQL.
- Visualize results in DLV.
No credentials are stored in this repository. Use environment variables or a local credential file outside the repository when running commands.
Python generator -> Flume spooldir -> Kafka topic fludesc
|
v
DLI Flink SQL
|
v
RDS MySQL
|
v
DLV dashboard
The lab requires adding store 313024 / 陈致远店.
This repo includes that change in:
scripts/autodatagen.py: can generate records for store313024.sql/mysql_schema.sql: inserts陈致远店intodesc_store_info.
conf/
flume.properties.template Flume spooldir -> Kafka template
docs/
environment.md Environment and service checklist
runbook.md Step-by-step lab runbook
scripts/
autodatagen.py Test data generator
create_kafka_topic.sh Kafka topic helper
manage_cron.sh Start/stop periodic data generation
verify_mysql.py Query result tables via env vars
sql/
flink_job.sql DLI Flink SQL template
mysql_schema.sql RDS MySQL schema and dimension data
- Prepare RDS tables:
mysql -h "$MYSQL_HOST" -P "${MYSQL_PORT:-3306}" -u "$MYSQL_USER" -p < sql/mysql_schema.sql- Upload
scripts/autodatagen.pyto the MRS master node, for example:
scp scripts/autodatagen.py root@<MRS_MASTER_EIP>:/opt/client/autodatagen.py
ssh root@<MRS_MASTER_EIP> 'chmod 755 /opt/client/autodatagen.py'- Create Kafka topic:
KAFKA_HOME=/opt/Bigdata/client/Kafka/kafka \
BOOTSTRAP_SERVERS=192.168.0.117:9092,192.168.0.153:9092,192.168.0.86:9092 \
scripts/create_kafka_topic.sh-
Configure Flume from
conf/flume.properties.template, then start Flume with the full FusionInsight Flume runtime. -
Create a DLI Flink OpenSource SQL job from
sql/flink_job.sql, replacing all${...}placeholders. -
Generate a fixed 235-row validation batch:
python3 /opt/client/autodatagen.py /tmp/flume_spooldir/lab2_100_1.txt 100
python3 /opt/client/autodatagen.py /tmp/flume_spooldir/lab2_100_2.txt 100
python3 /opt/client/autodatagen.py /tmp/flume_spooldir/lab2_35_1.txt 35- Verify RDS results:
MYSQL_HOST=<rds-private-ip> MYSQL_USER=root MYSQL_PASSWORD=<password> \
python3 scripts/verify_mysql.pyFor the full procedure, see docs/runbook.md.