The Cyberinfrastructure Knowledge Network (CKN) is an extensible and portable distributed framework designed to optimize AI at the edge—particularly in dynamic environments where workloads may change suddenly (for example, in response to motion detection). CKN enhances edge–cloud collaboration by using historical data, graph representations, and adaptable deployment of AI models to satisfy changing accuracy–and–latency demands on edge devices.
Tag: CI4AI, Software, PADI
Explanation
CKN facilitates seamless connectivity between edge devices and the cloud through event streaming, enabling real-time data capture and processing. By leveraging event-stream processing, it captures, aggregates, and stores historical system-performance data in a knowledge graph that models application behaviour and guides model selection and deployment at the edge.
CKN comprises several core components:
CKN Daemon – A lightweight service that resides on each edge server. It manages communication with edge devices, handles requests, captures performance data, and deploys AI models as needed. The daemon connects with the cloud-based CKN system via a pub/sub system, capturing real-time events from edge devices (model usage, resource consumption, prediction accuracy, latency, and more).
Event Streaming & Processing – Stream-processing techniques (for example, tumbling windows) aggregate events and generate real-time alerts from edge-device streams.
Knowledge Graph – A Neo4j graph database that stores historical and provenance information about applications, models, and edge events. This comprehensive view of the system enables CKN to track model usage and analyse performance over time.
The primary objective of CKN is to provide a robust framework for optimising AI-application deployment and resource allocation at the edge. Leveraging real-time event streaming and knowledge graphs, CKN efficiently handles AI workloads, adapts to changing requirements, and supports scalable edge–cloud collaboration.
Refer to this paper for more information: https://ieeexplore.ieee.org/document/10254827
How-To Guide
See the full documentation at https://cyberinfrastructure-knowledge-network.readthedocs.io/en/latest/ for detailed instructions on creating custom plug-ins and streaming events to the knowledge graph.
Prerequisites
Docker and Docker Compose installed and running.
Open network access to the following ports: -
7474(Neo4j Web UI) -7687(Neo4j Bolt) -2181(ZooKeeper) -9092(Kafka Broker) -8083(Kafka Connect) -8502(CKN dashboard)
Quick-Start
Clone the repository and start services
git clone https://github.com/Data-to-Insight-Center/cyberinfrastructure-knowledge-network.git
make up
After setup completes, verify that all modules are running:
docker compose ps
Stream an example camera-trap event
docker compose -f examples/docker-compose.yml up -d --build
View the streamed data on the CKN dashboard at http://localhost:8502/Camera_Traps or open the Neo4j browser at http://localhost:7474/browser/ and run:
MATCH (n) RETURN n
To shut down services:
make down
docker compose -f examples/docker-compose.yml down
Topics & Event Types
Tutorial: Create a Custom CKN Plug-in
We will create a CKN topic named temperature-sensor-data to store temperature events. The CKN topics and their details are mentioned in docs/topics.md.
Update docker-compose.yml and add:
services:
broker:
environment:
KAFKA_CREATE_TOPICS: "temperature-sensor-data:1:1"
Apply the change:
make down
make up
Create produce_temperature_events.py:
from confluent_kafka import Producer
import json, time
producer = Producer({"bootstrap.servers": "localhost:9092"})
try:
for i in range(10):
for sensor_id in ["sensor_1", "sensor_2", "sensor_3"]:
event = {
"sensor_id": sensor_id,
"temperature": round(20 + 10 * (0.5 - time.time() % 1), 2),
"timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
}
producer.produce("temperature-sensor-data", key=sensor_id, value=json.dumps(event))
producer.flush()
time.sleep(1)
print("Produced 10 events successfully.")
except Exception as e:
print(f"An error occurred: {e}")
Open a shell inside the broker container and start the consumer:
kafka-console-consumer --bootstrap-server localhost:9092 --topic temperature-sensor-data --from-beginning
Create ckn_broker/connectors/neo4jsink-temperature-connector.json:
{
"name": "Neo4jSinkConnectorTemperature",
"config": {
"topics": "temperature-sensor-data",
"connector.class": "streams.kafka.connect.sink.Neo4jSinkConnector",
"errors.retry.timeout": "-1",
"errors.retry.delay.max.ms": "1000",
"errors.tolerance": "all",
"errors.log.enable": true,
"errors.log.include.messages": true,
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"key.converter.schemas.enable": false,
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable": false,
"neo4j.server.uri": "bolt://neo4j:7687",
"neo4j.authentication.basic.username": "neo4j",
"neo4j.authentication.basic.password": "PWD_HERE",
"neo4j.topic.cypher.temperature-sensor-data": "MERGE (sensor:Sensor {id: event.sensor_id}) MERGE (reading:TemperatureReading {timestamp: datetime(event.timestamp)}) SET reading.temperature = event.temperature MERGE (sensor)-[:REPORTED]->(reading)"
}
}
curl -X POST -H "Content-Type: application/json" --data @/app/neo4jsink-temperature-connector.json http://localhost:8083/connectors
Restart CKN and run the producer again:
make down
make up
python produce_temperature_events.py
Open the Neo4j browser at http://localhost:7474/browser/ to view streamed data.
License
See LICENSE.txt for license details.
Acknowledgements
Funded by NSF award #2112606 (ICICLE) and the Data to Insight Center at Indiana University.