Blazegraph Database¶
Blazegraph is a high-performance, open-source RDF graph database (triplestore) that supports Blueprints and RDF/SPARQL APIs. It provides excellent scalability (up to 50 billion edges on a single machine) and has been commercialized as AWS Neptune.
CIMantic Graphs provides the BlazegraphConnection class for seamless integration with Blazegraph databases, making it one of the recommended backends for large-scale CIM power system models.
Overview¶
Key Features:
- High-performance SPARQL query execution
- Excellent scalability for large distribution models (IEEE 8500+ nodes)
- Fast bulk loading of RDF data
- Built-in web-based SPARQL query interface
- Support for multiple namespace enumeration parsing
- Docker images available with IEEE test feeders pre-loaded
Best For:
- Large distribution feeder models (1000+ nodes)
- Production deployments requiring fast query performance
- Applications using GridAPPS-D platform
- Development and testing with IEEE standard test cases
Performance:
- IEEE 8500-node model: ~10-15 seconds to load base topology
- Parallel query execution with batching
- Efficient graph traversal and expansion
Installation and Setup¶
Docker Installation (Recommended)¶
The easiest way to get started with Blazegraph is using Docker. GridAPPS-D provides pre-configured Docker images with IEEE test feeders already loaded.
For CIM17 / RC4_2021 profile (GridAPPS-D v2021-v2024):
docker pull gridappsd/blazegraph:v2024.09.0
docker run -p 8889:8080 gridappsd/blazegraph:v2024.09.0
For CIM100 / CIMHub_2023 profile (GridAPPS-D v2025+):
docker pull gridappsd/blazegraph:v2025.01.0
docker run -p 8889:8080 gridappsd/blazegraph:v2025.01.0
Available tags: https://hub.docker.com/r/gridappsd/blazegraph/tags
Standalone Installation¶
For standalone installation:
- Download Blazegraph JAR from https://github.com/blazegraph/database/releases
- Start the server:
java -server -Xmx4g -jar blazegraph.jar
- Access web interface at http://localhost:9999/blazegraph/
Python Dependencies¶
The BlazegraphConnection class requires the SPARQLWrapper library:
pip install SPARQLWrapper
This is installed automatically when you install CIMantic Graphs.
Environment Configuration¶
For GridAPPS-D / Blazegraph v2021-v2024 (CIM17)¶
If using GridAPPS-D Docker images with tags between v2021.01.0 and v2024.09.0:
import os
os.environ['CIMG_CIM_PROFILE'] = 'rc4_2021'
os.environ['CIMG_URL'] = 'http://localhost:8889/bigdata/namespace/kb/sparql'
os.environ['CIMG_IEC61970_301'] = '7'
import cimgraph.data_profile.rc4_2021 as cim
For GridAPPS-D / Blazegraph v2025+ (CIM100)¶
If using GridAPPS-D Docker images with tag v2025.01.0 or later:
import os
os.environ['CIMG_CIM_PROFILE'] = 'cimhub_2023'
os.environ['CIMG_URL'] = 'http://localhost:8889/bigdata/namespace/kb/sparql'
os.environ['CIMG_IEC61970_301'] = '8'
import cimgraph.data_profile.cimhub_2023 as cim
For Standalone Blazegraph¶
If running standalone Blazegraph (default port 9999):
import os
os.environ['CIMG_CIM_PROFILE'] = 'cimhub_2023'
os.environ['CIMG_URL'] = 'http://localhost:9999/blazegraph/sparql'
os.environ['CIMG_IEC61970_301'] = '8'
import cimgraph.data_profile.cimhub_2023 as cim
Creating a Connection¶
Import and instantiate the BlazegraphConnection class:
from cimgraph.databases import BlazegraphConnection
# Create connection (automatically connects to database)
db = BlazegraphConnection()
The connection is established automatically when the object is instantiated. It retrieves the SPARQL endpoint URL from the CIMG_URL environment variable.
Blazegraph-Specific Features¶
The BlazegraphConnection class extends SPARQLEndpointConnection with Blazegraph-specific implementations and features.
Constructor¶
BlazegraphConnection.__init__()
Creates a new Blazegraph database connection.
Parameters: None (uses environment variables)
Behavior:
- Calls parent
SPARQLEndpointConnection.__init__()to initialize base attributes - Retrieves SPARQL endpoint URL from
CIMG_URLenvironment variable viaget_url() - Configures multiple namespace support:
- Primary namespace from CIM profile
- Additional namespace:
http://epri.com/gmdm/2025#for EPRI GMDM enumerations
- Automatically calls
connect()to establish database connection
Source: cimgraph/databases/blazegraph/blazegraph.py:22
Database-Specific Method Implementations¶
Blazegraph implements the four required abstract methods from SPARQLEndpointConnection:
_setup_connection()¶
Initializes the SPARQLWrapper connection object for Blazegraph.
Implementation:
- Creates
SPARQLWrapperinstance with Blazegraph SPARQL endpoint URL - Sets return format to JSON for standardized response parsing
- Stores connection object in
self.connection_obj
Source: cimgraph/databases/blazegraph/blazegraph.py:33
_execute_raw_query(query_message)¶
Executes a SPARQL query and returns raw JSON results.
Parameters:
query_message(str): The SPARQL query to execute
Returns:
dict: Query results in SPARQL JSON format
Implementation:
- Sets query on SPARQLWrapper connection object
- Uses POST method for query submission
- Converts response to Python dictionary
- Returns standardized SPARQL JSON results
Source: cimgraph/databases/blazegraph/blazegraph.py:38
_parse_result_field(result, field_name)¶
Extracts field values from SPARQLWrapper query results.
Parameters:
result(dict): A single result binding from query resultsfield_name(str): The name of the field to extract
Returns:
str: The field value, orNoneif field doesn't exist
Implementation:
- SPARQLWrapper returns results as nested dictionaries:
result[field_name]['value'] - Checks if field exists in result binding
- Returns value or None
Source: cimgraph/databases/blazegraph/blazegraph.py:44
_update_raw(update_message)¶
Executes a SPARQL UPDATE statement.
Parameters:
update_message(str): The SPARQL update to execute
Returns:
str: Response from the database
Implementation:
- Sets update query on SPARQLWrapper connection object
- Uses POST method for update submission
- Returns database response
Source: cimgraph/databases/blazegraph/blazegraph.py:50
_get_namespaces()¶
Returns list of namespaces for enumeration parsing (optional override).
Returns:
list[str]: List of namespace URIs
Implementation:
- Returns
self.namespaceslist configured in__init__() - Includes both CIM profile namespace and EPRI GMDM namespace
- Enables parsing of enumerations from multiple namespace sources
Source: cimgraph/databases/blazegraph/blazegraph.py:56
Common SPARQL Endpoint Methods¶
Important: BlazegraphConnection inherits all standard methods from SPARQLEndpointConnection. These methods are documented in the Databases Overview - Common SPARQL Endpoint Methods section.
Inherited Methods:
- Connection Management:
connect(),disconnect() - Query Execution:
execute(),update() - Object Retrieval:
get_object(),get_from_triple() - Graph Creation:
create_new_graph(),create_distributed_graph(),build_graph_from_list() - Graph Expansion:
get_all_edges(),get_all_attributes(),get_edges_query() - Query Parsing:
parse_node_query(),edge_query_parser() - Data Upload:
upload()
Refer to the Databases Overview for complete documentation, usage examples, and parameter details for these methods.
Usage Examples¶
Example 1: Loading a Feeder Model¶
import os
os.environ['CIMG_CIM_PROFILE'] = 'cimhub_2023'
os.environ['CIMG_URL'] = 'http://localhost:8889/bigdata/namespace/kb/sparql'
import cimgraph.data_profile.cimhub_2023 as cim
from cimgraph.databases import BlazegraphConnection
from cimgraph.models import FeederModel
# Connect to Blazegraph
db = BlazegraphConnection()
# Load IEEE 13-bus feeder
feeder = cim.Feeder(mRID="49AD8E07-3BF9-A4E2-CB8F-C3722F837B62")
network = FeederModel(container=feeder, connection=db)
print(f"Loaded {len(network.graph[cim.ACLineSegment])} line segments")
Example 2: Retrieving an Object by mRID¶
# Retrieve a feeder object directly from database
feeder = db.get_object(mRID="49AD8E07-3BF9-A4E2-CB8F-C3722F837B62")
feeder.pprint()
Output:
{
"@id": "49ad8e07-3bf9-a4e2-cb8f-c3722f837b62",
"@type": "Feeder"
}
Example 3: Querying Object Attributes¶
# Get feeder name (string attribute)
name = db.get_from_triple(subject=feeder, predicate='IdentifiedObject.name')
print(f"Feeder name: {name}")
# Get associated substation (object reference)
substation = db.get_from_triple(subject=feeder, predicate='Feeder.NormalEnergizingSubstation')
print(f"Substation: {substation}")
Example 4: Executing Custom SPARQL Query¶
# Custom SPARQL query to find all feeders
query_text = '''
PREFIX r: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX cim: <http://iec.ch/TC57/CIM100#>
SELECT DISTINCT ?identifier ?name
WHERE {
?feeder r:type cim:Feeder .
BIND(strafter(str(?feeder),"urn:uuid:") as ?identifier)
OPTIONAL { ?feeder cim:IdentifiedObject.name ?name . }
}
ORDER BY ?name
'''
results = db.execute(query_text)
# Parse results
print("Available Feeders:")
for result in results['results']['bindings']:
mrid = result['identifier']['value']
name = result.get('name', {}).get('value', 'Unnamed')
print(f" - {name}: {mrid}")
Example 5: Expanding Graph with Specific Classes¶
# Load base topology
feeder = cim.Feeder(mRID="49AD8E07-3BF9-A4E2-CB8F-C3722F837B62")
network = FeederModel(container=feeder, connection=db)
# Expand breakers with all attributes
network.get_all_edges(cim.Breaker)
# Expand line segments
network.get_all_edges(cim.ACLineSegment)
# Now all breakers and lines have complete information
breaker = network.first(cim.Breaker)
print(f"Breaker: {breaker.name}")
print(f"Rated Current: {breaker.ratedCurrent}")
Example 6: Debugging SPARQL Queries¶
# Get the SPARQL query that would be executed
query_text = db.get_edges_query(network.graph, cim.Breaker)
print("SPARQL Query for Breaker expansion:")
print(query_text)
# Copy this query to Blazegraph web interface for debugging
About Triplestore Databases¶
The triple-store database offers a semantic solution for data management. Unlike relational databases (which require a DDL database schema), the triple-store database structure is comprised of Resource Description Framework (RDF) statements.
RDF Triple Structure¶
RDF statements take the form of subject (node) - predicate (relation) - object (node) that can be dynamically generated to form inter-related complex class structures.
Example CIM Triple:
Subject: ACLineSegment (urn:uuid:1234...)
Predicate: ACLineSegment.length
Object: 105 meters
Advantages for CIM¶
The RDF statement structure intuitively corresponds to the structure of object-attribute specifications used in CIM. Key advantages include:
- Direct CIM Support: CIM models translate directly to RDF with automated data correlation
- Polymorphism: RDF Schema (RDFS) supports inheritance and class hierarchies
- Graph Constructs: Native support for complex graph structures and subgraphs
- Standards-Based: Uses mature, standardized languages (RDF, RDFS, OWL, SPARQL)
- Validation: Supports type-checking and SHACL (Shapes Constraint Language)
- Agility: Dynamic data structures support multiple developers and evolving schemas
Serialization Formats¶
RDF can be serialized in multiple formats:
- XML/RDF - IEC 61970-301 standard for CIM
- Turtle (TTL) - Human-readable format
- JSON-LD - JSON-based RDF for web applications
- N-Triples - Simple line-based format
Data Management Considerations¶
Advantages:
- No schema migration required for model changes
- Easy integration of multiple data sources
- Standardized query language (SPARQL)
- Support for reasoning and inference (with OWL)
Challenges:
- Risk of "garbage-in-garbage-out" without data governance
- Potential for dangling references without validation
- Requires rigor from data contributors
- Need for well-thought-out data management strategy
UML Sequence Diagrams¶
This section contains UML sequence diagrams explaining how the BlazegraphConnection class executes database queries and API calls. The diagrams are rendered from flat text using mermaid.js.
Blazegraph Connection Initialization¶
from mermaid import Mermaid
with open('./images/3_3_connect.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
get_object() - Retrieving Object by mRID¶
with open('./images/3_3_get_object.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
get_from_triple() - Querying Object Attributes¶
with open('./images/3_3_get_from_triple.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
execute() - Executing SPARQL Query¶
with open('./images/3_3_execute.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
update() - Executing SPARQL Update¶
with open('./images/3_3_update.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
create_new_graph() - Building Feeder Topology¶
with open('./images/3_3_create_new_graph.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
create_distributed_graph() - Building Distributed Model¶
with open('./images/3_3_create_distributed_graph.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
get_all_edges() - Parallel Graph Expansion¶
with open('./images/3_3_get_all_edges.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
get_edges_query() - Query Debugging¶
with open('./images/3_3_get_edges_query.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
parse_node_query() - Parsing Network Topology¶
with open('./images/3_3_parse_node_query.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
edge_query_parser() - Parsing Edge Query Results¶
with open('./images/3_3_parse_edges_query.txt', 'r') as diagram:
diagram_text = diagram.read()
Mermaid(diagram_text)
Blazegraph Web Interface¶
Blazegraph provides a built-in web-based SPARQL query interface for interactive querying and debugging.
Access:
- GridAPPS-D Docker: http://localhost:8889/bigdata/
- Standalone: http://localhost:9999/blazegraph/
Features:
- SPARQL query editor with syntax highlighting
- Query execution with result display
- Namespace management
- Database statistics and monitoring
- Data loading interface (bulk load RDF files)
Usage Tip: Copy SPARQL queries from get_edges_query() into the web interface to test and debug query performance.
Performance Optimization¶
Memory Configuration¶
For large models, increase JVM heap size:
java -server -Xmx8g -jar blazegraph.jar
Query Batching¶
CIMantic Graphs automatically batches queries in groups of 100 objects. This is optimized for Blazegraph's query processing.
Parallel Execution¶
The get_all_edges() method uses parallel query execution with ThreadPoolExecutor for maximum performance on multi-core systems.
Indexing¶
Blazegraph automatically indexes RDF triples for fast query execution. No manual index configuration required.
Troubleshooting¶
Connection Refused¶
ConnectionRefusedError: [Errno 111] Connection refused
- Verify Blazegraph is running
- Check
CIMG_URLenvironment variable - Ensure correct port (8889 for Docker, 9999 for standalone)
Query Timeout¶
TimeoutError: Query execution exceeded timeout
- Increase JVM heap size for Blazegraph
- Reduce batch size for large models
- Check Blazegraph web interface for slow queries
Empty Results¶
- Verify data is loaded into Blazegraph
- Check namespace matches CIM profile
- Use Blazegraph web interface to verify data exists
Profile Mismatch¶
Class XYZ not in data profile
- Verify
CIMG_CIM_PROFILEmatches database content - Check Docker image version corresponds to correct profile
Loading Data into Blazegraph¶
Via Web Interface¶
- Navigate to http://localhost:8889/bigdata/
- Click "Update" tab
- Select "File Path or URL"
- Choose RDF file format (RDF/XML for CIM)
- Click "Update" to load data
Via CIMantic Graphs¶
Load from XML file and upload to Blazegraph:
from cimgraph.databases import XMLFile, BlazegraphConnection
from cimgraph.models import FeederModel
# Load from XML
xml_file = XMLFile(filename='../../sample_models/ieee13.xml')
network = FeederModel(container=cim.Feeder(), connection=xml_file)
# Upload to Blazegraph
db = BlazegraphConnection()
db.upload(network.graph)
Via REST API¶
Use curl to bulk load RDF files:
curl -X POST \
-H 'Content-Type: application/rdf+xml' \
--data-binary @model.xml \
http://localhost:8889/bigdata/namespace/kb/sparql