Introduction
Welcome to Chapter 9! By now, you’ve grasped the core philosophy of OpenZL: its power lies in understanding your data’s structure to achieve superior compression. But theory is only half the battle, right? In this chapter, we’re going to roll up our sleeves and dive into the practical side of things: integrating OpenZL directly into your C++ applications.
This is where the magic truly happens! You’ll learn how to leverage OpenZL’s C++ API to define your data’s structure, create specialized compressors, and efficiently compress and decompress structured data. We’ll build up a working example piece by piece, ensuring you understand every step.
Before we begin, it’s assumed you have a basic understanding of C++ programming, including structs, classes, and memory management. Familiarity with CMake for C++ project setup will also be beneficial. Most importantly, you should be comfortable with the OpenZL core concepts like data descriptors and compression plans, as covered in previous chapters. Let’s get started!
Core Concepts: OpenZL’s C++ Integration Model
Integrating OpenZL into a C++ application involves interacting with its dedicated C++ API. The fundamental idea remains the same: OpenZL needs a detailed “map” of your data to work its magic. In C++, this map is constructed programmatically using OpenZL’s descriptor building utilities.
The OpenZL C++ API Workflow
At its heart, the OpenZL C++ API provides classes and functions to:
- Describe your data’s format: Define a
DataDescriptorthat matches your C++ struct or class layout. - Create a Compressor: Instantiate an
openzl::Compressorobject using yourDataDescriptor. This object encapsulates the specialized compression logic tailored for your data. - Compress data: Feed your raw structured data to the
openzl::Compressor. - Create a Decompressor: Instantiate an
openzl::Decompressorobject, often from the compressed data itself or using the same descriptor. - Decompress data: Use the
openzl::Decompressorto reconstruct the original data.
Think of it like this: you’re giving OpenZL a blueprint of your house (the data descriptor), and OpenZL then builds a custom moving truck (the compressor) specifically designed to pack and unpack your furniture (your data) in the most efficient way possible.
Let’s visualize this workflow:
Data Descriptors in C++
In C++, defining a DataDescriptor typically involves using a builder pattern or a set of helper functions provided by OpenZL. You’ll specify the types, sizes, and offsets of your struct members. For example, if you have a struct with an integer and a float, your descriptor will reflect that. This is crucial because OpenZL operates on raw memory, so it needs to know exactly how your C++ types map to bytes.
openzl::Compressor and openzl::Decompressor
These are the primary classes you’ll interact with.
- The
openzl::Compressoris initialized with aDataDescriptor. It then intelligently selects and combines internal codecs to form an optimal “compression plan” for your specific data structure. - The
openzl::Decompressoris used to reverse the process. It often infers the necessary decompression plan from metadata embedded within the compressed data itself, or it can be initialized with the sameDataDescriptor.
Step-by-Step Implementation
Let’s walk through integrating OpenZL into a simple C++ program. We’ll define a basic data structure, create its descriptor, compress an instance, and then decompress it.
Step 1: Setting up Your C++ Project with CMake
First, you’ll need a CMakeLists.txt file to manage your project and link against the OpenZL library. Assuming OpenZL is installed on your system (or you’ve built it from source and know its location), linking it is straightforward.
Create a CMakeLists.txt file:
# CMakeLists.txt
cmake_minimum_required(VERSION 3.15) # OpenZL requires C++17, so a modern CMake is good.
project(OpenZL_Cpp_Example LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
# Find OpenZL. This assumes OpenZL is installed in a standard location
# or discoverable via CMAKE_PREFIX_PATH.
# Replace with actual path if not found automatically, e.g.,
# find_package(OpenZL CONFIG REQUIRED HINTS "/path/to/openzl_install")
find_package(OpenZL CONFIG REQUIRED)
# Add your executable
add_executable(my_openzl_app main.cpp)
# Link against the OpenZL library
target_link_libraries(my_openzl_app PRIVATE OpenZL::OpenZL)
# If OpenZL requires specific include directories that find_package doesn't handle:
# target_include_directories(my_openzl_app PRIVATE ${OpenZL_INCLUDE_DIRS})
Explanation:
cmake_minimum_required(VERSION 3.15): Specifies the minimum CMake version. Using 3.15 or newer is good practice for C++17 projects.project(OpenZL_Cpp_Example LANGUAGES CXX): Defines your project name and specifies it’s a C++ project.set(CMAKE_CXX_STANDARD 17): Ensures your project uses the C++17 standard, which OpenZL requires.find_package(OpenZL CONFIG REQUIRED): This command tells CMake to look for the OpenZL library. It expects OpenZL to have an installed configuration file (OpenZLConfig.cmake). If OpenZL isn’t in a standard system path, you might need to setCMAKE_PREFIX_PATHor useHINTSto point to its installation directory (e.g.,find_package(OpenZL CONFIG REQUIRED HINTS "/usr/local/openzl")).add_executable(my_openzl_app main.cpp): Creates an executable namedmy_openzl_appfrommain.cpp.target_link_libraries(my_openzl_app PRIVATE OpenZL::OpenZL): Links your executable against the OpenZL library.OpenZL::OpenZLis the standard target name provided by OpenZL’s CMake configuration.
Step 2: Defining Your Data Structure
Let’s create a simple C++ struct that we want to compress. Imagine we’re tracking sensor readings.
Create main.cpp:
// main.cpp
#include <iostream>
#include <vector>
#include <string> // Not directly compressed in this example, but useful for context
// Include OpenZL headers
// The exact header paths might vary slightly based on OpenZL's installation,
// but these are common patterns.
#include <openzl/openzl.h> // Main header
#include <openzl/descriptor.h> // For DataDescriptor definition
#include <openzl/compressor.h> // For Compressor class
#include <openzl/decompressor.h> // For Decompressor class
// Our simple structured data type
struct SensorReading {
int id;
double temperature;
float humidity;
bool isActive;
};
int main() {
std::cout << "Starting OpenZL C++ Integration Example (2026-01-26)" << std::endl;
// --- Data Definition ---
SensorReading myReading = {101, 25.5, 60.2f, true};
std::cout << "Original Reading: ID=" << myReading.id
<< ", Temp=" << myReading.temperature
<< ", Humid=" << myReading.humidity
<< ", Active=" << (myReading.isActive ? "true" : "false") << std::endl;
// We'll add OpenZL descriptor and compression logic here.
return 0;
}
Explanation:
- We include necessary standard C++ headers and the core OpenZL headers.
SensorReadingis our custom data structure. We’ll be compressing instances of this struct.- The
mainfunction starts by initializing an instance ofSensorReadingand printing its values, just to have some data to work with.
Step 3: Defining the OpenZL DataDescriptor
Now, let’s tell OpenZL about the SensorReading struct. We’ll use OpenZL’s descriptor API to map the C++ types to OpenZL’s internal representation.
Modify main.cpp (add this after std::cout << "Original Reading..."):
// --- Define OpenZL DataDescriptor for SensorReading ---
// OpenZL needs to know the layout of your C++ struct in memory.
// We create a descriptor that matches the struct's fields.
openzl::DataDescriptor sensorDescriptor =
openzl::descriptor::struct_("SensorReading") // Name of the struct for debugging/identification
.field("id", openzl::descriptor::int32()) // 'id' field is a 32-bit integer
.field("temperature", openzl::descriptor::float64()) // 'temperature' is a 64-bit float (double)
.field("humidity", openzl::descriptor::float32()) // 'humidity' is a 32-bit float (float)
.field("isActive", openzl::descriptor::boolean()); // 'isActive' is a boolean
std::cout << "\nDataDescriptor created for SensorReading." << std::endl;
// We'll continue adding compression logic here.
Explanation:
openzl::DataDescriptor sensorDescriptor = ...: We declare anopenzl::DataDescriptorobject.openzl::descriptor::struct_("SensorReading"): This starts the definition of a structured type. The string “SensorReading” is a human-readable name..field("id", openzl::descriptor::int32()): For each member of ourSensorReadingstruct, we add a field to the descriptor. We provide the field’s name (matching the struct member) and its OpenZL type (e.g.,int32(),float64(),boolean()). OpenZL provides a rich set of primitive type descriptors.- Crucial Point: The order of
.field()calls should generally match the order of members in your C++ struct to ensure correct memory interpretation, especially if you’re not explicitly providing offsets. OpenZL is smart enough to often infer offsets, but explicit ordering is safer.
Step 4: Creating a Compressor and Compressing Data
With the descriptor ready, we can now create an openzl::Compressor instance and use it to compress our SensorReading.
Modify main.cpp (add this after std::cout << "\nDataDescriptor created..."):
// --- Create OpenZL Compressor ---
// The compressor is initialized with the data descriptor.
// OpenZL then builds an optimized compression plan based on this descriptor.
openzl::Compressor compressor(sensorDescriptor);
std::cout << "OpenZL Compressor initialized." << std::endl;
// --- Compress Data ---
// OpenZL works with raw byte buffers.
// We cast our struct to a const void* and provide its size.
// The compress method returns a vector of bytes (the compressed data).
std::vector<uint8_t> compressedData = compressor.compress(
reinterpret_cast<const uint8_t*>(&myReading), // Pointer to the start of our struct
sizeof(SensorReading) // Total size of our struct in bytes
);
std::cout << "Original data size: " << sizeof(SensorReading) << " bytes" << std::endl;
std::cout << "Compressed data size: " << compressedData.size() << " bytes" << std::endl;
// We'll add decompression logic here.
Explanation:
openzl::Compressor compressor(sensorDescriptor);: We create an instance ofopenzl::Compressor, passing oursensorDescriptor. This is where OpenZL analyzes the structure and prepares its internal compression logic.compressor.compress(...): This method takes two arguments:reinterpret_cast<const uint8_t*>(&myReading): A pointer to the raw bytes of ourSensorReadingstruct. We cast it toconst uint8_t*as OpenZL expects a byte array.sizeof(SensorReading): The total size of our struct in bytes.
- The
compressmethod returns astd::vector<uint8_t>containing the compressed data. We print the sizes to see the compression ratio.
Step 5: Creating a Decompressor and Decompressing Data
Finally, let’s decompress the data back into a SensorReading struct.
Modify main.cpp (add this after std::cout << "Compressed data size..."):
// --- Create OpenZL Decompressor ---
// The decompressor can often infer the descriptor from the compressed data itself,
// or you can explicitly provide it. For simplicity, we'll use the compressed data.
openzl::Decompressor decompressor; // Default constructor, will use metadata from compressed data
std::cout << "OpenZL Decompressor initialized." << std::endl;
// --- Decompress Data ---
// The decompress method returns a vector of bytes representing the decompressed data.
std::vector<uint8_t> decompressedBytes = decompressor.decompress(
compressedData.data(), // Pointer to the compressed data
compressedData.size() // Size of the compressed data
);
// Verify decompressed size matches original struct size
if (decompressedBytes.size() != sizeof(SensorReading)) {
std::cerr << "Error: Decompressed size does not match original struct size!" << std::endl;
return 1;
}
// --- Reconstruct Original Struct ---
// Copy the decompressed bytes back into a new SensorReading struct.
SensorReading decompressedReading;
std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));
std::cout << "\nDecompressed Reading: ID=" << decompressedReading.id
<< ", Temp=" << decompressedReading.temperature
<< ", Humid=" << decompressedReading.humidity
<< ", Active=" << (decompressedReading.isActive ? "true" : "false") << std::endl;
// --- Verification ---
if (myReading.id == decompressedReading.id &&
myReading.temperature == decompressedReading.temperature &&
myReading.humidity == decompressedReading.humidity &&
myReading.isActive == decompressedReading.isActive) {
std::cout << "\nDecompression successful! Data matches original." << std::endl;
} else {
std::cerr << "\nDecompression failed! Data mismatch." << std::endl;
}
Explanation:
openzl::Decompressor decompressor;: We create adecompressor. OpenZL is designed to embed metadata within the compressed stream, allowing the decompressor to automatically determine the structure and plan needed for decompression.decompressor.decompress(...): This method takes the compressed byte array and its size. It returns astd::vector<uint8_t>containing the original, uncompressed bytes.std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));: We usememcpyto copy the raw decompressed bytes back into a newSensorReadingstruct. This is a common pattern when dealing with raw memory and structs in C++.- Finally, we print the decompressed values and perform a simple check to ensure they match the original data.
Full main.cpp for Reference:
#include <iostream>
#include <vector>
#include <string>
#include <cstring> // For std::memcpy
// Include OpenZL headers
#include <openzl/openzl.h>
#include <openzl/descriptor.h>
#include <openzl/compressor.h>
#include <openzl/decompressor.h>
// Our simple structured data type
struct SensorReading {
int id;
double temperature;
float humidity;
bool isActive;
};
int main() {
std::cout << "Starting OpenZL C++ Integration Example (2026-01-26)" << std::endl;
// --- Data Definition ---
SensorReading myReading = {101, 25.5, 60.2f, true};
std::cout << "Original Reading: ID=" << myReading.id
<< ", Temp=" << myReading.temperature
<< ", Humid=" << myReading.humidity
<< ", Active=" << (myReading.isActive ? "true" : "false") << std::endl;
// --- Define OpenZL DataDescriptor for SensorReading ---
openzl::DataDescriptor sensorDescriptor =
openzl::descriptor::struct_("SensorReading")
.field("id", openzl::descriptor::int32())
.field("temperature", openzl::descriptor::float64())
.field("humidity", openzl::descriptor::float32())
.field("isActive", openzl::descriptor::boolean());
std::cout << "\nDataDescriptor created for SensorReading." << std::endl;
// --- Create OpenZL Compressor ---
openzl::Compressor compressor(sensorDescriptor);
std::cout << "OpenZL Compressor initialized." << std::endl;
// --- Compress Data ---
std::vector<uint8_t> compressedData = compressor.compress(
reinterpret_cast<const uint8_t*>(&myReading),
sizeof(SensorReading)
);
std::cout << "Original data size: " << sizeof(SensorReading) << " bytes" << std::endl;
std::cout << "Compressed data size: " << compressedData.size() << " bytes" << std::endl;
// --- Create OpenZL Decompressor ---
openzl::Decompressor decompressor;
std::cout << "OpenZL Decompressor initialized." << std::endl;
// --- Decompress Data ---
std::vector<uint8_t> decompressedBytes = decompressor.decompress(
compressedData.data(),
compressedData.size()
);
if (decompressedBytes.size() != sizeof(SensorReading)) {
std::cerr << "Error: Decompressed size does not match original struct size!" << std::endl;
return 1;
}
// --- Reconstruct Original Struct ---
SensorReading decompressedReading;
std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));
std::cout << "\nDecompressed Reading: ID=" << decompressedReading.id
<< ", Temp=" << decompressedReading.temperature
<< ", Humid=" << decompressedReading.humidity
<< ", Active=" << (decompressedReading.isActive ? "true" : "false") << std::endl;
// --- Verification ---
if (myReading.id == decompressedReading.id &&
myReading.temperature == decompressedReading.temperature &&
myReading.humidity == decompressedReading.humidity &&
myReading.isActive == decompressedReading.isActive) {
std::cout << "\nDecompression successful! Data matches original." << std::endl;
} else {
std::cerr << "\nDecompression failed! Data mismatch." << std::endl;
}
return 0;
}
Step 6: Building and Running
- Save the
CMakeLists.txtandmain.cppfiles in the same directory. - Create a build directory:
mkdir build && cd build - Configure CMake:
cmake ..(This tells CMake to look forCMakeLists.txtin the parent directory). - Build your application:
cmake --build . - Run:
./my_openzl_app
You should see output similar to this (compression sizes may vary slightly depending on OpenZL version and internal codecs):
Starting OpenZL C++ Integration Example (2026-01-26)
Original Reading: ID=101, Temp=25.5, Humid=60.2, Active=true
DataDescriptor created for SensorReading.
OpenZL Compressor initialized.
Original data size: 24 bytes
Compressed data size: 18 bytes
OpenZL Decompressor initialized.
Decompressed Reading: ID=101, Temp=25.5, Humid=60.2, Active=true
Decompression successful! Data matches original.
Notice how the Compressed data size is smaller than the Original data size! This shows OpenZL doing its job.
Mini-Challenge: Compressing an Array of Structs
You’ve successfully compressed a single SensorReading. Now, let’s take it up a notch.
Challenge: Modify the main.cpp code to compress a std::vector<SensorReading> (an array of sensor readings) instead of just one. You’ll need to:
- Create a
std::vector<SensorReading>and populate it with a few instances. - Adjust the
DataDescriptorto describe an array ofSensorReadingstructs. - Modify the
compressor.compress()anddecompressor.decompress()calls to handle the vector’s data and total size. - Verify that all readings in the decompressed vector match the original.
Hint: OpenZL’s descriptor namespace offers functions for arrays. Look for openzl::descriptor::array_of(...) or similar to define a descriptor for a sequence of your SensorReading struct. Remember that std::vector stores its elements contiguously in memory, so you can treat its data as a raw array. The total size will be vector.size() * sizeof(SensorReading).
What to Observe/Learn: This challenge will solidify your understanding of how OpenZL handles collections of structured data, which is a very common use case. You’ll see how flexible the descriptor system is.
Common Pitfalls & Troubleshooting
Incorrect DataDescriptor Definition:
- Symptom: Poor compression ratios, data corruption after decompression, or even crashes.
- Cause: The
DataDescriptordoesn’t accurately reflect the C++ struct’s memory layout. This could be incorrect types (e.g.,int32()for along long), wrong field order, or missing padding. - Fix: Double-check your
structdefinition and ensure every field in the C++ struct has a corresponding, correctly typed field in theDataDescriptor, in the same order. Be mindful of C++ compiler padding (though OpenZL often handles this well with its intelligent type mapping, explicit offset specification might be needed for complex cases if issues arise).
Linking Errors (CMake Issues):
- Symptom: Compilation fails with messages like “undefined reference to
openzl::Compressor::Compressor(...)” or “OpenZL not found.” - Cause: CMake cannot locate the OpenZL library, or your project isn’t correctly linking against it.
- Fix:
- Ensure OpenZL is installed and its
OpenZLConfig.cmakefile is discoverable. - If OpenZL is installed in a non-standard location, set
CMAKE_PREFIX_PATHbefore runningcmake(e.g.,export CMAKE_PREFIX_PATH=/usr/local/openzl:$CMAKE_PREFIX_PATH). - Verify
target_link_librariesuses the correct target name (OpenZL::OpenZL).
- Ensure OpenZL is installed and its
- Symptom: Compilation fails with messages like “undefined reference to
Memory Management and Buffer Handling:
- Symptom: Crashes, segfaults, or unexpected behavior.
- Cause: Incorrectly managing the raw byte buffers, e.g., passing invalid pointers or incorrect sizes to
compressordecompress. Usingmemcpywith wrong sizes can lead to buffer overflows. - Fix: Always ensure the pointers passed to OpenZL functions are valid and point to the beginning of the data, and that the sizes accurately reflect the number of bytes available/expected. When copying back, use
sizeof(YourStruct)or the expected total size of the collection.
Summary
Congratulations! You’ve successfully integrated OpenZL into a C++ application, covering the crucial steps from project setup to compression and decompression.
Here are the key takeaways from this chapter:
- CMake is essential for setting up C++ projects and linking against OpenZL.
openzl::DataDescriptoris the cornerstone, providing OpenZL with the structural blueprint of your C++ data.openzl::Compressorandopenzl::Decompressorare the primary classes for performing compression and decompression operations.- OpenZL works with raw byte buffers, requiring you to cast your C++ structs or objects to
uint8_t*pointers. - Verification is critical to ensure data integrity after decompression.
- Careful descriptor definition and memory handling are crucial for stable and correct operation.
You’ve now got the tools to start building powerful, data-aware compression into your C++ projects. In the next chapter, we’ll explore more advanced topics, such as performance tuning and integrating OpenZL with different data sources.
References
- OpenZL GitHub Repository
- Introducing OpenZL: An Open Source Format-Aware Compression Framework - Engineering at Meta
- CMake Documentation
- C++ Standard Library (cppreference.com)
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.