Introduction

Welcome to Chapter 9! By now, you’ve grasped the core philosophy of OpenZL: its power lies in understanding your data’s structure to achieve superior compression. But theory is only half the battle, right? In this chapter, we’re going to roll up our sleeves and dive into the practical side of things: integrating OpenZL directly into your C++ applications.

This is where the magic truly happens! You’ll learn how to leverage OpenZL’s C++ API to define your data’s structure, create specialized compressors, and efficiently compress and decompress structured data. We’ll build up a working example piece by piece, ensuring you understand every step.

Before we begin, it’s assumed you have a basic understanding of C++ programming, including structs, classes, and memory management. Familiarity with CMake for C++ project setup will also be beneficial. Most importantly, you should be comfortable with the OpenZL core concepts like data descriptors and compression plans, as covered in previous chapters. Let’s get started!

Core Concepts: OpenZL’s C++ Integration Model

Integrating OpenZL into a C++ application involves interacting with its dedicated C++ API. The fundamental idea remains the same: OpenZL needs a detailed “map” of your data to work its magic. In C++, this map is constructed programmatically using OpenZL’s descriptor building utilities.

The OpenZL C++ API Workflow

At its heart, the OpenZL C++ API provides classes and functions to:

  1. Describe your data’s format: Define a DataDescriptor that matches your C++ struct or class layout.
  2. Create a Compressor: Instantiate an openzl::Compressor object using your DataDescriptor. This object encapsulates the specialized compression logic tailored for your data.
  3. Compress data: Feed your raw structured data to the openzl::Compressor.
  4. Create a Decompressor: Instantiate an openzl::Decompressor object, often from the compressed data itself or using the same descriptor.
  5. Decompress data: Use the openzl::Decompressor to reconstruct the original data.

Think of it like this: you’re giving OpenZL a blueprint of your house (the data descriptor), and OpenZL then builds a custom moving truck (the compressor) specifically designed to pack and unpack your furniture (your data) in the most efficient way possible.

Let’s visualize this workflow:

flowchart TD A[Your Raw C++ Data] -->|Contains structure| B{Define DataDescriptor}; B --> C[Create openzl::Compressor]; C -->|Feed raw data| D[Compress Data]; D --> E[Compressed Binary Data]; E --> F[Create openzl::Decompressor]; F -->|Feed compressed data| G[Decompress Data]; G --> H[Reconstructed C++ Data];

Data Descriptors in C++

In C++, defining a DataDescriptor typically involves using a builder pattern or a set of helper functions provided by OpenZL. You’ll specify the types, sizes, and offsets of your struct members. For example, if you have a struct with an integer and a float, your descriptor will reflect that. This is crucial because OpenZL operates on raw memory, so it needs to know exactly how your C++ types map to bytes.

openzl::Compressor and openzl::Decompressor

These are the primary classes you’ll interact with.

  • The openzl::Compressor is initialized with a DataDescriptor. It then intelligently selects and combines internal codecs to form an optimal “compression plan” for your specific data structure.
  • The openzl::Decompressor is used to reverse the process. It often infers the necessary decompression plan from metadata embedded within the compressed data itself, or it can be initialized with the same DataDescriptor.

Step-by-Step Implementation

Let’s walk through integrating OpenZL into a simple C++ program. We’ll define a basic data structure, create its descriptor, compress an instance, and then decompress it.

Step 1: Setting up Your C++ Project with CMake

First, you’ll need a CMakeLists.txt file to manage your project and link against the OpenZL library. Assuming OpenZL is installed on your system (or you’ve built it from source and know its location), linking it is straightforward.

Create a CMakeLists.txt file:

# CMakeLists.txt
cmake_minimum_required(VERSION 3.15) # OpenZL requires C++17, so a modern CMake is good.
project(OpenZL_Cpp_Example LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Find OpenZL. This assumes OpenZL is installed in a standard location
# or discoverable via CMAKE_PREFIX_PATH.
# Replace with actual path if not found automatically, e.g.,
# find_package(OpenZL CONFIG REQUIRED HINTS "/path/to/openzl_install")
find_package(OpenZL CONFIG REQUIRED)

# Add your executable
add_executable(my_openzl_app main.cpp)

# Link against the OpenZL library
target_link_libraries(my_openzl_app PRIVATE OpenZL::OpenZL)

# If OpenZL requires specific include directories that find_package doesn't handle:
# target_include_directories(my_openzl_app PRIVATE ${OpenZL_INCLUDE_DIRS})

Explanation:

  • cmake_minimum_required(VERSION 3.15): Specifies the minimum CMake version. Using 3.15 or newer is good practice for C++17 projects.
  • project(OpenZL_Cpp_Example LANGUAGES CXX): Defines your project name and specifies it’s a C++ project.
  • set(CMAKE_CXX_STANDARD 17): Ensures your project uses the C++17 standard, which OpenZL requires.
  • find_package(OpenZL CONFIG REQUIRED): This command tells CMake to look for the OpenZL library. It expects OpenZL to have an installed configuration file (OpenZLConfig.cmake). If OpenZL isn’t in a standard system path, you might need to set CMAKE_PREFIX_PATH or use HINTS to point to its installation directory (e.g., find_package(OpenZL CONFIG REQUIRED HINTS "/usr/local/openzl")).
  • add_executable(my_openzl_app main.cpp): Creates an executable named my_openzl_app from main.cpp.
  • target_link_libraries(my_openzl_app PRIVATE OpenZL::OpenZL): Links your executable against the OpenZL library. OpenZL::OpenZL is the standard target name provided by OpenZL’s CMake configuration.

Step 2: Defining Your Data Structure

Let’s create a simple C++ struct that we want to compress. Imagine we’re tracking sensor readings.

Create main.cpp:

// main.cpp
#include <iostream>
#include <vector>
#include <string> // Not directly compressed in this example, but useful for context

// Include OpenZL headers
// The exact header paths might vary slightly based on OpenZL's installation,
// but these are common patterns.
#include <openzl/openzl.h> // Main header
#include <openzl/descriptor.h> // For DataDescriptor definition
#include <openzl/compressor.h> // For Compressor class
#include <openzl/decompressor.h> // For Decompressor class

// Our simple structured data type
struct SensorReading {
    int id;
    double temperature;
    float humidity;
    bool isActive;
};

int main() {
    std::cout << "Starting OpenZL C++ Integration Example (2026-01-26)" << std::endl;

    // --- Data Definition ---
    SensorReading myReading = {101, 25.5, 60.2f, true};
    std::cout << "Original Reading: ID=" << myReading.id
              << ", Temp=" << myReading.temperature
              << ", Humid=" << myReading.humidity
              << ", Active=" << (myReading.isActive ? "true" : "false") << std::endl;

    // We'll add OpenZL descriptor and compression logic here.

    return 0;
}

Explanation:

  • We include necessary standard C++ headers and the core OpenZL headers.
  • SensorReading is our custom data structure. We’ll be compressing instances of this struct.
  • The main function starts by initializing an instance of SensorReading and printing its values, just to have some data to work with.

Step 3: Defining the OpenZL DataDescriptor

Now, let’s tell OpenZL about the SensorReading struct. We’ll use OpenZL’s descriptor API to map the C++ types to OpenZL’s internal representation.

Modify main.cpp (add this after std::cout << "Original Reading..."):

    // --- Define OpenZL DataDescriptor for SensorReading ---
    // OpenZL needs to know the layout of your C++ struct in memory.
    // We create a descriptor that matches the struct's fields.
    openzl::DataDescriptor sensorDescriptor =
        openzl::descriptor::struct_("SensorReading") // Name of the struct for debugging/identification
            .field("id", openzl::descriptor::int32()) // 'id' field is a 32-bit integer
            .field("temperature", openzl::descriptor::float64()) // 'temperature' is a 64-bit float (double)
            .field("humidity", openzl::descriptor::float32()) // 'humidity' is a 32-bit float (float)
            .field("isActive", openzl::descriptor::boolean()); // 'isActive' is a boolean

    std::cout << "\nDataDescriptor created for SensorReading." << std::endl;

    // We'll continue adding compression logic here.

Explanation:

  • openzl::DataDescriptor sensorDescriptor = ...: We declare an openzl::DataDescriptor object.
  • openzl::descriptor::struct_("SensorReading"): This starts the definition of a structured type. The string “SensorReading” is a human-readable name.
  • .field("id", openzl::descriptor::int32()): For each member of our SensorReading struct, we add a field to the descriptor. We provide the field’s name (matching the struct member) and its OpenZL type (e.g., int32(), float64(), boolean()). OpenZL provides a rich set of primitive type descriptors.
  • Crucial Point: The order of .field() calls should generally match the order of members in your C++ struct to ensure correct memory interpretation, especially if you’re not explicitly providing offsets. OpenZL is smart enough to often infer offsets, but explicit ordering is safer.

Step 4: Creating a Compressor and Compressing Data

With the descriptor ready, we can now create an openzl::Compressor instance and use it to compress our SensorReading.

Modify main.cpp (add this after std::cout << "\nDataDescriptor created..."):

    // --- Create OpenZL Compressor ---
    // The compressor is initialized with the data descriptor.
    // OpenZL then builds an optimized compression plan based on this descriptor.
    openzl::Compressor compressor(sensorDescriptor);
    std::cout << "OpenZL Compressor initialized." << std::endl;

    // --- Compress Data ---
    // OpenZL works with raw byte buffers.
    // We cast our struct to a const void* and provide its size.
    // The compress method returns a vector of bytes (the compressed data).
    std::vector<uint8_t> compressedData = compressor.compress(
        reinterpret_cast<const uint8_t*>(&myReading), // Pointer to the start of our struct
        sizeof(SensorReading)                       // Total size of our struct in bytes
    );

    std::cout << "Original data size: " << sizeof(SensorReading) << " bytes" << std::endl;
    std::cout << "Compressed data size: " << compressedData.size() << " bytes" << std::endl;

    // We'll add decompression logic here.

Explanation:

  • openzl::Compressor compressor(sensorDescriptor);: We create an instance of openzl::Compressor, passing our sensorDescriptor. This is where OpenZL analyzes the structure and prepares its internal compression logic.
  • compressor.compress(...): This method takes two arguments:
    • reinterpret_cast<const uint8_t*>(&myReading): A pointer to the raw bytes of our SensorReading struct. We cast it to const uint8_t* as OpenZL expects a byte array.
    • sizeof(SensorReading): The total size of our struct in bytes.
  • The compress method returns a std::vector<uint8_t> containing the compressed data. We print the sizes to see the compression ratio.

Step 5: Creating a Decompressor and Decompressing Data

Finally, let’s decompress the data back into a SensorReading struct.

Modify main.cpp (add this after std::cout << "Compressed data size..."):

    // --- Create OpenZL Decompressor ---
    // The decompressor can often infer the descriptor from the compressed data itself,
    // or you can explicitly provide it. For simplicity, we'll use the compressed data.
    openzl::Decompressor decompressor; // Default constructor, will use metadata from compressed data
    std::cout << "OpenZL Decompressor initialized." << std::endl;

    // --- Decompress Data ---
    // The decompress method returns a vector of bytes representing the decompressed data.
    std::vector<uint8_t> decompressedBytes = decompressor.decompress(
        compressedData.data(),          // Pointer to the compressed data
        compressedData.size()           // Size of the compressed data
    );

    // Verify decompressed size matches original struct size
    if (decompressedBytes.size() != sizeof(SensorReading)) {
        std::cerr << "Error: Decompressed size does not match original struct size!" << std::endl;
        return 1;
    }

    // --- Reconstruct Original Struct ---
    // Copy the decompressed bytes back into a new SensorReading struct.
    SensorReading decompressedReading;
    std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));

    std::cout << "\nDecompressed Reading: ID=" << decompressedReading.id
              << ", Temp=" << decompressedReading.temperature
              << ", Humid=" << decompressedReading.humidity
              << ", Active=" << (decompressedReading.isActive ? "true" : "false") << std::endl;

    // --- Verification ---
    if (myReading.id == decompressedReading.id &&
        myReading.temperature == decompressedReading.temperature &&
        myReading.humidity == decompressedReading.humidity &&
        myReading.isActive == decompressedReading.isActive) {
        std::cout << "\nDecompression successful! Data matches original." << std::endl;
    } else {
        std::cerr << "\nDecompression failed! Data mismatch." << std::endl;
    }

Explanation:

  • openzl::Decompressor decompressor;: We create a decompressor. OpenZL is designed to embed metadata within the compressed stream, allowing the decompressor to automatically determine the structure and plan needed for decompression.
  • decompressor.decompress(...): This method takes the compressed byte array and its size. It returns a std::vector<uint8_t> containing the original, uncompressed bytes.
  • std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));: We use memcpy to copy the raw decompressed bytes back into a new SensorReading struct. This is a common pattern when dealing with raw memory and structs in C++.
  • Finally, we print the decompressed values and perform a simple check to ensure they match the original data.

Full main.cpp for Reference:

#include <iostream>
#include <vector>
#include <string>
#include <cstring> // For std::memcpy

// Include OpenZL headers
#include <openzl/openzl.h>
#include <openzl/descriptor.h>
#include <openzl/compressor.h>
#include <openzl/decompressor.h>

// Our simple structured data type
struct SensorReading {
    int id;
    double temperature;
    float humidity;
    bool isActive;
};

int main() {
    std::cout << "Starting OpenZL C++ Integration Example (2026-01-26)" << std::endl;

    // --- Data Definition ---
    SensorReading myReading = {101, 25.5, 60.2f, true};
    std::cout << "Original Reading: ID=" << myReading.id
              << ", Temp=" << myReading.temperature
              << ", Humid=" << myReading.humidity
              << ", Active=" << (myReading.isActive ? "true" : "false") << std::endl;

    // --- Define OpenZL DataDescriptor for SensorReading ---
    openzl::DataDescriptor sensorDescriptor =
        openzl::descriptor::struct_("SensorReading")
            .field("id", openzl::descriptor::int32())
            .field("temperature", openzl::descriptor::float64())
            .field("humidity", openzl::descriptor::float32())
            .field("isActive", openzl::descriptor::boolean());

    std::cout << "\nDataDescriptor created for SensorReading." << std::endl;

    // --- Create OpenZL Compressor ---
    openzl::Compressor compressor(sensorDescriptor);
    std::cout << "OpenZL Compressor initialized." << std::endl;

    // --- Compress Data ---
    std::vector<uint8_t> compressedData = compressor.compress(
        reinterpret_cast<const uint8_t*>(&myReading),
        sizeof(SensorReading)
    );

    std::cout << "Original data size: " << sizeof(SensorReading) << " bytes" << std::endl;
    std::cout << "Compressed data size: " << compressedData.size() << " bytes" << std::endl;

    // --- Create OpenZL Decompressor ---
    openzl::Decompressor decompressor;
    std::cout << "OpenZL Decompressor initialized." << std::endl;

    // --- Decompress Data ---
    std::vector<uint8_t> decompressedBytes = decompressor.decompress(
        compressedData.data(),
        compressedData.size()
    );

    if (decompressedBytes.size() != sizeof(SensorReading)) {
        std::cerr << "Error: Decompressed size does not match original struct size!" << std::endl;
        return 1;
    }

    // --- Reconstruct Original Struct ---
    SensorReading decompressedReading;
    std::memcpy(&decompressedReading, decompressedBytes.data(), sizeof(SensorReading));

    std::cout << "\nDecompressed Reading: ID=" << decompressedReading.id
              << ", Temp=" << decompressedReading.temperature
              << ", Humid=" << decompressedReading.humidity
              << ", Active=" << (decompressedReading.isActive ? "true" : "false") << std::endl;

    // --- Verification ---
    if (myReading.id == decompressedReading.id &&
        myReading.temperature == decompressedReading.temperature &&
        myReading.humidity == decompressedReading.humidity &&
        myReading.isActive == decompressedReading.isActive) {
        std::cout << "\nDecompression successful! Data matches original." << std::endl;
    } else {
        std::cerr << "\nDecompression failed! Data mismatch." << std::endl;
    }

    return 0;
}

Step 6: Building and Running

  1. Save the CMakeLists.txt and main.cpp files in the same directory.
  2. Create a build directory: mkdir build && cd build
  3. Configure CMake: cmake .. (This tells CMake to look for CMakeLists.txt in the parent directory).
  4. Build your application: cmake --build .
  5. Run: ./my_openzl_app

You should see output similar to this (compression sizes may vary slightly depending on OpenZL version and internal codecs):

Starting OpenZL C++ Integration Example (2026-01-26)
Original Reading: ID=101, Temp=25.5, Humid=60.2, Active=true

DataDescriptor created for SensorReading.
OpenZL Compressor initialized.
Original data size: 24 bytes
Compressed data size: 18 bytes
OpenZL Decompressor initialized.

Decompressed Reading: ID=101, Temp=25.5, Humid=60.2, Active=true

Decompression successful! Data matches original.

Notice how the Compressed data size is smaller than the Original data size! This shows OpenZL doing its job.

Mini-Challenge: Compressing an Array of Structs

You’ve successfully compressed a single SensorReading. Now, let’s take it up a notch.

Challenge: Modify the main.cpp code to compress a std::vector<SensorReading> (an array of sensor readings) instead of just one. You’ll need to:

  1. Create a std::vector<SensorReading> and populate it with a few instances.
  2. Adjust the DataDescriptor to describe an array of SensorReading structs.
  3. Modify the compressor.compress() and decompressor.decompress() calls to handle the vector’s data and total size.
  4. Verify that all readings in the decompressed vector match the original.

Hint: OpenZL’s descriptor namespace offers functions for arrays. Look for openzl::descriptor::array_of(...) or similar to define a descriptor for a sequence of your SensorReading struct. Remember that std::vector stores its elements contiguously in memory, so you can treat its data as a raw array. The total size will be vector.size() * sizeof(SensorReading).

What to Observe/Learn: This challenge will solidify your understanding of how OpenZL handles collections of structured data, which is a very common use case. You’ll see how flexible the descriptor system is.

Common Pitfalls & Troubleshooting

  1. Incorrect DataDescriptor Definition:

    • Symptom: Poor compression ratios, data corruption after decompression, or even crashes.
    • Cause: The DataDescriptor doesn’t accurately reflect the C++ struct’s memory layout. This could be incorrect types (e.g., int32() for a long long), wrong field order, or missing padding.
    • Fix: Double-check your struct definition and ensure every field in the C++ struct has a corresponding, correctly typed field in the DataDescriptor, in the same order. Be mindful of C++ compiler padding (though OpenZL often handles this well with its intelligent type mapping, explicit offset specification might be needed for complex cases if issues arise).
  2. Linking Errors (CMake Issues):

    • Symptom: Compilation fails with messages like “undefined reference to openzl::Compressor::Compressor(...)” or “OpenZL not found.”
    • Cause: CMake cannot locate the OpenZL library, or your project isn’t correctly linking against it.
    • Fix:
      • Ensure OpenZL is installed and its OpenZLConfig.cmake file is discoverable.
      • If OpenZL is installed in a non-standard location, set CMAKE_PREFIX_PATH before running cmake (e.g., export CMAKE_PREFIX_PATH=/usr/local/openzl:$CMAKE_PREFIX_PATH).
      • Verify target_link_libraries uses the correct target name (OpenZL::OpenZL).
  3. Memory Management and Buffer Handling:

    • Symptom: Crashes, segfaults, or unexpected behavior.
    • Cause: Incorrectly managing the raw byte buffers, e.g., passing invalid pointers or incorrect sizes to compress or decompress. Using memcpy with wrong sizes can lead to buffer overflows.
    • Fix: Always ensure the pointers passed to OpenZL functions are valid and point to the beginning of the data, and that the sizes accurately reflect the number of bytes available/expected. When copying back, use sizeof(YourStruct) or the expected total size of the collection.

Summary

Congratulations! You’ve successfully integrated OpenZL into a C++ application, covering the crucial steps from project setup to compression and decompression.

Here are the key takeaways from this chapter:

  • CMake is essential for setting up C++ projects and linking against OpenZL.
  • openzl::DataDescriptor is the cornerstone, providing OpenZL with the structural blueprint of your C++ data.
  • openzl::Compressor and openzl::Decompressor are the primary classes for performing compression and decompression operations.
  • OpenZL works with raw byte buffers, requiring you to cast your C++ structs or objects to uint8_t* pointers.
  • Verification is critical to ensure data integrity after decompression.
  • Careful descriptor definition and memory handling are crucial for stable and correct operation.

You’ve now got the tools to start building powerful, data-aware compression into your C++ projects. In the next chapter, we’ll explore more advanced topics, such as performance tuning and integrating OpenZL with different data sources.

References

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.