Guiding an LLM for Robust Java ByteBuffer Code

AI editors like Cursor can generate code with remarkable speed using LLMs,
handling boilerplate and providing functional snippets for various tasks.
However, when building robust systems, functional correctness is only the
starting point. Code must also be safe, predictable, maintainable, and free
from subtle side effects.

This article demonstrates, through a dialogue between a developer and an
LLM, how expert guidance is crucial to transform an initial, potentially
unsafe code snippet into a robust, system-ready component. This development
session was done as part of a larger development effort to write a BoltDB
like storage engine for an internal workshop. This also demonstrates how
breaking the larger development efforts into smaller programming sessions
can be more effective with LLM based development. We’ll use the common task
of reading and writing a fixed-structure header within a Java ByteBuffer.

The Conversation Begins

Initial Prompt to Start the Conversation

Can you show me how to write and read a simple page header
structure to a Java ByteBuffer? The header should contain: Page Type (byte),
Page Level (byte), Number of Keys (short), Free Space Offset (int), Right
Sibling Page ID (int).

LLM-Generated Code (Using position())

Certainly. Here’s a basic example using ByteBuffer’s position()
method to manage reads and writes:”

import java.nio.ByteBuffer;

public class PageHeaderV1 {
    // Offsets (implicitly managed by position)
    private static final int PAGE_TYPE_OFFSET = 0;
    private static final int LEVEL_OFFSET = 1; // pageType is 1 byte
    private static final int NUM_KEYS_OFFSET = 2; // level is 1 byte
    private static final int FREE_SPACE_OFFSET_OFFSET = 4; // numKeys is 2 bytes
    private static final int RIGHT_SIBLING_OFFSET = 8; // freeSpaceOffset is 4 bytes
    public static final int HEADER_SIZE = 12; // rightSibling is 4 bytes

    public static void write(ByteBuffer buffer, byte pageType, byte level, short numKeys, int freeSpaceOffset, int rightSiblingPageId) {
        buffer.clear(); // Reset position to 0, limit to capacity

        buffer.position(PAGE_TYPE_OFFSET);
        buffer.put(pageType);

        buffer.position(LEVEL_OFFSET);
        buffer.put(level);

        buffer.position(NUM_KEYS_OFFSET);
        buffer.putShort(numKeys);

        buffer.position(FREE_SPACE_OFFSET_OFFSET);
        buffer.putInt(freeSpaceOffset);

        buffer.position(RIGHT_SIBLING_OFFSET);
        buffer.putInt(rightSiblingPageId);
    }

    public static byte getPageType(ByteBuffer buffer) {
        buffer.position(PAGE_TYPE_OFFSET);
        return buffer.get();
    }

    public static byte getLevel(ByteBuffer buffer) {
        buffer.position(LEVEL_OFFSET);
        return buffer.get();
    }

    public static short getNumKeys(ByteBuffer buffer) {
        buffer.position(NUM_KEYS_OFFSET);
        return buffer.getShort();
    }

    public static int getFreeSpaceOffset(ByteBuffer buffer) {
        buffer.position(FREE_SPACE_OFFSET_OFFSET);
        return buffer.getInt();
    }

    public static int getRightSiblingPageId(ByteBuffer buffer) {
        buffer.position(RIGHT_SIBLING_OFFSET);
        return buffer.getInt();
    }
}

This code uses position() to move to the correct offset before each read
or write.

Developer Insight: The LLM produced functionally correct code for a
simple case. However, the expert developer immediately recognizes a potential
problem: the use of buffer.position(). This method modifies the state of the
ByteBuffer object itself.

Developer Suggests Absolute Offsets

Thanks. However, constantly changing the buffer’s position can
cause issues, especially with shared buffers like MappedByteBuffer or in
concurrent code. It introduces side effects. Can we rewrite this using the
absolute get(index) and put(index, value) methods instead to make it stateless
regarding the buffer’s position?

Developer Insight: The developer identifies the specific issue (side
effects via state mutation) and understands the context where it becomes
dangerous (shared buffers, concurrency). They guide the LLM towards a safer,
stateless pattern.

You are absolutely right. Using absolute offset methods avoids
modifying the buffer’s position state, which is much safer for shared or
concurrent use cases. Here’s how you can read the numKeys and freeSpaceOffset
using the absolute methods:

import java.nio.ByteBuffer;

// Example correction for a couple of methods:
public class PageHeaderCorrection {
    private static final int NUM_KEYS_OFFSET = 2;
    private static final int FREE_SPACE_OFFSET_OFFSET = 4;

    // Corrected getter using absolute offset
    public static short getNumKeys(ByteBuffer buffer) {
        return buffer.getShort(NUM_KEYS_OFFSET); // Reads at index 2, doesn't change position
    }

    // Corrected getter using absolute offset
    public static int getFreeSpaceOffset(ByteBuffer buffer) {
        return buffer.getInt(FREE_SPACE_OFFSET_OFFSET); // Reads at index 4, doesn't change position
    }

    // Similarly, writing would use put(index, value):
    public static void writeNumKeys(ByteBuffer buffer, short numKeys) {
         buffer.putShort(NUM_KEYS_OFFSET, numKeys); // Writes at index 2, doesn't change position
    }
}

Using get(index) and put(index, value) ensures the operations are
side-effect-free concerning the buffer’s position.

Developer Insight: The LLM correctly implements the suggested pattern.
The developer now pushes for better software design, moving from static
utility methods to an object-oriented approach.

Developer Requests Refactoring to Instance-Based Class

Developer: “Excellent, that’s exactly the pattern needed. Please provide
the complete PageHeader class using only the absolute offset methods
(get(index), put(index, value), getShort(index), putShort(index, value), etc.)
for all fields. Also, let’s refactor this. Instead of static methods, make
PageHeader a class that takes the ByteBuffer in its constructor. It should act
as a view or wrapper around that buffer section. Remove the static write
method that takes all attributes. Instead, provide instance methods for
getting and setting each field directly on the buffer provided in the
constructor, still using the safe absolute offset methods.”

Developer Insight: The developer applies software design principles:
Encapsulation (wrapping the buffer) and Object-Oriented Design (instance
methods acting on object state – the buffer reference). This improves
usability and maintainability.

LLM: “Understood. Refactoring the PageHeader into a class that wraps a
ByteBuffer provided at construction is a good way to encapsulate header access
for a specific page buffer. Here’s the refactored class with instance methods
for getting and setting fields using absolute offsets:”

import java.nio.ByteBuffer;

public class PageHeader {
    // Define constants for offsets and size
    private static final int OFFSET_PAGE_TYPE = 0;
    private static final int OFFSET_LEVEL = 1;
    private static final int OFFSET_NUM_KEYS = 2;
    private static final int OFFSET_FREE_SPACE = 4;
    private static final int OFFSET_RIGHT_SIBLING = 8;
    public static final int HEADER_SIZE = 12; // Total size of the header

    private final ByteBuffer buffer; // Store the buffer instance

    /**
     * Creates a PageHeader view wrapping the given ByteBuffer.
     * Assumes the buffer's relevant section starts at position 0
     * or that the buffer slice/view is correctly positioned.
     * @param buffer The ByteBuffer containing the page data.
     */
    public PageHeader(ByteBuffer buffer) {
        // Consider adding checks: buffer != null, buffer.capacity() >= HEADER_SIZE
        this.buffer = buffer;
    }

    // --- Getters using absolute offsets ---

    public byte getPageType() {
        return buffer.get(OFFSET_PAGE_TYPE);
    }

    public byte getLevel() {
        return buffer.get(OFFSET_LEVEL);
    }

    public short getNumKeys() {
        return buffer.getShort(OFFSET_NUM_KEYS);
    }

    public int getFreeSpaceOffset() {
        return buffer.getInt(OFFSET_FREE_SPACE);
    }

    public int getRightSiblingPageId() {
        return buffer.getInt(OFFSET_RIGHT_SIBLING);
    }

    // --- Setters using absolute offsets ---

    public void setPageType(byte pageType) {
        buffer.put(OFFSET_PAGE_TYPE, pageType);
    }

    public void setLevel(byte level) {
        buffer.put(OFFSET_LEVEL, level);
    }

    public void setNumKeys(short numKeys) {
        buffer.putShort(OFFSET_NUM_KEYS, numKeys);
    }

    public void setFreeSpaceOffset(int freeSpaceOffset) {
        buffer.putInt(OFFSET_FREE_SPACE, freeSpaceOffset);
    }

    public void setRightSiblingPageId(int rightSiblingPageId) {
        buffer.putInt(OFFSET_RIGHT_SIBLING, rightSiblingPageId);
    }

    /**
     * Initializes a new header structure with default/zero values
     * in the underlying buffer.
     */
    public void initializeNewHeader() {
        setPageType((byte) 0);
        setLevel((byte) 0);
        setNumKeys((short) 0);
        setFreeSpaceOffset(HEADER_SIZE); // Default free space starts after header
        setRightSiblingPageId(0); // Assuming 0 represents no sibling initially
        // Or use specific constants for null/invalid IDs if available
    }
}

This version encapsulates access to the header fields within an instance
tied to a specific ByteBuffer. All read and write operations use absolute
offsets, ensuring safety and avoiding side effects on the buffer’s position.
I’ve also added an example initializeNewHeader method for convenience.

Takeaways: The importance of the developer role

There are more improvements possible. But this dialogue highlights critical areas where developer expertise is
indispensable when working with LLMs:

Identifying Hidden Risks: The initial code worked, but the developer
recognized the latent danger of side effects from position() in broader system
contexts.
Contextual Awareness: Understanding implications for
MappedByteBuffer, concurrency, and library design was key to steering away from
the stateful approach.
Guiding Towards Robust Patterns: The developer explicitly prompted
for stateless, absolute offset methods, leveraging deep knowledge of safe
low-level programming practices.
Applying Software Design Principles: The final refactoring wasn’t
just about correctness but about better design – encapsulation, instance-based
logic, maintainability (constants), and usability.
Critical Evaluation: Throughout the process, the developer critically
evaluated the LLM’s output against not just functional requirements but also
non-functional requirements like safety, stability, and maintainability.

Conclusion

LLMs are incredibly powerful coding assistants, accelerating development
and handling complex tasks. However, as this case study shows, they are tools
that respond to guidance. Building robust, reliable, and performant systems,
requires the critical thinking, contextual understanding, and deep systems
knowledge of an experienced developer. The expert doesn’t just prompt for
code; they evaluate, guide, refine, and integrate, ensuring the final product
meets the rigorous demands of real-world software engineering.

Source link

The Conversation Begins

Developer Suggests Absolute Offsets

Developer Requests Refactoring to Instance-Based Class

Takeaways: The importance of the developer role

Conclusion

Leave a Reply Cancel reply

Company

My Account

Support

The Conversation Begins

Developer Suggests Absolute Offsets

Developer Requests Refactoring to Instance-Based Class

Takeaways: The importance of the developer role

Conclusion

Leave a Reply Cancel reply

NEWSLETTER