stream and buffer usage

Scott Hunt

#16800

November 19, 2018

Good evening everyone, couple questions as I'm attempting to learn the stream and buffer mechanisms Casey has added recently.

Making a quick OBJ 3d model parser, I've pulled in the readstream from the file reader and while reading through it using a slightly modified version of the token struct when debugging I've noticed that the stream pointer (At) gets corrupted at the end of it's data. It's not clear to me why this is happening except maybe it is reading off the end of the stream as the contents? It doesn't fail as the contents.count hits 0 and breaks the reading, but I can't figure out why it's doing this.

Also is there a better way to handle this text stream data reading? I wanted to learn how to tokenize strings so I went this direction in-lieu of the fscanf and FILE handle and while it works, I'm not convinced this is the data approach Casey's buffer and streams were intended as I'm having to cast to a (char*) to convert to a value.

Any thoughts are appreciated.

Sample code:

static void ConsumeLine(FString* Source)
{
    for(;;)
    {
        FToken Token = PopToken(Source);
        if(StringsAreEqual(Token.Text, "\n"))
        {
            break;
        }
    }
}

static FOBJModelData ParseOBJ(FMemoryArena* Arena, FStream File)
{
FStream* At = &File;
while(At->Contents.Count > 0)
{
    FToken = PopToken(&At->Contents);
    if(StringsAreEqual(Token.Text, "#"))
    {
        ConsumeLine(&At->Contents);
    }
    else if(StringsAreEqual(Token.Text, "v"))
    {
        Vertices[VertexCount++] = vec3{strtof((char*)PopToken(&At->Contents).Text.Data, 0),
                strtof((char*)PopToken(&At->Contents).Text.Data, 0),
                strtof((char*)PopToken(&At->Contents).Text.Data, 0)};
        PopToken(&At->Contents);
    }

    ...

    }
}
}

Edited by Scott Hunt on November 19, 2018, 7:16am

Simon Anciaux

#16808

November 20, 2018

It's hard to find the problem with you At pointer without more code (if possible a simple reproduction case that we can compile and debug). At least we would need your PopToken function and the FStream and FString struct definitions.

I haven't watched the last few episodes of HmH so I don't know what is Casey's approach, but if your problem is the casting, you can use a union to have several pointer type.

typedef struct token {
    umm length;
    union {
        void* data;
        char* text;
        umm offset;
    };
} token;

token t = PopToken( &stream );
r32 number = strtof( t.text, 0 );

Edited by Simon Anciaux on November 20, 2018, 8:38pm Reason: typo

Scott Hunt

#16898

December 3, 2018

Hey Simon, sorry for the delayed follow-up, been a hectic week. I added you to a git repository I'm using as a test bed for the things I'm learning from Handmade Hero. If you have a quick moment without inconvenience, you can clone and build it (using the HH style build.bat), any insight into the Visual Studio showing the junk at the end of the stream would be beneficial. I believe it's from memory being written over, but I can't tell for sure. The part I'm curious about is *At stream when loading in a OBJ file.

Edited by Scott Hunt on December 3, 2018, 5:47am

Simon Anciaux

#16903

December 3, 2018

Is there something to do to load a obj ? The code I downloaded doesn't call ParseOBJ as it is, and the test code in UpdateAndRender is in a #if 0. Switching it to 1 causes a few compile error.

Scott Hunt

#16908

December 4, 2018

Hey Simon, I apologize for not providing better information and appreciate you taking time to review.

It uses Casey's asset file style. Use oeaedit to create a local.oea under the data folder. Once done any .obj files in the art folder will get automatically parsed in to the local.oea for future runs. If the local.oea is already there, you can delete it and recreate with oeaedit.

Edited by Scott Hunt on December 4, 2018, 4:18pm

Simon Anciaux

#16919

December 5, 2018

There doesn't seem to be a problem with the At pointer, as it's never modified in the ParseOBJ function.
At->Content->Data points to garbage data after parsing the file because when you Pop a token you advance that pointer. At the end of the loop it's expected that it would point 1 byte past the end of the valid data. The garbage data is whatever was in memory before you used it to read the file. If there was a 0 after the file content (you could for instance make your file reading function add a zero at the end) the debugger would stop showing the garbage as it would consider that a null terminated string.

It's a personal choice but instead of using a counter, I like to write those loops like this:

uint8_t* current = At->Content->Data;
uint8_t* end = current + At->Content->Count;

while ( current < end ) { /* The loop ends when the current pointer is past the end pointer. */
    ...
}

To avoid casting to char* you could change you FBuffer struct to be

struct FBuffer
{
    umm Count;
    union {
        u8* Data;
        char* Text;
    };
};

Your PopToken function could just return a FBuffer since you don't seem to use the "value" part of the Token.

The following code could potentially read invalid memory if the file was corrupted (but maybe you can trust file and don't care about that), since you don't test for the length of the token before using it: if the file was truncated it would return a token with the data pointer after the end of the valid data and a count of 0.

Vertices[VertexCount++] = vec3{
    strtof((char*)PopToken(&At->Contents).Text.Data, 0),
    strtof((char*)PopToken(&At->Contents).Text.Data, 0),
    strtof((char*)PopToken(&At->Contents).Text.Data, 0)
};

Edited by Simon Anciaux on December 5, 2018, 5:21pm Reason: typo

Scott Hunt

#16923

December 6, 2018

Thanks Simon, makes sense about it reading garbage due to no null termination. I went ahead and updated the FBuffer struct with the union, it's definitely cleaner. Appreciate your time in reviewing through.