tech_log: Zero length arrays in C

Wednesday, August 12, 2009

Zero length arrays in C

http://www.mail-archive.com/freebsd-hackers@freebsd.org/msg21219.html
http://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

5.14 Arrays of Length Zero

Zero-length arrays are allowed in GNU C. They are very useful as the last element of a structure which is really a header for a variable-length object:

     struct line {
       int length;
       char contents[0];
     };
    
     struct line *thisline = (struct line *)
       malloc (sizeof (struct line) + this_length);

     thisline->length = this_length;


In ISO C90, you would have to give contents a length of 1, which means either you waste space or complicate the argument to malloc.

In ISO C99, you would use a flexible array member, which is slightly different in syntax and semantics:

    * Flexible array members are written as contents[] without the 0.
    * Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero.
    * Flexible array members may only appear as the last member of a struct that is otherwise non-empty.
    * A structure containing a flexible array member, or a union containing such a structure (possibly recursively), may not be a member of a structure or an element of an array. (However, these uses are permitted by GCC as extensions.)

GCC versions before 3.0 allowed zero-length arrays to be statically initialized, as if they were flexible arrays. In addition to those cases that were useful, it also allowed initializations in situations that would corrupt later data. Non-empty initialization of zero-length arrays is now treated like any case where there are more initializer elements than the array holds, in that a suitable warning about "excess elements in array" is given, and the excess elements (all of them, in this case) are ignored.

Instead GCC allows static initialization of flexible array members. This is equivalent to defining a new structure containing the original structure followed by an array of sufficient size to contain the data. I.e. in the following, f1 is constructed as if it were declared like f2.

     struct f1 {
       int x; int y[];
     } f1 = { 1, { 2, 3, 4 } };
    
     struct f2 {
       struct f1 f1; int data[3];
     } f2 = { { 1 }, { 2, 3, 4 } };


The convenience of this extension is that f1 has the desired type, eliminating the need to consistently refer to f2.f1.

This has symmetry with normal static arrays, in that an array of unknown size is also written with [].

Of course, this extension only makes sense if the extra data comes at the end of a top-level object, as otherwise we would be overwriting data at subsequent offsets. To avoid undue complication and confusion with initialization of deeply nested arrays, we simply disallow any non-empty initialization except when the structure is the top-level object. For example:

     struct foo { int x; int y[]; };
     struct bar { struct foo z; };
    
     struct foo a = { 1, { 2, 3, 4 } };        // Valid.
     struct bar b = { { 1, { 2, 3, 4 } } };    // Invalid.

     struct bar c = { { 1, { } } };            // Valid.
     struct foo d[1] = { { 1 { 2, 3, 4 } } };  // Invalid.



-------------------


The idea is to use this zero-length array as a reference to variable

length data that’d be stored using the struct. If you are wondering why
a pointer is not used, the size of a pointer would be non-zero, whereas
the size of a zero-length array is guaranteed to be zero.


-------------------


****************    Cool explanation   *********************************

On Tue, Mar 20, 2001 at 01:03:21PM -0600, Peter Seebach wrote:
> In message <[EMAIL PROTECTED]>, Shankar Agarwal writes:

> >Can someone pls tell me if it is possible to define an array of size 0.
>
> Not in C.

Actually you can (see below).  It depends on the compiler and how strict
you have it checking things.  It only works in this case because the

memory manager is allocating an entire page for the structure, not
just the size of the structure.

It's not uncommon to use this for message-based communications where you
have a header and payload  and want to use sizeof(struct message) to get

the header size, but also want to use foo.payload to access the message
itself.  In that case, it's more likely to be used as a cast on a buffer,
(e.g., ((struct message*) buffer)->payload)

Realize, tho, there's a potential portability issue, if you use this.


What follows was done on a NetBSD 1.5 system.

[24]% cat zero.c && make zero && ./zero
#include <stdio.h>

struct zero_array {
        int header;
        int payload[0];

};

int main()
{
        struct zero_array foo;

        foo.header=1;
        foo.payload[0]=10;
        foo.payload[1]=12;
        printf("Foo:\n\theader: %d\n", foo.header);
        printf("\tpayload 0: %d\n", foo.payload[0]);

        printf("\tpayload 1: %d\n", foo.payload[1]);
        return 0;
}
cc -O2   -o zero zero.c
Foo:
        header: 1
        payload 0: 10
        payload 1: 12




****************************************************************************


In message <[EMAIL PROTECTED]>, John Franklin writes:
>On Tue, Mar 20, 2001 at 01:03:21PM -0600, Peter Seebach wrote:

>> In message <[EMAIL PROTECTED]>, Shankar Agarwal writes:
>> >Can someone pls tell me if it is possible to define an array of size 0.

 
>> Not in C.

>Actually you can (see below).  It depends on the compiler and how strict
>you have it checking things.

The C language doesn't allow zero-sized objects.  Some systems may, but

C itself doesn't.

>What follows was done on a NetBSD 1.5 system.

More importantly, it was done with gcc, which (by default) compiles a
language called "GNU C", which is very similar to C, but has some extensions.


In C99, you can do this "portably" (C99 isn't exactly universally adopted yet)
by saying
        struct message {
                int header;
                char payload[];
        };


and then doing
        struct message *p;
        p = malloc(sizeof(struct message) + 10);
        p->header = 10;
        strcpy(p->payload, "123456789");


>int main()
>{

>        struct zero_array foo;
>
>        foo.header=1;
>        foo.payload[0]=10;
>        foo.payload[1]=12;


This isn't even a result of the page management, you're just overwriting

other space.

If you did
        struct zero_array foo;
        int a[2];

you would probably find that a[0] was 10, and a[1] was 12.  Probably.
The behavior is totally undefined, and it's not exactly reliable.  :)




-------


On Fri, Mar 30, 2001 at 10:37:28AM -0500, Lord Isildur wrote:
> sine one knows the size of the struct, who need the pointer? just
> take the displacement.
>

> char* buf; /* some buffer */
> struct foo{
> int header;
> struct funkystruct blah;
> };
>
> (struct foo*)buf; /*your headers are here */
> (struct foo*)buf+1; /* and your data is here */


Could, true. Buf if foo is:

struct foo{
 struct header head;
 struct funcystruct data[0];
}

you can say:
        mesg->head->headerbits;
        mesg->data[x]->databits;


A bit more readable, IMHO.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home