I recently saw a post on the Arduino forum regarding initializing arrays – specifically, how to speed up filling values in arrays.
For example, how could you speed up this:
for(int i=0;i<5;i++) for(int j=0;j<5;j++) array[i][j] = 0;
When it’s zero, you can use memset()
:
memset(array,0,sizeof(array));
The sizeof(array)
returns the size of your array in bytes – since it automatically recalculates when compiled, you never have to change it if you change your array size. And because it is a compiler directive (although it looks like an ordinary function), the result compiles to a constant value in your code; that is, the compiler knows all the sizes, so it can change this to a constant number, saving you that calculation time when you’re running the code.
The result is about the fastest way to clear an array there is short of machine code.
Of course, it only works with arrays that are valid being all zeroes – char, int, and long, along with their unsigned versions. I wasn’t going to recommend it for floats, but one forum member noted that the Arduino floating point was the IEEE 754 format, and that an the 0.0 in floating point was all zeroes – so clearing a float array with this is fine.
And clearing other types? Each is different; for example, an array of structs is fine as long as every item inside it can be zeroed safely (pointers will become NULL, for example, which is OK). Be careful of objects, however: an object should ideally do its own initialization, which means you shouldn’t use this type of code.
Of course, that’s zeroing – what of setting all array items to any value fast?
Well, a simple solution is to unroll parts of the loop:
for(int i=0;i<5;i++) array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0;
…or even
i=-1; ++i; array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0; ++i; array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0; ++i; array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0; ++i; array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0; ++i; array[i][0] = array[i][1] = array[i][2] = array[i][3] = array[i][4] = 0;
But now you’ve traded code size for (very small) time benefits – unless you really need to eke out every little cycle, it’s rather inflexible (imagine how much fun it would be to change the array size to 7, for example).
There is another trick, almost as fast as memset()
, however, called memcpy()
memcpy()
moves bytes from one place to another. But it has a side effect: if the source and destination overlaps, memcopy()
is too dumb to notice that, and will just keep copying, overwriting parts of the data (that’s why most programmers use memmove()
instead, which takes care not to trash memory when overlapping).
To see how this works, imagine we have an array of numbers in memory:
100 101 102 103 104 105 106 107 108 109...
And let’s say we want to move everything 3 numbers over (or ‘up’ in memory). If we’re using memcpy()
, where the source is the start (100), and the destination is three numbers later (103), you’d expect this to be the result:
100 101 102 100 101 102 103 104 105 106...
Which is how memmove()
works. But with memcpy()
, you’d likely get this:
100 101 102 100 101 102 100 101 102 100...
The reason? memcpy()
blindly moved 100, then 101, then 102. But by the time it got to 103, it had already filled it with 100 earlier on, so it copied that instead. The result was instead of a whole copy, the first three entries were duplicated throughout the array.
This is what is called ‘undefined behavior’ for the definition of memcpy()
, and should normally be avoided (some versions copy from the back first, not the front, so the duplication would start from the back entries instead). However, if we wrap this up in a function, we can use this ‘glitch’ to make a fast array setup function that works on the Arduino:
void dupFirstItem(void *array,unsigned int totalLenBytes,unsigned int itemLenBytes) { // using the memcpy 'glitch' to duplicate array items memcpy( ((char*)array)+itemLenBytes, array, totalLenBytes-itemLenBytes ); }
To use this, you’d call it in your code like this:
int x[25]; x[0]=57; dupFirstItem(x,sizeof(x),sizeof(x[0]));
Or you could combine lines at initialization:
int x[25]={32000}; dupFirstItem(x,sizeof(x),sizeof(x[0]));
And if entering the array size and item size all the time is annoying, then how about a macro?
#define DUPFIRSTINARRAY(a) dupFirstItem(a,sizeof(a),sizeof(a[0]));
You’d use it like this:
int x[99]={2048}; DUPFIRSTINARRAY(x);
You can even go further and use a macro to assign that first variable as well:
#define FILLARRAY(a,n) a[0]=n, memcpy( ((char*)a)+sizeof(a[0]), a, sizeof(a)-sizeof(a[0]) );
Then your code becomes
int x[200]; FILLARRAY(x,1345);
The tradeoff? The macro has no type casting, which can cause problems if you enter odd values. For example, you’d think you could use this to fill PART of an array, but it won’t work:
FILLARRAY(x[3],1345); // fill from x[3] on only
Whereas you could with the function:
int x[25]; x[7]=57; dupFirstItem( &x[7], sizeof(x)-7*sizeof(x[0]), sizeof(x[0]) );
(Whether you’d want to is another matter – at this point, a regular loop is neater and easier to maintain).
Note that these functions now copy whole items of an array, which means that not only are ints, char, floats, and longs easy to do:
char c[200]; FILLARRAY(c,'d');
But you can use it for (some) classes and structs:
myClassOjectWithManyThingsInside obj[200]; // set up obj[0] properly DUPFIRSTINARRAY(obj); //copy to rest of array
The only issue here is that some objects and structs should not be copied bitwise like this. For example, a String class would have a pointer to its data. If you duplicate that, then all the Strings in the array would point to the same string! So take care when using this with objects and structs, unless you know for sure that you can do a bitwise copying of the item.
In any case, the resulting code here can make it easy to set up arrays with a single value – so when you want to trade a for loop for some speed, give these a try.
(Postscript: To give you an idea of how non-standard this side effect of memcpy()
is, I tested the code on both the Arduino and Windows. Under Windows, FILLARRAY(c,'d');
for chars wouldn’t work, although it was fine on the Arduino. The reason? Some versions of memcpy()
use quad words to copy four bytes at a time. But not all copies are four bytes long (like char), so extra code is needed for those parts – and obviously that code doesn’t have the same glitch. However, on the Arduino, the glitch is consistent – chars duplicate like everything else.)
Hi David, i find your post very interesting. I have posted a link to it in my blog. I’m just starting with the Arduino and I’m amazed what things you can do with it. Cheers!
You know, you can use memset() to put in any unsigned char, it doesn’t have to a zero. It is just the most used featured to clear things.
So if you want to fill your array with 42, you just type memset(array, 42, sizeof(array));
This will work, but only if the variables are char or unsigned char, otherwise it will be combined – for example, a 16-bit int will get 42 in the MSB, and 42 in the LSB, for 42*256+42, or 10,794
Hello,
If sizeof(array) returns the size in bytes, memset(array,0,sizeof(array)) only works if it’s an array of char. In case of an array of int, you must specify sizeof(array) * sizeof(int)
sizeof() does give you the size of any type of array in bytes (https://en.wikipedia.org/wiki/Sizeof) but it must be a direct array – if you use a pointer to memory, sizeof() will return the size of the pointer, not the array.