Discussion:
Memory allocation is too slow
(too old to reply)
Alex
2010-06-29 22:17:59 UTC
Permalink
Memory allocation in our application is too slow, on average every new
malloc or 'new' is costing us about 3 to 4 milliseconds, which is a
huge hit to the performance, each 'new' is only allocating a few bytes
or 1Kbytes at most in our application. Our application allocates a
lot of memory at the begginning, but still it only uses about 3MB
memory, GlobalMemoryStatus shows there are still 16MB physical RAM and
about 29MB virtual RAM available. So how can allocating a few bytes
take 4ms?

ms.dwTotalPhys = 0x017ba000 24296KB 23MB
ms.dwAvailPhys = 0x010a2000 17032KB 16MB
ms.dwTotalVirtual = 0x02000000 32768KB 32MB
ms.dwAvailVirtual = 0x01dc0000 30464KB 29MB

I wrote another test application to reproduce and analyze the problem,
it allocates 10000 bytes each time and allocates 10MB at the
beginning, however subsequent malloc does not take that much time, it
only takes about 70 us for each malloc even after there is only 1MB
physical RAM available.
XuanKe Liu
2010-06-30 09:01:38 UTC
Permalink
Post by Alex
Memory allocation in our application is too slow, on average every new
malloc or 'new' is costing us about 3 to 4 milliseconds, which is a
huge hit to the performance, each 'new' is only allocating a few bytes
or 1Kbytes at most in our application. Our application allocates a
lot of memory at the begginning, but still it only uses about 3MB
memory, GlobalMemoryStatus shows there are still 16MB physical RAM and
about 29MB virtual RAM available. So how can allocating a few bytes
take 4ms?
ms.dwTotalPhys = 0x017ba000 24296KB 23MB
ms.dwAvailPhys = 0x010a2000 17032KB 16MB
ms.dwTotalVirtual = 0x02000000 32768KB 32MB
ms.dwAvailVirtual = 0x01dc0000 30464KB 29MB
I wrote another test application to reproduce and analyze the problem,
it allocates 10000 bytes each time and allocates 10MB at the
beginning, however subsequent malloc does not take that much time, it
only takes about 70 us for each malloc even after there is only 1MB
physical RAM available.
It seems unreasonable.
What is the difference between your two applications?
Is there any special settings in your slow application?
Alex
2010-06-30 13:40:33 UTC
Permalink
Test application is short and simple, it is just written to test
memory allocation, it allocates 10000 bytes each time and used most of
the physical RAM, then I allocate another 10000 bytes and randomly use
some of its contents, it takes only about 70 us for each new malloc
even when physical RAM is under 1MB.

Our real application is rather complicated, but it only consumes about
3MB physical RAM, each new allocation is taking 3 to 4 ms, what is
interesting and might be a clue is:
class1 *p1 = new class1; //This new takes 3 to 4 ms.
class1 *p2 = new class1; //This new only takes less than 100 us.
class2 *p3 = new class2; //This new only takes less than 100 us.

If I swap the first two lines, it is still the first new taking most
of the time.
class1 *p2 = new class1; //This new takes 3 to 4 ms.
class1 *p1 = new class1; //This new only takes less than 100 us.
class2 *p3 = new class2; //This new only takes less than 100 us.

If I comment out the first two lines, then the third line is taking 3
to 4ms now.
//class1 *p1 = new class1;
//class1 *p2 = new class1;
class2 *p3 = new class2; //This new is taking about 3 to 4 ms now.

Each new allocates from a few bytes to at most 2K bytes, but the class
might be complicated, it may have a pointer to another class which has
a pointer to another buffer etc.
Alex
2010-06-30 17:13:53 UTC
Permalink
This is on Windows CE 5.0 PXA270 platform.
KMOS
2010-06-30 18:03:47 UTC
Permalink
I suspect the initial delay is caused by CRT lib initialization.
Is the whole OS Image XIP in RAM, Flash or others?
And is your application built in OS Image or load from external storages?
Sometime, the demand paging could be a performance killer, if real-time
performance is more important to you, perhaps disable Demand Page is worth
to try.
You could disable demand paging globally by modifying the ROMFLAGS in
config.bib
For detail refer to http://msdn.microsoft.com/en-us/library/ee479089.aspx
Post by Alex
Test application is short and simple, it is just written to test
memory allocation, it allocates 10000 bytes each time and used most of
the physical RAM, then I allocate another 10000 bytes and randomly use
some of its contents, it takes only about 70 us for each new malloc
even when physical RAM is under 1MB.
Our real application is rather complicated, but it only consumes about
3MB physical RAM, each new allocation is taking 3 to 4 ms, what is
class1 *p1 = new class1; //This new takes 3 to 4 ms.
class1 *p2 = new class1; //This new only takes less than 100 us.
class2 *p3 = new class2; //This new only takes less than 100 us.
If I swap the first two lines, it is still the first new taking most
of the time.
class1 *p2 = new class1; //This new takes 3 to 4 ms.
class1 *p1 = new class1; //This new only takes less than 100 us.
class2 *p3 = new class2; //This new only takes less than 100 us.
If I comment out the first two lines, then the third line is taking 3
to 4ms now.
//class1 *p1 = new class1;
//class1 *p2 = new class1;
class2 *p3 = new class2; //This new is taking about 3 to 4 ms now.
Each new allocates from a few bytes to at most 2K bytes, but the class
might be complicated, it may have a pointer to another class which has
a pointer to another buffer etc.
Alex
2010-07-01 13:42:55 UTC
Permalink
Thanks for the reply. OS image is not XIP, it is loaded in RAM.
Application is not in OS image, it is loaded from flash too, its size
is less than 300K. It consistently spends almost all the time in
'new' operation.

VirtualAlloc returns address on 64K boundary, is there similar issue
on new and malloc? If the application allocates 128bytes a time, and
system has 20MB physical memory, does it allocate memory
(20*1024*1024)/128 times before running into problem? If system has
more than 32MB physical memory available, does it only return address
from its 32MB range and return NULL after all 32MB virtual memory is
used?
Dean Ramsier
2010-07-01 15:59:42 UTC
Permalink
New and Malloc use the heap; that memory is already allocated from the
system. Your allocation times could also increase if the heap is heavily
fragmented (lots of new/delete of varying sizes). The heap only gets more
memory if it can't find a chunk big enough free in the heap, but it has to
look through the entire heap first...
--
Dean Ramsier - eMVP
BSQUARE Corporation
Post by Alex
Thanks for the reply. OS image is not XIP, it is loaded in RAM.
Application is not in OS image, it is loaded from flash too, its size
is less than 300K. It consistently spends almost all the time in
'new' operation.
VirtualAlloc returns address on 64K boundary, is there similar issue
on new and malloc? If the application allocates 128bytes a time, and
system has 20MB physical memory, does it allocate memory
(20*1024*1024)/128 times before running into problem? If system has
more than 32MB physical memory available, does it only return address
from its 32MB range and return NULL after all 32MB virtual memory is
used?
Alex
2010-07-01 19:55:31 UTC
Permalink
Thanks. Dean.

This might be the reason. If it is, how can it be improved without
redesigning the application? After all, I only used less than 3MB
physical memory and there are still 16MB available. What is the
default heap size? Can I change it?

//On Jul 1, 11:59 am, "Dean Ramsier" <***@nospam.com> wrote:
KMOS
2010-07-02 19:18:44 UTC
Permalink
The default new/malloc use default heap.
You may consider creating you own heap by using HeapCreate or CeHeapCreate
to allocate and commit all of the memory you need in the beginning.
And overwrite "new/delete" operator to allocate memory from your own heap.
For the usage of heap, refer to
http://msdn.microsoft.com/en-us/library/ee488377.aspx

"Alex" <***@yahoo.com> wrote in message news:7b9c5f7c-ef3f-413d-ab0c-***@d37g2000yqm.googlegroups.com...
Thanks. Dean.

This might be the reason. If it is, how can it be improved without
redesigning the application? After all, I only used less than 3MB
physical memory and there are still 16MB available. What is the
default heap size? Can I change it?

//On Jul 1, 11:59 am, "Dean Ramsier" <***@nospam.com> wrote:
Alex
2010-07-22 18:07:09 UTC
Permalink
One more clue:
I added an extra malloc() at one specific point of the code, malloc(2)
at that point takes 1.2 ms on average; malloc(10) takes 4.4ms on
average, malloc(100) takes 0.5ms, malloc(1024) takes 0.2ms.

What's the possible explanation to make sense of it?

On Jul 1, 11:59 am, "Dean Ramsier" <***@nospam.com> wrote:
Loading...