diff options
Diffstat (limited to 'third_party/git/compat/nedmalloc/Readme.txt')
-rw-r--r-- | third_party/git/compat/nedmalloc/Readme.txt | 136 |
1 files changed, 0 insertions, 136 deletions
diff --git a/third_party/git/compat/nedmalloc/Readme.txt b/third_party/git/compat/nedmalloc/Readme.txt deleted file mode 100644 index 07cbf50c0f9a..000000000000 --- a/third_party/git/compat/nedmalloc/Readme.txt +++ /dev/null @@ -1,136 +0,0 @@ -nedalloc v1.05 15th June 2008: --=-=-=-=-=-=-=-=-=-=-=-=-=-=-= - -by Niall Douglas (http://www.nedprod.com/programs/portable/nedmalloc/) - -Enclosed is nedalloc, an alternative malloc implementation for multiple -threads without lock contention based on dlmalloc v2.8.4. It is more -or less a newer implementation of ptmalloc2, the standard allocator in -Linux (which is based on dlmalloc v2.7.0) but also contains a per-thread -cache for maximum CPU scalability. - -It is licensed under the Boost Software License which basically means -you can do anything you like with it. This does not apply to the malloc.c.h -file which remains copyright to others. - -It has been tested on win32 (x86), win64 (x64), Linux (x64), FreeBSD (x64) -and Apple MacOS X (x86). It works very well on all of these and is very -significantly faster than the system allocator on all of these platforms. - -By literally dropping in this allocator as a replacement for your system -allocator, you can see real world improvements of up to three times in normal -code! - -To use: --=-=-=- -Drop in nedmalloc.h, nedmalloc.c and malloc.c.h into your project. -Configure using the instructions in nedmalloc.h. Run and enjoy. - -To test, compile test.c. It will run a comparison between your system -allocator and nedalloc and tell you how much faster nedalloc is. It also -serves as an example of usage. - -Notes: --=-=-= -If you want the very latest version of this allocator, get it from the -TnFOX SVN repository at svn://svn.berlios.de/viewcvs/tnfox/trunk/src/nedmalloc - -Because of how nedalloc allocates an mspace per thread, it can cause -severe bloating of memory usage under certain allocation patterns. -You can substantially reduce this wastage by setting MAXTHREADSINPOOL -or the threads parameter to nedcreatepool() to a fraction of the number of -threads which would normally be in a pool at once. This will reduce -bloating at the cost of an increase in lock contention. If allocated size -is less than THREADCACHEMAX, locking is avoided 90-99% of the time and -if most of your allocations are below this value, you can safely set -MAXTHREADSINPOOL to one. - -You will suffer memory leakage unless you call neddisablethreadcache() -per pool for every thread which exits. This is because nedalloc cannot -portably know when a thread exits and thus when its thread cache can -be returned for use by other code. Don't forget pool zero, the system pool. - -For C++ type allocation patterns (where the same sizes of memory are -regularly allocated and deallocated as objects are created and destroyed), -the threadcache always benefits performance. If however your allocation -patterns are different, searching the threadcache may significantly slow -down your code - as a rule of thumb, if cache utilisation is below 80% -(see the source for neddisablethreadcache() for how to enable debug -printing in release mode) then you should disable the thread cache for -that thread. You can compile out the threadcache code by setting -THREADCACHEMAX to zero. - -Speed comparisons: --=-=-=-=-=-=-=-=-= -See Benchmarks.xls for details. - -The enclosed test.c can do two things: it can be a torture test or a speed -test. The speed test is designed to be a representative synthetic -memory allocator test. It works by randomly mixing allocations with frees -with half of the allocation sizes being a two power multiple less than -512 bytes (to mimic C++ stack instantiated objects) and the other half -being a simple random value less than 16Kb. - -The real world code results are from Tn's TestIO benchmark. This is a -heavily multithreaded and memory intensive benchmark with a lot of branching -and other stuff modern processors don't like so much. As you'll note, the -test doesn't show the benefits of the threadcache mostly due to the saturation -of the memory bus being the limiting factor. - -ChangeLog: --=-=-=-=-= -v1.05 15th June 2008: - * { 1042 } Added error check for TLSSET() and TLSFREE() macros. Thanks to -Markus Elfring for reporting this. - * { 1043 } Fixed a segfault when freeing memory allocated using -nedindependent_comalloc(). Thanks to Pavel Vozenilek for reporting this. - -v1.04 14th July 2007: - * Fixed a bug with the new optimised implementation that failed to lock -on a realloc under certain conditions. - * Fixed lack of thread synchronisation in InitPool() causing pool corruption - * Fixed a memory leak of thread cache contents on disabling. Thanks to Earl -Chew for reporting this. - * Added a sanity check for freed blocks being valid. - * Reworked test.c into being a torture test. - * Fixed GCC assembler optimisation misspecification - -v1.04alpha_svn915 7th October 2006: - * Fixed failure to unlock thread cache list if allocating a new list failed. -Thanks to Dmitry Chichkov for reporting this. Further thanks to Aleksey Sanin. - * Fixed realloc(0, <size>) segfaulting. Thanks to Dmitry Chichkov for -reporting this. - * Made config defines #ifndef so they can be overridden by the build system. -Thanks to Aleksey Sanin for suggesting this. - * Fixed deadlock in nedprealloc() due to unnecessary locking of preferred -thread mspace when mspace_realloc() always uses the original block's mspace -anyway. Thanks to Aleksey Sanin for reporting this. - * Made some speed improvements by hacking mspace_malloc() to no longer lock -its mspace, thus allowing the recursive mutex implementation to be removed -with an associated speed increase. Thanks to Aleksey Sanin for suggesting this. - * Fixed a bug where allocating mspaces overran its max limit. Thanks to -Aleksey Sanin for reporting this. - -v1.03 10th July 2006: - * Fixed memory corruption bug in threadcache code which only appeared with >4 -threads and in heavy use of the threadcache. - -v1.02 15th May 2006: - * Integrated dlmalloc v2.8.4, fixing the win32 memory release problem and -improving performance still further. Speed is now up to twice the speed of v1.01 -(average is 67% faster). - * Fixed win32 critical section implementation. Thanks to Pavel Kuznetsov -for reporting this. - * Wasn't locking mspace if all mspaces were locked. Thanks to Pavel Kuznetsov -for reporting this. - * Added Apple Mac OS X support. - -v1.01 24th February 2006: - * Fixed multiprocessor scaling problems by removing sources of cache sloshing - * Earl Chew <earl_chew <at> agilent <dot> com> sent patches for the following: - 1. size2binidx() wasn't working for default code path (non x86) - 2. Fixed failure to release mspace lock under certain circumstances which - caused a deadlock - -v1.00 1st January 2006: - * First release |