git clone http://www.coincoin169.org/git/ccsp.git ccsp
ccsp - a library for string pooling and interning
#include <ccsp.h>
Ccsp
stands for CoinCoin String Pooling. It is a small library for
string pooling and interning. The C header file exports three types:
ccsp_strbuf
, ccsp_pool
and ccsp_string
. Basically you first
build a string using a ccsp_strbuf
variable and its related
function, then you intern the string in a ccsp_pool
variable to
obtain a pointer of type ccsp_string
. The ccsp_strbuf
variables
are only provided to simplify the construction of C-string. Of
course you can also intern standard C-string into any pool to
obtain a pointer of type ccsp_string
. The particularity of the
library resides in the fact that is you intern several times the
same C-string (either by direct pointer char*
or via the use of
ccsp_strbuf
variables) you will get several pointers of type
ccsp_string
that all points to the same structure containing
only one copy of the C-string. For each interned C-string in any
pool you can associate some data by the means of a pointer of type
void*
so that the pool acts somehow like a mapping from C-strings
to void*
's. Note that two different variables of type ccsp_string
that point to the same C-string within the same pool will have
the same value (of type void*
).
For its internal memory management ccsp
use by default malloc
,
realloc
and free
. But you can of course put in your own memory
management functions. This could be useful if the library is to be
used by another program that disposes of a garbage collector or that
has some alignment requirements. The functions given to ccsp
must
follow the following semantic:
the replacements for malloc
and realloc
must follow exactly the
same semantic as malloc
and realloc
with the exception that
they must always return a non-NULL
pointer. Errors must be
handled within the function. The replacement for free
must follow
exactly the same semantic as the free
function.
All variables must be initialized before being used. And when a variable is not needed anymore you may clear it with the appropriate function.
void ccpp_set_memory_functions(void *(*malloc_func)(size_t),void *(*realloc_func)(void*,size_t),void (*free_func)(void*));
Set the memory functions to be used by ccsp
.
The replacements for malloc
and realloc
must follow exactly the
same semantic as malloc
and realloc
with the exception that
they must always return a non-NULL
pointer. Errors must be
handled within the function. The replacement for free
must follow
exactly the same semantic as the free
function. If NULL
pointers
are passed as any argument then the default ccsp
corresponding
memory functions will be used.
void ccsp_strbuf_empty(ccsp_strbuf buf);
Set the buffer buf
to be the empty string ""
.
void ccsp_strbuf_init2(ccsp_strbuf buf,size_t mem);
Initialize the buffer buf
to be at least mem
bytes of
memory. This function must be called before doing anything else
with buf
.
void ccsp_strbuf_init(ccsp_strbuf buf);
Initialize the buffer buf
.
This function must be called before doing anything else
with buf
or any element in the list.
void ccsp_strbuf_inits(ccsp_strbuf_ptr buf,...);
Initialize the NULL
-terminated list of buffers beginning with
buf
.
This function must be called before doing anything else
with buf
.
void ccsp_strbuf_clear(ccsp_strbuf buf);
Clear the memory used by buf
.
You may use this function when you no longer need buf
.
void ccsp_strbuf_clears(ccsp_strbuf_ptr buf,...);
Clear the NULL
-terminated list of string buffers beginning by
buf
.
You may use this function when you no longer need buf
and the
other string buffers in the list.
size_t ccsp_strbuf_len(ccsp_strbuf buf);
Return the length of the string contained in buf
.
Be aware that this function is nothing more than a simple call to
strlen
so that it does take into account thte charset of the
string.
void ccsp_strbuf_append(ccsp_strbuf buf,char c);
Append the character c
to the string contained in buf
.
char *ccsp_strbuf_get_cstring(ccsp_strbuf buf);
Return a pointer (char*
) to the C-string contained in buf
.
Do not free this pointer or it will result in undefined behaviors.
If you want to free the memory occupied by the string contained in
buf
use ccsp_strbuf_clear
instead.
void ccsp_string_init(ccsp_string s);
Initialize s
.
This function must be called before doing anything else
with s
.
void ccsp_string_inits(ccsp_string_ptr s,...);
Initialize the NULL
-terminated list of string pointers beginning
with s
.
This function must be called before doing anything else
with s
or any element in the list.
void ccsp_string_clear(ccsp_string s);
Clear the memory used by s
.
You may use this function when you no longer need s
.
void ccsp_string_clears(ccsp_string_ptr s,...);
Clear the NULL
-terminated list of string poniters beginning by
s
.
You may use this function when you no longer need buf
and the
other string buffers in the list.
void ccsp_string_copy(ccsp_string r,ccsp_string s);
Copy the string poniter s
into r
.
void ccsp_pool_init2(ccsp_pool p,size_t n,size_t (*hash_func)(const char*,size_t))
Initialize the pool of strings p
. Internally it is implemented
with a hash table which can be given with arguemetn n
. The hash
fucntion used for the pool is given by hash_func
. The hash
function must have the following signature: its first arguement
is a poniter to string whose hash will be computed while its
second argument gives an upper bound on the hash values. If
hash_func
is set to NULL
then the default hash function djb2
will be used.
This function must be called before doing anything else
with p
.
void ccsp_pool_init(ccsp_pool p);
Initialize pool p
.
This function must be called before doing anything else
with p
.
void ccsp_pool_inits(ccsp_pool_ptr p,...);
Initialize the NULL
-terminated list of pools beginning
with p
.
This function must be called before doing anything else
with p
or any element in the list.
void ccsp_pool_clear(ccsp_pool p);
Clear the memory used by s
.
You may use this function when you no longer need s
.
Note that every string pointers (ccsp_string
) that ponits to a
string in the pool p
must not be used after pool p
has
been cleared otherwise undefined behavior will occur. If you
want to produce cleaner code I recommand to clear all string
pointers that ponits to p
before clearing p
.
void ccsp_pool_clears(ccsp_pool_ptr p,...);
Clear the NULL
-terminated list of pools beginning by p
.
You may use this function when you no longer need p
and any
other element in the list. The same remark as for ccsp_pool_clear
apply about ccsp_string
variables pointing to any element in
the list.
void ccsp_intern_strbuf(ccsp_string r,ccsp_pool p,ccsp_strbuf buf);
Intern the string contained in buf
in pool p
and fills the
string variable r
to point to the string within the pool p
.
void ccsp_intern_cstring(ccsp_string r,ccsp_pool p,const char *s);
Intern the C-string pointed to by s
in pool p
and fills the
string variable r
to point to the string within the pool p
.
A deep copy of s
is done by this function so that you may
dispose of s
as you wish afterwards.
void ccsp_intern_cstring_alias(ccsp_string r,ccsp_pool p,char *s);
Intern the C-string pointed to by s
in pool p
and fills the
string variable r
to point to the string within the pool p
.
Note that no deep copy of s
is done by this function, therefore
it is strongly recommanded to let the C-string pointed by s
untouched after a cal to this function otherwise undefined
behiavors will occur. Unless you know what you are doing please
use ccsp_inter_cstring
instead.
void ccsp_string_set_value(ccsp_string s,void *value);
If s
does point to a string in a pool then you give this string
the value value
. If s
does not point to any string then this
function does nothing.
void *ccsp_string_get_value(ccsp_string s);
Return the value of the string pointed to by s
. If s
does not
point to any string in any pool then NULL
is returned by this
function.
char *ccsp_string_get_cstring(ccsp_string s);
Return a pointer (char*
) to the C-string pointed by s
.
If s
does not point to any string, typically just after a call
to ccsp_string_init
this function will return NULL
.
Do not free this pointer or it will result in undefined behaviors.
If you want to free the memory occupied by the string pointed to
by s
use ccsp_string_clear
instead.
The following program shows the use of the library. It interns two
times in two different ways the same C-string and shows that the
obtained pointers of type ccsp_string
both points to the same
C-string in memory. Then a value is given by the means of one
string pointer (ccsp_string
) and checked by the means of the
other string pointer. As the two string pointers ponit to the
same C-string they share the value.
#include <stdio.h>
#include <ccsp.h>
int main(void) {
ccsp_pool p;
ccsp_strbuf buf;
ccsp_string s1,s2;
ccsp_string_init(s1);
ccsp_string_init(s2);
ccsp_strbuf_init(buf);
ccsp_pool_init(p);
ccsp_intern_cstring(s1,p,"abc");
ccsp_strbuf_append(buf,'a');
ccsp_strbuf_append(buf,'b');
ccsp_strbuf_append(buf,'c');
ccsp_inter_strbuf(s2,p,buf);
fprintf(stdout,"s1 == s2 ? %dn",
ccsp_string_get_cstring(s1) == ccsp_string_get_cstring(s2));
fflush(stdout);
ccsp_string_set_value(s1,(void*)-1);
fprintf(stdout,"s1.value == s2.value ? %dn",
ccsp_string_get_value(s1) == ccsp_string_get_value(s2));
fflush(stdout);
ccsp_string_clear(s1);
ccsp_string_clear(s2);
ccsp_strbuf_clear(buf);
ccsp_pool_clear(p);
return 0;
}
The main goal of the library is to be memory efficient that is, use the least memory possible although there sure are some areas for improvements. I did not design the library for speed. Of course I care about speed but, for this library, at the moment, not at low-level. I have implemented a hash table to have roughly O(1) time for all the operations although the constant behind the big-O may be big. You are welcome to improve the library of course and I will be happy to have your feedback or patches.
The documentation of ccsp
and ccsp
are placed under the GPL.
Copyright (C) 2014 Guillaume Quintin
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Written by Guillaume Quintin (quintin@lix.polytechnique.fr).
malloc
(3), realloc
(3), free
(3), strlen
(3),
djb2
(http://www.cse.yorku.ca/~oz/hash.html),
GPL(7).