sombok  2.3.0
Data Structures | Functions
gcstring

Grapheme cluster string. More...

Data Structures

struct  unistr_t
 
struct  gcchar_t
 
struct  gcstring_t
 

Functions

gcstring_tgcstring_new (unistr_t *unistr, linebreak_t *lbobj)
 
gcstring_tgcstring_newcopy (unistr_t *str, linebreak_t *lbobj)
 
gcstring_tgcstring_new_from_utf8 (char *str, size_t len, int check, linebreak_t *lbobj)
 
void gcstring_destroy (gcstring_t *gcstr)
 
gcstring_tgcstring_copy (gcstring_t *gcstr)
 
gcstring_tgcstring_append (gcstring_t *gcstr, gcstring_t *appe)
 
int gcstring_cmp (gcstring_t *a, gcstring_t *b)
 
size_t gcstring_columns (gcstring_t *gcstr)
 
gcstring_tgcstring_concat (gcstring_t *gcstr, gcstring_t *appe)
 
gcchar_tgcstring_next (gcstring_t *gcstr)
 
void gcstring_setpos (gcstring_t *gcstr, int pos)
 
void gcstring_shrink (gcstring_t *gcstr, int length)
 
gcstring_tgcstring_substr (gcstring_t *gcstr, int offset, int length)
 
gcstring_tgcstring_replace (gcstring_t *gcstr, int offset, int length, gcstring_t *replacement)
 
propval_t gcstring_lbclass (gcstring_t *gcstr, int pos)
 
propval_t gcstring_lbclass_ext (gcstring_t *gcstr, int pos)
 

Detailed Description

Grapheme cluster string.

Function Documentation

gcstring_t* gcstring_append ( gcstring_t gcstr,
gcstring_t appe 
)

Append

Modify grapheme cluster string by appending another string.

Parameters
[in]gcstrtarget grapheme cluster string, must not be NULL.
[in]appegrapheme cluster string to be appended. NULL means null string therefore gcstr won't be modified.
Returns
Modified grapheme cluster string gcstr itself (not a copy). If error occurred, errno is set then NULL is returned.
int gcstring_cmp ( gcstring_t a,
gcstring_t b 
)

Compare

Compare grapheme cluster strings.

Parameters
[in]agrapheme cluster string.
[in]bgrapheme cluster string.
Returns
positive, zero or negative value when a is greater, equal to, lesser than b, respectively.
size_t gcstring_columns ( gcstring_t gcstr)

Number of Columns

Returns number of columns of grapheme cluster strings determined by built-in character database according to UAX #11.

Parameters
[in]gcstrgrapheme cluster string. NULL may mean null string.
Returns
Number of columns.
gcstring_t* gcstring_concat ( gcstring_t gcstr,
gcstring_t appe 
)

Concatenate

Create new grapheme cluster string which is concatination of two strings.

Parameters
[in]gcstrgrapheme cluster string, must not be NULL.
[in]appegrapheme cluster string to be appended. NULL means null string.
Returns
New grapheme cluster string. If error occurred, errno is set then NULL is returned.
gcstring_t* gcstring_copy ( gcstring_t gcstr)

Copy Constructor

Create deep copy of grapheme cluster string.

Parameters
[in]gcstrgrapheme cluster string, must not be NULL.
Returns
deep copy of grapheme cluster string. If error occurred, errno is set then NULL is returned.
void gcstring_destroy ( gcstring_t gcstr)

Destructor

Free memories allocated for grapheme cluster string.

Parameters
[in]gcstrgrapheme cluster string.
Returns
none. If gcstr was NULL, do nothing.
propval_t gcstring_lbclass ( gcstring_t gcstr,
int  pos 
)

Get Line Breaking Class of grapheme base

Get UAX #14 line breaking class of grapheme base.

Parameters
[in]gcstrgrapheme cluster string, must not be NULL.
[in]posposition.
Returns
line breaking class property value.
Note
Introduced by sombok 2.2.
propval_t gcstring_lbclass_ext ( gcstring_t gcstr,
int  pos 
)

Get Line Breaking Class of grapheme extender

Get UAX #14 line breaking class of grapheme extender. If it is CM, get one of grapheme base.

Parameters
[in]gcstrgrapheme cluster string, must not be NULL.
[in]posposition.
Returns
line breaking class property value.
Note
Introduced by sombok 2.2.
gcstring_t* gcstring_new ( unistr_t unistr,
linebreak_t lbobj 
)

Constructor

Create new grapheme cluster string from Unicode string. Use gcstring_newcopy() if you wish to copy buffer of Unicode string.

Parameters
[in]unistrUnicode string. NULL may be given as zero-length string.
[in]lbobjlinebreak object.
Returns
New grapheme cluster string sharing str buffer with unistr. If error occurred, errno is set then NULL is returned.

option bits of lbobj:

  • if LINEBREAK_OPTION_EASTASIAN_CONTEXT bit is set, LB_AI and EA_A are resolved to LB_ID and EA_F. Otherwise, LB_AL and EA_N, respectively.
  • if LINEBREAK_OPTION_LEGACY_CM bit is set, combining mark lead by a SPACE is isolated combining mark (ID). Otherwise, such sequences are treated as degenerate cases.
  • if LINEBREAK_OPTION_VIRAMA_AS_JOINER bit is set, virama and other letter are not broken.
gcstring_t* gcstring_new_from_utf8 ( char *  str,
size_t  len,
int  check,
linebreak_t lbobj 
)

Constructor from UTF-8 string

Create new grapheme cluster string from UTF-8 string.

Parameters
[in]strbuffer of UTF-8 string, must not be NULL.
[in]lenlength of UTF-8 string.
[in]checkcheck input. See sombok_decode_utf8().
[in]lbobjlinebreak object.
Returns
New grapheme cluster string. If error occurred, errno is set then NULL is returned. Source string buffer would not be modified.
gcstring_t* gcstring_newcopy ( unistr_t str,
linebreak_t lbobj 
)

Constructor copying Unicode string.

Create new grapheme cluster string from Unicode string. Use gcstring_new() if you wish not to copy buffer of Unicode string.

Parameters
[in]strUnicode string. NULL may be given as zero-length string.
[in]lbobjlinebreak object.
Returns
New grapheme cluster string. If error occurred, errno is set then NULL is returned.
gcchar_t* gcstring_next ( gcstring_t gcstr)

Iterator

Returns pointer to next grapheme cluster of grapheme cluster string. Next position will be incremented.

Parameters
[in]gcstrgrapheme cluster string.
Returns
Pointer to grapheme cluster. If pointer was already at end of the string, NULL will be returned.
gcstring_t* gcstring_replace ( gcstring_t gcstr,
int  offset,
int  length,
gcstring_t replacement 
)

Replace substring

Replace substring og grapheme cluster string. Offset and length are specified by number of grapheme clusters.

Parameters
[in,out]gcstrgrapheme cluster string. Must not be NULL.
[in]offsetOffset of substring.
[in]lengthLength of substring. offset and length must not be out of range.
[in]replacementIf this was not NULL, modify grapheme cluster string by replacing substring with it.
Returns
modified gcstr itself (not a copy of it). If error occurred, errno is set to non-zero then NULL is returned.
Todo:
On next major release, offset and length would be ssize_t, not int.
void gcstring_setpos ( gcstring_t gcstr,
int  pos 
)

Set Next Position

Set next position of grapheme cluster string.

Parameters
[in]gcstrgrapheme cluster string.
[in]posNew position.
Returns
none. If pos is out of range of string, position won't be updated.
Todo:
On next major release, pos would be ssize_t, not int.
void gcstring_shrink ( gcstring_t gcstr,
int  length 
)

Shrink

Modify grapheme cluster string to shrink its length. Length is specified by number of grapheme clusters.

Parameters
[in]gcstrgrapheme cluster string.
[in]lengthNew length.
Returns
none. If gcstr was NULL, do nothing.
Todo:
On next major release, length would be ssize_t, not int.
gcstring_t* gcstring_substr ( gcstring_t gcstr,
int  offset,
int  length 
)

Substring

Returns substring of grapheme cluster string. Offset and length are specified by number of grapheme clusters.

Parameters
[in]gcstrgrapheme cluster string. Must not be NULL.
[in]offsetOffset of substring.
[in]lengthLength of substring.
Returns
(newly allocated) substring. If error occurred, errno is set to non-zero then NULL is returned.
Todo:
On next major release, offset and length would be ssize_t, not int.