summaryrefslogtreecommitdiff
path: root/src/utf8.hh
AgeCommit message (Collapse)Author
2025-04-02Reduce include creepMaxime Coste
2024-12-05Split utf8::read_codepoint between single byte and multibyte codeMaxime Coste
Make read_codepoint_multibyte noinline so that the common case single byte case gets inlined.
2024-08-16include headers cleanupAdrià Arrufat
2019-09-07Rank a word-boundary after a non-word-boundaryJean-Louis Fuchs
2019-01-13Use an InvalidPolicy in utf8::dump and utf8::codepoint_sizeMaxime Coste
Do not throw on invalid codepoints by default, ignore them. Fixes #2686
2018-11-01Support different type for iterators and sentinel in utf8 functionsMaxime Coste
2017-10-10Fix utf8::to_previous that could go before the begin iteratorMaxime Coste
2017-04-23Add noexcept specifiers to unicode and utf8 functionsMaxime Coste
2017-04-20Change utf8::to_next/to_previous so that they are more symetricalMaxime Coste
The previous implementation could yield different positions when iterating forward and backward, leading to confusion in boost regex. This makes an existing problem a bit more visible: iterating with to_next and with read_codepoint wont behave the same way, as read_codepoint will put the iterator onto the byte following the utf8 codepoint, whereas to_next will put it on the next utf8 character start byte, which might be different if the buffer content is not valid utf8. Fixes #1195
2016-10-01Rename get_width to codepoint_widthMaxime Coste
2016-10-01Support codepoints of variable widthMaxime Coste
Add a ColumnCount type and use it in place of CharCount whenever more appropriate, take column size of codepoints into account for vertical movements and docstring wrapping. Fixes #811
2016-07-27Avoid underlying iterator copies in utf8_iteratorMaxime Coste
2016-07-15Faster implementation of utf8::advance not copying iterators at each stepMaxime Coste
2016-07-15Avoid postfix increment in utf8::distanceMaxime Coste
2016-02-05More string usage cleanupMaxime Coste
2015-09-25Avoid (*it++) pattern in utf8.hhMaxime Coste
2015-09-24Add utf8::read_codepoint that both gets the codepoint and advance iteratorMaxime Coste
2015-09-23Minor additional cleanup in utf8.hhMaxime Coste
2015-09-23Avoid unneeded iterator copies in utf8.hhMaxime Coste
2014-10-13Use Pass as default policy for invalid utf8 avoid asserting on thatMaxime Coste
2014-07-05utf8: use end of sequence iterators for more securityMaxime Coste
2014-07-05Use unsigned char rather than char in utf8 decoding to avoid sign extensionMaxime Coste
2014-05-14utf8::is_character_start takes directly the char valueMaxime Coste
2013-05-30Add utf8::codepoint_size functionMaxime Coste
2013-04-09sort includes directivesMaxime Coste
2013-04-09rename assert to kak_assert to avoid collisionsMaxime Coste
2013-02-27utf8::dump uses a copy of the output iterator instead of a referenceMaxime Coste
2013-02-26Add utf8::character_start functionMaxime Coste
2012-10-27utf8: use CharCount instead of size_tMaxime Coste
2012-10-17utf8: replace InvalidBytePolicy::Throw with InvalidBytePolicy::AssertMaxime Coste
2012-10-13utf8::codepoint: configurable invalid byte policyMaxime Coste
2012-10-11use ByteCount instead of CharCount when we are really counting bytesMaxime Coste
(that is most of the time when we are not concerned with displaying)
2012-10-11Return something in utf8::distance, thanks again gcc for letting this workMaxime Coste
2012-10-10Actually return something in utf8::codepoint, thanks gcc for using raxMaxime Coste
2012-10-09add a unicode.hh header for Codepoint related functions, ↵Maxime Coste
s/utf8::Codepoint/Codepoint/
2012-10-09utf8: add dump(OutputIterator& it, Codepoint cp)Maxime Coste
2012-10-08add utf8 helpers in utf8.hhMaxime Coste