kakoune.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2025-07-08	Replace std::unique_ptr with a custom implementation	Maxime Coste
	<memory> is a costly header we can avoid by just implementing UniquePtr ourselves, which is a pretty straightforward in modern C++, this saves around 10% of the compilation time here.
2025-07-07	Add a CharRange regex op to optimize the common simple range case	Maxime Coste
	Instead of jumping into the general CharClass code, detect simple [a-z] style ranges and use a specific op. Also detect when a range can be converted to ignore case
2025-04-02	Reduce include creep	Maxime Coste

2025-02-04	Revert "Use uint64_t for regex step"	Maxime Coste
	This got pushed by accident This reverts commit d92496449d0c9655253ad16363685bb8446dc582.
2025-01-22	Use uint64_t for regex step	Maxime Coste
	Its unclear that maintaining a small instruction size outweigh the cost of handling wrapping of the current_step every 64K codepoints, this makes the code simpler.
2024-12-01	Add specific start desc optimization for single possible start byte	Maxime Coste
	Use tighter codegen for that pretty common use case.
2024-11-04	Fix backward regex search ending in DOTALL	Johannes Altmanninger
	I noticed that reverse searches ending in "." stopped working in version 2024.05.08: kak -n -e "exec %{%cfoobar<ret><esc>gj<a-/>foo.<ret>}' Bisects ca7471c25 (Compute StartDesc with an offset to effective start, 2024-03-18) which updated the find_next_start() logic for the forward case but not for backward case. Add a symmetrical change and test case, that seems to fix it. Not 100% sure if this is correct but feels so.
2024-08-16	include headers cleanup	Adrià Arrufat

2024-08-12	Reduce headers dependency graph	Maxime Coste
	Move more code into the implementation files to reduce the amount of code pulled by headers.
2024-08-12	Extract format implementation to its own file	Maxime Coste
	Split it to avoid pulling all string_utils dependencies for just format.
2024-08-04	Move most code in regex_impl inside the anonymous namespace	Maxime Coste
	This is all internal code that should not be exposed.
2024-07-15	Improve error message for alternations in lookarounds and add tests	Maxime Coste
	Fixes #5203
2024-06-26	Fix order of evaluation issue when compiling regex alternations	Maxime Coste
	The previous implementation was relying on the left hand side of the assignement's side effects to be sequenced before the right hand side evaluation (so --split_pos was expected to have taken place) This is actually counter to C++17 which guarantees evaluation of the right hand side of the assignement first. For some reason this is not the case with GCC on linux.
2024-06-15	Store instruction pointers directly in ThreadedRegexVM::Thread	Maxime Coste
	The previous tradeoff of having a very small Thread struct is not necessary anymore as we do not memcpy Threads on swap_next since d708b77186c1685dcbd2298246ada7d204acec2f. This requires offsets to be used instead of indices for jump/split ops.
2024-03-22	Match Op declaration order in switches	Maxime Coste

2024-03-21	Compute StartDesc with an offset to effective start	Maxime Coste
	This means `.{2,4}foo` will now consider 4 or less before f as a start candidate instead of every characters
2024-03-13	Simplify and accelerate start desc map	Maxime Coste
	Store values for all possible bytes and fill utf8 multi byte start values when necessary.
2024-03-13	Fix quantifier parsing bug	Maxime Coste

2024-03-12	Simplify Quantifier logic in regex parsing	Maxime Coste
	Remove redundant type enum
2023-11-05	Remove ignored packed attribute and static_assert on Node size	Maxime Coste
	This static_assert is not necessary for the code to work and is not valid on every platform.
2023-11-03	Add support for 0-padding in format and replace uses of sprintf	Maxime Coste

2023-06-27	Unbreak build on ppc	Sergey Fedorov
	Fixes: https://github.com/mawww/kakoune/issues/4937
2023-02-19	Optimize Regex CharacterClass matching	Maxime Coste
	Take advantage of ranges sorting to early out, make the logic inline.
2023-02-14	Fix broken corner cases in DualThreadStack::grow_ifn	Maxime Coste
	We only grow when the ring buffer is full, which allows for a nice simplification of the code. Tell grow_ifn if we pushed in current or next so that we can distinguish between filled by next or filled by current when m_current == m_next_begin
2023-02-13	Remove scheduled optimization from ThreadedRegexVM	Maxime Coste
	This does not seem to actually speed up execution as threads will be dropped on next step anyway
2023-01-23	Fix incorrect use of subject end/begin in regex execution	Maxime Coste
	This could lead to reading past subject string end in certain conditions Fixes #4794
2022-08-05	Reuse existing character classes when possible in regex	Maxime Coste

2022-02-02	Fix regex alternation execution priority	Maxime Coste
	The ThreadedRegexVM implementation does not execute split opcodes as expected: on split the pending thread is pushed on top of the thread stack, which means that when multiple splits are executed in a row (such as with a disjunction with 3 or more branches) the last split target gets on top of the thread stack and gets executed next (when the thread from the first split target would be the expected one) Fixing this in the ThreadedRegexVM would have a performance impact as we would not be able to use a plain stack for current threads, so the best solution at the moment is to reverse the order of splits generated by a disjunction. Fixes #4519
2021-12-11	Fix parsing nul bytes in regex	Maxime Coste
	Fixes #4460
2021-11-25	small regex impl code style tweak	Maxime Coste

2021-11-21	Micro-optimize regex character class/type matching	Maxime Coste
	Also force-inline step_thread as function call overhead has a mesurable impact.
2021-11-21	Reduce the amount of Regex VM Instruction code	Maxime Coste
	Merge all lookarounds into the same instruction, merge splits, merge literal ignore case with literal... Besides reducing the amount of almost duplicated code, this improves performance by reducing pressure on the (often failing) branch target prediction for instruction dispatching by moving branches into the instruction code themselves where they are more likely to be well predicted.
2021-08-21	Use the [[gnu::packed]] C++ attribute.	Peter Pentchev
	Suggested by: Maxime Coste <mawww@kakoune.org>
2021-08-20	Do not break non-GCC/g++ compilers.	Peter Pentchev

2021-08-20	Make sure the ParsedRegex structure has the right size.	Peter Pentchev
	Some versions of GCC/g++ will not necessarily pad the structure to a 32-bit boundary, so make the alignment and the filler explicit. Detected on: Debian/m68k; https://buildd.debian.org/status/fetch.php?pkg=kakoune&arch=m68k&ver=2020.09.01-1&stamp=1629387444&raw=0
2021-07-31	Code style tweak for Regex implementation TestVM	Maxime Coste

2021-01-03	Add missing limits includes	Maxime Coste
	Fixes #4003
2020-02-07	Fix regex start desc computation for case insensitive ranges	Maxime Coste
	Fixes #3345
2019-12-05	Restore regex optimization pass by introducing basic block analysis	Maxime Coste
	Run the peephole optimizer on each basic block, avoiding the previous issue that some instructions could move across their boundaries.
2019-11-09	Add static or const where useful	Jason Felice

2019-11-06	Support \x and \u escapes in regex character classes	Maxime Coste
	Change \u to use 6 digits to cover the full unicode range. Fixes #3172
2019-07-06	Fix build on FreeBSD	Tobias Kortkamp
	file.cc:390:21: error: use of undeclared identifier 'rename'; did you mean 'devname'? if (replace and rename(temp_filename, zfilename) != 0) ^~~~~~ devname /usr/include/stdlib.h:277:7: note: 'devname' declared here char devname(__dev_t, __mode_t); ^ file.cc:390:28: error: cannot initialize a parameter of type '__dev_t' (aka 'unsigned long') with an lvalue of type 'char [1024]' if (replace and rename(temp_filename, zfilename) != 0) ^~~~~~~~~~~~~ /usr/include/stdlib.h:277:22: note: passing argument to parameter here char devname(__dev_t, __mode_t); ^ 2 errors generated. --- highlighters.cc:1110:13: error: use of undeclared identifier 'snprintf'; did you mean 'vswprintf'? snprintf(buffer, 16, format, std::abs(line_to_format)); ^~~~~~~~ vswprintf /usr/include/wchar.h:139:5: note: 'vswprintf' declared here int vswprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict, ^ highlighters.cc:1110:22: error: cannot initialize a parameter of type 'wchar_t ' with an lvalue of type 'char [16]' snprintf(buffer, 16, format, std::abs(line_to_format)); ^~~~~~ /usr/include/wchar.h:139:35: note: passing argument to parameter here int vswprintf(wchar_t __restrict, size_t n, const wchar_t * __restrict, ^ 2 errors generated. --- json_ui.cc:60:13: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'? sprintf(buf, "\\u%04x", next); ^~~~~~~ swprintf /usr/include/wchar.h:133:5: note: 'swprintf' declared here int swprintf(wchar_t __restrict, size_t n, const wchar_t * __restrict, ^ json_ui.cc:60:21: error: cannot initialize a parameter of type 'wchar_t ' with an lvalue of type 'char [7]' sprintf(buf, "\\u%04x", next); ^~~ /usr/include/wchar.h:133:34: note: passing argument to parameter here int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict, ^ json_ui.cc:74:9: error: use of undeclared identifier 'sprintf' sprintf(buffer, R"("#%02x%02x%02x")", color.r, color.g, color.b); ^ 3 errors generated. --- regex_impl.cc:1039:9: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'? sprintf(buf, " %03d ", count++); ^~~~~~~ swprintf /usr/include/wchar.h:133:5: note: 'swprintf' declared here int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict, ^ regex_impl.cc:1039:17: error: cannot initialize a parameter of type 'wchar_t ' with an lvalue of type 'char [20]' sprintf(buf, " %03d ", count++); ^~~ /usr/include/wchar.h:133:34: note: passing argument to parameter here int swprintf(wchar_t __restrict, size_t n, const wchar_t * __restrict, ^ regex_impl.cc:1197:17: error: use of undeclared identifier 'puts' { if (dump) puts(dump_regex(this).c_str()); } ^ regex_impl.cc:1208:18: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<Kakoune::RegexMode::Forward>::TestVM' requested here TestVM<> vm{R"(ab)"}; ^ regex_impl.cc:1197:17: error: use of undeclared identifier 'puts' { if (dump) puts(dump_regex(this).c_str()); } ^ regex_impl.cc:1283:56: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<5>::TestVM' requested here TestVM<RegexMode::Forward \| RegexMode::Search> vm{R"(f.a(.o))"}; ^ regex_impl.cc:1197:17: error: use of undeclared identifier 'puts' { if (dump) puts(dump_regex(this).c_str()); } ^ regex_impl.cc:1423:57: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<6>::TestVM' requested here TestVM<RegexMode::Backward \| RegexMode::Search> vm{R"(fo{1,})"}; ^ 5 errors generated. --- remote.cc:829:9: error: use of undeclared identifier 'rename'; did you mean 'devname'? if (rename(old_socket_file.c_str(), new_socket_file.c_str()) != 0) ^~~~~~ devname /usr/include/stdlib.h:277:7: note: 'devname' declared here char devname(__dev_t, __mode_t); ^ remote.cc:829:16: error: cannot initialize a parameter of type '__dev_t' (aka 'unsigned long') with an rvalue of type 'const char ' if (rename(old_socket_file.c_str(), new_socket_file.c_str()) != 0) ^~~~~~~~~~~~~~~~~~~~~~~ /usr/include/stdlib.h:277:22: note: passing argument to parameter here char devname(__dev_t, __mode_t); ^ 2 errors generated. --- string_utils.cc:126:20: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'? res.m_length = sprintf(res.m_data, "%i", val); ^~~~~~~ swprintf /usr/include/wchar.h:133:5: note: 'swprintf' declared here int swprintf(wchar_t __restrict, size_t n, const wchar_t * __restrict, ^ string_utils.cc:126:28: error: cannot initialize a parameter of type 'wchar_t ' with an lvalue of type 'char [15]' res.m_length = sprintf(res.m_data, "%i", val); ^~~~~~~~~~ /usr/include/wchar.h:133:34: note: passing argument to parameter here int swprintf(wchar_t __restrict, size_t n, const wchar_t * __restrict, ^ string_utils.cc:133:20: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'? res.m_length = sprintf(res.m_data, "%u", val); ^~~~~~~ swprintf [...]
2019-02-27	Fixed all reorder warnings	Justin Frank

2019-02-04	Remove peephole regex optimization pass	Maxime Coste
	The current implementation is wrong as it crosses basic blocks boundaries. Doing basic block decomposition of regex is probably a tad too complex for this single optimization. Fixes #2711
2019-02-04	Fix regex not always selecting the leftmost longest match	Maxime Coste
	(Actually the rightmost longest match when searching backwards) Fixes #2710
2019-01-20	Add a peephole optimization pass to the regex compiler	Maxime Coste

2019-01-20	Refactor regex find next start not to be an instruction anymore	Maxime Coste
	The same logic can be hard coded, avoiding one thread and 3 instructions, improving the regex matching speed.
2019-01-20	Split compile time regex flags from runtime ones	Maxime Coste

2019-01-20	Refactor parsed regex children iteration to use regular range-for loops	Maxime Coste

2019-01-03	Add support for named captures to the regex impl and regex highlighter	Maxime Coste
	ECMAScript is adding support for it, and it is a pretty isolated change to do. Fixes #2293