| Age | Commit message (Collapse) | Author |
|
<memory> is a costly header we can avoid by just implementing
UniquePtr ourselves, which is a pretty straightforward in modern
C++, this saves around 10% of the compilation time here.
|
|
Instead of jumping into the general CharClass code, detect simple
[a-z] style ranges and use a specific op.
Also detect when a range can be converted to ignore case
|
|
|
|
This got pushed by accident
This reverts commit d92496449d0c9655253ad16363685bb8446dc582.
|
|
Its unclear that maintaining a small instruction
size outweigh the cost of handling wrapping of
the current_step every 64K codepoints, this makes
the code simpler.
|
|
Use tighter codegen for that pretty common use case.
|
|
I noticed that reverse searches ending in "." stopped working in
version 2024.05.08:
kak -n -e "exec %{%cfoobar<ret><esc>gj<a-/>foo.<ret>}'
Bisects ca7471c25 (Compute StartDesc with an offset to effective start,
2024-03-18) which updated the find_next_start() logic for the forward
case but not for backward case. Add a symmetrical change and test
case, that seems to fix it. Not 100% sure if this is correct but
feels so.
|
|
|
|
Move more code into the implementation files to reduce the amount
of code pulled by headers.
|
|
Split it to avoid pulling all string_utils dependencies for just
format.
|
|
This is all internal code that should not be exposed.
|
|
Fixes #5203
|
|
The previous implementation was relying on the left hand side of the
assignement's side effects to be sequenced before the right hand side
evaluation (so --split_pos was expected to have taken place)
This is actually counter to C++17 which guarantees evaluation of the
right hand side of the assignement first. For some reason this is
not the case with GCC on linux.
|
|
The previous tradeoff of having a very small Thread struct is not
necessary anymore as we do not memcpy Threads on swap_next since
d708b77186c1685dcbd2298246ada7d204acec2f.
This requires offsets to be used instead of indices for jump/split
ops.
|
|
|
|
This means `.{2,4}foo` will now consider 4 or less before f as
a start candidate instead of every characters
|
|
Store values for all possible bytes and fill utf8 multi byte start
values when necessary.
|
|
|
|
Remove redundant type enum
|
|
This static_assert is not necessary for the code to work and is
not valid on every platform.
|
|
|
|
Fixes: https://github.com/mawww/kakoune/issues/4937
|
|
Take advantage of ranges sorting to early out, make the logic
inline.
|
|
We only grow when the ring buffer is full, which allows for a nice
simplification of the code.
Tell grow_ifn if we pushed in current or next so that we can
distinguish between filled by next or filled by current when
m_current == m_next_begin
|
|
This does not seem to actually speed up execution as threads will
be dropped on next step anyway
|
|
This could lead to reading past subject string end in certain
conditions
Fixes #4794
|
|
|
|
The ThreadedRegexVM implementation does not execute split opcodes as
expected: on split the pending thread is pushed on top of the thread
stack, which means that when multiple splits are executed in a row
(such as with a disjunction with 3 or more branches) the last split
target gets on top of the thread stack and gets executed next (when the
thread from the first split target would be the expected one)
Fixing this in the ThreadedRegexVM would have a performance impact as
we would not be able to use a plain stack for current threads, so the
best solution at the moment is to reverse the order of splits generated
by a disjunction.
Fixes #4519
|
|
Fixes #4460
|
|
|
|
Also force-inline step_thread as function call overhead has a
mesurable impact.
|
|
Merge all lookarounds into the same instruction, merge splits, merge
literal ignore case with literal...
Besides reducing the amount of almost duplicated code, this improves
performance by reducing pressure on the (often failing) branch target
prediction for instruction dispatching by moving branches into the
instruction code themselves where they are more likely to be well
predicted.
|
|
Suggested by: Maxime Coste <mawww@kakoune.org>
|
|
|
|
Some versions of GCC/g++ will not necessarily pad the structure to
a 32-bit boundary, so make the alignment and the filler explicit.
Detected on: Debian/m68k; https://buildd.debian.org/status/fetch.php?pkg=kakoune&arch=m68k&ver=2020.09.01-1&stamp=1629387444&raw=0
|
|
|
|
Fixes #4003
|
|
Fixes #3345
|
|
Run the peephole optimizer on each basic block, avoiding the
previous issue that some instructions could move across their
boundaries.
|
|
|
|
Change \u to use 6 digits to cover the full unicode range.
Fixes #3172
|
|
file.cc:390:21: error: use of undeclared identifier 'rename'; did you mean 'devname'?
if (replace and rename(temp_filename, zfilename) != 0)
^~~~~~
devname
/usr/include/stdlib.h:277:7: note: 'devname' declared here
char *devname(__dev_t, __mode_t);
^
file.cc:390:28: error: cannot initialize a parameter of type '__dev_t' (aka 'unsigned long') with an lvalue of type 'char [1024]'
if (replace and rename(temp_filename, zfilename) != 0)
^~~~~~~~~~~~~
/usr/include/stdlib.h:277:22: note: passing argument to parameter here
char *devname(__dev_t, __mode_t);
^
2 errors generated.
---
highlighters.cc:1110:13: error: use of undeclared identifier 'snprintf'; did you mean 'vswprintf'?
snprintf(buffer, 16, format, std::abs(line_to_format));
^~~~~~~~
vswprintf
/usr/include/wchar.h:139:5: note: 'vswprintf' declared here
int vswprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
highlighters.cc:1110:22: error: cannot initialize a parameter of type 'wchar_t *' with an lvalue of type 'char [16]'
snprintf(buffer, 16, format, std::abs(line_to_format));
^~~~~~
/usr/include/wchar.h:139:35: note: passing argument to parameter here
int vswprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
2 errors generated.
---
json_ui.cc:60:13: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'?
sprintf(buf, "\\u%04x", *next);
^~~~~~~
swprintf
/usr/include/wchar.h:133:5: note: 'swprintf' declared here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
json_ui.cc:60:21: error: cannot initialize a parameter of type 'wchar_t *' with an lvalue of type 'char [7]'
sprintf(buf, "\\u%04x", *next);
^~~
/usr/include/wchar.h:133:34: note: passing argument to parameter here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
json_ui.cc:74:9: error: use of undeclared identifier 'sprintf'
sprintf(buffer, R"("#%02x%02x%02x")", color.r, color.g, color.b);
^
3 errors generated.
---
regex_impl.cc:1039:9: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'?
sprintf(buf, " %03d ", count++);
^~~~~~~
swprintf
/usr/include/wchar.h:133:5: note: 'swprintf' declared here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
regex_impl.cc:1039:17: error: cannot initialize a parameter of type 'wchar_t *' with an lvalue of type 'char [20]'
sprintf(buf, " %03d ", count++);
^~~
/usr/include/wchar.h:133:34: note: passing argument to parameter here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
regex_impl.cc:1197:17: error: use of undeclared identifier 'puts'
{ if (dump) puts(dump_regex(*this).c_str()); }
^
regex_impl.cc:1208:18: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<Kakoune::RegexMode::Forward>::TestVM' requested here
TestVM<> vm{R"(a*b)"};
^
regex_impl.cc:1197:17: error: use of undeclared identifier 'puts'
{ if (dump) puts(dump_regex(*this).c_str()); }
^
regex_impl.cc:1283:56: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<5>::TestVM' requested here
TestVM<RegexMode::Forward | RegexMode::Search> vm{R"(f.*a(.*o))"};
^
regex_impl.cc:1197:17: error: use of undeclared identifier 'puts'
{ if (dump) puts(dump_regex(*this).c_str()); }
^
regex_impl.cc:1423:57: note: in instantiation of member function 'Kakoune::(anonymous namespace)::TestVM<6>::TestVM' requested here
TestVM<RegexMode::Backward | RegexMode::Search> vm{R"(fo{1,})"};
^
5 errors generated.
---
remote.cc:829:9: error: use of undeclared identifier 'rename'; did you mean 'devname'?
if (rename(old_socket_file.c_str(), new_socket_file.c_str()) != 0)
^~~~~~
devname
/usr/include/stdlib.h:277:7: note: 'devname' declared here
char *devname(__dev_t, __mode_t);
^
remote.cc:829:16: error: cannot initialize a parameter of type '__dev_t' (aka 'unsigned long') with an rvalue of type 'const char *'
if (rename(old_socket_file.c_str(), new_socket_file.c_str()) != 0)
^~~~~~~~~~~~~~~~~~~~~~~
/usr/include/stdlib.h:277:22: note: passing argument to parameter here
char *devname(__dev_t, __mode_t);
^
2 errors generated.
---
string_utils.cc:126:20: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'?
res.m_length = sprintf(res.m_data, "%i", val);
^~~~~~~
swprintf
/usr/include/wchar.h:133:5: note: 'swprintf' declared here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
string_utils.cc:126:28: error: cannot initialize a parameter of type 'wchar_t *' with an lvalue of type 'char [15]'
res.m_length = sprintf(res.m_data, "%i", val);
^~~~~~~~~~
/usr/include/wchar.h:133:34: note: passing argument to parameter here
int swprintf(wchar_t * __restrict, size_t n, const wchar_t * __restrict,
^
string_utils.cc:133:20: error: use of undeclared identifier 'sprintf'; did you mean 'swprintf'?
res.m_length = sprintf(res.m_data, "%u", val);
^~~~~~~
swprintf
[...]
|
|
|
|
The current implementation is wrong as it crosses basic blocks
boundaries. Doing basic block decomposition of regex is probably
a tad too complex for this single optimization.
Fixes #2711
|
|
(Actually the rightmost longest match when searching backwards)
Fixes #2710
|
|
|
|
The same logic can be hard coded, avoiding one thread and 3
instructions, improving the regex matching speed.
|
|
|
|
|
|
ECMAScript is adding support for it, and it is a pretty isolated
change to do.
Fixes #2293
|