summaryrefslogtreecommitdiff
path: root/src/regex_impl.cc
AgeCommit message (Collapse)Author
2017-11-01Regex: compute start chars from matchers, do not compute it from lookaroundsMaxime Coste
Computing potential start characters from lookarounds is more complex than expected, and not worth the complexity.
2017-11-01Regex: rename "flags" with the more common "modifiers"Maxime Coste
2017-11-01Regex: Correctly handle ignore case mode for start chars computationMaxime Coste
2017-11-01Regex: Rework parsing, treat lookarounds as assertions, and flags separatelyMaxime Coste
2017-11-01Regex: Limit programs to std::numeric_limits<uint16_t>::max() instructionsMaxime Coste
2017-11-01Regex: Fix reverse searching behaviour, againMaxime Coste
2017-11-01Regex: limit explicit quantifiers value (too 1000 for now)Maxime Coste
Fixes #1628
2017-11-01Regex: Fix handling of ^ and $ in backward matching modeMaxime Coste
2017-11-01Regex: Fix support for ignore case in lookaroundsMaxime Coste
2017-11-01Regex: support more than two children in alternationsMaxime Coste
Avoid deep nested alternations, parse them flattened.
2017-11-01Regex: print instruction index in dump_regexMaxime Coste
2017-11-01Regex: Tweak definition of character class and control escape tablesMaxime Coste
2017-11-01Regex: fix lookarounds handling when computing starting charsMaxime Coste
2017-11-01Regex: Make boost checking disableable at compile timeMaxime Coste
2017-11-01Regex: switch to custom impl, use boost for checkingMaxime Coste
2017-11-01Regex: Fix lookaround use in moon.kakMaxime Coste
(?=[A-Z]\w*) is strictly the same as (?=[A-Z]) as \w* will always at least match an empty string.
2017-11-01Regex: Support any char and character classes in lookaroundsMaxime Coste
Lookarounds still need to be fixed size, but accept character classes as well as plain literals.
2017-11-01Regex: Fix computation of potential starts for lookaheadsMaxime Coste
2017-11-01Regex: detect when all characters can start and avoid allocatingMaxime Coste
2017-11-01Regex: Fix wrong size of character_class_escapes arrayMaxime Coste
2017-11-01Regex: Introduce RegexExecFlags::PrevAvailableMaxime Coste
Rework assertion code as well.
2017-11-01Regex: deallocate Saves memory on ThreadedRegexVM destructionMaxime Coste
2017-11-01Regex: Fix handling of control escapes inside character classesMaxime Coste
2017-11-01Regex: tag instructions as scheduled as well instead of searchingMaxime Coste
And a few more code cleanup in the ThreadedRegexVM
2017-11-01Regex: store the processed flag directly in CompiledRegex instructionsMaxime Coste
2017-11-01Regex: abandon bytecode and just use a simple list of instructionsMaxime Coste
Makes the code simpler.
2017-11-01Regex: avoid infinite loopsMaxime Coste
2017-11-01Regex: Add support for backward matchingMaxime Coste
Regex can be compiled for backward matching instead of forward matching and the ThreadedRegexVM is able to iterate in reverse on the subject string to find the last match instead of the first.
2017-11-01Regex: Remove static RegexCompiler::compileMaxime Coste
2017-11-01Regex: remove use of buffer_utils.hh from regex_impl.ccMaxime Coste
2017-11-01Regex: Use memcpy to write/read offsets from bytecodeMaxime Coste
reinterpret_cast was undefined behaviour as we do not guarantee that offsets are going to be stored properly aligned.
2017-11-01Regex: slight cleanup of the unit testsMaxime Coste
2017-11-01Regex: Cleanup character class parsing a bitMaxime Coste
2017-11-01Regex: Make ThreadedRegexVM a proper class, define a proper interfaceMaxime Coste
2017-11-01Regex: Find potential start position using a map of valid start charsMaxime Coste
With this optimization we get close to performance parity with boost regex on the common use cases in Kakoune.
2017-11-01Regex: Optimize single char character classes as literalsMaxime Coste
2017-11-01Regex: reorder lookaround ops, group by directionMaxime Coste
2017-11-01Regex: Fix handling of non capturing groups (?:...)Maxime Coste
We were wrongly keeping the `:` as a literal content of the group
2017-11-01Regex: do not write the search prefix inside the program bytecodeMaxime Coste
Its faster to have specialized code in the VM directly
2017-11-01Regex: Use a custom allocated buffer for Saves instead of a VectorMaxime Coste
2017-11-01Regex: fix handling of negative escaped character classesMaxime Coste
2017-11-01Regex: introduce RegexExecFlags to control various behavioursMaxime Coste
2017-11-01Regex: Fix use of not-yet-constructed CompiledRegex in TestVM implMaxime Coste
2017-11-01Regex: min/max quantifiers can be non greedy as wellMaxime Coste
2017-11-01Regex: validate that our custom impl gets the same results as boost regexMaxime Coste
In addition to running boost regex, run our custom regex and compare the results to ensure the two regex engine agree.
2017-11-01Regex: support escaping characters in character classesMaxime Coste
2017-11-01Regex: add support for case insensitive matching, controlled by (?i)Maxime Coste
2017-11-01Regex: use \A \z for subject start/endMaxime Coste
This is the most common syntax in various regex variants.
2017-11-01Regex: Implement lookarounds for fixed literal stringsMaxime Coste
We do not support anything else than a plain literal string for lookarounds.
2017-11-01Regex: Support non greedy quantifiersMaxime Coste