| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 2018-11-04 | Dump start description as well when writing a regex dump | Maxime Coste | |
| 2018-11-03 | Remove most regex impl special casing for backwards matching | Maxime Coste | |
| 2018-11-01 | Support different type for iterators and sentinel in utf8 functions | Maxime Coste | |
| 2018-10-10 | Cleanup regex lookarounds implementation and reject incompatible regex | Maxime Coste | |
| Fixes #2487 | |||
| 2018-07-08 | Tweak comment to make it less ambiguous | Maxime Coste | |
| 2018-06-24 | Use a dedicated vm op for dot when match-newline is false | Olivier Perret | |
| 2018-06-24 | Use bit-flags for storing regex regex options | Olivier Perret | |
| 2018-06-24 | Add support for regex flag to toggle dot-matches-newline | Olivier Perret | |
| 2018-04-30 | Fix wrong use of constexpr | Maxime Coste | |
| 2018-04-29 | Regex: Use only 128 characters in start desc and encode others as 0 | Maxime Coste | |
| Using 257 was using lots of memory for no good reason, as > 127 codepoint are not common enough to be treated specially. | |||
| 2018-04-28 | Merge remote-tracking branch 'lenormf/regex-format-string' into HEAD | Maxime Coste | |
| 2018-04-28 | fix potential overflow in dump_regex | Maxime Coste | |
| 2018-04-27 | regex_impl: Fix a potential format string flaw | Frank LENORMAND | |
| 2018-04-27 | Add a debug regex command to dump regex instructions | Maxime Coste | |
| 2018-04-27 | Use indices instead of pointers for saves/instruction in ThreadedRegexVM | Maxime Coste | |
| Performance seems unaffacted, but memory usage should be lowered as the Thread struct is 4 bytes instead of 16. | |||
| 2018-04-05 | Fix some trailing spaces and a tab that sneaked into the code base | Maxime Coste | |
| 2018-03-20 | Regex: Only allow SyntaxCharacter and - to be escaped in a character class | Maxime Coste | |
| Letting any character to be escaped is error prone as it looks like \l could mean [:lower:] (as it used to with boost) when it only means literal l. Fix the haskell.kak file as well. Fixes #1945 | |||
| 2018-03-05 | Regex: take the full subject range as a parameter | Maxime Coste | |
| To allow more general look arounds out of the actual search range, pass a second range (the actual subject). This allows us to remove various flags such as PrevAvailable or NotBeginOfSubject, which are now easy to check from the subject range. Fixes #1902 | |||
| 2018-02-24 | Regex: Improve comments and constify some variables | Maxime Coste | |
| Reword various comments to make some tricky parts of the regex engine easier to understand. | |||
| 2018-02-09 | Regex: Use a template argument instead of a regular one for "forward" | Maxime Coste | |
| forward (which controls if we are compling for forward or backward matching) is always statically known, and compilation will first compile forward, then backward (if needed), so by having separate compiled function we get rid of runtime branches. | |||
| 2018-02-09 | Regex: minor code cleanup | Maxime Coste | |
| 2017-12-01 | Regex: Support forward and backward matching code in the same CompiledRegex | Maxime Coste | |
| No need to have two separate regexes to handle forward and backward matching, just passing RegexCompileFlags::Backward will add support for backward matching to the regex. For backward only regex, pass RegexCompileFlags::NoForward as well to disable generation of forward matching code. | |||
| 2017-12-01 | Regex: Do not allow private use codepoints literals | Maxime Coste | |
| We use them to encode non-literals in lookarounds, so they can trigger bugs. Fixes #1737 | |||
| 2017-12-01 | Regex: rename StartChars to StartDesc | Maxime Coste | |
| It only contains chars for now, but its still more generally describing where matches can start. | |||
| 2017-11-30 | Regex: optimize parsing a bit | Maxime Coste | |
| 2017-11-30 | Regex: smarter handling of start chars computation for character class | Maxime Coste | |
| 2017-11-28 | Regex: Various small code tweaks | Maxime Coste | |
| 2017-11-28 | Regex: optimize compilation by reserving data | Maxime Coste | |
| 2017-11-28 | Regex: Tweak is_ctype implementation style | Maxime Coste | |
| 2017-11-25 | Regex: Replace generic 'Matchers' with specialized functionality | Maxime Coste | |
| Introduce CharacterClass and CharacterType Regex Op, and optimize their evaluation. | |||
| 2017-11-25 | Regex: do not decode utf8 in accept calls as they always run on ascii | Maxime Coste | |
| 2017-11-13 | Regex: add unit test for #1693 | Maxime Coste | |
| 2017-11-12 | Fix #1693: typo in RegexParser::character_class() | fsub | |
| 2017-11-01 | Regex: remove dead code | Maxime Coste | |
| 2017-11-01 | Regex: Tweak struct layouts of ParsedRegex data | Maxime Coste | |
| 2017-11-01 | Regex: Remove "Ast" from names in the ParsedRegex | Maxime Coste | |
| It does not add much value, and makes names longer. | |||
| 2017-11-01 | Regex: Optimize parsing and compilation | Maxime Coste | |
| AstNodes are now POD, stored in a single vector, accessed through their index. The children list is implicit, with nodes storing only the node index at which their child graph ends. That makes reverse iteration slower, but that is only used for reverse matching regex, which are uncommon. In the general case compilation is now faster. | |||
| 2017-11-01 | Regex: minor cleanup of the regex parsing code | Maxime Coste | |
| 2017-11-01 | Regex: small code cleanup in the Save compilation code | Maxime Coste | |
| 2017-11-01 | Regex: put the other char boolean inside the general start char map | Maxime Coste | |
| 2017-11-01 | Regex: Fix handling of all unicode codepoint as start chars | Maxime Coste | |
| 2017-11-01 | Regex: fix wrong fallthough in dump_regex | Maxime Coste | |
| 2017-11-01 | Regex: Go back to instruction based search of next start | Maxime Coste | |
| The previous method, which was a bit faster in the general use case, can hit some cases where we get quadratic behaviour and very slow matching. By using an instruction, we can guarantee our complexity of O(N*M) as we will never have more than N threads (N being the instruction count) and we run the threads once per codepoint in the subject string. That slows down the general case slightly, but ensure we dont have pathological cases. This new version is much faster than the previous instruction based search because it does not use a plain `.*` searcher, but a specific, smarter instruction specialized for finding the next start if we are in the correct conditions. | |||
| 2017-11-01 | Regex: add support for \0, \cX, \xXX and \uXXXX escapes | Maxime Coste | |
| 2017-11-01 | Regex: compute if codepoints outside of the start chars map can start | Maxime Coste | |
| 2017-11-01 | Regex: abort compilation as soon as we hit the instruction count limit | Maxime Coste | |
| 2017-11-01 | Regex: add a unit test for why lookaheads dont count for start chars anymore | Maxime Coste | |
| 2017-11-01 | Regex: comment the mutables in CompiledRegex::Instruction and fix their init | Maxime Coste | |
| 2017-11-01 | Regex: Introduce a Regex memory domain to track usage separately | Maxime Coste | |
| 2017-11-01 | Regex: use binary search to for character class ranges check | Maxime Coste | |
